Human-Computer Interaction.. Interaction Design and Usability

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris ...

Author: Julie A. Jacko | Julie A. Jacko

123 downloads 6050 Views 20MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany

4550

Julie A. Jacko (Ed.)

Human-Computer Interaction Interaction Design and Usability 12th International Conference, HCI International 2007 Beijing, China, July 22-27, 2007 Proceedings, Part I

13

Volume Editor Julie A. Jacko Georgia Institute of Technology and Emory University School of Medicine 901 Atlantic Drive, Suite 4100, Atlanta, GA 30332-0477, USA E-mail: [email protected]

Library of Congress Control Number: 2007929779 CR Subject Classification (1998): H.5.2, H.5.3, H.3-5, C.2, I.3, D.2, F.3, K.4.2 LNCS Sublibrary: SL 2 – Programming and Software Engineering ISSN ISBN-10 ISBN-13

0302-9743 3-540-73104-0 Springer Berlin Heidelberg New York 978-3-540-73104-7 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2007 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12077908 06/3180 543210

Foreword

The 12th International Conference on Human-Computer Interaction, HCI International 2007, was held in Beijing, P.R. China, 22-27 July 2007, jointly with the Symposium on Human Interface (Japan) 2007, the 7th International Conference on Engineering Psychology and Cognitive Ergonomics, the 4th International Conference on Universal Access in Human-Computer Interaction, the 2nd International Conference on Virtual Reality, the 2nd International Conference on Usability and Internationalization, the 2nd International Conference on Online Communities and Social Computing, the 3rd International Conference on Augmented Cognition, and the 1st International Conference on Digital Human Modeling. A total of 3403 individuals from academia, research institutes, industry and governmental agencies from 76 countries submitted contributions, and 1681 papers, judged to be of high scientific quality, were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of Human-Computer Interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. This volume, edited by Julie A. Jacko, contains papers in the thematic area of Human-Computer Interaction, addressing the following major topics: • • • •

Interaction Design: Theoretical Issues, Methods, Techniques and Practice Usability and Evaluation Methods and Tools Understanding Users and Contexts of Use Models and Patterns in HCI The remaining volumes of the HCI International 2007 proceedings are:

• Volume 2, LNCS 4551, Interaction Platforms and Techniques, edited by Julie A. Jacko • Volume 3, LNCS 4552, HCI Intelligent Multimodal Interaction Environments, edited by Julie A. Jacko • Volume 4, LNCS 4553, HCI Applications and Services, edited by Julie A. Jacko • Volume 5, LNCS 4554, Coping with Diversity in Universal Access, edited by Constantine Stephanidis • Volume 6, LNCS 4555, Universal Access to Ambient Interaction, edited by Constantine Stephanidis • Volume 7, LNCS 4556, Universal Access to Applications and Services, edited by Constantine Stephanidis • Volume 8, LNCS 4557, Methods, Techniques and Tools in Information Design, edited by Michael J. Smith and Gavriel Salvendy • Volume 9, LNCS 4558, Interacting in Information Environments, edited by Michael J. Smith and Gavriel Salvendy • Volume 10, LNCS 4559, HCI and Culture, edited by Nuray Aykin

VI

Foreword

• Volume 11, LNCS 4560, Global and Local User Interfaces, edited by Nuray Aykin • Volume 12, LNCS 4561, Digital Human Modeling, edited by Vincent G. Duffy • Volume 13, LNAI 4562, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris • Volume 14, LNCS 4563, Virtual Reality, edited by Randall Shumaker • Volume 15, LNCS 4564, Online Communities and Social Computing, edited by Douglas Schuler • Volume 16, LNAI 4565, Foundations of Augmented Cognition 3rd Edition, edited by Dylan D. Schmorrow and Leah M. Reeves • Volume 17, LNCS 4566, Ergonomics and Health Aspects of Work with Computers, edited by Marvin J. Dainoff I would like to thank the Program Chairs and the members of the Program Boards of all Thematic Areas, listed below, for their contribution to the highest scientific quality and the overall success of the HCI International 2007 Conference.

Ergonomics and Health Aspects of Work with Computers Program Chair: Marvin J. Dainoff Arne Aaras, Norway Pascale Carayon, USA Barbara G.F. Cohen, USA Wolfgang Friesdorf, Germany Martin Helander, Singapore Ben-Tzion Karsh, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Kari Lindstrom, Finland

Holger Luczak, Germany Aura C. Matias, Philippines Kyung (Ken) Park, Korea Michelle Robertson, USA Steven L. Sauter, USA Dominique L. Scapin, France Michael J. Smith, USA Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK

Human Interface and the Management of Information Program Chair: Michael J. Smith Lajos Balint, Hungary Gunilla Bradley, Sweden Hans-Jörg Bullinger, Germany Alan H.S. Chan, Hong Kong Klaus-Peter Fähnrich, Germany Michitaka Hirose, Japan Yoshinori Horie, Japan Richard Koubek, USA Yasufumi Kume, Japan Mark Lehto, USA Jiye Mao, P.R. China

Robert Proctor, USA Youngho Rhee, Korea Anxo Cereijo Roibás, UK Francois Sainfort, USA Katsunori Shimohara, Japan Tsutomu Tabe, Japan Alvaro Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan

Foreword

Fiona Nah, USA Shogo Nishida, Japan Leszek Pacholski, Poland

Li Zheng, P.R. China Bernhard Zimolong, Germany

Human-Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Sheue-Ling Hwang, Taiwan Yong Gu Ji, Korea Steven J. Landry, USA Jonathan Lazar, USA

V. Kathlene Leonard, USA Chang S. Nam, USA Anthony F. Norcio, USA Celestine A. Ntuen, USA P.L. Patrick Rau, P.R. China Andrew Sears, USA Holly Vitense, USA Wenli Zhu, P.R. China

Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Kenneth R. Boff, USA Guy Boy, France Pietro Carlo Cacciabue, Italy Judy Edworthy, UK Erik Hollnagel, Sweden Kenji Itoh, Japan Peter G.A.M. Jorna, The Netherlands Kenneth R. Laughery, USA

Nicolas Marmaras, Greece David Morrison, Australia Sundaram Narayanan, USA Eduardo Salas, USA Dirk Schaefer, France Axel Schulte, Germany Neville A. Stanton, UK Andrew Thatcher, South Africa

Universal Access in Human-Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elizabeth Andre, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian Bühler, Germany Noelle Carbonell, France Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy Michael Fairhurst, UK Gerhard Fischer, USA Jon Gunderson, USA Andreas Holzinger, Austria

Zhengjie Liu, P.R. China Klaus Miesenberger, Austria John Mylopoulos, Canada Michael Pieper, Germany Angel Puerta, USA Anthony Savidis, Greece Andrew Sears, USA Ben Shneiderman, USA Christian Stary, Austria Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg Vanderheiden, USA Gerhard Weber, Germany

VII

VIII

Foreword

Arthur Karshmer, USA Simeon Keates, USA George Kouroupetroglou, Greece Jonathan Lazar, USA Seongil Lee, Korea

Harald Weber, Germany Toshiki Yamaoka, Japan Mary Zajicek, UK Panayiotis Zaphiris, UK

Virtual Reality Program Chair: Randall Shumaker Terry Allard, USA Pat Banerjee, USA Robert S. Kennedy, USA Heidi Kroemker, Germany Ben Lawson, USA Ming Lin, USA Bowen Loftin, USA Holger Luczak, Germany Annie Luciani, France Gordon Mair, UK

Ulrich Neumann, USA Albert "Skip" Rizzo, USA Lawrence Rosenblum, USA Dylan Schmorrow, USA Kay Stanney, USA Susumu Tachi, Japan John Wilson, UK Wei Zhang, P.R. China Michael Zyda, USA

Usability and Internationalization Program Chair: Nuray Aykin Genevieve Bell, USA Alan Chan, Hong Kong Apala Lahiri Chavan, India Jori Clarke, USA Pierre-Henri Dejean, France Susan Dray, USA Paul Fu, USA Emilie Gould, Canada Sung H. Han, South Korea Veikko Ikonen, Finland Richard Ishida, UK Esin Kiris, USA Tobias Komischke, Germany Masaaki Kurosu, Japan James R. Lewis, USA

Rungtai Lin, Taiwan Aaron Marcus, USA Allen E. Milewski, USA Patrick O'Sullivan, Ireland Girish V. Prabhu, India Kerstin Röse, Germany Eunice Ratna Sari, Indonesia Supriya Singh, Australia Serengul Smith, UK Denise Spacinsky, USA Christian Sturm, Mexico Adi B. Tedjasaputra, Singapore Myung Hwan Yun, South Korea Chen Zhao, P.R. China

Online Communities and Social Computing Program Chair: Douglas Schuler Chadia Abras, USA Lecia Barker, USA Amy Bruckman, USA

Stefanie Lindstaedt, Austria Diane Maloney-Krichmar, USA Isaac Mao, P.R. China

Foreword

Peter van den Besselaar, The Netherlands Peter Day, UK Fiorella De Cindio, Italy John Fung, P.R. China Michael Gurstein, USA Tom Horan, USA Piet Kommers, The Netherlands Jonathan Lazar, USA

IX

Hideyuki Nakanishi, Japan A. Ant Ozok, USA Jennifer Preece, USA Partha Pratim Sarker, Bangladesh Gilson Schwartz, Brazil Sergei Stafeev, Russia F.F. Tusubira, Uganda Cheng-Yen Wang, Taiwan

Augmented Cognition Program Chair: Dylan D. Schmorrow Kenneth Boff, USA Joseph Cohn, USA Blair Dickson, UK Henry Girolamo, USA Gerald Edelman, USA Eric Horvitz, USA Wilhelm Kincses, Germany Amy Kruse, USA Lee Kollmorgen, USA Dennis McBride, USA

Jeffrey Morrison, USA Denise Nicholson, USA Dennis Proffitt, USA Harry Shum, P.R. China Kay Stanney, USA Roy Stripling, USA Michael Swetnam, USA Robert Taylor, UK John Wagner, USA

Digital Human Modeling Program Chair: Vincent G. Duffy Norm Badler, USA Heiner Bubb, Germany Don Chaffin, USA Kathryn Cormican, Ireland Andris Freivalds, USA Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Sung H. Han, South Korea Pheng Ann Heng, Hong Kong Dewen Jin, P.R. China Kang Li, USA

Zhizhong Li, P.R. China Lizhuang Ma, P.R. China Timo Maatta, Finland J. Mark Porter, UK Jim Potvin, Canada Jean-Pierre Verriest, France Zhaoqi Wang, P.R. China Xiugan Yuan, P.R. China Shao-Xiang Zhang, P.R. China Xudong Zhang, USA

In addition to the members of the Program Boards above, I also wish to thank the following volunteer external reviewers: Kelly Hale, David Kobus, Amy Kruse, Cali Fidopiastis and Karl Van Orden from the USA, Mark Neerincx and Marc Grootjen from the Netherlands, Wilhelm Kincses from Germany, Ganesh Bhutkar and Mathura Prasad from India, Frederick Li from the UK, and Dimitris Grammenos, Angeliki

X

Foreword

Kastrinaki, Iosif Klironomos, Alexandros Mourouzis, and Stavroula Ntoa from Greece. This conference could not have been possible without the continuous support and advise of the Conference Scientific Advisor, Prof. Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications Chair and Editor of HCI International News, Abbas Moallem, and of the members of the Organizational Board from P.R. China, Patrick Rau (Chair), Bo Chen, Xiaolan Fu, Zhibin Jiang, Congdong Li, Zhenjie Liu, Mowei Shen, Yuanchun Shi, Hui Su, Linyang Sun, Ming Po Tham, Ben Tsiang, Jian Wang, Guangyou Xu, Winnie Wanli Yang, Shuping Yi, Kan Zhang, and Wei Zho. I would also like to thank for their contribution towards the organization of the HCI International 2007 Conference the members of the Human Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, Maria Pitsoulaki, George Paparoulis, Maria Bouhli, Stavroula Ntoa and George Margetis.

Constantine Stephanidis General Chair, HCI International 2007

HCI International 2009

The 13th International Conference on Human-Computer Interaction, HCI International 2009, will be held jointly with the affiliated Conferences in San Diego, California, USA, in the Town and Country Resort & Convention Center, 19-24 July 2009. It will cover a broad spectrum of themes related to Human Computer Interaction, including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. For more information, please visit the Conference website: http://www.hcii2009.org/

General Chair Professor Constantine Stephanidis ICS-FORTH and University of Crete Heraklion, Crete, Greece Email: [email protected]

Table of Contents

Part 1: Interaction Design: Theoretical Issues, Methods, Techniques and Practice Design Principles Based on Cognitive Aging . . . . . . . . . . . . . . . . . . . . . . . . . Hiroko Akatsu, Hiroyuki Miki, and Naotsune Hosono

3

Redesigning the Rationale for Design Rationale . . . . . . . . . . . . . . . . . . . . . . Michael E. Atwood and John Horner

11

HCI and the Face: Towards an Art of the Soluble . . . . . . . . . . . . . . . . . . . . Christoph Bartneck and Michael J. Lyons

20

Towards Generic Interaction Styles for Product Design . . . . . . . . . . . . . . . Jacob Buur and Marcelle Stienstra

30

Context-Centered Design: Bridging the Gap Between Understanding and Designing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yunan Chen and Michael E. Atwood

40

Application of Micro-Scenario Method (MSM) to User Research for the Motorcycle’s Informatization - A Case Study for the Information Support System for Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiroshi Daimoto, Sachiyo Araki, Masamitsu Mizuno, and Masaaki Kurosu

49

Incorporating User Centered Requirement Engineering into Agile Software Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus D¨ uchting, Dirk Zimmermann, and Karsten Nebe

58

How a Human-Centered Approach Impacts Software Development . . . . . Xavier Ferre and Nelson Medinilla

68

After Hurricane Katrina: Post Disaster Experience Research Using HCI Tools and Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Catherine Forsman

78

A Scenario-Based Design Method with Photo Diaries and Photo Essays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kentaro Go

88

Alignment of Product Portfolio Definition and User Centered Design Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ron Hofer, Dirk Zimmermann, and Melanie Jekal

98

XIV

Table of Contents

A New User-Centered Design Process for Creating New Value and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yasuhisa Itoh, Yoko Hirose, Hideaki Takahashi, and Masaaki Kurosu

108

The Evasive Interface – The Changing Concept of Interface and the Varying Role of Symbols in Human–Computer Interaction . . . . . . . . . . . . Lars-Erik Janlert

117

An Ignored Factor of User Experience: FEEDBACK-QUALITY . . . . . . . Ji Hong and Jiang Xubo

127

10 Heuristics for Designing Administrative User Interfaces – A Collaboration Between Ethnography, Design, and Engineering . . . . . . . . . Luke Kowalski and Kristyn Greenwood

133

Micro-Scenario Database for Substantializing the Collaboration Between Human Science and Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . Masaaki Kurosu, Kentaro Go, Naoki Hirasawa, and Hideaki Kasai

140

A Meta-cognition Modeling of Engineering Product Designer in the Process of Product Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jun Liang, Zu-Hua Jiang, Yun-Song Zhao, and Jin-Lian Wang

146

User Oriented Design to the Chinese Industries Scenario and Experience Innovation Design Approach for the Industrializing Countries in the Digital Technology Era . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . You Zhao Liang, Ding Hau Huang, and Wen Ko Chiou

156

Emotional Experiences and Quality Perceptions of Interactive Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sascha Mahlke and Gitte Lindgaard

164

CRUISER: A Cross-Discipline User Interface and Software Enginee ring Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Memmel, Fredrik Gundelsweiler, and Harald Reiterer

174

Interface Between Two Disciplines - The Development of Theatre as a Research Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maggie Morgan and Alan Newell

184

Aspects of Integrating User Centered Design into Software Engineering Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karsten Nebe and Dirk Zimmermann

194

Activity Theoretical Analysis and Design Model for Web-Based Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anh Vu Nguyen-Ngoc

204

Table of Contents

XV

Collaborative Design for Strategic UXD Impact and Global Product Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . James Nieters and David Williams

214

Participatory Design Using Scenarios in Different Cultures . . . . . . . . . . . . Makoto Okamoto, Hidehiro Komatsu, Ikuko Gyobu, and Kei Ito

223

Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ronnie Taib and Natalie Ruiz

232

Extreme Programming in Action: A Longitudinal Case Study . . . . . . . . . Peter Tingling and Akbar Saeed

242

Holistic Interaction Between the Computer and the Active Human Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hannu Vanharanta and Tapio Salminen

252

The Use of Improvisational Role-Play in User Centered Design Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanna Vogiazou, Jonathan Freeman, and Jane Lessiter

262

Quantifying the Narration Board for Visualising Final Design Concepts by Interface Designers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chui Yin Wong and Chee Weng Khong

273

Scenario-Based Installability Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiao Shanghong

283

A Case Study of New Way to Apply Card Sort in Panel Design . . . . . . . . Yifei Xu, Xiangang Qin, and Shan Shan Cao

289

Design Tools for User Experience Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazuhiko Yamazaki and Kazuo Furuta

298

Axiomatic Design Approach for E-Commercial Web Sites . . . . . . . . . . . . . Mehmet Mutlu Yenisey

308

Development of Quantitative Metrics to Support UI Designer Decision-Making in the Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Young Sik Yoon and Wan Chul Yoon

316

Scenario-Based Product Design, a Real Case . . . . . . . . . . . . . . . . . . . . . . . . Der-Jang Yu and Huey-Jiuan Yeh

325

Designing Transparent Interaction for Ubiquitous Computing: Theory and Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weining Yue, Heng Wang, and Guoping Wang

331

XVI

Table of Contents

Understanding, Measuring, and Designing User Experience: The Causal Relationship Between the Aesthetic Quality of Products and User Affect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haotian Zhou and Xiaolan Fu

340

Enhancing User-Centered Design by Adopting the Taguchi Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Zhou, David Heesom, and Panagiotis Georgakis

350

A Requirement Engineering Approach to User Centered Design . . . . . . . . Dirk Zimmermann and Lennart Gr¨ otzbach

360

Part 2: Usability and Evaluation Methods and Tools Design Science-Oriented Usability Modelling for Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sisira Adikari, Craig McDonald, and Neil Lynch Prototype Evaluation and User-Needs Analysis in the Early Design of Emerging Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Margarita Anastassova, Christine Mégard, and Jean-Marie Burkhardt

373

383

Long Term Usability; Its Concept and Research Approach - The Origin of the Positive Feeling Toward the Product . . . . . . . . . . . . . . . . . . . . . . . . . . Masaya Ando and Masaaki Kurosu

393

General Interaction Expertise: An Approach for Sampling in Usability Testing of Consumer Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Emre Berkman

397

Are Guidelines and Standards for Web Usability Comprehensive? . . . . . . Nigel Bevan and Lonneke Spinhof

407

The Experimental Approaches of Assessing the Consistency of User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yan Chen, Lixian Huang, Lulu Li, Qi Luo, Ying Wang, and Jing Xu

420

Evaluating Usability Improvements by Combining Visual and Audio Modalities in the Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Duarte, Lu´ıs Carri¸co, and Nuno Guimar˜ aes

428

Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryosuke Fujioka, Ryo Tanimoto, Yuki Kawai, and Hidehiko Okada

438

A Game to Promote Understanding About UCD Methods and Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muriel Garreta-Domingo, Mag´ı Almirall-Hill, and Enric Mor

446

Table of Contents

DEPTH TOOLKIT: A Web-Based Tool for Designing and Executing Usability Evaluations of E-Sites Based on Design Patterns . . . . . . . . . . . . Petros Georgiakakis, Symeon Retalis, Yannis Psaromiligkos, and George Papadimitriou

XVII

453

Evaluator of User’s Actions (Eua) Using the Model of Abstract Representation Dgaui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Susana G´ omez-Carnero and Javier Rodeiro Iglesias

463

Adaptive Evaluation Strategy Based on Surrogate Model . . . . . . . . . . . . . . Yi-nan Guo, Dun-wei Gong, and Hui Wang

472

A Study on the Improving Product Usability Applying the Kano’s Model of Customer Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeongyun Heo, Sanhyun Park, and Chiwon Song

482

The Practices of Usability Analysis to Wireless Facility Controller for Conference Room . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ding Hau Huang, You Zhao Liang, and Wen Ko Chiou

490

What Makes Evaluators to Find More Usability Problems?: A Meta-analysis for Individual Detection Rates . . . . . . . . . . . . . . . . . . . . . . . . Wonil Hwang and Gavriel Salvendy

499

Evaluating in a Healthcare Setting: A Comparison Between Concurrent and Retrospective Verbalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Janne Jul Jensen

508

Development of AHP Model for Telematics Haptic Interface Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Gu Ji, Beom Suk Jin, Jae Seung Mun, and Sang Min Ko

517

How to Make Tailored User Interface Guideline for Software Designers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilari Jounila

527

Determining High Level Quantitative Usability Requirements: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Niina Kantola and Timo Jokela

536

Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sami Karjalainen

544

Usability Improvements for WLAN Access . . . . . . . . . . . . . . . . . . . . . . . . . . Kristiina Karvonen and Janne Lindqvist

549

A New Framework of Measuring the Business Values of Software . . . . . . . In Ki Kim, Beom Suk Jin, Seungyup Baek, Andrew Kim, Yong Gu Ji, and Myung Hwan Yun

559

XVIII

Table of Contents

Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Koutsabasis, T. Spyrou, and J. Darzentas

569

Concept of Usability Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masaaki Kurosu

579

How to Use Emotional Usability to Make the Product Serves a Need Beyond the Traditional Functional Objective to Satisfy the Emotion Needs of the User in Order to Improve the Product Differentiator - Focus on Home Appliance Product . . . . . . . . . . . . . . . . . . . Liu Ning and Shang Ting Towards Remote Empirical Evaluation of Web Pages’ Usability . . . . . . . . Juan Miguel L´ opez, Inmaculada Fajardo, and Julio Abascal Mixing Evaluation Methods for Assessing the Utility of an Interactive InfoVis Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus Rester, Margit Pohl, Sylvia Wiltner, Klaus Hinum, Silvia Miksch, Christian Popow, and Susanne Ohmann

587

594

604

Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaspal S. Sandhu, P. Altankhuyag, and D. Amarsaikhan

614

Effectiveness of Content Preparation in Information Technology Operations: Synopsis of a Working Paper . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Savoy and G. Salvendy

624

Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture for Early Usability Evaluation: Basic Principles and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jean-Claude Tarby, Houcine Ezzedine, José Rouillard, Chi Dung Tran, Philippe Laporte, and Christophe Kolski

632

Usability and Software Development: Roles of the Stakeholders . . . . . . . . Tobias Uldall-Espersen and Erik Frøkjær

642

Human Performance Model and Evaluation of PBUI . . . . . . . . . . . . . . . . . Naoki Urano and Kazunari Morimoto

652

Developing Instrument for Handset Usability Evaluation: A Survey Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ting Zhang, Pei-Luen Patrick Rau, and Gavriel Salvendy

662

Table of Contents

XIX

Part 3: Understanding Users and Contexts of Use Tips for Designing Mobile Phone Web Pages for the Elderly . . . . . . . . . . . Yoko Asano, Harumi Saito, Hitomi Sato, Lin Wang, Qin Gao, and Pei-Luen Patrick Rau The Role of Task Characteristics and Organization Culture in Non-Work Related Computing (NWRC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gee-Woo Bock, Huei-Huang Kuan, Ping Liu, and Hua Sun Searching for Information on the Web: Role of Aging and Ergonomic Quality of Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aline Chevalier, Aurélie Dommès, Daniel Martins, and Cécile Valérian

675

681

691

Creating Kansei Engineering-Based Ontology for Annotating and Archiving Photos Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu-Liang Chi, Shu-Yun Peng, and Ching-Chow Yang

701

Influence of Avatar Creation on Attitude, Empathy, Presence, and Para-Social Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Donghun Chung, Brahm Daniel deBuys, and Chang S. Nam

711

Sambad- Computer Interfaces for Non-literates . . . . . . . . . . . . . . . . . . . . . . Sagun Dhakhwa, Patrick A.V. Hall, Ganesh Bahadur Ghimire, Prakash Manandhar, and Ishwor Thapa

721

The Balancing Act Between Computer Security and Convenience . . . . . . Mayuresh Ektare and Yanxia Yang

731

What Makes Them So Special?: Identifying Attributes of Highly Competent Information System Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brenda Eschenbrenner and Fiona Fui-Hoon Nah

736

User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bente Evjemo, Sigmund Akselsen, and Anders Sch¨ urmann

746

Why Does IT Support Enjoyment of Elderly Life? - Case Studies Performed in Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaori Fujimura, Hitomi Sato, Takayoshi Mochizuki, Kubo Koichiro, Kenichiro Shimokura, Yoshihiro Itoh, Setsuko Murata, Kenji Ogura, Takumi Watanabe, Yuichi Fujino, and Toshiaki Tsuboi

756

Design Effective Navigation Tools for Older Web Users . . . . . . . . . . . . . . . Qin Gao, Hitomi Sato, Pei-Luen Patrick Rau, and Yoko Asano

765

Out of Box Experience Issues of Free and Open Source Software . . . . . . . Mehmet G¨ okt¨ urk and G¨ orkem C ¸ etin

774

XX

Table of Contents

Factor Structure of Content Preparation for E-Business Web Sites: A Survey Results of Industrial Employees in P.R. China . . . . . . . . . . . . . . . . Yinni Guo and Gavriel Salvendy

784

Streamlining Checkout Experience – A Case Study of Iterative Design of a China e-Commerce Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alice Han, Jianming Dong, Winnie Tseng, and Bernd Ewert

796

Presence, Creativity and Collaborative Work in Virtual Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilona Heldal, David Roberts, Lars Br˚ athe, and Robin Wolff

802

Users Interact Differently: Towards a Usability- Oriented User Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabian Hermann, Iris Niedermann, Matthias Peissner, Katja Henke, and Anja Naumann Reminders, Alerts and Pop-ups: The Cost of Computer-Initiated Interruptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Helen M. Hodgetts and Dylan M. Jones

812

818

The Practices of Scenario Study to Home Scenario Control . . . . . . . . . . . . Yung Hsing Hu, Yuan Tsing Huang, You Zhao Liang, and Wen Ko Chiou

827

Effects of Time Orientation on Design of Notification Systems . . . . . . . . . Ding-Long Huang, Pei-Luen Patrick Rau, Hui Su, Nan Tu, and Chen Zhao

835

Being Together: User’s Subjective Experience of Social Presence in CMC Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ha Sung Hwang and SungBok Park

844

Age Differences in Performance, Operation Methods, and Workload While Interacting with an MP3 Player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neung Eun Kang and Wan Chul Yoon

854

A Usability Test of Exchanging Context in a Conference Room Via Mobile Device Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Doyoon Kim, Seungchul Shin, Cheolho Cheong, and Tack-Don Han

862

Conceptual and Technical Issues in Extending Computational Cognitive Modeling to Aviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alex Kirlik

872

Mental Models of Chinese and German Users and Their Implications for MMI: Experiences from the Case Study Navigation System . . . . . . . . . Barbara Knapp

882

Table of Contents

XXI

Usability Test for Cellular Phone Interface Design That Controls Home Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haeinn Lee

891

Validating Information Complexity Questionnaires Using Travel Web Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Ling, Miguel Lopez, and Jing Xing

901

Maximizing Environmental Validity: Remote Recording of Desktop Videoconferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sean Rintel

911

The Impact of Moving Around and Zooming of Objects on Users’ Performance in Web Pages: A Cross-Generation Study . . . . . . . . . . . . . . . . Hitomi Sato, Kaori Fujimura, Lin Wang, Ling Jin, Yoko Asano, Masahiro Watanabe, and Pei-Luen Patrick Rau Entelechy and Embodiment in (Artistic) Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uwe Seifert and Jin Hyun Kim Predicting Perceived Situation Awareness of Low Altitude Aircraft in Terminal Airspace Using Probe Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Z. Strybel, Kim-Phuong L. Vu, John P. Dwyer, Jerome Kraft, Thuan K. Ngo, Vanessa Chambers, and Fredrick P. Garcia

921

929

939

Co-presence in Shared Virtual Environments: Avatars Beyond the Opposition of Presence and Representation . . . . . . . . . . . . . . . . . . . . . . . . . . Jan S¨ offner and Chang S. Nam

949

Using Memory Aid to Build Memory Independence . . . . . . . . . . . . . . . . . . . Quan T. Tran, Elizabeth D. Mynatt, and Gina Calcaterra

959

Perception of Movements and Transformations in Flash Animations of Older Adults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lin Wang, Hitomi Sato, Ling Jin, Pei-Luen Patrick Rau, and Yoko Asano

966

Studying Utility of Personal Usage-History: A Software Tool for Enabling Empirical Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kimmo Wideroos and Samuli Pekkola

976

Enable the Organization for UCD Through Specialist and Process Counseling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Natalie Woletz and Susanne Laumann

985

User Response to Free Trial Restrictions: A Coping Perspective . . . . . . . . Xue Yang, Chuan-Hoo Tan, and Hock-Hai Teo

991

XXII

Table of Contents

A Study on the Form of Representation of the User’s Mental Model-Oriented Ancient Map of China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001 Rui Yang, Dan Li, and Wei Zhou Towards Automatic Cognitive Load Measurement from Speech Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1011 Bo Yin and Fang Chen Attitudes in ICT Acceptance and Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021 Ping Zhang and Shelley Aikman

Part 4: Models and Patterns in HCI Using Patterns to Support the Design of Flexible User Interaction . . . . . . 1033 M. Cec´ılia C. Baranauskas and Vania Paula de Almeida Neris Model-Based Usability Evaluation - Evaluation of Tool Support . . . . . . . . 1043 Gregor Buchholz, J¨ urgen Engel, Christian M¨ artin, and Stefan Propp User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053 Chiou Wen-Ko, Chen Bi-Hui, Wang Ming-Hsu, and Liang You-Zhao Formal Validation of Java/Swing User Interfaces with the Event B Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 Alexandre Cortier, Bruno d’Ausbourg, and Yamine A¨ıt-Ameur Task Analysis, Usability and Engagement . . . . . . . . . . . . . . . . . . . . . . . . . . . 1072 David Cox ORCHESTRA: Formalism to Express Static and Dynamic Model of Mobile Collaborative Activities and Associated Patterns . . . . . . . . . . . . . . 1082 Bertrand David, René Chalon, Olivier Delotte, and Guillaume Masserey Effective Integration of Task-Based Modeling and Object-Oriented Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1092 Anke Dittmar and Ashraf Gaffar A Pattern Decomposition and Interaction Design Approach . . . . . . . . . . . 1102 Cunhao Fang, Pengwei Tian, and Ming Zhong Towards an Integrated Approach for Task Modeling and Human Behavior Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109 Martin Giersich, Peter Forbrig, Georg Fuchs, Thomas Kirste, Daniel Reichart, and Heidrun Schumann A Pattern-Based Framework for the Exploration of Design Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1119 Tibor Kunert and Heidi Kr¨ omker

Table of Contents

XXIII

Tasks Models Merging for High-Level Component Composition . . . . . . . . 1129 Arnaud Lewandowski, Sophie Lepreux, and Grégory Bourguin Application of Visual Programming to Web Mash Up Development . . . . . 1139 Seung Chan Lim, Sandi Lowe, and Jeremy Koempel Comprehensive Task and Dialog Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 1149 V´ıctor L´ opez-Jaquero and Francisco Montero Structurally Supported Design of HCI Pattern Languages . . . . . . . . . . . . . 1159 Christian M¨ artin and Alexander Roski Integrating Authoring Tools into Model-Driven Development of Interactive Multimedia Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168 Andreas Pleuß and Heinrich Hußmann A Survey on Transformation Tools for Model Based User Interface Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1178 Robbie Schaefer A Task Model Proposal for Web Sites Usability Evaluation for the ErgoMonitor Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188 André Luis Schwerz, Marcelo Morandini, and Sérgio Roberto da Silva Model-Driven Architecture for Web Applications . . . . . . . . . . . . . . . . . . . . . 1198 Mohamed Taleb, Ahmed Seffah, and Alain Abran HCI Design Patterns for PDA Running Space Structured Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206 Ricardo Tesoriero, Francisco Montero, Mar´ıa D. Lozano, and José A. Gallud Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216 Kristof Verpoorten, Kris Luyten, and Karin Coninx Patterns for Task- and Dialog-Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226 Maik Wurdel, Peter Forbrig, T. Radhakrishnan, and Daniel Sinnig Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237

Part I

Interaction Design: Theoretical Issues, Methods, Techniques and Practice

Design Principles Based on Cognitive Aging Hiroko Akatsu1, Hiroyuki Miki1, and Naotsune Hosono2 1 Oki Electric Industry Co., Ltd. 1-16-8 Chuou Warabi-shi, Saitama, 335-8510 Japan [email protected], [email protected] 2 Oki Consulting Solutions Co., Ltd. [email protected]

Abstract. This study proposes the design principles considering the balance of ‘simplicity’ and ’helpfulness’ based on cognitive aging. Due to the increase of the aging population, various equipments are required to better assist the elderly users. ATMs (Automatic Teller Machine) have always been considered to be equipment that is difficult for the elderly users. Then this paper discusses a new ATM interface design considering the principles. The effectiveness of the new design was examined by comparing it with a conventional ATM. The usability test results favored the new ATM design, and it is consequently accepted by many elderly users. Keywords: cognitive aging, design principles, elderly users, ATM.

1 Introduction Due to the increase of the aging population, various equipments are required to better assist the elderly users [1]. Although most assistances are designed with consideration for the special needs of the elderly, they are often limited to perceptive characteristics, such as stronger contrasts or larger characters on the display, and physical characteristics such as the big buttons. The problem is that most elderly users cannot operate the ATMs on such perceptive characteristics alone [2]. ATMs haven’t taken sufficient assistances against cognitive aging and behavioral characteristics that are definite influences on operation. This research proposes the design principles considering the balance of ‘simplicity’ and ’helpfulness’ based on cognitive aging by using the ATM case. The elderly users tend to read all screen information, so a simple design is important. However, the design for the complicated steps requires not only ‘simplicity’, but also ’helpfulness’. Therefore the new ATM based the principles was designed. The effectiveness of the new design was examined by comparing it with a conventional ATM. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 3–10, 2007. © Springer-Verlag Berlin Heidelberg 2007

4

H. Akatsu, H. Miki, and N. Hosono

2 Influences of Interaction Equipments by Cognitive Aging 2.1 Issues It is important to consider not only the perceptive and physical characteristics, but a comprehensive consideration including cognitive behavioral characteristics that are definite influences on operation should also be taken into account (Figure1).

The elderly users’ characteristics when they operate various equipments

Aged-Changes Decreased vision

Slow operations through confirmations

Cataracta senilis Decreased sensibility

Hard to understand all the information at once.

longer response time

Hard to notice the screen changes

Diminished attention

Repeat similar errors

Decline in memory

Hesitate to take initiatives

・・・ Perceptive Physical characteristics

・・・ Cognitive behavioral characteristics

Cognitive aging Fig. 1. Cognitive aging

The elderly user's characteristics were found by usability tests of various equipment as presented below [3]. 1) Longer Response Time than Younger Users The time required for entries was quite long when using the 50 character keys, which involved the time to insert a passbook or cash and the overall time responding to individual items. This often resulted in a time-out, meaning many of the elderly needed to repeat the procedure from the beginning. A comparison of the average times needed for each task revealed that the group of elderly users took twice as long as the group of university students for withdrawal operations and three times as long for fund transfers. It was found by repeating the same operations, such as entering one's name using the 50 character keys, however, the elderly people also learned the operation, and this resulted in a shortening of time for such tasks.

Design Principles Based on Cognitive Aging

5

2) Difficulties Collecting all the Information in a Short Time Under certain conditions, they experienced difficulties in collecting all the necessary information at once, such as being able to read only a portion of the messages displayed on the screen. 3) Excessive Response to Voice Messages In general the voice message prompts prevented the elderly from forgetting to press a key (example: A voice message such as “Please verify the amount and press the 'Confirm' key if the amount is correct”). However, when a voice message prompting them to “enter your name” was given at a time after the name was entered, the elderly proceeded to enter the name again, even though the name entry had just been completed. 4) Recurrence of the Same Errors It was found when an operational error was once made, then there was a tendency to repeat the same error. It appears that it is difficult for the elderly to determine what status they are currently in or how the operation was done previously, therefore, making it difficult for themselves to avoid the same errors. 5) They Tend to Respond to Items that are Easily seen or can be Touched Directly by Hand (example: hardware keys) 6) They Hardly Notice the Changes to Information Displayed on the Screen 7) They cannot always extract the necessary information (or they will try to read all the information, but will get tired on the way through and are unable to finish the reading). 8) They will not take any initiatives on their own (or they will just follow the orders when they are asked to push keys, for example). 2.2 Ease of Use and Cognitive Aging: A Three-Layered Factor Model By sorting out the problems of the elderly obtained through various experiments, it appears that the three factors as shown in Figure 2 overlap each other in a complex manner, causing the phenomena that the elderly“cannot use equipment”. The three factors are; (a) Factors Associated with the Deterioration of the Cognitive Capacity of the Elderly Users Factors that are the basis for the inability to use equipment are the deterioration of the cognitive function, which occurs by aging. As reported by researches in the field of experimental cognitive psychology, the deterioration of capabilities due to aging is considered to have a clear influence on the matter.

6


(b) Factors Relevant to the Lack of Knowledge and Mental Models (for Equipment and Systems) A mental model is an image that a user puzzles how equipment should be used. It is believed that the lack of such knowledge is accelerating the effects of cognitive aging outlined in (a) of figure 2, delaying the understanding on the operations of equipment. Such problems arise from the rapid acceleration in the advancement of IT equipments. This brings difficulties for the elderly in the future. So long as new technologies are being developed at all times, however, it is believed that new problems, which are different from those today, will appear continuously. (c) Factors Relevant to Attitude (cultural and social values) The elderly users seem to have an attitude of not even wanting to try to use the equipment from the start by selecting methods and means that are beyond their familiarity (example: Using a teller rather than an ATM), as they do not want to be seen as being incapable. This factor is a problem for manufacturers. Still, as mentioned before, with the branches of many banks being consolidated and reduced in number, it is believed that there will be an increasing number of situations in the future when the elderly are forced to use ATMs, which are eventually difficult for them to use. As our agenda for the future, it is essential to broaden the scope of usability research and to conduct studies from other perspectives, such as what needs to be done to enable the elderly to use the equipment. It is necessary to consider that a cause of one issue is not only by one factor but also by three factors. Then the design principles are based on cognitive aging considering the three factors. Consequently the new ATM design for the elderly users by the design principles is proposed. Afterward the effectiveness of the new design was compared with a conventional ATM.

(c) Factors associated w ith attitudes • Negative attitude by using the equipm ent • Values, knowledge and fram ework for each generation. • Select m ethods and m eans to effectively sustain their own capabilities. (b) Factors associated w ith a lack of know ledge and m ental m odels • Knowledge and m ental m odels concerning particular m odes of operation of equipm ent. • Knowledge relative to the concept of the inform ation itself. (a) Factors associated w ith the deterioration of cognitive capabilities of the elderly • Deterioration of inhibition functions. • Decrease in short-term m emory capacity. • Delays in com prehension.

Fig. 2. Ease of use and cognitive aging: A three-layered factor model (The material was touched up and corrected by Harada and Akatsu[3])


7

3 Design Principles and ATM Design Through consideration of elderly users’ characteristics above, the following design principles were clarified. A new ATM design that balances ‘simplicity’ and ‘helpfulness’ based on cognitive aging is proposed. 1) Just One Operation Requires at One Screen ATM design example: the elderly users can perform the banking transaction in a stepby-step manner. 2) The Screen Switch Must be Noticed ATM design example: blinking buttons and screen switch by side slide at a time of page renewal (Figure3). 3) The Operation Flow also Must be Comprehensible ATM design example: The conventional ATM demands two operations of input and confirmation. The new ATM divides them into two pieces of an input screen and confirmation screen. As a result, the elderly users could use it with confident input operation and confirmation (see Figure 4). 4) The Screen Information Must be Easy to Read (sufficient font size and contrast) 5) Screen Information must be Simple as Possible The announcements generally support the operation. However, sometimes the announcements hinder the operation due to inappropriate timing and contents. Hence the following points were considered. 6) The Same Content as the Announcement Must be Displayed on the Screen 7) The Announcement Must be Done at the Time Just Before Changing to the Next Screen, and it must not Repeat 8) The Announcements of Feedback Massage can be Done by the Handset Phone

Fig. 3. Screen switch by side slide

8


Please enter the amount to remit. Next then please confirm.

cancel confirm

clear

amount

Conventional ATM Please enter the amount to remit. Clear

amount

The amount is “65,000 yen” Is it OK?

Clear

Yes

Next

Input Screen

Confirmation Screen

New ATM for elderly users Fig. 4. Input screen and Conformation screen

4 ATM Usability Testing The effectiveness of the new ATM design for the elderly users was compared with conventional one. 4.1 Methods At first the test participants were instructed to express vocally, what they were thinking while operating an ATM simulator (“Think Aloud Method”). Then, the collected data (every behavior and speech of the test participants) were “Protocol Analyzed”. 4.2 Test Participants The test participants were six elderly users (three males and three females, aged between 68 and 75). They have never used an ATM before. 4.3 Experimental Equipments As an intended system, the ATM simulator was prepared (a personal computer and a touch display were installed in a paper model housing), and ordinary transaction


9

operations were then to be performed. A video camera, a tiepin- type small microphone, recording equipment, etc., were prepared as recording media. 4.4 Experimental Procedures Each test was conducted by the individual participants. First, an explanation of the usability test objectives, an explanation for the use of the equipment, practice of the thought utterance method and preliminary questionnaire survey concerning the use of ATM were conducted prior to performing the tasks. A follow-up questionnaire survey was conducted once after the tasks had been completed, and additional interviews were also conducted. The prepared two tasks were (1) withdrawal using a cash card, and (2) money transfer. 4.5 Results and Considerations 1) Decreased Number of Time-outs from Operational Errors It was found that most time-outs of an ATM operation occur when the elderly users become confused and are uncertain of what to do next. When a time-out occurs, the display is usually returned to the top screen and wipes out any previous efforts by the users. The number of time-outs of each user experienced during a money transfer task. As a whole, the new ATM design was found to decrease the number of time-out occurrences to less than half when compared with a conventional ATM. On the conventional ATM, the time-outs mainly occurred during the money transfer operation, entering the first letter of the bank branch name and selecting a bank branch from a list. On the other hand, the new ATM time-outs were found to occur during the name input using the Japanese character list. Consequently it can be said that the new ATM solved the issues of usability even though there are still some problems left with the name input. 2) Less Cognitive Load The six users were interviewed after the experimental evaluation. They admitted that the new ATM was easier to use and the most part were satisfied. From the comments made by the users, it is surmised that accumulation of useful tips on each screen page and overall effort to reduce cognitive load were effective. 3) Number of Operational Steps and Operational Confidence There is a trade off between simplifying one screen page information and the additional number of page operations. In the elderly user mode, additional screen pages are added, so that the operations can be performed easier and with their confidence. Operational rhythm is enhanced with subsidiary announcements to make the additional steps less noticeable. Interview results by the test participants showed they preferred simple usability even if several steps are added. Judging by the results of the usability test, the proposed principles were confirmed its effectiveness.

10


5 Conclusion This paper proposed to design a new ATM interface particularly reflecting the requirements of cognitive aging. Experimental evaluation shows a lower number of operational puzzlement and errors when compared with the conventional ATM. The elderly users appreciated the step-by-step operations, which were more in line with their input pace. Therefore the proposed principles were confirmed its effectiveness. As for the principles , not only the ATM but also other equipments will be applicable.

References 1. Fisk, A.D., Rogers, W.A., et al.: Designing for older adults: Principles and Creative Factors Approaches, CRC Press (2004) 2. Kyoyou-Hin Foundation: Inconvenience list such as the elderly people (1999) 3. Harada, T.E., Akatsu, H.: What is “Usability” - A Perspective of Universal Design in An Aging Society. In: Cognitive Science of Usability, Kyoritsu Publisher (2003)

Redesigning the Rationale for Design Rationale Michael E. Atwood and John Horner College of Information Science and Technology Drexel University Philadelphia, PA 19104 USA {atwood, jh38} @drexel.edu

Abstract. One goal of design rationale systems is to support designers by providing a means to record and communicate the argumentation and reasoning behind the design process. However, there are several inherent limitations to developing systems that effectively capture and utilize design rationale. The dynamic and contextual nature of design and our inability to exhaustively analyze all possible design issues results in cognitive, capture, retrieval, and usage limitations. In addition, there are the organizational limitations that ensue when systems are deployed. In this paper we analyze the essential problems that prevent the successful development and use of design rationale systems. We argue that useful and effective design rationale systems cannot be built unless we carefully redefine the goal of design rationale systems. Keywords: Design rationale, theories of design, interactive systems design.

1 Introduction Over the past two decades, much has been written about design rationale. That design rationale has remained an active research area within the human-computer interaction (HCI) community for an extended time indicates that researchers see it as an attractive and productive area for research. We share this enthusiasm for research on design rationale. But, at the same time, we have little confidence that useful and usable design rationale systems will ever be built. And, should they ever be built, we have little confidence that they will be used. The only solution we see to successful research on design rationale is to carefully define the rationale underlying design rationale. Our motivation in writing this paper is derived from two questions. First, since we don’t have a common understanding of what design is, how can we have a common understanding of what design rationale is? Second, why is the collection of papers that describe design rationale systems so much larger than the collection that describe design rationale successes?

2 How Does Design Rationale Relate to Design? In order to get a better picture of the different views of design, we will use an author co-citation analysis initially reported in Wania, Atwood, and McCain [1]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 11–19, 2007. © Springer-Verlag Berlin Heidelberg 2007

12

M.E. Atwood and J. Horner

PARTICIPATORY

C

DESIGN

DESIGN USER CENTERED COGNITIVE

COGNITIVE

Fig. 1. Design Communities

Wania et al reported a bibliometric cocitation analysis of the HCI literature over much of the past two decades. From this analysis, shown in Figure 1, seven major approaches to design were identified. It is important to note that the Design Rationale cluster spans across much of the map, almost connecting one side to the other. Two points are worth noting here. First, design rationale is not a tool that other design communities use as much as it is a research area of its own; that is why is appears here as a separate cluster, Second, the design rationale community does not have a great deal of commonality in interest. The authors in the Design Rationale cluster all seem to be boundary spanners. Each author in this cluster is located very close to another cluster. This suggests that design rationale may mean different things to the different researchers and practitioners within this community. 2.1 Why Do the Papers Describing Systems Outnumber Those Describing Successes? In analyzing the papers that describe design rationale systems, we will look at two end-points. In 1991, a special issue of the journal Human-computer interaction presented six papers on design rationale. Of these six, only one reported any data on system use and this data indicated only that one design rationale system was usable; there was no data supporting a claim that is was useful. In 2006, an edited text [2] presented twenty papers on design rationale. Of these twenty, only one reported data on system usability; no data on usefulness was presented. Clearly, the number of papers describing design rationale systems is much larger then the number reporting design rationale successes. In order to understand why design rationale is not seen as a tool for designers and why successes are so rare, we will begin with a common view of design rationale. In Figure 2, we show the flow of information in most design rationale systems. Initially, designers consider alternatives to design issues they are facing [3]. Then, they store the rationale for their decisions in a design rationale system. At a later time, another design can browse the design rationale system to review earlier decisions and potentially to apply these earlier decisions to the current design. All of this, of course, sits in some organizational context.

Redesigning the Rationale for Design Rationale

13

Organizational Setting Artifact B

Artifact A

1

4

2

DR System

3

Fig. 2. Barriers to Effective Design Rationale Systems

Overall, design rationale systems are intended to support communication, reflection, and analysis in design. Design rationale systems are intended to support the communication of design decisions to others, to support reflecting on design options, and to support analyzing which option to select. But, referring back to Figure 2, the goal of transmitting information to future designers detracts from the goal of doing good designs today! Simply put, a designer’s cognitive energy can be focused on solving today’s problems or on recording information to be used in the future. But, doing one detracts from the other. We argue that the main use of rationale of design rationale systems is to support today’s design. In essence, this brings design rationale back to its starting point (e.g.,[4]).

3 The Essential Barriers For each of the activities shown in Figure 2, we list the essential problems that inhibit the success of design rationale systems. We use the term essential in the same way that Brooks [5] did; essential problems are inherent in the nature of the activity in contrast to accidental problems that are problems for today but which are not inherent and may well be solved by future technological advances. After analyzing these essential problems, we return to two additional questions. In order to better understand what the rationale for design rationale should be we must ask what do designer do? And then what should the goal of design rationale be? 3.1 Cognitive Barriers Designers must focus their cognitive energy on the problem at hand. Imposing inappropriate constraints or introducing irrelevant information into design activities can have detrimental effects. Satisficing, Not Optimization. People have a limited capacity to process information. This limitation can hinder the effectiveness of design rationale. Simon [6] states that we are bounded by our rationality and cannot consider all possible

14


alternatives. Therefore, people choose satisfactory rather than optimal solutions. Since we are bounded by the amount of information we can process, design rationale is necessarily incomplete. Unintended Consequences. It is important to recognize the potential for unintended consequences, especially in systems where the risks are high [6]. In these situations, designers may want to ensure that they have exhaustively covered the design space so as to minimize the risk for unanticipated effects. The key question in this type of query is “what are we missing?” Design rationale is a potential solution to help designers identify issues that they may have otherwise left unconsidered. Systems could allow designers to search for similar projects or issues to identify issues that were considered in those projects. Collaboration Hampers Conceptual Integrity. One mechanism to more exhaustively analyze the design space is to use collaboration in the design process [7]. However, in any collaborative design context, maintaining conceptual integrity is important to keep the design project focused [5]. More people are capable of considering more ideas, but this adds complexity and effort in keeping persons on the design team up to speed. It also increases the effort of integrating diverse perspectives. 3.2 Capture Barriers There are many different situations in which design rationale may not be captured. In some cases, the omission is unintentional. In others, it is quite intentional. We consider both below. Work-benefit Disparity. Complex design is normally a group activity, and tools to support designers can therefore be considered a type of groupware. Grudin [8] describes several problems involved in developing groupware. Specifically, one of the obstacles he discusses is of particular interest to design rationale systems. He contends that there should not be a disparity between who incurs the cost and who receives the benefit. If the focus of design rationale is placed only on minimizing the cost to later users, it can add significant costs to the original designers. A major shortcoming in design rationale is the failure to minimize the cost to the original designers. Gruber and Russell [9] contend that design rationale must go beyond the record and replay paradigm and collect data that can benefit later users, while also not being a burden on designers. Context Is Hard to Capture. Design rationale may be considered, but unintentionally not recorded by the capture process. There are several reasons why considerations could be unintentionally omitted from design rationale. If the design rationale capture takes place outside of the design process, it is possible that contextual cues may not be present, and designers may not recall what they deliberated upon, or designers may not be available at the time the rationale is captured. For these reasons, it would appear that rationale should be captured in the context of design. However, it is not always possible or advantageous to capture rationale in


15

the design context. Grudin [10] notes that in certain development environments, exploring design space can be detrimental because it diverts critical resources. Additionally, many design decisions are considered in informal situations, where capturing the rationale is infeasible [11]. Tracking the location of where the rationale was recorded, the persons present at the time of design rationale capture, their roles and expertise, and the environmental context of the capture can help reviewers infer why specific information was considered. Designers Should Design, Not Record Rationale. Tacit knowledge [12] is a term used to describe things that we know, but are not able to bring to consciousness. It is possible that design rationale may unintentionally be omitted because a designer may not be able to explicate their tacit knowledge. Designers may not be able or willing to spend the energy to articulate their thoughts into the design rationale system, especially when they reach breakdowns, and are focusing on understanding and resolving the problem at hand. Conklin and Bergess-Yakemovic [7] state that designers focus should be on solving problems and not on capturing their decisions. During routine situations, designers react to problems as they arise without consciously thinking about them. Recording Rationale Can Be Dangerous! Sharing knowledge can be detrimental to designers, especially if the information they share could potentially be used against them. Designers may be hesitant to simply give away knowledge without knowing who will use it or how it will be used. Rewarding knowledge sharing is a challenging task that involves creating tangible rewards for intangible ideas. This is especially difficult considering that there is often no way to evaluate which ideas resulted in the success or failure of an artifact. In certain contexts, there are privacy and security concerns with the design rationale. For instance, organizations may want to keep their rationale secure so that competing organizations cannot gain a competitive advantage. Similarly, there may be political repercussions or security breaches if policy makers make their rationale available to the public. For example, designers may not want to document all of their considerations because politically motivated information could be held against them. There are also situations where people working outside the specified work procedures may not want to document their work-arounds in fear that it will be detrimental to them. Designers may not want to capture rationale that could be viewed as detrimental to themselves or certain other people, and therefore will intentionally omit certain rationale. Additionally, individual designers may not want their design considerations to be available for post-hoc scrutiny. 3.3 Retrieval Barriers Karsenty [13] evaluated design documents and found that design rationale questions were by far the most frequent questions during design evaluation meetings. However, only 41% of the design rationale questions were answered by the design rationale documentation. The reasoning for the discrepancy between the needed and captured design rationale is broken into several high-level reasons, including analysts not capturing questions, options, or criteria; the inadequacy of the design rationale method; and the lack of understanding. Other literature has focused on several issues

16


that contribute to this failure, including inappropriate representations [14,15] the added workload required of designers [7,10] exigent organizational constraints [11] and contextual differences between the design environment at the time when the rationale is captured and the time when it is needed [9]. Relevance Is Situational. Initial designers and subsequent users of rationale may have different notions of what is relevant in a given design context. Wilson [16] describes relevance as a relationship between a user and a piece of information, and as independent of truth. Relevance is based on a user’s situational understanding of a concern. Moreover, he argues that situational relevance is an inherently indeterminate notion because of the changing, unsettled, and undecided character of our concerns. This suggests that the rationale constructed at design time may not be relevant to those reviewing the rationale at a later time in a different context. When rationale is exhaustively captured, there is an additional effort required to capture the information. And, when too little information is captured, the reviewers’ questions remain unanswered. Belkin [17] describes information retrieval as a type of communication whereby a user is investigating their state of knowledge with respect to a problem. Belkin contends that the success of the communication is dependent upon the extent to which the anomaly can be resolved based on the information provided, and thus is controlled by the recipient. This suggests that designers cannot recognize the relevance of rationale until a person queries it. And, later uses may not be able to specify what information will be most useful, but rather will only recognize that they do not have the necessary knowledge to resolve a problem. Indexing. A more structured representation can make it more difficult to capture design ideas, but can facilitate indexing and retrieval. One problem is that there is an inherent tradeoff between representational flexibility and ease of retrieval. Unstructured text is easier to record, but more difficult to structure in a database. One solution is to push the burden on to those who are receiving the benefit [8] which would be the retrievers in this case. However, if the potential users of the rationale find the system to be too effortful, then it will go unused. Then, designers will not be inclined to spend time entering design rationale into a system that will not be used. 3.4 Usage Barriers People reviewing design rationale have a goal and a task at hand that they hope the design rationale will support. Often, these people are also involved in designing. If this is the case, the reviewers may not know whether retrieved rationale is applicable to their current problem. The Same Problem in a Different Context Is a Different Problem. Because design problems are unique, even rationale that successfully resolved one design problem may not be applicable to a different problem. In addition to the problem of accurately and exhaustively capturing rationale, recognizing the impact of rationale can be a difficult task. Understanding rationale tied to one problem could help resolve similar problems in the future. However, design is contextual, and external factors often interact with the


17

design activity in a complex and unexpected manner. Reviewers of rationale are interested in understanding information to help them with their task-at-hand, and without understanding the context of those problems, utilization of the information becomes difficult. The inherent problem of identifying the impact of rationale across different design problems adds a net cost to utilizing rationale, decreasing the overall utility in the design process. Initiative Falls on the User. Design rationale systems are passive rather than active. The initiative to find relevant rationale falls on the user. The system does not suggest it; it is the user’s responsibility to retrieve it. 3.5 Organizational Barriers As Davenport and Prusak warn in their book [18] “if you build it, they may not come." Being able to build a system is only an initial step; the “gold standard” against which success is measured, however, is whether people will accept and use it. Designers don’t Control the Reward Structure of Users. As system builders, we do not have much control over the personal reward systems of the individual users and management mandate that many [18,19] recommend will enhance usage of the technology, and therefore we can not motivate our users as such. Therefore, we must rely on other factors. Informal Knowledge is Difficult to Capture. Design Rationale tools must support both formal and informal knowledge, making the system flexible enough so that broad content types were supported [20]. They must support multiple levels of organization of content and design systems so that knowledge can be structured at any time after it is entered [21].

4 Conclusions In this paper, we have explored the role of design rationale research within the broader design community. And, we have looked into a number of barriers that impede design rationale as an effective tool for reflection, communication, and analysis. The barriers were discussed in terms of cognitive, capture, retrieval, usage, and organizational limitations. At one level, the intent of design rationale is to transmit information from a designer working at one time and in one context to another designer working in another time and context. This is the most frequently-cited goal in design rationale research. But, is this the ultimate goal of design rationale? We argue that it is not. The goal of research on design rationale is to improve the quality of designs. There are fundamental barriers to developing information systems that support asynchronous communication among designers working on different design problems. Therefore, design research should focus on supporting designers who better understand the context of their unique problems.

18


The goal of research on design rationale is to improve the quality of designs. There are fundamental barriers to developing computer systems that support communication among designers working on design problems. Therefore, the focus of design rationale should be on identifying what tools are most appropriate for the task. Using less persistent modes of communication, putting a greater emphasis on supporting design processes rather than design tools, and creating systems that are optimized for a single purpose are necessary steps for improving design.

References 1. Wania, C., McCain, K., Atwood, M.E.: How do design and evaluation interrelate in HCI research? In: Proceedings of the 6th ACM conference on Designing Interactive systems, June 26-28, 2006, University Park, PA, USA (2006) 2. Dutoit, McCall, Mistrik, Paech. (eds.) Rationale Management in Software Engineering. Springer Heidelberg 3. Horner, J., Atwood, M.E.: Design rationale: the rationale and the barriers. In: Proceedings of the 4th ACM Nordic conference on Human-computer interaction: changing roles (2006) 4. Rittel, H., Weber, M.: Planning Problems are Wicked Problems. In: Cross, N. (ed.) Developments in design methodology, pp. 135–144. Wiley, Chichester; New York (1984) 5. Brooks, F.P.: The mythical man-month: essays on software engineering. Addison-Wesley Pub. Co, Reading, Mass (1995) 6. Simon, H.A.: The sciences of the artificial. Cambridge, MA, MIT Press. 1996. Tenner, E. Why things bite back: technology and the revenge of unintended consequences. New York, Knopf (1996) 7. Conklin, E., Bergess-Yakemovic, K.: A process oriented approach to design rationale. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 8. Grudin, J.: Groupware and social dynamics: eight challenges for developers. Communications of the ACM 37(1), 92–105 (1994) 9. Gruber, T., Russell, D.: Generative Design Rationale. Beyond the Record and Replay Paradigm. In: Moran, T.P., Carroll, J.M. (eds.) esign rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 10. Grudin, J.: Evaluating opportunities for design capture. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 11. Sharrock, W., Anderson, R.: Synthesis and Analysis: Five modes of reasoning that guide design. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 12. Polanyi, M.: The tacit dimension. Doubleday, Garden City, NY (1966) 13. Karsenty, L.: An empirical evaluation of design rationale documents. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 150–156. ACM Press, New York (1996) 14. Lee, J., Lai, K.: What’s in design rationale? In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 15. MacLean, A., Young, R., Bellotti, V., Moran, T.: Questions, Options, Criteria: Elements of design space analysis. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 16. Wilson, P.: Situational Relevance. Information Stor. Retrieval 9, 457–471 (1973)


19

17. Belkin, N.: Anomalous States of Knowledge as a Basis for Information Retrieval. Canadian Journal of Information Science 5, 133–143 (1980) 18. Davenport, T.H., Prusak, L.: Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, Massachusetts (1998) 19. Orlikowski, W.J., Hofman, J.D.: An Improvisational Model for Change Management: The Case of Groupware Technologies, Sloan Management Review/Winter, pp. 11–21 (1997) 20. Davenport, T.H.: Saving IT’s Soul: Human-Centered Information Management, Harvard Business Review: Creating a System to Manage Knowledge, 1994, product #39103, pp. 39–53 (1994) 21. Shipman, F., McCall, R.: Incremental Formalization with the Hyper-Object Substrate. ACM Transactions on Information Systems (1999)

HCI and the Face: Towards an Art of the Soluble Christoph Bartneck1 and Michael J. Lyons2 1

Department of Industrial Design, Eindhoven University of Technology, Den Dolech 2, 5600 MB Eindhoven, The Netherlands [email protected] 2 ATR Intelligent Robotics and Communication Labs, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan [email protected]

Abstract. The human face plays a central role in most forms of natural human interaction so we may expect that computational methods for analysis of facial information and graphical and robotic methods for synthesis of faces and facial expressions will play a growing role in human-computer and human-robot interaction. However, certain areas of face-based HCI, such as facial expression recognition and robotic facial display have lagged others, such as eye-gaze tracking, facial recognition, and conversational characters. Our goal in this paper is to review the situation in HCI with regards to the human face, and to discuss strategies which could bring more slowly developing areas up to speed. Keywords: face, hci, soluble, recognition, synthesis.

1 Introduction The human face is used in many aspects of verbal and non-verbal communication: speech, the facial expression of emotions, gestures such as nods, winks, and other human communicative acts. Subfields of neuroscience, cognitive science, and psychology are devoted to study of this information. Computer scientists and engineers have worked on the face in graphics, animation, computer vision, and pattern recognition. A widely stated motivation for this work is to improve human computer interaction. However, relatively few HCI technologies employ face processing (FP). At first sight this seems to reflect technical limitations to the development of practical, viable applications of FP technologies. This paper has two aims: (a) to introduce current research on HCI applications of FP, identifying both successes and outstanding issues, and (b) to propose, that an efficient strategy for progress could be to identify and approach soluble problems rather than aim for unrealistically difficult applications. While some of the outstanding issues in FP may indeed be as difficult as many unsolved problems in artificial intelligence, we will argue that skillful framing of a research problem can allow HCI researchers to pursue interesting, soluble, and productive research. For concreteness, this article will focus on the analysis of facial expressions from video input, as well as their synthesis with animated characters or robots. Techniques for automatic facial expression processing have been studied intensively in the pattern J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 20–29, 2007. © Springer-Verlag Berlin Heidelberg 2007

HCI and the Face: Towards an Art of the Soluble

21

recognition community and the findings are highly relevant to HCI [1, 2]. Work on animated avatars may be considered to be mature [3], while the younger field of social robotics is expanding rapidly [4-6]. FP is a central concern in both of these fields, and HCI researchers can contribute to and benefit from the results.

2 HCI and the Face Computer scientists and engineers have worked increasingly on FP, from the widely varying viewpoints of graphics, animation, computer vision, and pattern recognition. However, an examination of the HCI research literature indicates that activity is restricted to a relatively narrow selection of these areas. Eye gaze has occupied the greatest share of HCI research on the human face (e.g. [7]). Eye gaze tracking technology is now sufficiently advanced that several commerical solutions are available (e.g. Tobii Technology [8]). Gaze tracking is a widely used technique in interface usability, machine-mediated human communication, and alternative input devices. This area can be viewed as a successful, sub-field related to face-based HCI. Numerous studies have emphasized the neglect of human affect in interface design and argued this could have major impact on the human aspects of computing [9]. Accordingly, there has been much effort in the pattern recognition, AI, and robotics communities towards the analysis, understanding, and synthesis of emotion and expression. In the following sections we briefly introduce the areas related to analysis and synthesis, especially by robots, of facial expressions. In addition, we share insights on these areas gained during a workshop we organized on the topic. 2.1 Analysis: Facial Expression Classification The attractive prospect of being able to gain insight into a user’s affective state may be considered one of the key unsolved problems in HCI. It is known that it is difficult to measure the “valence” component of affective state, as compared to “arousal”, which may be gauged using biosensors. However, a smile, or frown, provides a clue that goes beyond physiological measurements. It is also attractive that expressions can be guaged non-invasively with inexpensive video cameras. Automatic analysis of video data displaying facial expressions has become a topic of active area of computer vision and pattern recognition research (for reviews see [10, 11]). The scope of the problem statement has, however, been relatively narrow. Typically one measures the performance of a novel classification algorithm on recognition of the basic expression classes proposed by Ekman and Friesen [12]. Expression data often consists of a segmented headshot taken under relatively controlled conditions and classification accuracy is based on comparison with emotion labels provided by human experts. This bird’s eye caricature of the methodology used by the pattern recognition community given above is necessarily simplistic, however it underlines two general reflections. First, pattern recognition has successfully framed the essentials of the facial expression problem to allow for effective comparison of algorithms. This narrowing of focus has led to impressive developments of the techniques for facial expression analysis and substantial understanding. Second, the narrow framing of the

22

C. Bartneck and M.J. Lyons

FP problem typical in the computer vision and pattern recognition may not be appropriate for HCI problems. This observation is a main theme of this paper, and we suggest that progress on use of FP in HCI may require re-framing the problem. Perhaps the most salient aspect of our second general observation on the problem of automatic facial expression recognition is that HCI technology can often get by with partial solutions. A system that can discriminate between a smile and frown, but not an angry versus disgusted face, can still be a valuable tool for HCI researchers, even if it is not regarded as a particularly successful algorithm from the pattern recognition standpoint. Putting this more generally, components of algorithms developed in the pattern recognition community, may already have sufficient power to be useful in HCI, even if they do not yet constitute general facial expression analysis systems. Elsewhere in this paper we give several examples to back up this statement. 2.2 Synthesis: Robotic Facial Expressions There is a long tradition within the HCI community of investigating and building screen based characters that communicate with users [3]. Recently, robots have also been introduced to communicate with the users and this area has progressed sufficiently that some review articles are available [4, 6]. The main advantage that robots have over screen based agents is that they are able to directly manipulate the world. They not only converse with users, but also perform embodied physical actions. Nevertheless, screen based characters and robots share an overlap in motivations for and problems with communicating with users. Bartneck et al. [13] has shown, for example, that there is no significant difference in the users’ perception of emotions as expressed by a robot or a screen based character. The main motivation for using facial expressions to communicate with a user is that it is, in fact, impossible not to communicate. If the face of a character or robot remains inert, it communicates indifference. To put it another way, since humans are trained to recognize and interpret facial expressions it would be wasteful to ignore this rich communication channel. Compared to the state of the art in screen-based characters, such as Embodied Conversational Agents [3], however, the field of robot’s facial expressions is underdeveloped. Much attention has been paid to robot motor skills, such as locomotion and gesturing, but relatively little work has been done on their facial expression. Two main approaches can be observed in the field of robotics and screen based characters. In one camp are researchers and engineers who work on the generation of highly realistic faces. A recent example of a highly realistic robot is the Geminoid H1 which has 13 degrees of freedom (DOF) in its face alone. The annual Miss Digital award [14] may be thought of as a benchmark for the development of this kind of realistic computer generated face. While significant progress has been made in these areas, we have not yet reached human-like detail and realism, and this is acutely true for the animation of facial expressions. Hence, many highly realistic robots and character currently struggle with the phenomena of the “Uncanny Valley” [15], with users experiencing these artificial beings to be spooky or unnerving. Even


23

the Repliee Q1Expo is only able to convince humans of the naturalness of its expressions for at best a few seconds [16]. In summary, natural robotic expressions remain in their infancy [6]. Major obstacles to the development of realistic robots lie with the actuators and the skin. At least 25 muscles are involved in the expression in the human face. These muscles are flexible, small and be activated very quickly. Electric motors emit noise while pneumatic actuators are difficult to control. These problems often result in robotic heads that either have a small number of actuators or a somewhat larger-thannormal head. The Geminoid H1 robot, for example, is approximately five percent larger than its human counterpart. It also remains difficult to attach skin, which is often made of latex, to the head. This results in unnatural and non-human looking wrinkles and folds in the face. At the other end of the spectrum, there are many researchers who are developing more iconic faces. Bartneck [17] showed that a robot with only two DOF in the face can produce a considerable repertoire of emotional expressions that make the interaction with the robot more enjoyable. Many popular robots, such as Asimo [18], Aibo [19] and PaPeRo [20] have only a schematic face with few or no actuators. Some of these only feature LEDs for creating facial expressions. The recently developed iCat robot is a good example of an iconic robot that has a simple physically-animated face [21]. The eyebrows and lips of this robot move and this allows synthesis of a wide range of expressions. More general and fundamental unsolved theoretical aspects of facial information are also relevant to the synthesis of facial expressions. The representation of the space of emotional expressions is a prime example [22]. The space of expressions is often modeled either with continuous dimensions, such as valence and arousal [23] or with a categorical approach [12]. This controversial issue has broad implications for all HCI applications involving facial expression [22]. The same can be said for other fundamental aspects of facial information processing, such as the believability of synthetic facial expressions by characters and robots [5, 24].

3 Workshop on “HCI and the Face” As part of our effort to examine the state of the field of FP in HCI, we organized a day-long workshop the ACM CHI’2006 conference (see: http://www.bartneck.de/ workshop/chi2006/ for details). The workshop included research reports, focus groups, and general discussions. This has informed our perspective on the role of FP in HCI, as presented in the current paper. One focus group summarized the state of the art in facial expression analysis and synthesis, while another brainstormed HCI applications. The idea was to examine whether current technology sufficient advanced to support HCI applications. The proposed applications were organized with regards to the factors “Application domain” and “Intention” (Table 1). Group discussionseemed to naturally focus on applications that involve some type of agent, avatar or robot. It is nearly impossible to provide an exhaustive list of applications for each field in the matrix. The ones listed in the table should therefore be only considered as representative examples.

24

C. Bartneck and M.J. Lyons Table 1. Examples of face processing applications in HCI and HRI Intention Persuade

Application domain

Entertainment

Communication

Health

Advertisement: REA [3] Greta [25] Persuasive Technology [28] Cat [29] Health advisor Fitness tutor [32]

Being a companion

Educate

Aibo [19] Tamagotchi [26]

My Real Baby [27]

Avatar [30]

Language tutor [31]

Aibo for elderly [33] Paro [29] Attention Capture for Dementia Patients [34]

Autismtic children [35]

These examples well illustrate a fundamental problem of this research field. The workshop participants can be considered experts in the field and all the proposed example applications were related to artificial characters, such as robots, conversational agents and avatars. Yet not one of these applications has become a lasting commercial success. Even Aibo, the previously somewhat successful entertainment robot, has been discontinued by Sony in 2006. A problem that all these artificial entities have to deal with is, that while their expression processing has reached an almost sufficient maturity, their intelligence has not. This is especially problematic, since the mere presence of an animated face raises the expectation levels of its user. An entity that is able to express emotions is also expected to recognize and understand them. The same holds true for speech. If an artificial entity talks then we also expect it to listen and understand. As we all know, no artificial entity has yet passed the Turing test or claimed the Loebner Prize. All of the examples given in Table 1 presuppose the existence of a strong AI as described by John Searle [36]. The reasons why strong AI has not yet been achieved are manifold and the topic of lengthy discussion. Briefly then, there are, from the outset, conceptual problems. John Searle [36] pointed out that digital computers alone can never truly understand reality because it only manipulates syntactical symbols that do not contain semantics. The famous ‘Chinese room’ example points out some conceptual constraints in the development of strong AIs. According to his line of arguments, IBM’s chess playing computer “Deep Blue” does not actually understand chess. It may have beaten Kasparov, but it does so only by manipulating meaningless symbols. The creator of Deep Blue, Drew McDermott [37], replied to this criticism: "Saying Deep Blue doesn't really think about chess is like saying an airplane doesn't really fly because it doesn't flap its wings." This debate reflects different philosophical viewpoints on what it means to think and understand. For centuries philosophers have thought about such questions and perhaps the most important conclusion is that there is no conclusion at this point in time. Similarly, the possibility of developing a strong AI remains an open question. All the same, it must be admitted that some kind of progress has been made.


25

In the past, a chess-playing machine would have been regarded as intelligent. But now it is regarded as the feat of a calculating machine – our criteria for what constitutes an intelligent machine has shifted. In any case, suffice it to say that no sufficiently intelligent machine has yet emerged that would provide a foundation for our example applications given in Table 1. The point we hope to have made with the digression into AI is that the application dreams of researchers sometimes conceal rather unrealistic assumptions about what is possible to achieve with current technology.

4 Towards an “Art of the Soluble” The outcome of the workshop we organized was unexpected in a number of ways. Most striking was the vast mismatch between the concrete and fairly realistic description of the available FP technology and its limitations arrived at by one of the focus groups, and the blue-sky applications discussed by the second group. Another sharp contrast was evident at the workshop. The actual presentations given by participants were pragmatic and showed effective solutions to real problems in HCI not relying on AI. This led us to the reflection that scientific progress often relies on what the Nobel prize winning biologist Peter Medawar called “The Art of the Soluble” [38]. That is, skill in doing science requires the ability to select a research problem which is soluble, but which has not yet been solved. Very difficult problems such as strong AI may not yield to solution over the course of decades, so for most scientific problems it is preferable to work on problems of intermediate difficulty, which can yield results over a more reasonable time span, while still being of sufficient interest to constitute progress. Some researchers of course are lucky or insightful enough to re-frame a difficult problem in such a way as to reduce its difficulty, or to recognize a new problem which is not difficult, but nevertheless of wide interest. In the next two subsections we illustrate the general concept with examples from robotic facial expression synthesis as well as facial expression analysis. 4.1 Facial Expression Synthesis in Social Robotics As we argued in section 2, the problems inherited by HRI researchers from the field of AI can be severe. Even if we neglect philosophical aspects of the AI problem and are satisfied with a computer that passes the Turing test, independently of how it achieves this, we will still encounter many practical problems. This leads us to the socalled “weak AI” position, namely claims of achieving human cognitive abilities are abandoned. Instead, this approach focuses on specific problem solving or reasoning tasks. There has certainly been progress in weak AI, but this has not yet matured sufficiently to support artificial entities. Indeed, at present, developers of artificial entities must to resort to scripting behaviors. Clearly, the scripting approach has its limits and even the most advanced common sense database, Cyc [39] , is largely incomplete. FP should therefore not bet on the arrival of strong AI solutions, but focus on what weak AI solutions can offer today. Of course there is still hope that eventually also strong AI applications will become possible, but this may take a long time.

26


Fig. 1. Robots with animated faces

When we look at what types of HRI solutions are currently being built, we see that a large number of them do barely have any facial features at all. Qrio, Asimo and Hoap-2, for example, are only able to turn their heads with 2 degrees of freedom (DOF). Other robots, such as Aibo, are able to move their head, but have only LEDs to express their inner states in an abstract way. While these robots are intended to interact with humans, they certainly avoid facial expression synthesis. When we look at robots that have truly animated faces, we can distinguish between two dimensions: DOF and iconic/realistic appearance (see Figure 1). Robots in the High DOF/Realistic quadrant not only have to fight with the uncanny valley [40] they also may raise user expectations of a strong AI which they are not able to fulfill. By contrast, the low DOF/Iconic quadrant includes robots that are extremely simple and perform well in their limited application domain. These robots lie well within the domain of the soluble in FP. The most interesting quadrant is the High DOF/Iconic quadrant. These robots have rich facial expressions but avoid evoking associations with a strong AI through their iconic appearance. We propose that research on such robots has the greatest potential for significant advances in the use of FP in HRI. 4.2 Facial Analysis for Direct Gesture-Based Interaction The second example we use to illustrate the “Art of the Soluble” strategy comes from the analysis of facial expressions. While there is a large body of work on automatic


27

facial expression recognition and lip reading within the computer vision and pattern recognition research communities, relatively few studies have examined the possible use of the face in direct, intentional interaction with computers. However, the complex musculature of the face and extensive cortical circuitry devoted to facial control suggest that motor actions of the face could play a complementary or supplementary role to that played by the hands in HCI [1]. One of us has explored this idea through a series of projects using vision-based methods to capture movement of the head and facial features and use these for intentional, direct interaction with computers. For example, we have used head and mouth motions for the purposes of hands-free text entry and single-stroke text character entry on small keyboards such as found on mobile phones. Related projects used action of the mouth and face for digital sketching and musical expression. One of the systems we developed tracked the head and position of the nose and mapped the projected position of the nose tip in the image plane to the coordinates of the cursor. Another algorithm segmented the area of the mouth and measured the visible area of the cavity of the user’s mouth in the image plane. The state of opening/closing of the mouth could be determined robustly and used in place of mouse-button clicks. This simple interface allowed for text entry using the cursor to select streaming text. Text entry was started and paused by opening and closing the mouth, while selection of letters was accomplished by small movements of the head. The system was tested extensively and found to permit comfortable text entry at a reasonable speed. Details are reported in [41]. Another project used the shape of the mouth to disambiguate the multiple letters mapped to the keys of a cell phone key pad [42]. Such an approach works very well for Japanese, which has a nearly strict CV (consonant-vowel) phoneme structure, and only five vowels. The advantage of this system was that it took advantage of existing user expertise in shaping the mouth to select vowels. With some practice, users found they could enter text faster than with the standard multi-tap approach. The unusual idea of using facial actions for direct input may find least resistance in the realm of artistic expression. Indeed, our first explorations of the concept were with musical controllers using mouth shape to control timbre and other auditory features [43]. Of course, since many musical instruments rely on action of the face and mouth, this work has precedence, and was greeted with enthusiasm by some musicians. Similarly, we used a mouth action-sensitive device to control line properties while drawing and sketching with a digital tablet [44]. Here again our exploration elicited a positive response from artists who tried the system. The direct action facial gesture interface serves to illustrate the concept that feasible FP technology is ready to be used as the basis for working HCI applications. The techniques used in all the examples discussed are not awaiting the solution of some grand problem in pattern recognition: they work robustly in real-time under a variety of lighting conditions.

5 Conclusion In this paper we have argued in favour of an “Art of the Soluble” approach in HCI. Progress can often be made by sidestepping long-standing difficult issues in artificial

28


intelligence and pattern recognition. This is partly intrinsic to HCI: the presence of a human user for the system being developed implies leverage for existing computational algorithms. Our experience and the discussions that led to this article have also convinced us that HCI researchers tend towards an inherently pragmatic approach even if they are not always self-conscious of the fact. In summary, we would like to suggest that skill in identifying soluble problems is already a relative strength of HCI and this is something that would be worth further developing.

References [1] Lyons, M.J.: Facial Gesture Interfaces for Expression and Communication, IEEE International Conference on Systems, Man and Cybernetics, The Hague (2004) [2] Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic Classification of Single Facial Images. IEEE PAMI 21, 1357–1362 (1999) [3] Cassell, J., Sullivan, J., Prevost, S., Churchill, E.: Embodied Conversational Agents. MIT Press, Cambridge (2000) [4] Bartneck, C., Okada, M.: Robotic User Interfaces, HC2001, Aizu (2001) [5] Bartneck, C., Suzuki, N.: Subtle Expressivity for Characters and Robots. In International Journal of Human Computer Studies, vol. 62, Elsevier, pp. 306 (2004) [6] Fong, T., Nourbakhsh, I., Dautenhahn, K.: A survey of socially interactive robots. Robotics and Autonomous Systems 42, 143–166 (2003) [7] Zhai, S., Morimoto, C., Ihde, S.: Manual and gaze input cascaded (MAGIC) pointing presented at ACM CHI’99 [8] Tobii Technology, Tobii Technology (2007) Retrieved February 2007, from http://www.tobii.com/ [9] Picard, R.W.: Affective computing. MIT Press, Cambridge (1997) [10] Pantic, M., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: the state of the art. IEEE PAMI 22, 1424–1445 (2000) [11] Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognition 36, 259–275 (2003) [12] Ekman, P., Friesen, W.V.: Unmasking the Face. Prentice-Hall, Englewood Cliffs (1975) [13] Bartneck, C., Reichenbach, J., Breemen, A.: In your face, robot! The influence of a character’s embodiment on how users perceive its expressions, Design and Emotion (2004) [14] Cerami, F.: Miss Digital World (2006) Retrieved August 4th, from http://www.missdigitalworld.com/ [15] Mori, M.: The Uncanny Valley, Energy, vol. The. Uncanny Valley 7, 33–35 (1970) [16] Ishiguro, H.: Towards a new cross-interdisciplinary framework, presented at CogSci Workshop Towards social Mechanisms of android science, Stresa (2005) [17] Bartneck, C.: Interacting with an Embodied Emotional Character, presented at Design for Pleasurable Products Conference (DPPI2004), Pittsburgh (2003) [18] Honda, Asimo (2002) Retrieved from http://www.honda.co.jp/ASIMO/ [19] Sony, Aibo (1999) Retrieved January, 1999, from http://www.aibo.com [20] NEC, PaPeRo (2001) Retrieved from http://www.incx.nec.co.jp/robot [21] Breemen, A., Yan, X., Meerbeek, B.: iCat: an animated user-interface robot with personality, 4th Intl. Conference on Autonomous Agents & Multi Agent Systems (2005) [22] Schiano, D.J.: Categorical Imperative NOT: Facial Affect is Perceived Continously, presented at ACM CHI’2004 (2004)


29

[23] Russell, J.A.: Affective space is bipolar. Journal of personality and social psychology 37, 345–356 (1979) [24] Bartneck, C.: How convincing is Mr. Data’s smile: Affective expressions of machines. User Modeling and User-Adapted Interaction 11, 279–295 (2001) [25] Pelachaud, C.: Multimodal expressive embodied conversational agents, In: Proceedings of the 13th annual ACM international conference on Multimedia (2005) [26] Bandai, Tamagotchi (2000) Retrieved from http://www.bandai.com/ [27] Lund, H.H., Nielsen, J.: An Edutainment Robotics Survey, 3rd Intl. Symposium on Human and Artificial Intelligence Systems (2002) [28] Fogg, B.J.: Persuasive technology: using computers to change what we think and do. Morgan Kaufmann Publishers, Amsterdam, Boston (2003) [29] Catherine, Z., Paula, G., Larry, H.: Can a virtual cat persuade you?: The role of gender and realism in speaker persuasiveness, presented at ACM CHI’2006 (2006) [30] Biocca, F.: The cyborg’s dilemma: embodiment in virtual environments, 2nd Intl. Conference on Cognitive Technology - Humanizing the Information Age (1997) [31] Schwienhorst, K.: The State of VR: A Meta-Analysis of Virtual Reality Tools in Second Language Acquisition. Computer Assisted Language Learning 15, 221–239 (2002) [32] Mahmood, A.K., Ferneley, E.: Can Avatars Replace The Trainer? A case study evaluation, International Conference on Enterprise Information Systems (ICEIS), Porto (2004) [33] Tamura, T., Yonemitsu, S., Itoh, A., Oikawa, D., Kawakami, A., Higashi, Y., Fujimooto, T., Nakajima, K.: Is an entertainment robot useful in the care of elderly people with severe dementia? The. Journals of Gerontology Series A 59, M83–M85 (2004) [34] Wiratanaya, A., Lyons, M.J., Abe, S.: An interactive character animation system for dementia care, Research poster, ACM SIGGRAPH (2006) [35] Robins, B., Dautenhahn, K., Boekhorst, R., t. Boekhorst, R., Billard, A.: Robotic Assistants in Therapy and Education of Children with Autism: Can a Small Humanoid Robot Help Encourage Social Interaction Skills? In: UAIS, 4(2), 1–20. Springer-Verlag, Heidelberg (2005) [36] Searle, J.R.: Minds, brains and programs. Behavioral and Brain Sciences 3, 417–457 (1980) [37] McDermott, D.: Yes, Computers Can Think, in New York Times (1997) [38] Medawar, P.B.: The art of the soluble. Methuen, London (1967) [39] Cycorp, Cyc. (2007) Retrieved February 2007, from http://www.cyc.com/ [40] MacDorman, K.F.: Subjective ratings of robot video clips for human likeness, familiarity, and eeriness: An exploration of the uncanny valley, ICCS/CogSci-2006 (2006) [41] de Silva, G.C., Lyons, M.J., Kawato, S., Tetsutani, N.: Human Factors Evaluation of a Vision-Based Facial Gesture Interface, IEEE CVPR (2003) [42] Lyons, M.J., Chan, C., Tetsutani, N.: MouthType: Text Entry by Hand and Mouth, presented at ACM CHI’2004 (2004) [43] Lyons, M.J., Tetsutani, N.: Facing the Music: A Facial Action Controlled Musical Interface, presented at ACM CHI’2001 (2001) [44] Chan, C., Lyons, M.J., Tetsutani, N.: Mouthbrush: Drawing and Painting by Hand and Mouth, ACM ICMI-PUI’2003 (2003)

Towards Generic Interaction Styles for Product Design Jacob Buur and Marcelle Stienstra Mads Clausen Institute for Product Innovation, University of Southern Denmark Grundtvigs Allé 150, 6400 Sønderborg, Denmark {buur, marcelle}@mci.sdu.dk

Abstract. A growing uneasiness among users with the experience of current product user interfaces mounts pressure on interaction designers to innovate user interface conventions. In previous research we have shown that a study of the history of product interaction triggers a broader discussion of interaction qualities among designers in a team, and that the naming of interaction styles helps establish an aesthetics of interaction design. However, that research focused on one particular product field, namely industrial controllers, and it was yet to be proven, if interaction styles do have generic traits across a wider range of interactive products. In this paper we report on five years of continued research into interaction styles for telephones, kitchen equipment, HiFi products and medical devices, and we show how it is indeed possible and beneficial to formulate a set of generic interaction styles. Keywords: Interaction styles, interaction history, product design, user interface design, tangible interaction, quality of interaction.

1 Introduction Across many types of products — including cameras, radios and telephones — we can identify similar patterns of development over the 20th century. Some aspects of the historical developments are clear: functions proliferate, products become smaller and more mobile, and computing power increases. Other aspects are perhaps less obvious: the form and function of the product are increasingly separated and the interaction with the product turns less and less physical. Early 20th century products express their core functionality through their forms and the actions required to access their functionality. In later products, the relationship between form, interaction and function becomes increasingly arbitrary as meaningful physical components are encased, replaced by electronics and become microprocessor-controlled. The user appears to loose touch with the functionality of the product as the interface layer in between them grows more complex and opaque. This is the major challenge for user interaction design today – a challenge much greater than simply creating ’user-friendly’ interfaces: to convey expression through form and interaction. IT-Products never come ‘neutral’: they always embody a style and certain values. In architecture and industrial design, the concept of style is used to achieve an aesthetic coherence with the predominant thinking in society. Style plays a major role J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 30–39, 2007. © Springer-Verlag Berlin Heidelberg 2007

Towards Generic Interaction Styles for Product Design

31

in education as a way of explaining the historical inheritance and debating the difference between alternative design solutions. Since user interaction design shares many characteristics with industrial design, we claim that interaction design can benefit greatly from an understanding of the concept of style. It can provide designers with strong visions and a sense of direction in designing new user interfaces. In particular we focus on user interface design for physical IT-products with small displays and dedicated keys, because of the tight coupling of interaction design and industrial design. The design of such user interfaces seems largely governed by technological progress, and to a large extent they seem to inherit user interface principles from the computer world, just one generation delayed. Human-Computer Interaction (HCI) interface principles were designed for full keyboard and mouse operation, therefore they become much more cumbersome with a tiny display and a limited number of keys. And in particular, when moving away from buttons and screen to forms of tangible interaction, HCI principles fall short of providing much help. We are concerned that interaction designers in enthusiasm with new technologies fail to transfer the qualities of use, which were achieved with previous technologies. It is, however, pointless to exactly copy products of the past as society’s needs and values have changed and technology has moved on. But we argue that it is possible to use the interaction style of a particular period as inspiration for an innovative blend of interaction style, functionality and technology within a contemporary interaction design. In this way we may be able to preserve qualities of interaction otherwise lost in history. In previous research we have shown that a study of the history of product interaction triggers a deep discussion of interaction qualities among designers in a team, and that the naming of interaction styles helps establish an aesthetics of interaction design [1, 2]. Since then we have expanded our research from industrial controllers to a broader range of interactive products including telephones, kitchen equipment, HiFi products and medical devices, and in this paper we will show that it is possible and beneficial to formulate a set of generic interaction styles for interactive products. Our research is based on two types of investigations: 1. 2.

Historical analysis of interactive products. We identify and characterise style eras for each of five product fields, then compare the style eras across the fields. Design experiments (research-through-design). We exaggerate the qualities found in historic eras, but implement them with contemporary technology, e.g., a mobile phone with the interaction experience of a 1930 rotary dial telephone. Then we analyse all design experiments across the five product fields to identify core dilemmas in current interaction design. Based on these investigations we propose a set of four generic interaction styles.

2 Interaction Styles in History The concept of style has been the focus of much debate within all genres of art, from literature and visual arts to architecture and design. In recent decades, emphasis has shifted from understanding style as a method of categorisation based on particular

32

J. Buur and M. Stienstra

conventions of content and norms to an understanding that styles are defined within social groups and essentially dynamic both in form and function [3, 4]. In a relatively new field as interaction design, discussions about style have only started recently, e.g., [2, 5, 6]. Style has been used for different purposes, to classify products and systems [6], but also to serve as an inspiration to create a specific look and feel [7]. In this paper we focus mainly on this last approach. In our understanding of style, the following concepts are important: ‘network of norms’, ‘style marker’, and ‘interpretation community’. Essential to style is, as Merleau-Ponty explains in [5], the fact that perception – which lies at the basis of stylization – ‘cannot help but to constitute and express a point of view’. Stylization thus starts the moment we perceive of an object and is an individual activity: it depends very much upon the person (his/her competences, references and experiences), and context in which the stylization takes place. We compare the object with similar objects based on, for example, function and usage. Essential to this systematic activity is the existence of a given system, which Enkvist calls the ‘network of norms’[8]: ‘a compilation of prior experiences with objects into a style taxonomy that makes it possible to find correspondences, both differences and similarities, between new objects and previous norms’. Enkvist observes that all style experiences arise from comparison. The comparison of artefacts that we see as similar lets us identify ‘style markers’, i.e. elements in the products that significantly correlate with or deviate from prevailing norms of design [8]. Our investigation covers five product genres: industrial controllers (in collaboration with Danfoss), telephones (Nokia), kitchen equipment, HiFi equipment (Bang & Olufsen), and medical devices (Novo Nordisk). We organised the style study as a yearly 2-week seminar for graduate design students with a new product genre each year. Each seminar included literature search, museum studies, interviews with curators, and videotaping of interactions with historic products. Product collections in museums provide a good opportunity to engage into the comparison activity. With groups of 16-20 students (our ‘interpretation community) we were able to cover 2-3 museums for each product genre; typically a combination of a science museum and a private company collection. To ensure a broad view of styles, we split into 3-5 teams, each with a particular focus of study: • Society context: What is the dominating view on humans and technology? • Hands and skills: What movements and skills are required to interact? • Technology: What main technologies are employed for functions and manufacturing? • Company Spirit: What is the dominating self-image of the manufacturer? Based on the collected data we sequenced a timeline in appropriate eras of a dominating ‘style’, characterise each style era, create an appropriate set of style names, and produce collage posters to communicate findings (see fig. 1). The posters then served as input for the ensuing design experiments (described in section 3). The naming of style eras posed a particular challenge. Where style labels in the history of architecture and design typically spring out of the style discourse of the period (e.g., De Stijl, Dada, Art Deco) and the origins of dominant pieces of art (e.g., Bauhaus, Pop Art, Swiss Style), the discussion of user interaction experience is rather


33

Fig. 1. Four posters describing eras of telephone interaction styles, each covering the dominant aspects of community, hand movements & skills, knowledge allocation, technology & design

recent. So for each product genre we were in the unique situation of at the same time discussing and naming of all styles eras through the 20th century1. It became clear to us that interaction style names need to point to people, interaction purpose and experience, rather than to the visual identity of user interfaces (buttons, knobs, sliders). Thus we chose labels like Routine Caller (30s-70s telephones), Food Processor Queen (50s-70s kitchen equipment) and Analog Professional (60s-70s industry controllers) – rather flamboyant names to trigger the imagination of interaction designers. The naming discussions were long and intensive, because they seemed to condense the many observations and interpretations. In the first seminars we left the naming to a small sub-group to easier form consensus. But later we realised that this discussion may well be the core of forming a shared style understanding, as it contributes exactly to the development of the ‘network of norms’. The naming discussion seemed precisely to foster the building of shared norms in the investigation team. According to Engholm [5]: ‘the stylization will always depend on the discursive context that one is part of and on one’s historical, cultural or technical competence’. Therefore it was even more important that all students were involved, not just a small group. In addition to the naming activity, the format of style posters worked exceedingly well as a format to synthesize what we had seen at the museums. The graduate students took pride in their work and the style period labels quickly became part of the language repertoire in discussions. 1

One can argue that the invention of electricity also gave birth to the field of user interface design as we know it today. Therefore we focused in our study on products invented in the end of the 19th century and start of the 20th.

34


Fig. 2. Four interaction style eras presented in the form of a ‘style book’. The style eras have been generated after analyzing the timelines for all five product genres. The ‘operation’ of the pages symbolizes the main mode of interaction of the respective eras: turning, sliding, clicking, and brushing.

When we align the style timelines for all five product genres, it is obvious that they share similar developments, although for some genres a new era may arrive earlier than for others (see fig. 3). To stress this we have indicated rather sharp transitions between one style and the next (see fig. 2). In reality the eras should be seen as waves with large overlaps. One would expect this similarity, as all products in an overall sense are embedded in the same society discourses and draw on the same technology inventions. We have however refrained from coining composite style labels, as we feel that the collapsing of the specific product style names would result in too abstract names without sufficient imaginative power.

3 Designing with Interaction Styles To test the power of interaction style thinking, we challenged our graduate students to design contemporary digital products that would incorporate the interaction experience of each of the style eras studied: a mobile phone, a microwave oven, a motor controller. By keeping the specifications constant across styles we were able to compare experiences (see fig. 4). This lead to a large collection of design samples and many challenging discussions, like: how would you send an SMS with the feeling of an old time crank? Or how would you use the rotary dialing motion in a portable telephone?


Fig. 3. Comparison of interaction style studies for five different product genres

35

36


Fig. 4. Four mobile phones inspired by respectively the Magic Connector period, the Routine Caller period, the Life Chatter period and the Information Navigator period (see also fig. 2)

Based on the design experiments we claim that interaction style thinking indeed helps designers to increase their sensitivity to experience issues and break with user interfaces conventions. We support this by three observations. Firstly, we observed a fine spread of interaction qualities in the designs that the graduate students produced following the history style studies. Along the way some teams found it very difficult to let go of their preoccupation with button and display technology, some simply copied user interface components of the past. In the end, however, all teams created designs that support rich actions and established convincing links between actions and functions. Secondly, the graduate students were able to compare their designs to exemplars in history and, most importantly, they were explicit about the expression of interaction they wanted to support. They demonstrated in their presentations that they had established a shared understanding of different interaction styles based in history and the respective qualities of each style. Thirdly, the students themselves were positive about the interaction style thinking compared to their prior experiences. One student, for instance, expressed his surprise about the richness of interaction history: »Inspiration from the past is like going to the beach - there is so much more to find.« Another one adds: »We did suffer from preconceptions. We think we know all about telephones already.« Our next step was to see if by looking at the designs themselves we were able to abstract generic characteristics. We analyzed four motor controller designs, four mobile phones, and five microwaves ovens. In order to reduce the risk of a circular argument (that what we learn from the designs only confirms what we knew already from the historical study), we added a set of 10 MP3 player designs. This assignment did not explicitly refer to historical interaction styles, but required students to design a new interface for an iShuffle-inspired MP3 player (no screen, very simple functionality) that would support rich interaction, bodily engagement and the expressiveness of product movements [9]. For clarity reasons only 17 of the 23 designs appear on the clustering diagram in fig. 5. The analysis helped us explicate two dilemmas in current (tangible) interaction design. One, the designs seem to support either an explanatory or an exploratory mode of interacting. The ‘explanatory’ designs provide a direct link between the goal you want to achieve, and how to get there. Every step is explainable: there is a feeling of being in control. The ‘exploratory’ designs, on the other hand, are less ‘serious’ in that they support a playful building of interaction skills, where the goal may be less important than the action itself.


37

Fig. 5. Visual comparison of 17 student designs; 4 motor controllers, 4 telephones, 5 microwaves and 4 MP3 players. Only the MP3 players were not designed using the history styles explicitly as inspiration.

Two, there seems to be an important distinction between discrete and composite interface designs, or – put very bluntly – simple and complex products. The ‘discrete’ interaction designs favor one control for each function they offer (think of old radios with different buttons to choose wave lengths, sliders to select radio channels, knobs to adjust volume, treble, bass). Products with ‘composite’ interaction have general controls to access different, layered functions (think of the keypad on mobile phones).

4 Towards Generic Interaction Styles Looking back at the Interaction Styles in history and comparing them with the designs made by students, we argue that it is possible to extract four generic and contemporary interaction styles based on the presented material. We take inspiration from the work of Maaß & Oberquelle [10], who proposed four perspectives to explain differences in how designers conceive of the computer in relation to its users: the system perspective, the information processor perspective, the workshop perspective and the media perspective. We propose four interaction styles for interactive products characterized as follows: • Tangible Control (discrete, explanatory): the product exhibits its function through its design; the interface consists of several, discrete controls; the spatial arrangement of the controls supports the understanding of the product; the

38


interaction takes place there where the product is placed. This style supports the view that interactive technology is a tool that people employ to achieve a certain, explicit purpose. • Elastic Play (discrete, exploratory): there are specific controls for specific functions; the interface consists of a wide variety of general control types (buttons, sliders, handles etc.); the interaction supports physical input and feedback; learning to interact with the product requires both a cognitive and bodily understanding. Elastic Play banks on virtuosity: technology is an expressive instrument that people can learn to master, and aims growing with the skills. • Rhythmic Logics (composite, explanatory): the product is a complex system which consists of different layers; the interaction requires a cognitive understanding of the product; input is a rhythmic sequence of simple actions, like button tapping; the interaction focuses on efficiency; feedback is digitally mediated. Technology is an ‘intelligent’ partner that people negotiate sense with. • Touch-free Magic (composite, exploratory): the product reacts in surprising ways; it may not have one clear identity (e.g., phone, camera and music player in one); personal style (in appearance and/or interaction) is important - in a way, the user also becomes the designer of the product; the product supports an exploratory type of interaction with no or very light touch; the product may move and respond physically, but there is no tactile feedback; interaction with the product takes place there where the user is. This style supports the view of technology as a wonder, as something unexplainable, a magic that people can learn to engage in/with.

5 Conclusions The generic interaction styles presented here are based on five studies of interaction history combined with a number of conceptual design experiments. They have come into being after long discussions amongst interaction designers and researchers. They clearly refer to qualities of interaction from the past, but have a contemporary character being based upon current technology and needs and values of today’s society. A next step would be to investigate how the generic interaction styles work for interaction designers who haven’t been involved in the preceding discussions. Rather than use our generic style proposals as an analytic tool or as design guidelines, we aim at provoking interaction designers to discuss how they relate to their own product genres. Such discussions are vital in order for common understanding and agreement to arise, and to create a shared ‘network of norms’ in Enkvist’s sense [8]. Are the descriptions, examples and illustrations provided for each style enough for designers to serve as the inspiration we intended them to be? Or are specific activities – like museum studies - required to get a deeper understanding of the styles, and what could such activities be? Employed as a trigger for discussion we believe that the generic interaction styles can help interaction designers to innovate the dominant user interface conventions.


39

Acknowledgments. We would like to thank the IT Product Design students at the University of Southern Denmark for their enthusiastic participation in the interaction style experiments, in particular Mads Vedel Jensen, Peng Cheng, Mette Mark Larsen, Ken Zupan, Kyle Kilbourn, Anda Grarup, René Petersen and Yingying Wang who created poster style guides and helped analyze the material.

References 1. Øritsland, T.A., Buur, J.: Taking the best from a company history - designing with interaction styles. In: Symposium on Designing Interactive Systems 2000, ACM Press, New York (2000) 2. Øritsland, T.A., Buur, J.: Interaction Styles: An Aesthetic Sense of Direction in Interface Design. International Journal of Human-Computer Interaction 15(1), 67–85 (2003) 3. Chandler, D.: An Introduction to Genre Theory, [WWW document], [15.02.2007] (1997), URL: http://www.aber.ac.uk/media/Documents/intgenre/intgenre.html 4. Ylimaula, A.M.: Origins of style - Phenomenological approach to the essence of style in the architecture of Antoni Gaudi. In: Mackintosh, C.R., Wagner, O. (eds.) University of Oulu, Oulu, Finland (1992) 5. Engholm, I.: Digital style history: the development of graphic design on the Internet. Digital Creativity 13(4), 193–211 (2002) 6. Ehn, P., et al.: What kind of car is this sales support system? In: On styles, artifacts, and quality-in-use. In Computers and design in context, MIT Press, Cambridge (1997) 7. Engholm, I., Salamon, K.L.: Webgenres and -styles as socio-cultural indicators - an experimental, interdisciplinary dialogue. In: The Making, Copenhagen, Denmark ( 2005) 8. Enkvist, N.E.: Någat om begrepp och metoder i språkvetenskaplig stilforskning. In Om stilforskning/Research on Style, Stockholm, Sweden: Kunglig Vitterhets Historie och Antikvitetsakademien (1983) 9. Djajadiningrat, J.P., Matthews, B., Stienstra, M.: Easy Doesn’t Do It: Skill and Expression in Tangible Aesthetics. Special Issue on Movement of the Journal for Personal and Ubiquitous Computing, forthcoming 10. Maaß, S., Oberquelle, H.: Perspectives and Metaphors for Human-Computer Interaction. In: Floyd, C., et al. (eds.) Software Development and Reality Construction, pp. 233–251. Springer, Heidelberg (1992)

Context-Centered Design: Bridging the Gap Between Understanding and Designing Yunan Chen and Michael E. Atwood College of Information Science & Technology Drexel University, Philadelphia, PA, 19104, USA {yunan.chen, michael.atwood}@ischool.drexel.edu

Abstract. HCI is about how people use systems to conduct tasks in context. Most current HCI research focuses on a single or multiple users’ interaction with system(s). Compared with the user, system and task components, context is a less studied area. The emergence of ubiquitous computing, context-aware computing, and mobile computing requires system design to be adaptive and respond to aspects of setting in which the tasks are performed, including other users, devices and environments. Given the importance of context in information system design, we note that even the notion of context in HCI is not well-defined. In this paper, we review several theories of context as it relates to interaction design. We also present our Context-centered Framework which is aimed to bridging end users’ understand and designers’ designing together. The research design and expected outcomes are also presented.

1 Introduction HCI is about how people use systems to conduct tasks in some context. Traditional HCI research focuses on a single user or multiple users interacting with computer systems to complete their tasks. Little research currently attacks the issue of context directly. In the 1980’s the Scandinavian school of HCI [1] illustrated the importance of involving end users in the design process and, in doing so, brought attention to the context in which work take place. Now, studies would include one, or a few, users, usually one system, some well-defined tasks and a well-defined context. Compared to people, system and task, context is less certain pervasive in current researches. The emergence of ubiquitous computing, context-aware computing and mobile computing requires that design systems can work seamlessly in the working environments. Therefore, it is necessary to take context into consideration, especially in certain context rich environment where the complex working setting interacts and effect system application. Although HCI researchers and practitioners are becoming increasingly aware of the importance of context in system design, especially in the highly complex working environments, a single, widely accepted definition of context does not exist. Often context have been used to represent fixed artifacts, such as the people, locations and surrounding artifacts involved in an interaction. [2, 3] This viewpoint focuses more on the environment surrounding the interaction than on the J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 40–48, 2007. © Springer-Verlag Berlin Heidelberg 2007

Context-Centered Design: Bridging the Gap Between Understanding and Designing

41

context of the interaction itself. Dourish [4] identified two perspectives for context: representational and interactional. He argues that the correct focus for research is on the interaction between objects and activities and not solely on the representation of the objects. We concur with observation and also with Greenberg’s point [5] that context is not a fixed, descriptive element, but is a dynamic and interactive element. Designing context-aware system for complex environments is very challenging because the knowledge needed to solve this complex problem is processed by people who typically work in different domains. This is known as the Symmetry of Ignorance, and communication breakthrough is needed in these cases [6]. Since endusers live in their context they understand the context much better than system designers do. But, end-users must rely on others to design the systems they need. Doing so effectively requires a shared understanding of context to ensure a good design in context rich environment. To solve this problem, in this paper we presented a Context-centered framework for interactive system design which is intended to answer the following three research questions. − What is context when it is applied in interactive design? − What are the components of the context? − How can we use context to bridge the gap between understanding and designing?

2 Literature Review: Theories and Metaphors Although many current theories within HCI do not explicitly address context issue, some consideration of context is embedded in these theories. We review these theories in this section (Table 1). Table 1. Theories and Metaphors Applicable to Using of Context in HCI

Activity Theory Distributed Cognition

Basic Unit of Analysis An activity -a form of doing directed to an object that transforms an object into an outcome. A cognition system composed of individuals and the artifacts they use.

Situated Action

The activity of persons-acting in setting

GMOS

GMOS-user’s cognitive structure

Awareness

Awareness -knowledge about the state of some environment

Locales Framework

Locales –the relationship between social world and its interactional needs, and the “site and means” its members use to meet those needs.

Components of context Subject, Tools, Object, Rules, Community, Division of labor Goals, Internal Representation, External Representation Person, Activity

Setting, Relationship between person and setting Goals, Operators, Methods for achieving the goals, Rules for choosing methods People, Artifacts, Time Actions happened and happening Locales foundation, Civic structures, Individual views, Interaction trajectory, Mutuality

42

Y. Chen and M.E. Atwood

2.1 Activity Theory Activity theory is a research framework originating in Soviet Psychology in the 1920s [7]. The application of activity theory has lately been introduced to information systems area [8, 9]. The object of AT is to understand the unity of consciousness and activity [9]. Emphasis on Context: Nardi [9] argued that the AT is a descriptive tool which provides different perspective on human activity. Activity theory begins with the notion of activity. Unlike many other theories which take human actions as a unit of analysis, AT takes actions and the situated context as a whole and calls this an activity. Context is the activity and the environment in which it occurs. 2.2 Distributed Cognition Distributed cognition [10, 11] theory believes that humans augment their knowledge by placing memories, facts, or knowledge on the objects, individuals, and tools in their environment. Distributed cognition breaks the traditional internal and external boundary and combined them together as a distributed system. Emphasis on Context: Distributed cognition system together is a context for the activities they are carried out. Since distributed cognition theory focus on the distributed nature of activity solving process, it takes into account people, artifacts situated in various locations. It is widely adopted in Computer Supported Collaborative Work (CSCW) studies which emphasized on the collaborations across multiple participants [12]. 2.3 Situated Action Situated action was first introduced in 1987 by Lucy Suchman [13]. Rather than decompose the circumstances and the actions being taken by a preset plan, situated action theory think that the actions are highly contextualized; the context of specific situation determined what the next action is. Suchman believes that people construct their plan as they go along in the situation, creating and altering their next move based on what has just happened, rather than planning all actions in advance and simply carrying out that plan. Emphasis on Context: Situated action theory believes that context is a dynamic thing associated with actions. From the situated action point of view, an action plan, is not pre-defined, but consists many unpredicted actions which determined by the specific context it is situated. In this way we could define and analyze context as an interaction entity from the action point of view. 2.4 Locales Framework Locales framework [14, 15] is a theory that create a shared abstraction among stakeholders and bridge understanding and design in CSCW field. Basically, a locale


43

is a space together with the resources available there which has particular relationship with social world and interaction needs to meet people’s needs. Locales could either be a physical space or a virtually shred environment. Emphasis on Context: Though Fitzpatrick only studied the locales in the CSCW field, the notion of ‘locales’ could be applied to any interaction situation. A locale is an individual context in this sense and the framework could help identify locales five properties. This would potentially 2.5 GOMS GOMS [16] is a method for modeling and describing human task performance. GOMS is an acronym that stands for Goals, Operators, Methods, and Selection Rules, the components of which are used as the building blocks for a GOMS model. Emphasis on Context: GOMS provides an alternative view of context. Context, instead of a shared environment and the people, artifacts inside it, it is a means to select and conduct activities. Context does not necessarily to be tangible artifacts. Like distributed cognition theory claim, human cognition is part of context too. Though the rules are not physical artifacts, it restrict the why which actions is carried out. 2.6 Awareness Awareness is generally defined in terms of two concepts: knowledge and consciousness. In the HCI scope, awareness is studied as it relates to the interaction between an agent and its environment. Emphasis on Context: Dourish and Bellotti [17] defined awareness as “an understanding of the activities of others, which provides a context for your own activities.” In this sense, awareness could be simply defined as “knowing what’s going on in the context [18].” This definition indicated that awareness is associated with the context under which the intended task is being processing. Also knowing what’s going on provide users feedbacks and conscious of the context. 2.7 Contextual Factors Identified From the above review, we conclude that context, although defined and used different in these theories, does share some common elements. The contextual factors associated with each theory are outlined (table 2). Our review and analysis suggested that Context is not a fixed, descriptive element. Instead, it is a dynamic and interactive element which arises from the activity and is particular to each occasion of activity.

44

Y. Chen and M.E. Atwood Table 2. Contextual Factors Extracted from HCI Theories

Factors Motivation Goal Activity

Rules Constraint Awareness Methods People Objects Settings

Explanations The reasons for a action The intend outcome for the a action Action

Principles or regulations of a action Limitation or restriction of a action Knowing what’s going on Different ways of conducting a action People involved in a action and their roles Relevant artifacts Either physical or virtual space for a action

Relevant Theories Activity Theory, GOMS Activity Theory, GOMS Activity Theory, Distributed Cognition, Situated Actions, GOMS, Awareness, Locales Framework Activity Theory, Distributed Cognition, GOMS Activity Theory, Distributed Cognition Awareness, Locales Framework Activity Theory, GOMS, Locales Framework Activity Theory, Distributed Cognition, Awareness, Locales Framework Activity Theory, Distributed Cognition. Awareness Distributed Cognition, Awareness, Locales Framework

3 Context Revisit Given the importance of context in the system design and the contextual factors extracted from the previous theories, we are interested in what exactly context is and context aware system from this activity bounded view. 3.1 Context Definition Both Dourish’s [4] point and the literature review above indicate that context is a property of an interaction between objects and activities, not of the objects in the environment alone. From this interactional point of view, context is “a relational property held between objects or activities. We can not simply say that something is or is not context; rather, it may or may not be contextually relevant to some particular activity.” [4] This viewpoint shows that context is a dynamic property which particular to each occasion of activity or action. Therefore, context in our definition is: A dynamic property aroused from activities. It interacts and constraints activities happened within it. 3.2 Context-Aware System A context aware application is adaptive, reactive, responsive, situated, contextsensitive and environment-directed [19]. Since the definition of context varies depending on the different usage, the notion and usage of context-aware application also differs greatly. In the early stage, context-aware has been depicted as “adapts according to its location of use, the collection of nearby people and objects, as well as


45

changes to those objects over time.” [20] Context depicts in this definition is only as representational problem. What does the context-aware mean when it is an interactional property? Dey [2] define context-aware as: “a system is context-aware if it uses context to provide relevant information and/or services to the user, where relevancy depends on the user’s task”. There is no doubt that adaptive and responsive to the surrounding environment is the key characteristic of context-aware computing. From the activity point of view, contextual information is decided by the activities happened within it, a task is a more general notion, and a task may contain many goal oriented activities. Therefore we define context-aware applications as: a system which could incorporate relevant contextual information and be adaptive to the situation it is situated, whereas the contextual information is determined by the goal oriented activities users carried out to complete tasks.

4 Context-Centered Framework Our Context-Centered Framework is intended both to incorporate context into design and to facilitate communication between end users and designers. Compared with locales framework of considering context as static environment, we adapt a dynamic view to combine context with the task solving process. End users could use this framework to identify the contextual information associated with their working activities. It also assists designers to analyze the system features and to validate it in the context. We take activity as a unit of analysis in this framework. 4.1 Action as a Unit of Analysis The review shows that context is inseparable from activities, whether something is considered to be context or not is determined by its relevance to a particular activity. Therefore, we set the unit of analysis in our study to an activity level. From the interaction point of view, contextual information is initiated from and bounded by the activities happened within it. According to Nardi’s [9] hierarchical levels of activities, activities are long-term formations and their objects can not be translated into outcomes at once, but through a process consisting often of several steps or phrases of actions. Actions under the same activity are related to each other by the same overall object and motive. 4.2 Context-Centered Framework Aspects From the hierarchy of activity point of view [9] , the activity is similar to the task which users are trying to accomplish, actions are steps of achieving it, and operation are procedures under each steps. Context differs in each step and also the overall task. For each action, there are four aspects to analyze it. These four aspects are highly interdependent and overlapping. They have been connected by the same action under taken. Combined together, the aspects have the potential to capture many contextual characteristics in the working settings.

46


Goal: First thing in understanding the context is to identify the object of the activity. It could determine what relevant context information is. Goal includes users’ motivation, intended outcome of performing this activity. Setting: Setting is a place where participants perform the activity; it could be either virtual or physical environment. The relevant setting information includes: − − − −

Who are the people who conduct this activity and their roles The characters of the setting where the activity performs The available tools like other available methods and approaches; The artifacts involved in the setting like other devices and objects.

Rules: Rules of using the resources in current setting and constraints of using any tool how users will perform the activity. E.g. Time constraint for an action. − Constraints of using the resources in the working settings − Rules of allocating resources Awareness: An understanding of the others (either objects or people), which provides feedbacks and conscious of the context and the activities. − The shared context: Aware of other people who involved in the activity and their roles; Aware of the tools and artifacts in the current settings; Aware of the rules/constraints for performing this activity − Actions: Aware of the actions has been taken; Aware of the actions is carrying out now. Table 3. Contextual factors identified Goal Setting Rules Awareness

Object determines what contextual information for the activity is. Setting is the place where participants perform activities. In includes the resources involved in the task solving process. Rules and constraints of using the resources. An understanding and conscious of the setting and activity.

5 Research Design In order to understand how can we use context to bridge the gap between understanding and designing. We designed a 2x2 experiment to test: 1) whether the context-centered Framework could bridge understanding and designing and 2) whether the context-centered Framework could generate a better design compared to non-contextual consideration. A scenario-based design (SBD) [21, 22] approach is applied to our experiment. SBD is an ideal way to measure the context implication in design [23]. Two group of students will be recruited to conduct to generate scenarios based on a given tasks. Students who had HCI courses are believed to have certain design expertise; whereas students who have nursing training are considered as end-users. We will apply two conditions to these students: with and without context-centered framework training (table 4).


47

Table 4. Research Design

Without training With training

Designers Group D1 Group D2

End Users Group E1 Group E2

The hypotheses are: Ha: In both designers and end-users groups, using Context-centered Framework could produce better design scenarios. Hb: Without context consideration, designers will generate better design scenarios; whereas with context-centered framework training, end-users could generate better design scenarios. To assess the quality of the scenarios, two HCI experts will review the scenario quality and score them according to the quality of scenarios to design.

6 Expected Results We believe that the focus on context could improve communication between endusers and designers. This focus will also produce high quality scenarios which will lead to better design products. We expect that without contextual centered framework instruction, designers (Group D1) will produce better interaction scenarios than end users (Group E1); whereas when context is taken into consideration, end users (Group E2) will generate high quality scenarios than the designers (Group D2).

7 Conclusion and Future Work We intend to use the context-centered framework to connect end users’ understanding of working setting and designer’s design activities. We believe that the results of this study will be relevant to both researchers and practitioners and will help in designing useful and usable system for two reasons. First the context-centered framework can be a starting point to help analyst and designers understand working environment. The task dependent framework could be used to generate initial question and direct observations. It could also capture working settings from the end users’ point of view. Second, a context-centered framework can be used by system designers to help identify where features can be added to enhance existing design, to identify task related context issues and how to incorporate then into system design. Our future work includes conducting experiment for this study, and also we intend to adapt the context-centered framework to a contextual walkthrough for system evaluation.

References 1. Greenbaum, J., Kyng, M. (eds.): Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum Ass, Hillsdale, New Jersey (1991)

48


2. Dey, A.K., Abowd, G.D., Salber, D.: A Conceptual Framework and a Toolkit for Supporting the Rapid Prototyping of Context-Aware Applications. Human-Computer Interaction 16, 97–166 (2001) 3. Schilit, B., Theimer, M.: Disseminating active map information to mobile hosts. IEEE Netwk 8, 22–32 (1994) 4. Dourish, P.: What we talk about when we talk about context. Personal Ubiquitous Comput. 8, 19–30 (2004) 5. Greenberg, S.: Context as a Dynamic Construct. Human-Computer Interaction 16, 257– 268 (2001) 6. Rittel, H.: Second-Generation Design Methods. In: Cross, N. (ed.) Developments in Design Methodology, pp. 317–327. John Wiley & Sons, New York (1984) 7. Wertsch, J.V.: Vygotsky and the Social Formation of Mind. Harvard University Press, Cambridge, MA, London (1985) 8. Bødker, S.: A human activity approach to user interfaces. Human-Computer Interaction 4, 171–195 (1989) 9. Nardi, B.: Context and Consciousness: Activity Theory and Human-Computer Interaction. MIT Press, Cambridge (1996) 10. Hutchins, E.: Cognition in the Wild. The MIT Press, Cambridge, MA (1996) 11. Zhang, J., Norman, D.A.: Representations in Distributed Cognitive Tasks. Cognitive Science 18, 87–122 (1994) 12. Rogers, Y., Ellis, J.: Distributed cognition: An alternative framework for analysing and explaining collaborative working. Journal of Information Technology 9, 119–128 (1994) 13. Suchman, L.A.: Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, New York (1987) 14. Fitzpatrick, G., Mansfield, T., Kaplan, S.M.: Locales framework: exploring foundations for collaboration support, pp. 34–41 (1996) 15. Fitzpatrick, G., Kaplan, S., Mansfield, T.: Applying the Locales Framework to Understanding and Designing. In: Proceedings of the Australasian Conference on Computer Human Interaction, p. 122. IEEE Computer Society Press, Los Alamitos (1998) 16. Card, S.K., Moran, T.P., Newell, A.: The phychology of human computer interaction. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ (1983) 17. Dourish, P., Bly, S.: Portholes: Supporting Awareness in a Distributed Work Group. In: Proceedings of the Conference on Human Factors in Computing Systems, Monterey, CA, 541–547 (1992) 18. Gutwin, C., Greenberg, S.: A Descriptive Framework of Workspace Awareness for RealTime Groupware. Computer Supported Cooperative Work (CSCW) 11, 411–446 (2002) 19. Abowd, G.D., Dey, A.K., Brown, P.J., Davies, N., Smith, M., Steggles, P.: Towards a Better Understanding of Context and Context-Awareness. In: Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing, pp. 304–307. SpringerVerlag, Karlsruhe, Germany (1999) 20. Schilit, B., Theimer, M.: Disseminating Active Map Information to Mobile Hosts. IEEE Network 8, 22–32 (1994) 21. Carroll, J.: Scenario-Based Design: Envisioning Work and Technology in System Development. John Wiley & Sons, Chichester (1995) 22. Rosson, M.B., Carroll, J.M.: Usability Engineering: scenario-based development of human-computer interaction. Morgan Kaufmann, Seattle, Washington (2001) 23. David Pinelle, C.G.: Groupware walkthrough: adding context to groupware usability evaluation. In: Proceedings of the SIGCHI conference on Human factors in computing systems: Changing our world, changing ourselves, pp. 455–462. ACM Press, Minneapolis (2002)

Application of Micro-Scenario Method (MSM) to User Research for the Motorcycle’s Informatization - A Case Study for the Information Support System for Safety Hiroshi Daimoto1,3, Sachiyo Araki1, Masamitsu Mizuno1, and Masaaki Kurosu2,3 1 YAMAHA MOTOR CO., LTD., Japan National Institute of Multimedia Education, Japan 3 Department of Cyber Society and Culture, The Graduate University for Advanced Studies, Japan 2

Abstract. The Micro-Scenario Method (MSM) is an approach to uncover the consumer needs and establish the development concepts [2]. In this study, the MSM is applied to the Information Support System for Safety related to a motorcycle and devised for application efficiency. What is devised is to make a prescriptive model up before interview research and set up the syntax rules of the problem-scenario (a description sentence of problem situation). As a result, the development efficiency is improved by the modified MSM. The communication of relevant parties can be speeded up, because the prescriptive model which keywords are structurally organized helps development actors share wide-ranging information about problem situations. Moreover, the creation time of problem-scenario can be cut, because the syntax rule of problem-scenario simplifies how to describe it. Though the modified MSM is an effort to take MSM in practical use at YAMAHA Motor Company Ltd. (YMC), the modified MSM was considered as a useful approach to reduce the workload of HCD (Human-Centred Design).

1 Introduction YMC manufactures various types of vehicles that include motorcycles, boats, water vehicles, snowmobiles, all terrain vehicles (ATVs), etc. In recent years, the informatization of the vehicles such as Intelligent Transport Systems (ITS) and Advanced Safety Vehicle (ASV) advances and navigation system has begun to be equipped with even in a motorcycle. A transportation equipment maker needs to consider what kind of information should be provided to a vehicle driver from the standpoint of HCD. At this point, HCD applies to the field of motorcycles and boats in the YMC R&D operations. In this paper, the studies of MSM at YMC are described and then the result of applying it to the Information Support System for Safety related to a motorcycle is reported. MSM is a kind of Scenario-Based Design [4], it helps to systematically analyze a qualitative data such as an interview data and thus leads to the generation of adequate solutions. Therefore, it is regarded as an effective approach for creating the informatization concept of motorcycle in order to design devices and systems to fit into J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 49–57, 2007. © Springer-Verlag Berlin Heidelberg 2007

50

H. Daimoto et al.

user needs and to improve their usability. This approach corresponds to the activity of “the understanding and specifying the context of use” in the early development stage of ISO13407. The purpose of present paper is to propose the modified MSM that is improved in terms of the analytical method regarding the problem-scenario (pscenario). There are two distinctive improvements on the analysis of the p-scenario, which consists of “the prescriptive model” and “the syntax rule”.

2 The Prescriptive Model and the Syntax Rule of the Modified MSM 2.1 The Prescriptive Model The prescriptive model consists of structured keywords derived from literature research. The prescriptive model is exploited for covering rough aspects of the target fields and utilized to facilitate understanding of the research contents among development actors (user, engineer, designer, usability engineer). Before the interview research, we made up the prescriptive model (see Fig.1) that is organized from a standpoint of i) rider’s factors (physical factor, emotional factor, personality factor, information processing factor), ii) vehicle body factor (breakdown, poor maintenance, etc), iii) environmental factor (surrounding vehicles, traffic situation, road surface condition, etc). Fig.1 shows the structured accident cause of a motorcycle. The keywords about the accident cause are grouped and organized structurally such as a KJ method [1].

Fig. 1. Prescriptive model about the accident factors of a motorcycle

Table 1. Detail descriptions of contextual factors on the accidents of a motorcycle

Application of MSM to User Research for the Motorcycle’s Informatization 51

52

H. Daimoto et al.

After the interview research, the prescriptive model is revised by adding keywords derived from interviews. Table 1 shows the detail descriptions of contextual factors on the motorcycle accidents. The prescriptive model is based on this structured classification. The prescriptive model is utilized for participants to understand the whole image of the accident causes. At a stage of p-scenario analysis, the prescriptive model is utilized for usability engineers to analyze the accident causes by connecting the prescriptive model (keywords) with the p-scenario. 2.2 The Syntax Rule The p-scenarios are derived from organizing the interview data and the literature research. The person who is responsible for usability takes plenty of time for making p-scenarios. Because the text derived from the interview and the literature are huge volumes of data. Therefore, the writers of scenarios have a hard time how to describe the p-scenarios. Their way of writing the p-scenarios vary considerably from person to person. It is resolved by setting up the syntax rules of the p-scenarios. The syntax rule is to regulate the words that should be described. Fig.2 shows a case example of p-scenario for the Information Support System for Safety related to motorcycles. Fig.3 shows the traffic situation (Japanese keeping to the left) of the p-scenario. This case example of p-scenario is described about “subject-object”, “provided information”, “to whom”, “when”, “condition of rider”, “situation of environment and other vehicles”, “a kind of hazard”, “means”. When one's own motorcycle goes straight through an intersection on a green light while there is a preceding vehicle (truck, etc.) and an oncoming right-turn car, and both the motorcycle’s rider and the oncoming car’s driver fail to see each other, there is a risk that the oncoming right turn car might come into the intersection by mistake. Therefore, it is desirable to indicate the presence of one's own motorcycle to the oncoming car's driver. However, such a means does not exist.

Fig. 2. A case example of p-scenario for the Information Support System for Safety related to a motorcycle

The p-scenarios were made up to cover the all keywords of the prescriptive model of Fig.1. When the p-scenario of Fig.2 is connected with the prescriptive model of Fig.1, the accident factors (keywords of the prescriptive model) that are assumed by the p-scenario are “invisible (= an invisible oncoming car)”, “surrounding vehicle (= a preceding vehicle)”, and “road geometry (= an intersection)”.

3 An Application Study of Modified Micro-Scenario Method

Fig. 3. A traffic situation of Fig2

An application study of the modified MSM for the Information Support System for Safety related to a motorcycle as follows. The modified MSM is characterized by “the prescriptive model” and “the syntax rule”.

Application of MSM to User Research for the Motorcycle’s Informatization

53

3.1 Participants Participants for the interview research were 20 working people between the ages 20’s and 50’s. Table 2 shows the detail attributions and the number of participants. The general riders were gathered through a research company for payment. The instructors were a driving instructor of a motorcycle. The other selection criteria of the participants were (1) to ride a motorcycle more than twice a month, (2) to have the experience that they run on a highway. The Attributes were scattered as much as possible to hear a voice of various riders. Participants for the questionnaire research were 20 working people same as the interview research. However, the 4 participants could not participate in the questionnaire research. The data analysis of questionnaire was performed for 16 participants. Table 2. The attributions of participants

3.2 Procedure At the interview research, each participant answered our questions (“What kinds of information do you want under what circumstances?” etc) and explained the context of the problem situation. Additionally, each participant was presented 15 typical traffic scenes (e.g. “crossing collision”, “right turn”, and “bump from behind” etc.) to lead an accident through the safety education teaching materials for motorcycles, and reported the requirements of motorcycle’s informatization for safety in each scene that was covered the major traffic situations. At the end points of the each reporting, each participant was also presented the prescriptive model of each scene, which showed an envisioned accident factor. Then, after having been explained about a general accident cause of each scene, each participant was demanded to report the

54

H. Daimoto et al.

more detailed requirements. The interview takes about two hours each. The voice data of the interviews is recorded. At the questionnaire research, each participant answered the questionnaire about “Level of importance (How important is the problem to be solved for motorcycle safety?)”, “Degree of risk (How dangerous it is due to the absence of means to solve?)”, and “Frequency (How often it is to encounter a case requiring the means?)” in a range of five-point for each p-scenario that was derived from the interview research. The p-scenarios were made up refer to the syntax rule. 3.3 Result The prescriptive model is exploited for two purposes. First is to cover rough aspects of the accident factors and facilitate understanding of the accident causes among development actors. The participants are easy to share the whole image of the accident causes with interviewers and easy to report the requirements without exception. As a result of the interviews, 66 p-scenarios regarding the various traffic situations that cover the 39 keywords of the prescriptive model was made up. Table 3 shows the number of p-scenario for each traffic situations that derived from the interviews and the literature. Table 3. The number of p-scenario for each traffic situations

Second is to analyze the accident cause by connecting the prescriptive model with the p-scenario (39 keywords x 66 p-scenario). We can make an important accident factor clear by analysing the ten high-scored p-scenarios (see Fig.3-1, Fig.3-2). Table 4 shows the result of the accident factor analysis regarding the checked factor (checked = 1). The result indicates that “surrounding vehicles”, “invisible”, and “road geometry” was the particularly important factors.

Application of MSM to User Research for the Motorcycle’s Informatization Table 4. The result of the accident factor analysis

Fig. 3-1. High-scored p-scenarios

55

56

H. Daimoto et al.

*1

The score of supporting data is the average one of questionnaire about “Level of importance (How important the problem is for the motorcycle safety?)”, “Degree of risk (How dangerous it is due to the absence of means to solve?)”, and “Frequency (How often it is encounter a case requiring the means?)” for p-scenarios of the best 10. *2 The order of ten high-scored p-scenarios is defined by the overall score (overall score = “Level of importance” x “Degree of risk” x “Frequency”). Fig. 3-2. High-scored p-scenarios

The syntax rule is exploited for describing the p-scenarios systematically. As a result of having used the syntax rule, the writer of the p-scenario made up 66 pscenarios based on the text data of the interview and the literature. Without the syntax rule, the writer of the p-scenario would not make up the p-scenario effectively and spend much time in vain. In fact, the writer of the p-scenario reported that there was the syntax rule and was easy to write the p-scenario.

4 Summary The purpose of present study is to propose the modified MSM that is improved in terms of the analytical method regarding the p-scenario. Specifically, the modified

Application of MSM to User Research for the Motorcycle’s Informatization

57

MSM is applied to the Information Support System for Safety related to a motorcycle, the application example is shown. Two distinctive improvements are obtained on the modified MSM, which are “the prescriptive model” and “the syntax rule”. As a result of the application, it is indicated that (1) the prescriptive model helps development actors to share wide-ranging information about the accident causes structurally, (2) the prescriptive model helps usability engineers to make a detailed analysis of the accident causes, (3) the syntax rule helps scenario writers to make up the p-scenario easily. In the early development stage of HCD, it was considered that scenario method was effective [3]. MSM is a method of analysis using the scenario technique for a qualitative data such as an interview data, and is getting clear a frame of methodology. However, most of the adaptation example to the real development field has never been introduced. The present study is a case study of MSM, and the modified MSM is an effort to apply MSM to a practical development at YMC. The coverage and quantitative evaluation of the modified MSM are future problems, and this method will be improved by taking in more voice of the real development field.

References 1. Kawakita, J.: Hassouhou. Chuko Shinsho, Tokyo, [in Japanese] (1967) 2. Kurosu, M.: Micro Scenario Method. Research Reports on National Institute of Multimedia Education, 17, [in Japanese] (2006) 3. Carroll, J.M.: Five reasons for scenario-based design. In: Proceedings of the 32nd Hawaii International Conference on System Sciences (Maui, HI, January 4-8), [published as CDROM] pp. 4–8. IEEE Computer Society Press, Los Alamitos, CA (1999) 4. Carroll, J.M.: Scenario-Based Design of Human-Computer Interactions. MIT Press, Boston, MA (2000)

Incorporating User Centered Requirement Engineering into Agile Software Development Markus Düchting1, Dirk Zimmermann2, and Karsten Nebe1 1

University of Paderborn C-LAB, Cooperative Computing & Communication Laboratory, Fürstenallee 11, 33102 Paderborn, Germany 2 T-Mobile Germany, Landgrabenweg 151, 53227 Bonn, Germany {markus.duechting, karsten.nebe}@c-lab.de, [email protected]

Abstract. Agile Software Engineering approaches gain more and more popularity in today’s development organizations. The need for usable products is also a growing factor for organizations. Thus, their development processes have to react on this demand and have to offer approaches to integrate the factor “usability” in their development processes. The approach presented in this paper evaluates how agile software engineering models consider activities of Usability Engineering to ensure the creation of usable software products. The user-centeredness of the two agile SE models Scrum and XP has been analyzed and the question of how potential gaps can be filled without loosing the process’ agility is discussed. As requirements play a decisive role during software development, in Software Engineering as well as Usability Engineering. Therefore, different User Centered Requirements that ensure the development of usable systems served as basis for the gap-analysis. Keywords: Agile Software Engineering, Usability Engineering, User-Centered Requirements.

1 Traditional Software Engineering The ambition of Software Engineering (SE) is the systematic and manageable development of Software, in order to make the development process more planable. Many existing Software Engineering Models (SE Models) manage the software development regarding costs, time and quality. A well-established and prevalent SE model is the Waterfall Model [11] introduced by Royce. The model consists of seven phases that are passed through sequentially. The results of each phase are captured in documents, which serve as milestones for further development activities of the next phase. The Waterfall Model offers an easy way to schedule and manage the software development, because of its sequential progress. It has been successfully applied in situations where requirements and workflows can be completely determined upfront. Another common SE Model is the Spiral Model introduced by Boehm [2], which is an enhancement of the Waterfall Model. The Spiral Model is an iterative and incremental approach which provides a cyclic repetition of four phases. Each cycle of J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 58–67, 2007. © Springer-Verlag Berlin Heidelberg 2007

Incorporating User Centered Requirement Engineering

59

the spiral consists of four major activities and ends with a progress assessment, followed by a planning phase for the next process iteration. Additionally, a risk assessment is performed after each iteration. The iterative approach allows reacting adequate on changing requirements. This makes the process of developing software more manageable and minimizes the risk of failure, in contrast to the sequential SE Model.

2 Agile Software Engineering A recently emerging trend in SE focuses on lightweight, so called agile models, which follow a different approach to software development. Agile models follow the idea of Iterative and Incremental Development (IID), similar to the Spiral Model mentioned above. But in contrast to Boehm’s model, the iteration length is shorter in agile models. The iterations in the Scrum Model for instance, take 30 calendar days. Agile software development does not rely on comprehensive documentation and monolithic analysis activities; instead they are more delivery- and code-quality-oriented approaches. Through co-location of the development team the tacit knowledge among the team members compensates extensive documentation efforts. Agile models emphasize communication, and aspire towards early and frequent feedback through testing, on-site customers and continuous reviews. The basic motivation behind agile and iterative development is to acknowledge that software development is similar to creating new and inventive products [8]. New product development requires the possibility for research and creativity. It is rarely possible to gather all requirements of a complex software system upfront and identify, define and schedule all detailed activities. Many details emerge later during the development process. This is a known problem within the domain of SE and the reason for many failed projects [8]. For this reason, agile models implement mechanisms to deal with changing requirements and other unforeseen incidents to plan, monitor and manage SE activities. 2.1 Scrum Scrum is an agile and iterative-incremental SE model. Its development tasks are organized in short iterations, called Sprints. Each Sprint starts with a Sprint Planning meeting where stakeholders decide the functionality to be developed in the following Sprint. All requirements for a software system are collected in the Product Backlog. The Product Backlog is a prioritized list and serves as a repository for all requirements related to the product. However, the Product Backlog is not at any time a finalized document but rather evolves along with the product. In the beginning of a project the Product Backlog only contains high-level requirements and it becomes more and more percise during the Sprints. Each Backlog item has a priority assigned to represent its’ business value, and an effort estimation to plan the required resources to implement it. During the Sprint Planning, the Scrum Team picks high priority backlog items that they think are realistic for the next Sprint. The Scrum Teams are small interdisciplinary groups of 7 to 9 people [12], which are self-organized and have full authority to determine the best way for reaching the

60

M. Düchting, D. Zimmermann, and K. Nebe

Sprint Goals. There are no explicit roles defined within the Scrum Team. Scrum places emphasis on an emergent behavior of the team, meaning the teams develop their mode of cooperation autonomously. This self-organizing aspect supports creativity and high productivity [12]. The Scrum Team and its’ manager - the Scrum Master – meet in a short, daily meeting, called Daily Scrum, to report progress, impediments and further proceedings. Every Sprint ends with a Sprint Review meeting, where the current product increment is demonstrated to project stakeholders. 2.2 Extreme Programming Extreme Programming [1] is one of the established agile SE methodologies. Similar to Scrum, XP is an iterative-incremental development model. However, XP’s iterations are even shorter than Scrum’s. According to Beck the optimal iterationlength is somewhere between 1 and 3 weeks. XP adopts reliable SE techniques to a very high degree. Continuous reviewing is assured by pair programming, where two developers are sitting together at one workstation. XP also applies the common code ownership principle. All team members are allowed to make changes in code written by someone else when it is necessary. In addition, XP requires a user stakeholder to be on-site as a mean to gather early user feedback. The requirements in XP are defined by the customer in so called User Stories. Each story is a brief, informal specification of requirements. Similar to Scrum’s Product- and Sprint Backlog, the User Stories have a priority and effort estimation assigned to it. Before a new iteration starts, the User Stories are decomposed into more granular technical work packages. The literature about XP does not mention an explicit design phase, but highly emphasizes continuous refactoring and modeling. The functionality described in User Stories is converted into test cases. The simplest concept that passes the test is implemented. Development is finished, when all tests are passed.

3 User Centered Design A recent trend can be observed, showing that usability criteria become a sales argument for products and the awareness for the need of usable systems is growing. But many software development projects are mainly driven by the SE model that is used. Usability Engineering (UE) provides a wide range of methods and systematical approaches to support the user-centered development. These approaches are called Usability Engineering Models (UE Models), e.g. the Usability Engineering Lifecycle [9] or Goal-Directed Design [4]. Mayhew’s UE process consists of three phases, which are processed sequentially. The first Phase is the Requirement Analysis, followed by Design/Testing/ Development Phase and the Installation of the product. The process is iterative: Concepts, preliminary and detailed design are evaluated until all problems are identified and resolved. In the Goal Directed Design Process of Cooper, several phases are passed through as well. During the Research Phase, qualitative research leads to a picture of how users do work in their daily work environment. During the Modeling Phase Domain Models and User Models (so called Personas) are developed that are then translated


61

into a Framework for the design solutions, which is detailed in the Refinement Phase. These two models have much in common since they describe an idealized approach to ensure the usability of a software system, but they usually differ in the details. UE Models usually define an order of activities and their resulting deliverables. UE approaches often happen concurrently to the other software development activities, so there is an obvious necessity for integrating these two approaches, in order to permit the predictability of budgets, resources and timelines of the UE activities within software development.

4 Motivation According to Ferre [7] basic conditions for integrating SE and UE are an iterative approach and active user involvement. The two agile SE models outlined above are iterative-incremental approaches that rely on a solid customer involvement. They even talk about user representatives as a special kind of customer stakeholders. The involved customer should at least have a solid knowledge of the user’s domain and their needs. This raises the question, if and how Usability is ensured in an agile software development process in order to perform UE activities in a satisfying way. This paper discusses the user-centeredness of two agile SE Models and the question how potential gaps can be filled without loosing the process agility. When exploring the UCD Models described above, there is a commonality with the traditional SE Models. Both are strongly driven by phases and the resulting deliverables. However, documentation has a minor part in agile models. Due to their incremental approach and overlapping development phases there are no distinct phases, like e.g. Analysis, Design, Development and Validation, in agile SE Models. Without certain deliverables or activities there is a need for other criteria to allow an assessment of the user-centeredness of agile SE Models. Requirements play a decisive role during the software development lifecycle, in both the SE and the UE domain. SE is mainly concerned with system requirements, while UE takes the user’s needs into account. Requirements are measurable criteria and the elicitation, implementation and validation takes place in most approaches to software development. The approach of defining granular requirements allows to look at activities independent of the larger modules, which lends itself well to the agile approach of developing smaller increments that ultimately add up to the final system, and not preparing a big design up front. In order to develop recommendations for the integration the authors analyze Scrum and XP to see how they are able to adopt UCD activities, specifically how they can utilize UCD requirements. The Requirement Framework introduced in the following section offers a way to approach this.

5 User Centered Requirements Based on a generalized UCD model defined in DIN EN ISO 13407 [5], Zimmermann & Grötzbach [13] describe a Requirement Engineering framework where three types of requirements are generated, each of which constitutes the analysis and design outcome for one of the three UCD activity types. Usability Requirements are

62


developed during the Context of Use analyses; which revolves mainly around the anticipated user, their jobs and tasks, their mental models, conceptions of the usage of the system, physical environment, organizational constraints and determinants and the like. It is important to elicit these findings from actual users in their context of use, in order to get a reliable baseline for requirements pertaining to users’ effectiveness, efficiency and satisfaction. These requirements can be used as criteria for the system and intermediate prototypes through Usability Tests, questionnaires, or expert based evaluations. The Workflow Requirements focus on individual workflows and tasks to be performed by a user. Task performance models are elicited from users, the workflow is optimized for, and an improved task performance model is generated. The outcome of this module is a set of requirements pertaining to a specific user’s interaction with the system in the context of a specific workflow or task, e.g. as described in use case scenarios. The requirements describe the discrete sub-steps of a user’s interaction flow and the expected behavior of the system for each of these steps in an optimized workflow. It is important to validate these requirements against the usability requirements with users, e.g. by comparing an optimized workflow to the current state of workflow performance with regard to effectiveness, efficiency and user satisfaction. Workflow Requirements are ideal input for test cases, against which prototypes or the final system can be tested, either through usability tests or expert evaluations. The User Interface (UI) Requirements, generated in the Produce Design Solution activities, define properties of the intended system that are derived from Usability or User Requirements, e.g. interaction flow or screen layout. During the development phase, the UI Requirements provide guidance for technical designers regarding the information and navigation model, which can then be aligned with other technical models. They also help programmers implement the required functions using the correct visual and interaction model. UI requirements serve as criteria for the actual system that has been developed, i.e. to determine if it follows the defined model for layout and interaction. These evaluations can be user or expert based, and can be conducted during system design and testing. By translating UI Requirements into test cases, this evaluation step is facilitated.

6 Proceedings The authors used the User Centered Requirements summarized above as a basis for a gap-analysis in order to determine whether the two agile SE Models (Scrum and XP) consider the three types of requirements adequately. As the different requirements have distinct stages, they have to be elicited, implemented and evaluated appropriately. The fulfillment of the requirements will guarantee user centeredness in the development process. In order to prepare the gap-analysis the authors used the description of the different requirements to derive several criteria used for the assessment. The goal was to specify criteria which apply to both models. Thereby there is no 1:1 relation of the stages (elicitation, implementation, evaluation) and the criteria derived for the different types of requirements. Thus, there might be no criteria at a specific stage for a specific type of requirement, as the framework suggests. As an example, selected criteria for the UI Requirements are shown in Table 1.


63

Table 1. Selection of criteria, defined for the UI Requirements,based on the definition in 5 Elicitation develop appropriate representation of workflow by UI designer

Implementation verify feasibility

specify interaction and behavioral detail

transform architecture into design solutions

Evaluation evaluate if UI meets UI methods to measure improvements in effectiveness and efficiency verify requirements and refine existing requirements

According to the criteria the two agile models Scrum and XP have been analyzed regarding whether the criteria’s are met. This allows comprehensive statements about the considerations of UE activities and outcomes in agile SE. The analysis results are presented by each type of requirement and in the order of the three stages and are based on the model description from the sources cited above. Subsequent to the analysis the authors give recommendations for the two agile SE Models that enhance the consideration of the three requirement types in Scrum and XP. 6.1 Implementation of User Centered Requirements The results of the analysis for the Usability Requirements (Table 2) show, that neither Scrum nor XP consider this type of requirements appropriately. During the elicitation of Usability Requirements only one criteria, the consideration of stakeholder input is partly fulfilled by both models. The insufficient acquaintance of overarching Usability Requirements can also be determined in evaluation activities. Just one criterion is met by the Scrum Models to some extend. Table 2. Selection of criteria, defined for the Usability Requirements. + fulfilled; - not fulfilled; o partly fulfilled. Usability Requirements Elicitation observe user in context of use consider workflow-oriented quality criteria measurable, verifiable and precise usability requirements gather and consider stakeholders input Evaluation verify if requirements are met measure end user's satisfaction check requirements and refine existing requirements

SCRUM

XP

─ ─ ─ o ─ ─ o

─ ─ ─ o ─ ─ ─

During the elicitation of Workflow Requirements hardly any of the criteria could be found in Scrum or in XP. Except that the XP model does partly fulfill the criteria to verify if the new workflow is an improvement from the user’s perspective. However, the agile models posses’ solid strengths in the evaluation of user requirements. The only criterion which is not met by both models is the verification of workflow mockups against the improved workflow. The impact for the usability, because of these unconsidered criteria, is negligible.

64

M. Düchting, D. Zimmermann, and K. Nebe Table 3. Selection of criteria, defined for the Workflow Requirements

Workflow Requirements Elicitation specify system behavior for given task, related to concrete goal check if new workflow is an improvement form users perspective Evaluation check correctness and completeness of workflow description check workflow mockups for correctness, completeness and possibly find new requirements verify requirements and refine existing requirements verify that final system meets requirements

SCRUM

XP

─

─

─

o

+

+

─

─

+ o

+ +

Table 4. Selection of criteria, defined for the User Interface Requirements User Interface Requirements Elicitation develop appropriate representation of workflow by UI designer specify interaction and behavioral detail Implementation verify feasibility transform architecture into design solutions Evaluation evaluate if UI meets UI requirements concluding evaluation to see if system meets requirements methods to measure improvements in effectiveness and efficiency verify requirements and refine existing requirements

SCRUM

XP

─

─

─ + + o ─

─ + + ─ +

─

─

+

─

The elicitation of User Interface Requirements is not provided by any of the two models. However, for the criteria of implementation activities, both models provide an opportunity to verify feasibility of certain interaction concepts and consider technical constraints for design decisions before the UI concepts are implemented. In terms of the evaluation of UI Requirements the two models have several distinctions. The Scrum Model provides a way to verify UI Requirements with users and experts whereas there is no information about a comparable activity in the literature for the XP Model. As opposed to that, XP does perform concluding evaluations to see if the system meets the UI Requirements within the scope of automated tests. Both models do not consider measuring the improvements of the user’s effectiveness and efficiency. 6.2 Conclusion and Recommendations Looking at the summarized results it becomes apparent that both agile models have significant deficiencies in handling User-Centered Requirements. Usability Requirements are treated insufficiently in the important stages of development. Regarding to more detailed requirements the agile models possess certain strengths and the potential for the integration with UE activities.


65

Workflow Requirements for instance are dealt with appropriately regarding evaluative activities. But it needs to be assured that they are elicited and processed adequately during previous stages from an UE standpoint. The development can be essentially influenced on the granular level of UI Requirements. However, the UI requirements have to be derived from correct workflow descriptions and qualitative Usability Requirements. The recommendations listed below, provide suggestions to endorse the two models in order to include the criteria of User-Centered Requirements and ensure the usability of a software product. The recommendations are derived from the results of the analysis described above. In the descriptions of both models an explicit exploration phase prior to the actual development is mentioned. The development teams work out system architecture and technology topics to evaluate technical feasibilities, while customer stakeholders generate Product Backlog Items (in Scrum) or User Stories (in XP). Compared to common UE analysis activities the exploration phases in Scrum and XP are rather short and are supposed to not exceed one usual process iteration. Nevertheless, this exploration phases can be used by UE experts to endorse the particular development teams in a rough exploration of the real users in their natural work environment. In order to stay agile it is important to not insist on comprehensive documentation of the results, rather than emphasizing on lightweight artifacts and sharing the knowledge with the rest of the team. Having an UE domain expert in the development team also assures that generic Usability Requirements are taken into account during requirement gathering activities. Due to the vague definition of the customer role in Scrum and XP it is not guaranteed that real users are among the group of customer stakeholders. From an UE point of view, it is essential to gather information regarding the context of use and the users’ workflows and to validate early design mockups and prototypes with real users. Therefore, it is necessary to explicitly involve users on-site for certain UE activities instead of different customer stakeholders, even though when they claim to have a solid knowledge of the end users needs. The Product Backlog (in Scrum) and the User Stories (in XP) would be the right place to capture Workflow Requirements. However, there is the risk of loosing the “big picture”, of how single system features are related to each other, because both artifacts focus on the documentation of high level requirements instead of full workflows. Modeling the workflow with Essential Use Cases and scenario based descriptions [10] would be sufficient, but is not intended by any of the two models. Scrum and the XP do not intend to perform usability tests to verify if the requirements are met, nor they measure the users satisfaction, e.g. using questionnaires. However, the Sprint Review in Scrum offers facilities to expert evaluations involving people with UE expertise and/or real users among the Scrum Team and as attendees of the Sprint Review. This can not substitute comprehensive usability evaluations, but helps to avoid user problems at an early stage. System testing in terms of usability is a problem in agile models because the solutions are specified, conventionalized and developed in small incremental steps. However, to perform a usability test with real users, the system has to be in a certain state of complexity to evaluate the implementation of their workflows. In traditional SE models, also using incremental development, these workflows and regarding requirements are documented forehand and a prototype could be developed regarding

66


such a set of requirements for one workflow to be tested with the users. It certainly does not make sense to demand for Usability Testing subsequent to each process iteration, but the tests could be tied to a release plan. Agile models provide good opportunities for a close collaboration between developers and designers during development activities. Due to the overlapping development phases and the multidisciplinarity of the development teams, the feasibility of certain interaction models can be compiled with developers frequently and without fundamentally slowing down design and implementation activities. Regarding to this, design decisions can consider Usability Requirements and technical constraints in an easy and early stage. In terms of the evaluation of UI Requirements the two models differ in their proceedings. The Sprint Review in Scrum can be used to review the user interface in order to verify whether the design meets the previously defined specifications - presuming that those specifications have been created and defined as Sprint Goals beforehand. XP does not stipulate a review meeting like the Scrum Model. Unlike to Scrum, the XP Model explicitly demands for constant testing on a frequent basis. Certain subsets of UI Requirements are suited for automated test, e.g. interaction or behavior related requirements. But it is barely possible to test the conformity to a style guide regarding the accurate implementation.

7 Summary and Outlook The underlying criteria for the assessment do not claim to be exhaustive. Anyhow, they show the right tendencies and allow to make statements in terms of the realization in the particular models. The approach presented in this paper is used to evaluate how agile software engineering (SE) models consider activities of usability engineering (UE) in order to ensure the creation of usable software products. The user-centeredness of the two agile SE models Scrum and XP has been analyzed and the question how potential gaps can be filled without loosing the process agility was discussed. As requirements play a decisive role during software development, either in software engineering but also in usability engineering, the authors assumed that requirements can serve as the common basis on which agile SE models can work together with the results of usability engineering activities. The User Centered requirements, defined by Zimmermann and Grötzbach, describe three types of requirements derived from the results of UCD activities outlined in DIN EN ISO 13407 [5]. By using these three types of requirements the authors derived more specific criteria in order to perform a gap-analysis of the two agile models. As a result, the fulfillment of the criteria allowed comprehensive statements about the considerations of UE activities and outcomes in agile SE. It turned out that both agile models have significant deficiencies in handling User-Centered Requirements. Usability Requirements are treated insufficiently in all the important stages of development. The presented approach has been used to acquire first insights about the ability of agile SE models in creating usable software. However, the authors are well aware of the need for further more extensive and more specific criteria. Using and applying them to other agile models will enable to derive more generic statements about the integration of UE in agile SE models in general.


67

References 1. Beck, K.: Extreme Programming explained. Addison-Wesley, Boston (2000) 2. Boehm, B.: A Spiral Model of Software Development and Enhancement. IEEE Computer 21, 61–72 (1988) 3. Cohn, M.: User Stories Applied – For Agile Software Development. Addison-Wesley, Boston (2004) 4. Cooper, A.: About Face 2.0. Wiley Publishing Inc, Indianapolis, Chichester (2003) 5. DIN EN ISO 13407: Human-centered design processes for interactive systems. Brussels, CEN - European Committee for Standardization (1999) 6. DIN EN ISO 9241-11. Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability. International Organization for Standardization (1998) 7. Ferre, X.: Integration of Usability Techniques into the Software Development Process. In: Proceedings of the 2003 International Conference on Software Engineering. pp. 28–35, Portland (2003) 8. Larman, C.: Agile & Iterative Development – A Manager’s Guide. Addison-Wesley, Boston (2004) 9. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufmann, San Francisco (1999) 10. Rosson, M.B., Carrol, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Academic Press, London (2002) 11. Royce, W.: Managing the Development of Large Software Systems. In: Proceedings of IEEE WESCON. vol. 26, pp. 1–9 (August 1970) 12. Schwaber, K., Beedle, M.: Agile Software Development with Scrum. Prentice Hall, Upper Saddle River (2002) 13. Zimmermann, D., Groetzbach, L.: A Requirement Engineering Approach to User Centered Design. In: HCII 2007, Beijing (2007)

How a Human-Centered Approach Impacts Software Development Xavier Ferre and Nelson Medinilla Universidad Politecnica de Madrid Campus de Montegancedo 28660 - Boadilla del Monte (Madrid), Spain {xavier, nelson}@fi.upm.es

Abstract. Usability has become a critical quality factor in software systems, and it requires the adoption of a human-centered approach to software development. The inclusion of humans and their social context into the issues to consider throughout development deeply influences software development at large. Waterfall approaches are not feasible, since they are based on eliminating uncertainty from software development. On the contrary, the uncertainty of dealing with human beings, and their social or work context, makes necessary the introduction of uncertainty-based approaches into software development. HCI (Human-Computer Interaction) has a long tradition of dealing with such uncertainty during development, but most current software development practices in industry are not rooted in a human-centered approach. This paper revises the current roots of software development practices, illustrating how their limitations in dealing with uncertainty may be tackled with the adoption of well-known HCI practices. Keywords: uncertainty, software engineering, waterfall, iterative, HumanComputer Interaction-Software Engineering integration.

1 Introduction Software development practices are mostly rooted in Software Engineering (SE), since SE as a discipline is pervasive in software development organizations all over the world. Its concepts are the ones with which the majority of developers are familiar, and this is especially true of senior management at software development organizations. HCI, on the other hand, has been traditionally considered as a specialist field, and its view of development is not as present in software development organizations as the SE perspective. According to Seffah, "HCI structure and techniques are still relatively unknown, under-used, difficult to master, and essentially not well integrated in software development teams" [24]. Nevertheless, there is an increasing interest about usability, due to the importance of graphical user interfaces nowadays [11], and about HCI methods to manage it, which will likely achieve wider user and greater impact in the near future [27]. Therefore, usability awareness has greatly risen in software development in the last decade. There is now a certain consensus on the aim of building usable systems, J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 68–77, 2007. © Springer-Verlag Berlin Heidelberg 2007

How a Human-Centered Approach Impacts Software Development

69

leading to a need for integration of usability methods into SE practices, providing them the necessary human-centered flavor. The term "Human-Centered Software Engineering" has been coined [25] to convey this idea. In contrast, HCI practitioners need to show upper management how their practices provide value to the company in the software development endeavor, in order to get a stronger position in the decisiontaking process. HCI and SE need to understand each other so that both can reciprocally complement with effectiveness. While SE may offer HCI practitioners participation in decision-making, HCI may offer their proven practices that help in dealing with the uncertainty present in most software development projects. In the next section the diverging approaches of HCI and SE are analyzed. Next, in section 3 the role of uncertainty in software development is outlined, elaborating on problem-solving strategies and how they apply to software development. Section 4 presents how joint HCI-SE strategies may be adopted for projects where uncertainty is present. Finally section 5 presents the conclusions gathered.

2 HCI and SE Development Approaches SE is defined as the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software [13]. In the pursuit of these objectives, SE has highlighted software process issues, and it has also traditionally focused on dealing with descriptive complexity. On the other hand, HCI is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use in a social context, and with the study of major phenomena surrounding them [22]. Usability is the main concern for HCI, and it is multidisciplinary by essence. The HCI view on software development is, in a certain sense, broader than the SE one, which mostly focuses on the running system in isolation. In contrast, HCI does not handle with comparable deepness specific issues, like software process or software architecture. Fig. 1 shows how SE and HCI differ about their main subject of interest in software development. While HCI cares about the impact created by the software in the user and his social context, SE focuses mainly on the correctness of the running software system itself. Software engineers mostly consider usability as a user interface issue, usually dealt with at the end of development, when the ìmportant´ part of the system has already been built. Alternatively, HCI experts carefully study the users and their tasks, in order to better fit the system to the intended users, and they consider that once the system interaction has been defined software engineers may begin `building´ the system. There is a high degree of misunderstanding between both fields, along with some lack of appreciation for the work performed by the other discipline. Practitioners of both fields think it is them who do the "important job" in software development. Comparing HCI to SE it may look like as lacking maturity. In this direction, Mayhew states that integration of usability engineering with the existing software development lifecycle has not yet been solved, mostly due to the state of maturity of the Usability Engineering discipline [20]. Alternatively, SE methods may look too system-centered for an effective user-system interaction, as understood in HCI.

70

X. Ferre and N. Medinilla

Fig. 1. Comparison between HCI and SE main focus

Despite this lack of mutual understanding, both disciplines need to collaborate, since there is a non-trivial overlapping between their respective objects of study and practice. In particular, requirements-related activities are considered a cornerstone of the work of both HCI and SE. The decision of which system is going to be built is quite important for usability purposes, so HCI has a lot to say about it, while requirements engineering is a SE subdiscipline with a recognized importance in the field, so software engineers will not be handing completely requirements-related activities to usability experts. The traditional overall approach to development in SE has been the waterfall lifecycle. In relation to requirements, it is based on requirements which are fixed (frozen) at early stages of development. Nevertheless, the waterfall lifecycle is considered nowadays in SE as only valid for developing software systems with lowmedium complexity in domains where the development team has extensive experience. As an alternative to the waterfall, iterative development is currently identified as the development approach of choice, even if its practical application finds some opposition. On the contrary, HCI has traditionally adopted an iterative approach to development. Therefore, some promising opportunities for SE-HCI collaboration come out. Conflicts may arise between both kinds of practitioners, but they must be solved if usability is to be considered a relevant quality attribute in mainstream software development. Fortunately, recent trends in SE show a higher acceptance of uncertainty in software development, and this can provide a higher appreciation for HCI practices, as explained in the next sections.


71

3 Uncertainty in Software Development Uncertainty is currently accepted as a necessary companion of software development [3],[19]. However, SE has traditionally considered uncertainty as harmful and eradicable. The aim was to try to define a "safe" space where no uncertainty could affect the work of software developers. The development of software systems of higher complexity levels has led to the need of changing this approach. In order to deal with complexity, the traditional SE view considers only descriptive complexity (quantity of information required to describe the system, according to Klir & Folger [17]). It is a useful dimension to work in the software universe but, on most occasions, it is not enough on its own to explain the software universe. Descriptive complexity needs to be combined with the complexity due to uncertainty, which is defined by Klir & Folger as the quantity of information required to solve any uncertainty related to the system [17]. Ignoring uncertainty in software development obstructs the objective of better coping with highly complex problems to be addressed by software systems, since it narrows the interpretation of both the problem and the possible strategies for building a successful solution. Complexity due to uncertainty adds a new dimension to the software space, as shown in Fig. 2. When extending the software universe dimensions to two, some hidden issues that hinder software development projects are uncovered, and new solutions emerge. complexity due to uncertainty

software universe descriptive complexity

Fig. 2. Extension of the software universe when considering the uncertainty dimension

Dealing with uncertainty is unavoidable in software development. But it is not just an undesired companion in the software development journey, it can be used as a tool that offers a powerful mean of attacking recurring problems in software development. Having uncertainty-based means in the toolbox of software development teams, offers them a richer background and vision to better tackle their work in the complex software universe. The usage of uncertainty as a tool in software development takes several forms: The introduction of ambiguity in the solution and the adoption of problem-solving strategies that manage uncertainty.

72


3.1 Ambiguity as a Way of Introducing Uncertainty in the Solution Abstraction is a simplification tool that expresses just the essential information about something, leaving out the unnecessary details. This omission deliberately introduces uncertainty, which manifests in the form of ambiguity. An abstraction is precise with respect to the essence of the topic conveyed, but it is necessarily ambiguous with respect to the particulars, which are intentionally taken out of the picture. When making design decisions, uncertainty also plays a major role in providing solutions which are easier to maintain, modify or extend. For example, the hiding information principle [21], promotes the introduction of uncertainty in the design, by not providing details on how a particular module is implemented. Modularization on its own does not provide benefits for this purpose, since a careful design of the modules and their headers is necessary for attaining the necessary relation of indifference between modules. Any design decision that attempts to introduce some degree of ambiguity in the solution being developed uses uncertainty as a tool for allowing easier future modifications. As a collateral effect, development usually gets more complex and more difficult to manage when employing uncertainty-based strategies; in a similar way to object-oriented design being more complex than the structured development paradigm, but providing a more powerful and less constrained instrument for the development of complex software systems. 3.2 Problem-Solving Strategies and Uncertainty Human beings use different strategies according to the extent of the uncertainty they must confront. A linear or industrial strategy may be employed with zero or negligible uncertainty; a cyclical or experimental strategy when having medium uncertainty (something is known); and an exploratory or trial and error strategy when high uncertainty needs to be dealt with. The higher the uncertainty level provided by the strategy, the higher will be its power for dealing with uncertainty (in the problem). Linear strategy (step after step) follows a path between a starting point to an ending one, given that both points and the path between them are known in advance. That is, it is necessary to know the problem, the solution, and the way to reach such solution. If all these requirements are met, the lineal strategy is the cheapest one. In order to make possible its application, any uncertainty needs to be eradicated before beginning the resolution process. The paradigm that represents the linear strategy in software development is the waterfall life cycle. It follows the sequence requirements, analysis, design, implementation, and testing, which is a direct translation of the Cartesian precepts enunciated in the Discourse on Method [8]: Evidences, analysis, synthesis and evaluation. The idea behind these principles is to undertake in the first place the what and afterwards the how. This separation between requirements and design is an abstract goal and not a human reality [1]. The so called incremental strategy is a variant of the linear one where the problem is divided into pieces, which are then undertaken one by one. Cyclical or experimental strategy (successive approximations), when converging, comes progressively closer to an unknown final destination through the periodical refinement of an initial proposition (hypothesis). A cyclical strategy is adopted when


73

the solution is unknown, but there is enough information on the issue to be able to formulate a hypothesis. The paradigm for the cyclical strategy in the software world is the spiral model [2]. A common statement in software development is to describe each cycle in the spiral model as a small waterfall. This is inappropriate, since the spiral recognizes the presence of uncertainty throughout the (risk-driven) process, and the waterfall, whatever its size, requires eradicating the uncertainty at the beginning. Arboreal or exploratory strategy (trial and error) is the way to reach an unknown destination without a best first guess, provided that the universe is closed. In the case of an open universe, the exploratory strategy does not ensure finding the solution, but none of the other strategies may ensure it, given the same conditions of uncertainty. An exploratory strategy is in place every time a solution is discarded and development goes back to the previous situation. The Chaos life cycle [23] is very close to an exploratory strategy, but it is limited by Raccoon's waterfall mindset. 3.3 Uncertainty and HCI HCI has developed interactive software for decades, without the obsession about uncertainty eradication present in SE. In fact, HCI literature has some examples of insight regarding real software development. Hix & Hartson's [10] observations about the work of software developers show that they usually operate in alternating waves of two complementary types of activities: both bottom-up, creative activities (a synthesis mode) and top-down, abstracting ones (an analysis mode). Hix & Hartson also unveil the closeness that exists between analysis and design activity types, especially in the requirements-related activities. It is not sensible, then, to try to draw a clear separation between both activity types. With regard to methodologies in place in software development companies (based in a waterfall approach), they report that in some of their empirical studies they noticed that "iterative and alternating development activities occurred, but because corporate standards required it, they reported their work as having been done strictly top-down" [10]. The reality of development was hidden behind the mask of order of the waterfall. According to Hakiel "There is no reason why a design idea might not survive from its original appearance in requirements elicitation, through high- and low-level design and into the final product, but its right to do so must be questioned at every step" [9]. This approach is a radical separation from the waterfall mindset mostly present in SE, which was traditionally presented as the way to develop software in an orderly manner. The multidisciplinary essence of HCI has helped in providing a not so rigid approach to development in the field. As Gould and Lewis [12] say, when a human user is considered (as in the upper part of Fig. 1) a coprocessor of largely unpredictable behavior has been added. Uncertainty is a companion of any attempt to develop interactive systems of non-trivial complexity, since human beings are part of the supra-system we are addressing: the combination of the user and the software system, trying to perform tasks which directly relate to the user goals. User-centered or human-centered development is the HCI approach to the development process, and it has traditionally introduced uncertainty when labelling himself as iterative. In this sense, [5], [10], [16], [22], and [26] agree on considering iterative development as a must for a user-centred development process. Therefore,

74


iterativeness is at the core of HCI practices. A real iterativity, in the sense that evaluation is often considered as formative; not just an exam for identifying mistakes, but a tool for giving form to the interaction design, and maybe for identifying new requirements.

4 Common HCI-SE Problem-Solving Strategies As presented in the previous section, uncertainty is a tool for problem resolution; in particular, it is a tool for interactive software development. Uncertainty-based approaches have been adopted in the resolution strategies of both HCI and SE, without labeling them as such. When trying to integrate usability and HCI methods into mainstream development, the extensive HCI experience in dealing with uncertainty may be incorporated into SE practices, making them better prepared to cope with the development of complex systems with a high usability level. Non-linear problem-solving strategies present important challenges with respect to estimation and planning, along with the danger of continuously iterating without advancing towards the solution. A certain degree of flexibility is necessary for dealing with these issues, as HCI usually employs. Accordingly, some degree of uncertainty will have to be introduced in the formal procedures advocated by SE methodologies. 4.1 Iterative Development Iterative-cyclical strategies are currently at the center of debate in SE, with agile and iterative practices (see, for example, [18]). When adopting cyclical strategies of this kind, the introduction of HCI practices may be undertaken with greater success than former proposals for integration into waterfall lifecycles, like [7]. The aim of integrating usability engineering and HCI practices into mainstream software development, which mostly refuses to deal with uncertainty, have led to more formal solutions, in a SE sense, but leaving out the uncertainty present, for example, in iterative approaches. Such as Mayhew's Usability Engineering Lifecycle [20], which is based on a two-step process where analysis activities are performed in a first phase, and then design and evaluation activities are performed iteratively on a second phase; but there is no place for resuming analysis activities. Therefore, it is based on a frozen requirements paradigm, with reminiscences of a waterfall mindset. Nevertheless, iterativeness has been at the heart of usability engineering practices because usability testing has been the central point around which the whole development effort turns. It is necessary to test any best-first-guess design. Observational techniques and sound analysis are performed with the aim of getting a high quality first design, but usability testing with representative users is then performed to check against reality the logical constructs the design is made of. The expected functionality and quality levels of the final system can be specified, but there is a certain degree of uncertainty in building the solution, the software system, in the sense that when undertaking the construction of some part of the system we do not exactly know how far we are from the specified solution. This is especially true when dealing with usability. Any design decision directed to usability


75

improvement needs to be tested with representative users, in order to check the actual improvement in usability attributes like efficiency in use. When the system under scrutiny includes the final user on top of the computer system, as it is necessary for the management of the final product usability, flexibility is required for adapting the partial prototypes according to evaluation outcomes. 4.2 Exploratory Strategies and the Definition of the Product Concept Exploratory strategies are not yet dealt with in SE literature and practice. Traditional information systems, like payroll systems, are well defined and most SE methodologies are directed to building them. Input-process-output models fit very well this kind of problems: automation of procedures previously performed manually, with well defined rules and algorithms. The product concept is clearly delimited in this kind of systems, so requirements can be written down with less risk of misunderstandings between the customer and the development team. Actually, IEEE body of standards has a standard for establishing the user requirements, the Concept of Operations [15] or ConOps, but it is seldom used in software development, unlike the more system-oriented (or developer-oriented) IEEE recommended practice for software requirements specification [14], which receives much more attention from the SE field. On the other hand, the HCI field has a long tradition of dealing with ill-defined problems, developing new products with a high degree of innovation. Even if the creation of these systems has not been their main focus of activity, dealing with problems with neither an obvious solution nor indications of how development should proceed, has been part of HCI practitioners' work. Accordingly, several HCI techniques are specially well suited for defining the product concept. These techniques favor participative and creative activities, which fit very well the purpose of creating a model of how the system works, from the user point of view, studying if it fits with user needs/expectations. Examples of this techniques are Personas [6], Scenarios [4], Storyboards and Visual Brainstorming [22]. As long as current interactive systems development goes on changing to new paradigms of interaction, with an ever increasing degree of novelty required, these HCI techniques will have to be either adopted by software engineers, or applied by HCI experts belonging to the development team.

5 Conclusions In this paper we have shown how uncertainty plays a major role in software development in the construction of non-trivial interactive software systems. While uncertainty in the problem may be harmful, uncertainty in the solution may be useful when used as a tool for dealing with the former kind of uncertainty (the one in the problem). HCI has been traditionally applying flexible processes that allow participatory design, and it has promoted the usage of prototypes aiming at greater flexibility for making changes to the (partial) software solution. Aditionally, some HCI techniques are especially well suited for the development of innovative software systems, which

76


are ill-defined by definition, and they may be adopted for exploratory problemsolving strategies. Even if this is part of standard HCI practices, the convenience of this approach has not been formalized in a way that helps HCI methods integration into mainstream software development practices. Recent awareness about the obstacles that traditional approaches, like the waterfall life cycle, imposes on the endeavor of successful systems development, leads to a more favorable attitude to the introduction of HCI methods, which ultimately lead to better quality products. In particular, HCI may play an important role in introducing practices that improve the usability of the final product, while additionally preparing businesses to better deal with uncertainty in software development. Understanding the roots of current software development practices and knowing their deficiencies in dealing with uncertainty is essential for any software development business. A model for software development that considers uncertainty is needed, in order to change from a field that is based only on the expertise of gurus to a software development field with sound foundations for the selection of development practices.

References 1. Blum, B.I.: Software Engineering A Holistic View. Oxford University Press, New York, USA (1992) 2. Boehm, B.W.: A Spiral Model of Software Development and Enhancement. ACM SIGSOFT Engineering Notes 11-4, 14–24 (1986) 3. Bourque, P., Dupuis, R., Abran, A., Moore, J.W., Tripp, L., Wolf, S.: Fundamental principles of software engineering- a journey. Journal of Systems and Software 62, 59–70 (2002) 4. Carroll, J.M.: Scenario-Based Design. In: Helander, M., Landauer, T., Prabhu, P. (eds.) Handbook of Human-Computer Interaction, 2nd edn. pp. 383–406. Elsevier, NorthHolland (1997) 5. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. Addison-Wesley, New York, USA (1999) 6. Cooper, A., Reimann, R.: About Face 2.0: The Essentials of Interaction Design. Wiley Publishing, Indianapolis (IN), USA (2003) 7. Costabile, M.F.: Usability in the Software Life Cycle. In: Chang, S.K. (ed.): Handbook of Software Engineering and Knowledge Engineering, pp. 179–192. World Scientific, New Jersey, USA (2001) 8. Descartes, R.: Discourse on the Method of Rightly Conducting One’s Reason and of Seeking Truth (1993), http://www.gutenberg.org/etext/59 9. Hakiel, S.: Delivering Ease of Use. Computing and Control Engineering Journal 8-2, 81– 87 (1997) 10. Hix, D., Hartson, H.R.: Developing User Interfaces: Ensuring Usability Through Product and Process. John Wiley and Sons, New York (NY), USA (1993) 11. Glass, R.L.: Facts and Fallacies of Software Engineering. Addison-Wesley, Boston (MA), USA (2003) 12. Gould, J.D., Lewis, C.: Designing for Usability: Key Principles and What Designers Think, Communications of the ACM, 300–311 (March 1985) 13. IEEE: IEEE Std 610.12-1990. IEEE Standard Glossary of Software Engineering Terminology. IEEE, New York (NY), USA (1990)


77

14. IEEE: IEEE Std 830-1998. IEEE Recommended Practice for Software Requirements Specifications. IEEE, New York (NY), USA (1998) 15. IEEE: IEEE Std 1362-1998. IEEE Guide for Information Technology - System Definition Concept of Operations (ConOps) Document. IEEE, New York (NY), USA (1998) 16. ISO: International Standard: Human-Centered Design Processes for Interactive Systems, ISO Standard 13407: 1999. ISO, Geneva, Switzerland (1999) 17. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty and Information. Prentice Hall, N.J. (1988) 18. Larman, C.: Agile and Iterative Development. In: A Manager’s Guide, Addison-Wesley, Boston (MA), USA (2004) 19. Matsubara, T., Ebert, C.: Benefits and Applications of Cross-Pollination. IEEE Software. 24–26 (2000) 20. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufmann, San Francisco (CA), USA (1999) 21. Parnas, D.L.: On the Criteria To Be Used in decomposing System into Modules. Communications of the ACM. 15-12, 1053–1058 (1972) 22. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., Carey, T.: Human-Computer Interaction. Addison Wesley, Harlow, England (1994) 23. Raccoon, L.B.S.: The Chaos Strategy. ACM SIGSOFT Software Engineering Notes, 20-5, 40–46 (1995) 24. Seffah, A., Andreevskaia, A.: Empowering Software Engineers in Human-Centered Design. In: Proc. of the ICSE’03 Conference, Portland (OR), USA, pp. 653–658 (2003) 25. Seffah, A., Gulliksen, J., Desmarais, M.D. (eds.): Human-Centered Software Engineering Integrating Usability in the Development Process. Springer, Heidelberg (2005) 26. Shneiderman, B.: Designing the User Interface: Strategies for Effective Human-Computer Interaction, 3rd edn. Addison-Wesley, Reading (MA), USA (1998) 27. Vredenburg, K., Mao, J.Y., Smith, P.W., Carey, T.: A Survey of User-Centered Design Practice. In: Proc. of CHI-2002, Minneaopolis (MI), USA, pp. 471–478 (2002)

After Hurricane Katrina: Post Disaster Experience Research Using HCI Tools and Techniques Catherine Forsman USA [email protected]

Abstract. This paper focuses on the time period between September 2005 and September 2006 where HCI research experiments were deployed in a posthurricane Katrina disaster area. This area stretched from the cities of Waveland and Bay St. Louis, Mississippi (the epicenter of hurricane Katrina) to shelters in Baton Rouge and Houston, Texas. The HCI experiments were constructed in order to understand immediate disaster aftermath issues of a population in context of activities, information and organizational needs. The use of a Participatory Design (PD) methodology, Ethnographic techniques, and design Probes were refined over the course of the longitudinal study. Field notes were created in an iterative process with individual participants over a course of time due to the impact of shock and cognitive issues early on. These field notes then influenced a set of personas that were iterated and used as a vehicle to gather and validate field research findings and people’s needs within the disaster framework. The main goal of this paper is not to propose informational, organizational or technology solutions to the complex problems inherent in a disaster cycle, but to illustrate both the failure and success of using HCI methods in a post disaster situation. Therefore, a disaster cycle is outlined and described in this paper. Insights and thoughts regarding the Rescue and Recovery phases are described and notes where HCI as a practice may influence or contribute to these areas within the disaster cycle are outlined. The last part of the paper illustrates the first HCI experiment in the field and some of the iterations and findings from this practice. This first research study was undertaken at a grassroots level, yet this does not mean valuable information could not be gathered in further studies of governmental, NGOs, or businesses participating in planning, preparing or rescue and recovery efforts during a disaster. In fact, the opportunity to combine grassroots and governmental HCI research could offer immense benefits. However, as a grassroots initiative it is a level of inquiry without the constraints of political hierarchy. Given this, this paper focuses less on how HCI can be used in a more typical framework where a sponsor, such as a client and HCI worker are collaborating in HCI “workplace” research, and more on developing tools and methods within communities. Keywords: Disaster Management, HCI & Hurricane Katrina, Pervasive Computing, Context, Ethnographic Research in Disaster Situations, Participatory Design, Community Prototype Development, Design, Probes, HCI. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 78–87, 2007. © Springer-Verlag Berlin Heidelberg 2007

After Hurricane Katrina: Post Disaster Experience Research

79

1 Introduction Disaster is a complex human and environmental event oftentimes perplexing to the most brilliant social scientist, humanitarian worker, governmental official, or human being. FEMA, and other governmental agencies, work with a defined categorization of disaster, called a cycle of disaster [1]. This high-level cycle is an overall view of stages within a continual loop of disaster. Yet, defining a fixed type of disaster and the resulting solutions for appropriate information dissemination, acquisition and organization is difficult because no disaster is the same. Each disaster brings with it specific characteristics. To easily envision the complexity, one could consider the difference between the impact of extreme environmental elements such as wind, water, air and fire and then compound that with the varying contexts where such a disaster could take place. These contexts could be urban or rural with different cultural and language requirements. For example, imagine the difference between 9/11 and Hurricane Katrina. These disasters took place in two different cities with different local governments, cultural histories, demographics, and within different urban plans. If one were to envision the types of activities and information involved in both disasters there may be a few overlapping qualities at a high level, but in reality, they are specifically different at the informational need, activity and urban context level. In other words, disaster has a site specific element to it that involves understanding context, activities, information and the flexibility of real-time, ad hoc information adaptation to contextual activities. If this hypothesis is interesting, then research in disaster management with an HCI perspective may reap interesting findings. Because HCI deals with the study of information systems and appropriate technologies for people within situated activities, it is a unique field, well suited for understanding human needs in adaptive and changing environments. That is what this paper is about: the process of using HCI tools and techniques in a post disaster situation in order to learn how context, activities and people learn what to do and what they may need as information and technology in order to do those things. In the future, conducting HCI research in disaster areas may lead to important findings regarding innovation for disaster situations, technology devices, information structures and the creation of ontological frameworks of experience used as infrastructures for adaptive learning tools when the cycle repeats itself.

2 The Cycle of Disaster The disaster cycle is outlined next in this paper in order to illustrate a framework and define what is meant by “disaster management.” Mitigation. This phase encompasses lessening the effects of possible disasters. It differs from other phases because it involves trying to learn from past disasters through information and data to lessen the severity of any future disaster. This phase also deals with evaluating risks and risk management [2].

80

C. Forsman

Preparedness. Common preparedness measures include the proper maintenance and training of emergency services, the development and exercise of emergency population warning methods combined with emergency shelters and evacuation plans, the stockpiling, inventory, and maintenance of supplies and equipment, and the development and practice of multi-agency coordination. An efficient preparedness measure is an emergency operations center (EOC) combined with a practiced regionwide doctrine for managing emergencies [2]. A development of interest to HCI professionals in this area is one where ethnographic observations regarding self organizing behavior were used after 9/11. In 2002, the US Federal government created a new procedure for evacuating federal employees in Washington. The protocol is based upon observed social dynamics exhibited in 9/11 and attempts to “improve the ad hoc process” based upon ethnographic findings [3]. But, even if there are some insights into how field research can contribute to understanding self organizing systems for future disaster scenarios, is the concept of preparedness flawed? Certainly some risks can be avoided, but disaster by definition is about chaos and the unexpected that takes place in specific contexts that cannot be predetermined. Is it possible to be prepared for dynamic and complex situations that may not now exist, even in a risk model? In two surveys conducted by NYU’s Center for Catastrophe Preparedness and Response (CCPR), one after 9/11 and one after Hurricane Katrina, they noticed a steep rise in participant’s beliefs that one could not prepare for a disaster after the widespread destruction of Hurricane Katrina. This survey data is as follows: 62% of Americans said that it was nearly impossible to be very prepared for terrorist bombings, 60% said the same about hurricanes and floods, and 55% said the same of a flu epidemic” [4]. This shows, perhaps, a lack of confidence in the idea that anyone can “prepare” for such events. But, it also illustrates a perception that preparedness may be an area of inquiry. The question is: What tools does one use to understand this issue? In reality, managing disaster is a combination of both understanding the ad hoc organization before, during, and after the disaster occurs—a very difficult proposition. Response. The response phase includes the mobilization of the necessary emergency services and first responders in the disaster area, such as firefighters, police, volunteers, and non-governmental organizations (NGOs). Additionally, this phase includes organizing volunteers [2]. One could imagine that there are dynamic events that occur in the real world during a disaster. Additionally, there are static preparedness protocols that could be described in a taxonomic way such as a “type” of response (e.g. rescue in water), and scenarios of rescue (e.g. evacuation to hospital). However, rescuing someone from a nest of poisonous water snakes as the person struggles to stay afloat in oil enriched water with no clear directive on where to take the victim due to limited radio frequency and lack of organizational directives is what occurs. The actual narrative of events is very different from simulation of the event previous to the disaster. This underscores an aspect inherent in response, and that is the need for real-time collaboration in interactions with people, information and technology in a social networking and ad hoc organizational manner as needs arise and have outcomes that can rarely be predicted. It also underscores the need for HCI research that deals both


81

with narratives of actual events and the creation of technical infrastructures, information structures and organizational models for real-time response and access and organization (reading of patterns). Recovery. The aim of the recovery phase is to restore the affected area to its previous state. It differs from the response phase in its focus; recovery efforts are concerned with issues and decisions that must be made after immediate needs are addressed [2]. The idea of “restructuring” brings with it a wealth of opportunity to explain and explore contextual and population needs through narrative. In other words, new socialized orders can be explored and remapped in accordance to what may not have existed before but ideally could have. The narrative could be shown in personas and scenarios; yet, grounded in field research that can be validated by communities and individuals in order to ensure a community feedback loop.

3 Technology and Disaster If one looks more closely at the technology used during the Rescue and Recovery phase after Katrina it leads to some interesting findings. For example, first responders often communicate via two-way radios. Two-way radios have limited range, about a kilometer, but repeater stations can be positioned to increase the range. They are most often used to coordinate supplies, rescue missions and communications between team members and a Coordinator [5]. Additionally, to accommodate a real-time dynamic, cell phones can be used but oftentimes the network cannot respond or infrastructure or device failure occurs due to environmental issues [6]. However, even if a cell phone could work, when power failure occurred, all 911 center capabilities were disabled [7]. There was nowhere to call but friends or family. A useful technology after Hurricane Katrina was Ham radio. Commercial radio antennas are placed on top of hills in order to cover broader areas of reception making them highly vulnerable to wind and earthquake tremors, whereas Ham radio operators build smaller antennas knowing their broadcast range is within the local area, such as within a city. If an Ham radio tower (can be as small as 100 feet) falls over, it is easy to pick up and reinstall. ARRL President Jim Haynie testified before Congress in 2005 that 1000 Amateur Radio volunteers have been serving the stricken area to provide communication for such agencies as The Red Cross and the Salvation Army and to promote interoperability between agencies” [8]. FEMA passed out radios to citizenry in Bay St. Louis, MS so that they could listen to the broadcasts from a local radio station and obtain information regarding food, shelter and medicine [9]. The idea of the local and smaller technology prevailed simply because it was quickly repairable and could be supported easily by governmental agencies (just pass out the radio from a truck). There is no guarantee that any type of technology would not suffer the same fate as cell phones did during Hurricane Katrina. Meaning, they may work but there are important factors to consider that involve information and organization (e.g. 9/11 operators, system overload, complete loss of communication tools). However, as in the case of Ham radio, the key is the separation of information structures from the device and the local, smaller aspect of the technology. This then, has some precedence in HCI with pervasive computing [10, 11].

82

C. Forsman

For the volunteers outside the disaster zone, a proliferation of internet use and social networking took place (bulletin boards, websites, etc…). The internet became a platform for grassroots initiatives and individual and small-group citizen rescue mission needs (e.g. “Please send money for a gas card for the plane we are flying in to the Gentilly area to distribute water.”) [12]. While those outside the disaster zone can use the internet for organizational and informational purposes, those within the disaster area most likely are without immediate access to the internet. The important takeaways from this is that technology, understood in a real-world framework, worked best when it was easily restructured, could relay distribution and organizing information, and allowed for a social networking either via voice or text.

4 Considering Disaster, Context and the User If we think about the “user’s” capacity for information organization and processing in a complex environment such as in a disaster, the idea of the user as an isolated element understood and normalized for specific psycho-cognitive interactions with an information system in a laboratory does not hold. This type of user definition arose from cybernetics where Herbert Simon proposed his ideas of bounded rationality and learning through information feedback and adaptation [13]. The objective of these studies was to determine task models, ergonomic needs, information models and cognitive responses to a system [14]. From the set of results, usually conducted in a laboratory with task-based questions or via survey, a baseline could be created of a user with varying degrees of expertise and satisfaction in relation to a technology system and tasks performed. Yet another train of thought, as written about by Drs. Lamb and Kling is the idea of the user as a social actor where context plays an important role in understanding the requirements for interactions [15]. The added relation of context and user accounts for the complexities of situated actions, such as space, interactions with objects and people, and power dynamics for use of systems or information. Situated action was first introduced in 1987 in Lucy Suchman’s book, Plans and Situated Actions: the Problem of Human-Computer Communication. And, as Lamb and King point out, “years later a particularly formative and influential study in this area appeared in Mumford’s socio-technical ETHICS PD approach [16]. PD practitioners became keenly aware that structural constraints may prevent the exchange of information, but they believed that users were social actors and capable of mobilizing change. This is not to say the early PD research sided only with the social actor as worker, but included within their perspective organizational changes through technology throughout the full power hierarchy of an organization.” Basically, it was a creative way to consider context and people as social actors within their interactions with context, information and technology [17, 18, and 19]. Disaster by its very definition is chaos rapidly changing and possibly disintegrating contexts. Context here can be the urban landscape, such as a city or home, or mental models of operation, such as knowing how to reach a medical facility if one is hurt. Due to the disruptive nature of disaster to context it is an important study for HCI field research in disaster situations. Meaning, how do people, whether organizers or


83

survivors, deal with varying cognitive loads of information and organizational complexity in order to readjust themselves within survival contexts (shelters, hospitals, etc…)? Extending context beyond the workplace does have some precedent in HCI in the area of pervasive and urban computing where context is extended so that “cities can also be viewed as products of historically and culturally situated practices and flows. When we view urban areas in this context, rather than as collections of people and buildings, infrastructure and practice are closely entwined” [15 and 20]. Over the course of time, methodologically, the research organically moved closer to the sensibilities of PD as envisioned in earlier Scandinavian HCI projects [23]. Observational notes were requested by NGOs so that they could better understand the conditions of specific locations. These notes were sent via email with participant’s permission and editing. Participants in the research began organizing within the shelter and asking for photography or journal recording advice in order to post information to the internet. Traditionally, PD dealt with context and human activities in work environments and was deployed with the understanding that a community would be studied and impacted from decisions made about information systems or machines [24]. The core premise of PD was that better and safer working conditions will result from some sharing of power and an appreciation of the tacit knowledge and adaptive capabilities that workers contribute to organizational processes. In other words, researchers immersed themselves in a culture in order to contextualize culture within their research, create a feedback loop with people within the context, and to participate with the community in developing prototypes and articulations of requirements. In the historical context of PD, the research itself became a conduit for requirements that arbitrated the needs of the workers to management and vice versa. In the case of HCI disaster research, the need for research within context becomes even more strongly coupled than in industry as understanding the needs of a population cannot be divorced from survival actions in context. Understanding what organizational and information needs confront people while coping with the myriad facets of disaster very likely can inform information structures for disaster response in the future, as well as immediate feedback loops in the present.

5 The Research Experience The full study took place over a 2 year period, but for the purposes of this paper, the first experiment is briefly described as it set the direction for the following research experiments. To understand the organizational complexity of people and how they rapidly relearn a new context in order to survive, ethnographic observation and contextual inquiry was used. Interviews were conducted regarding people’s memories of their experience in the changed flow of the city during escape compared to preexisting conditions. Interviews took place in order to understand the memory of how the participant once contextualized their day-to-day experiences within their

84

C. Forsman

homes and neighborhoods, and how this had changed. Additionally, design probes1 (cameras, diaries, sketching, asking participants to post photos to an internet site and video to you tube) were used [21]. The research had as its goal to understand what was critical when changes in context-as-situated-action had on participants. The inquiry highlighted the needs of survivors for both information and organization around flow in order for them to cocreate ongoing survival strategies. The output from this area of inquiry was a set of narratives, field notes, scenarios and personas with clear representations of the participants in context before and after disaster. These personas and scenarios were taken back to communities, and via workshops in some cases, or individuals in others, validated and iterated, thereby involving a “community” aspect. Another interesting output from this research was political in nature. Given that context became a central area of inquiry, findings regarding the appropriate distribution of goods and cultural misinterpretation of needs between governmental and NGO workers and hurricane survivors was evident. Meaning, the survivor believed “You just don’t get me.” when it came to privacy concerns, organization of cots, shower schedules and food. The NGOs and Red Cross organizers followed a protocol of organization that had little to do with context and had been prepared previous to the hurricane. This, then, may be categorized as a missing layer of information interfacing between a specific, situated population and the relief organization’s protocols. In order to bridge this gap, organizational meetings too place within the shelters and field notes were used as tools of information dissemination, pointing to the need for a more efficient; yet, malleable form for creating an information interface between the two populations. Similar research was conducted in the city of Bay St. Louis, MS. In this context, gas stations and Red Cross distribution centers had gluts of certain supplies (baby diapers and sweaters in 90 F weather where babies had been evacuated) and very little of the needed supplies for navigating the new context (accessible medical centers for tetanus shots, appropriate makeshift shelter for 200+ people sleeping in the church’s parking lot, or fork lifts to clear major thorough fares and street signs to order the flow of traffic). This is not an uncommon pattern as witnessed in an ethnographic work in Peru in 1990 when a powerful earthquake struck. Noticing that indigenous populations were not receiving food and goods that arrived through NGOs, Theo Schilderman, an anthropologist, studied the problem of how official relief agencies, survivors, and grassroots volunteers misinterpreted each others needs. The result was a deprivation of goods to survivors because both governmental agencies and NGOs 1

By the word “probe” is meant a label used to categorize a set of tools used in field research and design practices to gather information and iterate ideas with people. Tools categorized within the “probe” category are such things as diaries, remote cameras, drawing exercises mailed to the researcher, etc… Probes have a lineage within the design field for open inquiry. However, the author has intentionally not classified Participatory Design (PD) within the area of Probes in order to distinguish a different historical lineage, such as PD resulting from a need to incorporate an understanding of politics into the research process, whereas the classic definition of probes (diaries, cameras, etc…) were developed for design feedback loops that may or may not have had as an objective and understanding mediating politics.


85

were unfamiliar with the community conditions. Additionally, when the goods were finally distributed, the results were a mismatch between what emergency management authorities were trying to give to the victims and what they actually needed [22]. Creating information structures that can easily be accessed around population needs within specific contexts could alleviate some of this tension. 5.1 Aftermath Experiment The first experiment was performed from September 25 – 30, 2005 in three separate locations. Location 1: the Houston Astrodome. Location 2: NGO center outside Baton Rouge shelter, and Location 3: at various places within the city limits (home or makeshift shelter) of Bay St. Louis and Waveland, Mississippi. Two days were spent in each location, interviewing and observing people with a video camera and recording field notes. The recruitment process was done by word-of-mouth and over the internet. 5.2 Social Networking Via the Internet Before arriving onsite, the necessity to quickly establish connections with NGOs and emergency medical personnel within specific locations was done through email and the internet. Craig’s list and Katrina bulletin boards were also used during this phase. Due to the organizational complexities thwarting NGOs and governmental agencies in the accurate distribution of volunteers and supplies, certain people formed their own organizations and drove into the disaster area distributing goods. They organized via the internet and appeared on the doorsteps of shelters, churches or roadsides with supplies. Connecting with these people became invaluable in order to cover wider geographic areas of interest because they distributed life-saving goods in an ad hoc fashion. 5.2.1 Research Issues and the Failure of the Probe Ethnography was performed in the following way: observational, interactive (conversational), and some times participatory in a shelter or on the street with survivors. Cognitive issues arose, such as memory recall for the telling of a story. Additionally, a shifting of importance in temporal information would change with sudden interruptions in the conversation. Participants would interrupt themselves with more pressing concerns such a fears that “my house may blow up,” or “I don’t know where my child is could you help me get information?” Given this, cameras were passed out to each participant so that they could record the details of their lives when they had time and mail it to the researcher at a later date. Envelopes and stamps were included so that wherever they relocated to they could mail the envelope from that location. These diaries are still arriving in the mail from Sept. 2005 illustrating the importance of understanding post traumatic stress disorder and its lasting effects, as well as how long it may take to restructure key infrastructures, such as post office. Below is a chart illustrating the response times.

86

C. Forsman Table 1. Camera passed out in September 2005 with return dates Participant Location Baton Rouge Shelter Houston Shelter Bay St. Louis

Number of Cameras 25 10 27

Waveland

8

Date/Amount of Returned 1, Jan, 2005; 10, Dec. 2006 5, June, 2005 3, Oct. 2005; 5, Nov. 2005; 6, collected 2, collected Mar 2006

Understanding the context: post offices destroyed, or multiple relocation of participants, stress in recounting events, demonstrates the needs for adaptable research. After the initial field experiment participants began to email images taken with their cell phones or cameras. Additionally, participants wished to post information on You Tube and flickr in order to reach a wider audience. They expressed that they believed that these structures worked better for them as they struggled for assistance and wished to be noticed by a wider population. As this pattern began to develop new approaches for organizing information in the research took place. A good example of a website that is specifically for ethnographic research and disaster is the Indigenous Knowledge in Disaster Management website.

6 Conclusion The Disaster Cycle was highlighted in this paper in order to set the stage for a HCI field research. The research was explained so that anecdotal evidence was presented on how HCI research needed to be both participatory and adaptive in a post disaster environment. Acknowledgments. Participants in Bay St. Louis, Slidell, and Waveland, Mississippi. and in New Orleans and Baton Rouge who graciously gave of their time.

References 1. Alexander, D.: Principles of Emergency Planning and Management. Terra Publishing, Harpendern (1991) 2. Haddow, G.D., Bullock, J.A.: Introduction to Emergency Management. ButterworthHeinemann, Amsterdam (2004) 3. Jason, P.: (August 14, 2002), http://www.govexec.com 4. Berne, R.: CCPR: Organizational & Community Preparedness Project Executive Summary (2005) 5. SAFAM Summary of Events for a Medical Mission to Mozambique (2007) 6. Banipal, K.: Strategic Approach to Disaster Management: Lessons Learned from Hurricane Katrina. Disaster Prevention and Management, pp. 299–421 (2006) 7. Hurricane Katrina Timelines. The Brookings Institute (2004) 8. ARRL President Congressional Testimony on Hams’ Katrina Response, Submitted to the House Government Reform Committee (September 15, 2005)


87

9. Moyers, B., Kleinenberg, E.: Fighting for Air Transcripts (2006) 10. Dourish, P.: Seeking a Foundation for Context-Aware Computer. Human-Computer Interaction 16(2,3 & 4), 229–241 (2001) 11. Dourish, P.: Speech-gesture driven multimodal interfaces for Crisis Management. Proceedings of the IEEE 91, 1327–1354 (2003) 12. Anonymous, Craig’s List posting retrieve (September 10, 2005) 13. Simon, H.A.: A Behavioral Model of Rational Choice. Quarterly Journal of Economics 69, 99–118 (1955) 14. Norman, D.A.: Cognitive Engineering. In: Norman, D.A., Draper, S.W. (eds.) In UserCentered System Design, Lawrence Erlbaum Associates, Hillsdale, NJ (1986) 15. Dourish, P.: What We Talk About When We Talk About Context. Personal and Ubiquitous Computer 8(1), 19–30 (2004) 16. Mumford, E.: Effective Systems Design & Requirements Analysis: The ETHICS Approach. MacMillan, New York (1995) 17. Greenbaum, J., Kyun, M.: Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum, Hillsdale, NJ (1992) 18. Gutwin, C., Greenberg, S.: Design for Individuals, Design for Groups: Tradeoffs between Power and Workspace Awareness. In: Proceedings of the ACM 2000 Conference on Computer Supported Cooperative Work 2003, Philadelphia, PA (2000) 19. Nardi, B., Miller, J.: Twinkling Lights and Nested Loops: Distributed Problem Solving and Spreadsheet Development. International Journal of Man.-Machine Studies 34, 161–184 (1991) 20. Curry, M., Phillips, D., Regan, P.: Emergency Response Systems and the Creeping Legibility of People and Places. The. Information Society 20, 357–369 (2004) 21. Boehner, K., Vertesi, J., Sengers, P., Dourish, P.: How HCI Interprets the Probes. In: Proceedings of CHI (2007) 22. Schilderman.: Theo. Strengthening the Knowledge and Information System for the Urban Poor. Cambridge Unversity Press, Cambridge (2003) 23. Nygaard, K.: Program Development as Social Activity. In: Kugler, H.-J. (ed.) Information Processing, pp. 189–198. Elsevier Science Publishers, Amsterdam (1986) 24. Schuler, D., Namioka, A.: Participatory Design: Principles and Practices. Lawrence Erlbaum Associates, Hillsdale, NJ (1993)

A Scenario-Based Design Method with Photo Diaries and Photo Essays Kentaro Go Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi 4-3-11 Takeda, Kofu 400-8511 Japan [email protected]

Abstract. In this paper, we propose a requirements elicitation method called Scenarios, Photographic Essays and Diaries as User Probes (SPED-UP). In SPED-UP participants create photographic diaries and photographic essays themselves. Each participant creates photographic diaries to capture a day in their own life. They reflect upon their personal experiences and create photographic essays based upon this reflection. This approach enables designers to collect user data conveniently. Designers, who might be participants themselves in a participatory approach, can then analyze these experiences by forming design concepts, envision scenarios by imagining contexts of use, and create artifacts by sketching these scenarios. We also describe an exemplary workshop using the SPED-UP approach. Keywords: user research, photographic diary, photographic essay, probe, requirements inquiry, scenario.

1 Introduction User research is the basis of user-centered design (UCD). User research is usually conducted in a marketing department in a company, which is often an independent division. Consequently, user data from user research is not fully exploited in the design process because of organizational and communication problems [11] and the representation problem of user data [12]. A more direct approach to user research might be participatory design, in which actual users of target products participate in design activities. Thereby, they can contribute directly in requirements elicitation and continuous evaluation. However, user research is typically very expensive. It requires investments of time and money and is sometimes wasted. In addition, conducting appropriate user research requires high levels of skill and experience. User research in UCD is easily demanded, but hard to execute effectively, especially for designers. It is much harder if designers work on consumer products than if they work on products developed on request because potential users and their tasks and activities are not easily identified for consumer products to be designed. Consequently, designers of consumer products can move into a situation in which they create designs based on sheer intuition. Alternatively, they can use user data only as a guide of design: no rational relationship exists between user data and the produced products. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 88–97, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Scenario-Based Design Method with Photo Diaries and Photo Essays

89

We propose a design approach using lightweight user research for designers and create design ideas from the user data to address this issue. Our approach, Scenarios, Photo Essays and Diaries as User Probes (SPED-UP), is a scenario-based design using participants’ self-produced photographic essays and photographic diaries. In this paper, we describe an overview of the SPED-UP approach; this paper specifically examines photographic diaries and photographic essays as representations of user research.

2 User Research to Elicit Requirements Four goals of the early stage of design for human-computer interaction are the following. • • • •

Elicit potential desires and requirements. Envision novel scenarios of use. Create designs reflecting the material of user research. Bring actual users into design activities.

Several efforts have been made to study user research for design. Researchers and practitioners transferred research methods for field work to the design of humancomputer interaction. For example, in the contextual inquiry technique [1], researchers visit users’ work settings and ask questions during the actual activities. This technique is useful to record and understand actual users’ tasks and activities to elicit their potential wants and requirements. Gaver, Dunne and Pacenti [3] created cultural probes, which is a package of devices such as post cards, disposable cameras, notebooks, and so forth. Each device is designed to encourage potential users to keep a diary themselves as the instruction and messages from the designers are printed on it. The packages are distributed to potential users; they in turn keep a diary using the devices and send the package back to the designers. The designers browse the materials. Consequently, the materials provide the designers with a clue for design. As in the cultural probe technique, photographs taken by actual users often play a central role in user research. Frost and Smith [2] used photographs taken by patients with diabetes themselves for self-management training. In the marketing research field, Holbrook and Kuwahara proposed a data collection method using collective stereographic essays to probe consumption experiences. Holbrook and Kuwahara’s approach inspired us to develop the Participatory Requirements Elicitation using Scenarios and Photo Essays (PRESPE) approach [6, 7]. Based on experiences using the PRESPE approach, we created the SPED-UP approach. With devices such as photographs and writings created by potential users, we intend to deal with the above four issues in the early stage of the design process.

3 SPED-UP: Scenarios, Photo Essays and Diaries as User Probe Our approach on user research for design employs three key devices: scenarios, photographic essays and photographic diaries. The approach is called Scenarios, Photo Essays and Diaries as User Probe (SPED-UP). Fig. 1 depicts an overview of the SPED-UP approach.

90

K. Go

Coordinator

Theme Theme

(1) Collect Photo Diaries

(2) Reflect

(3) Analyze Photo Essays

Personal Experience Participants

Artifacts Artifacts

Design Design Concept Concept Requirements and needs

(5) Translate

Scenarios

(4) Envision

Fig. 1. Overview of the Scenarios, Photo Essays and Diaries as User Probes (SPED-UP) approach

3.1 SPED-UP Overview As a participatory design approach, SPED-UP sets a group of major stakeholders (including designers and real users) working together to produce and evaluate product designs [11]. The SPED-UP approach encompasses two roles: coordinators and participants. The coordinators assign a project theme and provide ongoing support for the participants’ activities. Five main activities are (1) collection, (2) reflection, (3) analysis, (4) envisioning, and (5) translation. Participants collect their own personal photographic diaries. For the assigned theme, participants create photo-essays to reflect their personal experiences using existing artifacts. The participants are divided into several groups; the remaining SPED-UP activities are conducted as group work. By comparing the individual photographic essays, the participants can analyze shared ideas, identify the concepts behind them, and then develop design concepts. The participants can then use these design concepts as inspiration for future uses of the relevant technology when they envision use scenarios and contexts. This activity, called scenario exploration, is a structured brainstorming session with role-playing using scenarios and questions. The participants then translate scenes described in the scenarios into artifacts by making sketches of the scenes [4]. Three devices area used for SPED-UP: photographic diaries, photographic essays, and scenarios. 3.2 Photo Diaries A photographic diary comprises a series of photographs and their descriptions. Fig. 2 shows an example of a photographic diary. A participant takes a photograph at specified time intervals and describes an activity at the time the photograph is taken.


91

In Fig. 2 the participant took a photograph and wrote a diary at one-hour intervals. Each photograph and description represents a scene from a day in the participant’s life. The purpose of collecting photographic diaries of users is to capture actual scenes of life from the users. The final outcome from the design process is design ideas or designed products relating to information and communications technologies. Therefore, we are interested in finding opportunities for information processing and communication from their daily lives.

8:45 All I have in my wallet is a thousand-yen note. I stop at .an ATM machine to withdraw money on the way to work.

9:45 Working on a business meeting, a business partner gave a phone call.

10:45 The business meeting started at 10:30. The meeting material got in under the wire. I will be giving a talk on the material soon.

Fig. 2. An example of a photographic diary

A timer or prompter is useful to prompt taking a photograph by users to create photographic diaries. However, using a self-timer to take photographs might not be appropriate for our approach because it might capture unintended scenes and cause privacy and security concerns. For this reason, we ask users to take photographs themselves so that the users can choose what they capture as a scene of daily life. Instead of taking photographs automatically, we ask users to capture a scene that represents their actions, tasks, or activities as well as the environment surrounding them. In fact, we ask them to appear themselves in photographs to represent clearly what they are doing in what situation. Current technologies such as small portable digital cameras, mobile telephones with digital cameras, and personal digital assistants (PDAs) with digital cameras provide us opportunities to create photographic diaries without too much trouble. In addition, self production of photographic diaries by the participants enables designers to collect user data in a short period of time.

92

K. Go

3.3 Photo Essays A photographic essay contains representative photographs on an assigned theme and an essay explaining why the participant thinks the photographs fit the theme. Photos might be stereograms to increase the viewer’s sense of reality [8]. Fig. 3 shows an example of a photographic essay.

I live alone. The first thing I do is turn on the TV when I get back home. I guess I might be feeling lonely. I try to find an entertaining program. I watch many kinds of programs, such as variety shows, dramas, and comedies. Because I live alone, I have a habit of channel surfing. Because I do not subscribe to a newspaper, I do not know what TV programs are currently on the air. So after turning on the TV, I start channel surfing and stop when I find an entertaining program. During commercial breaks, I start channel surfing again because I do not want to miss any entertaining programs that might be airing simultaneously on a different channel. Another reason for this habit is that I am not disturbing anyone because I live by myself. I think that this habit might change depending on my environment. Fig. 3. An example of a photographic essay: Channel surfing [7]. The theme assigned to the participant is “something I usually do with an IT product.” In the essay, the author assumed that the television is an IT product.

The purpose of collecting photographic essays of users is to elicit potential hidden needs. This is achieved by users’ deep introspection based on the assigned theme. The photographic diaries and photographic essays are key user data in the SPEDUP approach. We expect from the user data that users’ needs or requirements that emerge from photographic essays might be incorporated into opportunities of information processing or communication found in photographic diaries. Toward this end, designers analyzed the collected photographic diaries and essays. The obtained ideas from the data are summarized and listed as Design Concepts shown in Fig. 1. The next step in the SPED-UP approach is to create scenarios. 3.3 Scenarios Scenarios in the SPED-UP approach have two aspects: as a tool to support idea generation and as a representation of design ideas based on user data. At the idea


93

generation stage from the design concept produced from the photographic diary and photographic essay analysis, designers conduct brainstorming sessions using an affinity diagram. In this activity, scenarios might be a textual narrative form. During the SPED-UP brainstorming session, participants create short scenarios that include usage situations. The participants ask 5W1H (What, Why, Who, When, Where and How) and what-if questions to identify concrete details of various use situations. The answers to the questions are represented as scenarios with detailed information. As a representation of design ideas, designers create scenarios that represent scenes of a task or activity. Scenarios at this stage are much longer descriptions than those in the brainstorming session.

4 Example We conducted a two-day workshop of the SPED-UP approach at the Ergo-Design Division, Japan Ergonomics Society. This section describes an overview of the workshop as an example. Other reports on the workshop can be found at [9, 10, 16]. The workshop was intended to create design specifications of a ubiquitous computing system for a university campus. Specifically, we designed the system not only for traditional usability aspects but also for emotional aspects; in this sense, we intended to incorporate the aspect of happiness into the system. The workshop participants are from several companies and universities in Japan. They have various backgrounds and experiences in industrial and product design but have no experience using the SPED-UP approach. Box 1 and Box 2 show the assignment given to the participants. Following our SPED-UP approach, we asked them to address a theme – “Something I feel happy about” – by taking a representative photograph and writing a brief vignette indicating the significance of the photo. We provided assignments to the participants of the workshop beforehand. They created the photographic diaries and photographic essays prior to the workshop. Fig. 4 shows the first two hours of a photographic diary created by a participant. She is a supporting staff member of a university field hockey team, and she describes her day during spring break. The photographic diaries provided by the participants enable the workshop members to share and understand the individual’s daily life. Photo Diary Project Description Do the following.

• Take a photograph every thirty-minute interval from morning to night (one-hour interval may be acceptable if you think thirty-minute interval is too busy). • Write a short diary that explains the scene captured in the photograph. • Construct a summary document (a PowerPoint presentation or a poster) that contains the photographs and diary. Notes

• Consider what the theme means to you. • Describe the scene in the photographs; explain why you selected that particular scene. Box 1. Photo diary assignment given to the participants

94

K. Go Photo Essay Project Description For the theme below, do the following.

• Take a pair of photographs (overview and close-up) that describes the theme. • Write a short essay that explains the meaning of the scene captured in the photographs. • Construct a summary document (a PowerPoint slide) that contains the photographs and essay. Theme

• Something I feel happy about Notes

• Consider what the theme means to you. • Describe the scene in the photographs, and explain why you selected that particular scene. Box 2. Photo essay assignment given to the participants

8:30-9:00 (1), (2) I wake up in the morning and check e-mail first. (1)

(3)

9:30-10:00 (4) I time warm-up exercise with a stop-watch behind the backstop on the field hockey field

(2)

I use a microwave for making a drink in the cold winter.

9:00-9:30 (3) At a convenience store, I use photoprinting service. The cash insertion slot is out of reach of the printing terminal. 10:00-10:30 (5) I hand out drinks to players every thirtyminute interval (4)

(5)

Fig. 4. A photographic diary created by a participant. She is a supporting staff member of a university field hockey team (excerpt from her poster and translated by the author).

Fig. 5 shows a photographic essay created by a participant. He explains in the photographic essay why self-made coffee in the morning is important for him. At the workshop we started explaining the photographic diaries and photographic essays that were brought. Then we divided the workshop members into three groups. Each group reviewed all the photographic diaries and photographic essays and find common ideas and opportunities behind them. They created design keywords through this activity. All materials had been posted on the wall of the workshop room so that the participants were able to review them anytime.


95

Fig. 5. A photographic essay created by a participant. He explains why self made coffee in the morning is important for him to spend a happy day.

During the analysis phase of photographic diaries and photographic essays, the participants created keyword descriptions. Box 3 shows an example of the keyword descriptions created by a participant group. Based on those keywords, the participants conducted scenario-based brainstorming sessions. Finally, they created design ideas about restructuring the concept of a lecture on campus. They proposed the “learning like a pot-luck party” concept, a student-led learning environment where anyone comes and leaves anytime and shares knowledge and experience. Keyword descriptions: Relativity: The degree of happiness is perceived in a relative manner. The same life event can be experienced differently from person to person. Rhythm: Series of events in daily life create a harmony of happiness. Box 3. Keyword description by the participant group

5 Conclusions In this paper, we introduced a user research and design method using a scenario-based approach with photographic diaries and photographic essays. The Scenarios, Photo Essays and Diaries as User Probes (SPED-UP) approach enables designers to collect user data at the beginning of design process in lightweight manner. In this paper, we specifically addressed representation of photographic diaries and photographic essays.

96

K. Go

We introduced the SPED-UP approach at a workshop held by the Ergo-Design Division, Japan Ergonomics Society in February, 2006. The participants at the workshop quickly acquire the approach; then they started using it at design departments of several companies and universities in Japan including Fujitsu Co. Ltd., Canon Inc., Ricoh Company, Ltd., Chiba University, Musashi Institute of Technology, Kurashiki University of Science and The Arts, and University of Yamanashi. The Ergo-Design Division is now considering using it as a basic design approach for ubiquitous services, applications, and products. Ueda and Watanabe [15] reported that the SPED-UP approach enables design students to center their creative efforts specifically on their design target, which suggests the potential value of SPEDUP for use in design education. Acknowledgments. The author thanks the Ergo-Design Division, Japan Ergonomics Society. The photographic diary and photographic essay in Section 4 are provided by Saori Oku, Wakayama University and Hiromasa Yoshikawa, Design Center, Fujitsu Co. Ltd.

References 1. Beyer, H., Holtzblatt, K.: Contextual design: Defining customer-centered systems. Morgan Kaufmann, San Francisco (1998) 2. Frost, J., Smith, B.K.: Visualizing Health: Imagery in Diabetes Education. In: Proceedings DUX 2003 Case Study, Designing for User Experience ACM/AIGA (2003) 3. Gaver, B., Dunne, T., Pacenti, E.: Cultural probes, interactions, 6(1) 21–29 (1999) 4. Go, K., Carroll, J.M.: Scenario-based task analysis. In: Diaper, D., Stanton, N. (eds.) The Handbook of Task Analysis for Human-Computer Interaction, pp. 117–134 (2003) 5. Go, K., Carroll, J.M.: The Blind Men and the Elephant: Views of Scenario-Based System Design. Interactions 11(6), 44–53 (2004) 6. Go, K., Takamoto, Y., Carroll, J.M., Imamiya, A., Masuda, H.: PRESPE: Participatory Requirements Elicitation using Scenarios and Photo Essays, Extended. In: Proceedings of the CHI 2003, Conference on Human Factors in Computing Systems. pp. 780–781 (2003) 7. Go, K., Takamoto, Y., Carroll, J.M., Imamiya, A., Masuda, H.: Envisioning systems using a photo-essay technique and a scenario-based inquiry. In: Proceedings of HCI International 2003, pp. 375–379 (2003) 8. Holbrook, M.B., Kuwahara, T.: Collective Stereographic Photo Essays: An Integrated Approach to Probing Consumption Experiences in Depth. International Journal of Research in Marketing 15, 201–221 (1998) 9. Inoue, A.: A Proposal for New Campus Life for the Ubiquitous Generation: An approach using Photo Scenario Method. The. Japanese Journal of Ergonomics 42 Supplement, 58–59 (in Japanese) (2006) 10. Ito, J.: How to Make Campus Life Unforgettable with Ubiquitous Service. The Japanese Journal of Ergonomics 42 Supplement, 54–55 (in Japanese) (2006) 11. Muller, M.J., Haslwanter, J.H., Dayton, T.: Participatory Practices in the Software Lifecycle. In: Helander, M., Landauer, T.K., Prabhu, P.V. (eds.) Handbook of HumanComputer Interaction, 2nd edn. pp. 255–297. Elsevier, Amsterdam (1997) 12. Poltrock, S.E., Grudin, J.: Organizational obstacles to interface design and development: two participant-observer studies. ACM Transactions on Computer-Human Interaction 1(1), 52–80 (1994)


97

13. Pruitt, J., Adlin, T.: The Persona Lifecycle: Keeping People in Mind throughout Product Design. Morgan Kaufmann, San Francisco (2006) 14. Rosson, M.B., Carroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan Kaufmann, San Francisco (2001) 15. Ueda, Y., Watanabe, M.: A study of vision-development methods for the ubiquitous generation. In: Proceedings of the 36th annual Meeting of Kanto-Branch, Japan Ergonomics Society, pp. 29–30 (in Japanese) (2006) 16. Yoshikawa, H.: Campus Life Support by Ubiquitous Technology. The Japanese Journal of Ergonomics 42 Supplement, 56–57 (in Japanese) (2006)

Alignment of Product Portfolio Definition and User Centered Design Activities Ron Hofer1, Dirk Zimmermann2, and Melanie Jekal3 1

Siemens IT Solutions and Services C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected] 2 T-Mobile Deutschland GmbH, Landgrabenweg 151, 53227 Bonn, Germany [email protected] 3 Universität Paderborn C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected]

Abstract. To reach a product’s business objectives, the requirements of all relevant stakeholders have to be analyzed and considered in the product definition. This paper focuses on the processes applied to analyze and consider the needs and expectations of two of these stakeholder groups, namely the customers and the users of a product. The processes to produce customer centered product definitions and user centered product definitions are compared, rendering visual opportunities to increase their efficiency and effectiveness by means of collaboration. Keywords: Business Requirements, Customer Requirements, Marketing, Product Definition, Product Portfolio Management, Usability Engineering, User Centered Design, User Requirements.

1 Introduction Activities for customer centered and user centered product definitions are generally different in scope. Therefore, usability researchers and practitioners conclude that research on customers cannot substitute research on users ([3], [5]). The authors believe, however, that understanding the basic differences of customer centered and user centered activities provide opportunities to identify areas, in which they can supplement each other within the product lifecycle (PL), resulting in reduced time schedules and efforts, optimized usage of resources available and increased product quality. This paper focuses on the initial step of the PL, in which the elements of the product portfolio are defined. The authors call this initial, cradle step, the Product Portfolio Definition (PPD). Within this step, a variety of influencing factors has to be considered and combined into consistent product scopes to be realized and commercialized in subsequent steps. For companies, which need to develop products with an assured degree of quality in use, one of these influencing factors is the context of use [13]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 98–107, 2007. © Springer-Verlag Berlin Heidelberg 2007

Alignment of Product Portfolio Definition and User Centered Design Activities

99

2 The Playground The roles that one or more person(s) might perform in a buying decision, can be classified into six buying roles which are the initiator, the influencer, the decider, the buyer, the user and the gatekeeper [1]. This framework helps to understand the different view angles, expectations and needs of the customers - and the users regarding the same products. Business plans consider all of these six roles to define products, which intentionally influence all factors leading to a purchase decision. One of these buying roles is the user. User Centered Design (UCD) offers established processes, methods and tools to understand and consider this part of the six buying roles, which leads to the authors’ belief, that an early start of UCD activities supports business decisions already in the initial phase of the PL. Another buying role is the decider (the one who decides on the purchase of a product). In the context of this paper, the motivation to make a purchase decision is different for organizational customers that purchase IT systems to be used by members of the organization (e.g. a call center or an intranet solution) and private customers who are actual end-users (e.g. the purchaser of a tax software or mobile phone). These differences will be addresses at relevant points within the paper. The PPD is conducted at the very beginning of a product’s lifecycle. Product portfolios (PPs) consist of a unified basic product platform and product modules, which are tailored to fit the needs of specific market segments. Objectives and requirements of PPs are defined in “product vision” documents [22]. The modules of a PP can be developed and launched as independent projects at different times. There is a wide range of drivers influencing the definition of product vision for PPs. Company-external drivers, such as society and politics, sciences and technology and the target market as well as more internal drivers like the business strategy, the product strategy and existing and planned own and competitive products are to be considered. This paper focuses solely on one aspect of these drivers, the so called “voice of the customer” ([16], [22]) which has to be heard and considered in the definition of product visions and project scopes to tailor the modules of a product line according to customer segments and to align each module with specific customer needs and expectations. Literature on the process of product definition (PD) emphasizes, that the analysis of the context of use provides valuable insights about customers’ needs and expectations and should be considered in the definition of product visions and project scopes ([16], [22]). On the other hand, usability experts (e.g. in the QIU Reference model [8]) and related ISO standards (DIN EN ISO 13407 [7], ISO/TR 18529 [13] and ISO/PAS 18152 [12]) point out, that the interests and needs of user groups that will work with the products should be considered throughout the entire product lifecycle, “from the cradle to the grave” to thoroughly ensure and enhance the ease of use and usability of interactive products.

3 Comparison of Focus and Methods The following comparison identifies activities within both processes which needs to be aligned to assure and increase both the customer and the user acceptance of

100

R. Hofer, D. Zimmermann, and M. Jekal

Fig. 1. The four activities within the PPD and subsequent UCD phase

products. To ease the comparison, both processes are divided into four steps, namely “Analyze Context”, “Specify Requirements”, “Produce Concepts” and “Evaluate Concepts”. This sequence is in line with the iterative human centered design steps [7] and customer centered approaches to define products [16]. For each step, product definition and UCD activities are juxtaposed to identify opportunities to increase the efficiency within both processes by joint activities and to explore the usage related effects of decisions within the product definition phase. The steps are mapped on a schematic diagram visualizing the commonly acknowledged sequence from the PPD phase to the UCD phase. 3.1 Analyze Context Analyzing the Business and Customer Context Within the business context, product visions and project scopes are defined, based on a thorough analysis. This paper focuses on a significant part of the overall analysis activities, namely the identification of “the voice of the customer” [22]. Within this part, significant differences between groups of customers are identified in order to segment markets and detailed insights about each customer groups’ specific current and future needs and expectations are gathered. In the case of product offerings for private customers, information about customers’ geographics, demographics (addressing the social levels and the family lifecycle) psychographics (addressing patterns by which people live and spend time and money) and behavioristics (addressing the customers extent of use and loyalty, unrealized consumer needs and the usage situation) ([9], [10]) supports “the process of dividing a market into groups of similar consumers and selecting the most appropriate group(s) […] for the firm to serve” [19] and provides valuable information about the private customers motivation to make purchase decisions. Common sources to analyze customers’ needs and expectations are problem reports and enhancement requests for a current system, marketing surveys, system


101

requirements specifications and descriptions of current or competitive products. These sources are supplemented with interviews and discussions with potential users, user questionnaires, the observation of users at work and the analysis of user tasks [22] to “perform foresight research into potential user groups in order to identify forthcoming needs for systems and new users or user organizations” and to “Identify expected context of use of future systems” [13]. These methods have significant overlap with analysis methods used in the UCD process. Analyzing the User Context UCD processes begin with a thorough analysis of the context of use. The context of use includes “the characteristics of the intended users”, “the tasks the users are to perform” and “the environment in which the users are to use the system” [7]. Additionally, a “competitive analysis” [17] of competitive systems can add valuable information. The characteristics of the intended users include information about their “knowledge, skill, experience, education, training, physical attributes, habits, preferences and capabilities” [7]. This information is summarized in user profiles [14] often represented as Personas ([20], [5]). User profiles help to keep each user group’s specific constraints, abilities and mental models in mind throughout product development. The relevant user goals are captured and analyzed to identify the as-is sequences of tasks that users proceed to reach these goals. The usage environment analysis adds information about “the hardware, software and materials to be used [and] the organizational and physical environment” [7]. Information about the environment helps to consider restrictions and to identify potential opportunities to enhance the product-to-be. Common methods to analyze the context of use are structured on-site visits, structured interviews or interviews using the master/apprentice model [3] with users and customers ([6], [11]). 3.2 Specify Requirement Business and Customer Requirements Business requirements set the overall “product vision” and determine the product portfolio modules to be developed. Furthermore, business requirements contain the identified business opportunity, business objectives and criteria, customer and market needs, business risks, scopes and limitations and the business context containing information about the stakeholder profiles. Customers are a subset of the overall stakeholders considered in the definition of business requirements. Business requirements are the basis to elicit customer requirements for each project. This is done in tight collaboration with customers. Customer requirements can be grouped into nine classes, namely “Business Requirements, Business Rules, Use Cases or Scenarios, Functional Requirements, Quality Attributes, External Interface Requirements, Constraints, Data Definitions and Solution Ideas” [22]. Business as well as customer requirements address issues related to the context of use. High-level business requirements “set business tasks (Use Cases) that the product enables” and “influence the implementation priorities for use cases” [22] and project related customer requirements include those Use Cases.

102


User Requirements User or Workflow requirements specify how the system should support the user to complete his/her tasks and thus have an impact on the early definition of products and market segments [7]. They are captured in Use Cases that “describe the system behavior under various conditions as the system responds to a request from one of the stakeholders, called the primary actor” [4]. The core element of a Use Case is the main scenario, which lists the flow of interaction to reach a specific goal. This interaction flow is improved into a reengineered shall-be-status to “realize the power and efficiency that automation makes possible” and to “more effectively support business goals” [14] and customer requirements. Use Cases are an ideal container to gather all functional requirements necessary to enable a specific user group (primary actor) in reaching a specific goal. As products usually enable several kinds of distinctively different user groups in reaching several goals, Use Cases can be organized into a matrix showing user groups and their respective user goals. This matrix supports decisions concerning the product portfolio elements and project scopes. 3.3 Produce Concepts Business and Project Concepts On the business level, a consistent concept is developed under consideration of the business requirements. This process is of a complex nature, as there is more than one alternative solution for each component of the concept ([2], [18]. On the product level, customer requirements are consolidated into product definition concepts describing the “Place” variable (referring to a geographic location, an industry and/or a group of people - a segment - to whom a company wants to sell its products or services) and the “Product” variable (addressing a product’s functionality, product differentiation, product shape and the Product Portfolio management) of the “4Ps” of a so called “marketing mix”. From a marketing perspective, the “Pricing” and “Promotion” variables supplement the concepts [15]. Methods to systematically derive an optimum configuration of business and product concepts address the visualization of complex requirement interrelations, the production and usage of prototypes and the prioritization of requirements. To deal with the uncertainties given, usually several concepts are derived and evaluated to reduce the risks of misconceptions [22]. User Interface Concept The conceptual phase within the UCD process deals with two major objectives. The first objective is to organize the identified and reengineered tasks into models to describe the overall hierarchy and interrelations of tasks, considering the user and business perspective. The second objective is to translate these models into a consistent specification of the UI through several iterations. The first iteration, focuses on the creation of the “Conceptual Model Design”, which defines “a coherent rule-based framework that will provide a unifying foundation for all the detailed interface design decisions to come“ [14]. This framework, visualized in mock-ups, represents the reengineered task models in a more tangible way and can thus support customer-focused evaluation activities.


103

3.4 Evaluate Concepts Evaluation of Business and Project Concepts From a business perspective, evaluation activities address business concepts and product concepts defining the segmentation of markets and the corresponding of products. These concepts are reviewed with customers (usually specific registered customers of the company) and relevant stakeholders and domain experts [22]. Customer requirements are evaluated with customers to get feedback on how to adjust concepts and which concept to choose. Feedback on product concepts is gained by surveys, focus groups, reviews and structured interviews with potential and known customers. In the case of several concepts to be compared, benchmarking methods such as the KANO method or the Conjoint Analysis method [16] are used to identify promising project concepts and marketing mixes. These methods are based on the assumption that customers are able to explain and predict their thinking and behavior [20]. They can be supplemented by methods to gain insights about the 95% of thinking that takes place in customers’ unconscious minds, and strongly affect their purchasing behavior [23]. Additionally, launching products with a limited area of circulation or functionality (single modules, beta versions) provides early feedback from the marketplace. User Centered Evaluation One of the basic principles of UCD is to develop human system interfaces in iterations to decrease costly the chance of changes and revisions at late stages of product development [22]. With this approach, the risk of unforeseen obstacles which might result from reengineered task sequences, task models and UI concepts can be reduced and initially undetected issues concerning the users’ needs and expectations can be considered at an early stage of UI development. There are two types of UCD evaluations. Summative evaluations (e.g. usability tests, benchmarks and reviews) aim at the final assessment of products, whereas formative evaluations are conducted continuously to support decisions concerning UCD concepts within the process. As this paper discusses mutual benefits in joint customer centered and user centered activities in the early “cradle” step of product development, the formative UCD evaluation is of foremost interest. The methods used for formative evaluations at this point of product development are collaborative reviews, expert reviews, validations with users and customers and focus groups. Formative evaluations confirm intermediate results within the process and identify potential areas for optimization or correction.

4 Mutual Benefits As shown, the methods used within product development overlap with methods used in UCD activities. This overlap can be a promising starting point to reduce time and effort (the two basic metrics for efficiency) within product development.

104


Fig. 2. Promising areas for collaboration within the PPD and subsequent UCD phase

The second advantage of a simultaneous proceeding of Product portfolio definition and UCD activities is the opportunity to explore the effects of PPD activities on the context of use within the PPD phase. This feedback is a valuable basis to make adjustments within each of the PPD steps, enhancing the reliability of all subsequent steps and reducing cost intensive change request in subsequent PL phases. This enables the product definition team to adjust analysis plans, requirement specifications, concepts and evaluation focus accordingly. In the following, we summarize all potential areas of collaboration. The areas are mapped on the schematic diagram (Figure 2) visualizing the PPD and subsequent UCD phase, introduced in chapter 3. a) Joint Analysis and Customer Selection The identification of relevant customer and user segments for analysis activities can be simplified by joint collaboration of business and user analysts. Business analysts can utilize user groups described in Personas to segment markets ([20], [22]), which leads to a significant reduction of the set of customers to be investigated [16]. On the other hand, “ethnographic interviewers should use market research to help them select interview partners” [5] and derive user groups [20]. Some of the main methods used to analyze the characteristics of target customers are equally used within the UCD process to gain insights about the characteristics of


105

the intended user, their user goals and the environment in which the users are to use the system. A simultaneous analysis approach could therefore reduce time and effort. The relevant interview partners can be jointly interviewed adding valuable mutual insights. As stated by Cooper, “data gathered via market research and that gathered via qualitative user research complement each other quite well” [5]. b) Exploring User Requirements for Product Definition First (jointly analyzed) insights about customers and users can be utilized by UCD activities to “perform foresight research into potential user groups in order to identify forthcoming needs for systems and new users or user organizations” [13] which can be used as a basis for a user groups and goal oriented product modularization and the identification of “technology capabilities that would be needed” [21]. The UCD methods to translate user goals into meaningful Use Case requirements can be utilized in PD to “Identify expected context of use of future systems” [13]. Use Cases fill the “Use case or scenario” class of the customer requirements derived within PD [22]. Furthermore, early insights about the expected context of use can indicate missing analysis data about customers within the customer analysis step. c) Joint Requirement Specification Business requirements “determine both the set of business tasks (use cases) that the application enables” and “influence the implementation priorities for use cases and their associated functional requirements” [22]. Within the requirement elicitation phase of PD, analysts elaborate customer and user statements into general customer requirements. Some of these requirements address statements concerning user goals or business tasks that users need to perform. UCD methods can be utilized to condense these requirements in the form of Use Cases, which cluster all product requirements necessary to fulfill a certain user goal in one single requirement and can thus reduce the complexity of requirements to be considered. [17] In the requirement phase of the UCD process, task sequences are reengineered to optimally achieve the identified business goals. These UCD reengineering activities allow the consideration of improved workflows and changes in users and tasks within the PD phase. d) Explorative Concepts Usage related product requirements can be translated into first conceptual models and mock-ups. Especially in the context of private customers, these mock-ups can be used in the requirement elicitation phase to get early customer feedback and adjust the requirements accordingly. e) Joint Conception In the concept phase of PD, several marketing mix concepts are derived to identify the best mixture of all variables of the product offering. Joint conceptualization activities allow to see the effect of trade-off decisions in the marketing mix immediately and to adjust the marketing mix concepts accordingly. Furthermore, a simultaneous creation of first conceptual UI models increases the real-world character of marketing mixes to be evaluated with customers and users.

106


f) Explorative Evaluation of Usage Related Components of the Marketing Mix Explorative evaluation efforts to “assess the significance and relevance of the system to each stakeholder group which will be end users of the system and/or will be affected by input to or output from the system” [13] provide early feedback in the context of use. Marketing mix concepts can be evaluated up front by UCD activities based on the first set of user requirements to allow usage related concept adjustment within the PD phase. g) Joint Evaluation UCD processes offer appropriate methods to evaluate the (high-level) usability of product concepts. Furthermore, UI mock-ups derived within the UCD processes help to communicate the product part of marketing mixes to customers and users within review and evaluation sessions. h) Positive Influence on Schedule, Budget, Resources and Quality The alignment of PD and UCD activities reduces time and effort, enables to utilize each others expertise and increases the product quality and thereby the predictability of product acceptance of customers and users.

5 Summary This paper identified opportunities to improve the alignment of PPD and UCD activities. It offers a basis for the discussion of how these joint activities can be embedded into established product development processes. Considering the specific requirements of users within the Product Portfolio Definition increases the user acceptance of future products and helps to smoothly implement the UCD process into the overall Product Development: • The users’ acceptance of future products is considered from the beginning and leads to strategic product portfolios aiming at high-level user goals. • As UCD activities can start earlier in the product development process, the time necessary to analyze the context of use in subsequent process steps is reduced. • The simultaneous customer and user focus enhances the shared understanding and awareness of business and user goals across development teams early in the project development process. • Feedback about the user acceptance of portfolio definitions is provided early in the process, which enables the adjustment of product portfolios within the first process steps and thus reduces extra costs of change requests in subsequent steps.

References 1. American Marketing Association: Dictionary of Marketing Terms. Retrieved (February 16 2007), from http://www.marketingpower.com/mg-dictionary-view435.php 2. Becker, J.: Marketing-Konzeption. Grundlagen des ziel-strategischen und operativen Marketing-Managements. 8th edn. München, Vahlen (2006)


107

3. Beyer, H., Holzblatt, K.: Contextual Design. Defining Customer-Centered Systems. Morgan Kaufmann Publishers, San Francisco, CA (1998) 4. Cockburn, A.: Writing Effective Use Cases, vol. 1. Addison-Wesley, Boston, MA (2001) 5. Cooper, A.: About Face 2.0., vol. 53. Wiley Publishing Inc, Indianapolis, US (2003) 6. Courage, C., Baxter, K.: Understanding Your Users. A Practical Guide to User Requirements [...]. Morgan Kaufmann Publisher (Elsevier), San Francisco, CA (2005) 7. DIN EN ISO 13407: Human-centered design processes for interactive systems. Brussels, CEN - European Committee for Standardization vol. 9(10) (1999) 8. Earthy, J., Sherwood-Jones, B.: Quality in use processes and their integration - Part 1 Reference Model. Lloyd’s Register of Shipping, London (2000) 9. Engel, J.F., Blackwell, R.D., Minard, P.W.: Consumer Behavior. The Dryden Press, Chicago (1990) 10. Evans, M., Jamal, A., Foxall, G.: Consumer Behaviour. John Wiley & Sons Ltd, West Sussex, England (2006) 11. Hackos, J.T., Redish, J.C.: User and Task Analysis for Interface Design. John Wiley & Sons, Inc, USA (1998) 12. ISO/PAS 18152: Ergonomics of human-system interaction - Specification for the process assessment of human-system issues. ISO, Genf. 8, 9, 11 (2003) 13. ISO/TR 18529: Ergonomics - Ergonomics of humansystem interaction - human-centred lifecycle process descriptions. ISO, Genf. (2000) 14. Mayhew, D.J.: The Usability Engineering Lifecycle, Morgan Kaufmann, San Francisco, pp. 172, 174, 188 (1999) 15. McCarthy, J.: Basic Marketing - A managerial approach. Irwin, Homewood, IL (1960) 16. Mello, S.: Customer-centric product definition. Amacom, New York (2002) 17. Nielsen, J.: Usability Engineering. Academic Press, Boston (1993) 18. Nieschlag, R., Dichtl, E., Hörschgen, H.: Marketing. 18th edn. Duncker & Humbolt, Berlin (1997) 19. Peter, J.P., Olson, J.C.: Consumer Behavior and Marketing Strategy, p. 378. McGraw-Hill Higher Education, Boston (2002) 20. Pruitt, J., Adlin, T.: The Persona Lifecycle. Keeping People in Mind Throughout Product Design. Morgan Kaufmann Publishers (Elsevier), San Francisco, CA (2006) 21. Sengupta, U, Sherry, J.: Future vision 2015: Building a User-Focused Vision for Future Technology. Technology@intel Magazine (9/2004) (2004) 22. Wiegers, K.E.: Software Requirements. In: Practical Techniques for gathering and managing Requirements [...], 2nd edn. Microsoft Press, Redmond, Washington 120, 95,81 (2003) 23. Zaltman, G.: How Customers Think. Essential Insights into the Mind of the Market. Harvard Business School Press, Boston, MA (2003)

A New User-Centered Design Process for Creating New Value and Future Yasuhisa Itoh1,2, Yoko Hirose3, Hideaki Takahashi3, and Masaaki Kurosu3 1

U'eyes Design Inc., Housquare Yokohama 4th Floor 1-4-1 Nakagawa, Tsuzuki-ku, Yokohama, Kanagawa-ken 224-0001 Japan 2 Department of Cyber Society and Culture, The Graduate University for Advanced Studies, 2-12, Wakaba, Mihama-ku, Chiba-shi, 261-0014 Japan 3 National Institute of Multimedia Education, 2-12, Wakaba, Mihama-ku, Chiba-shi, 261-0014 Japan

Abstract. This paper presents a new process model of user-centered design that can be applied to the development of new value and future. Realizing that the widely known conventional human-centered design process, defined by ISO13407, is not always effective, here we propose a new process model and introduce an overview of activities based on this process. This aims at not only developing new value and future, but also in generating new ideas in concept planning. Keywords: User-centered design; ISO13407; Developing new value and future; Concept planning.

1 Introduction We have recently witnessed increase growth and market growth of a range of information appliances, with cellular phones and digital cameras being at the forefront of this development. This has been accompanied by increasing attempts to improve the usability of such products as the industries involved become increasingly competitive. This has meant the development and use of usability reviews has gathered pace for use in the development of home electrical appliances, IT devices and in the creation of various websites. Even industries that have previously seen no such need for these usability studies have recently started paying attention to these issues and their application. These types of usability review, as part of a user-centered design process, are also accompanied by the implementation of user-context studies, requirement definition, and prototyping. These types of activity are all defined as user-centered design processes under ISO13407 (JIS Z 8530)[1] (Figure 1). Processes that have been defined under this category display characteristics of being a user-centered design process, and are noticeable by having areas that actually differ from the actual design process. In some ways, however, this process model has been followed as a canonical process. Recently, however, we have realized the existence of a number of areas of development in projects that we have been involved in that have not fitted so well to J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 108–116, 2007. © Springer-Verlag Berlin Heidelberg 2007

A New User-Centered Design Process for Creating New Value and Future

109

ISO013407 processes. Here we will introduce some specific characteristics of the projects we are focusing on, which are outlined below: • The system’s realization date (the launch date) is in the near future • The system will make use of technology that is not currently available • Wishing to add new value but currently having no specific ideas Development that meets these kinds of requirements does not include products that have just become available for sale, but is aimed at products or systems that will be released over a period that could range from 2 to 3 years following development, to up to 5 or 10 years in the future. These products will also include items that will contain entirely new functions or added value, be equipped with a completely new user interface, or fall under the category of a completely new product or service. In order to be able to realize these new functions and added value new technology is often required, as well as a suitable amount of time being necessary for the development of this technology. This often means therefore, that rather than using the most recent technology what is actually required is using technology that, while not actually currently available, will be developed in the near future. In the initial stages of such development there are often cases in which the product or service itself is in the middle of the planning process, and this often leads to cases in which it is necessary to create new ideas regarding new value and include an investigation of the feasibility of actually realizing this as part of the product planning process. In this study we will introduce a conceptual model for a user-centered design process that involves a system that is both close to realization in the near future and that is capable of creating new value. This period of the near future is defined as being from 2 to 10 years from the current date.

2 Scope of a New User-Centered Design Process 2.1 Scope Table 1 shows the scope of the proposed process model. The points that divide the scope are whether there is actual new value in the system being examined (either a product or service) and the proposed realization date of the relevant system. In attempting to acquire quantifiable values from each axis for both realization date and new value it is impossible to actually divide these values qualitatively, but the figure does present a general concept of how these figures can be distinguished and separated. For areas in Table 1 that show no new value and whose realization date has been only been recently realized then the ISO13407 process model is thought to be a suitable model for use in bringing these products to development. For items that have already undergone the product planning stage under ISO13407, however, these can be treated using the “Specific requirements for user-centered planning” as defined in the upper-left panel in Figure 1. After making the decision as to whether the product or service is in need of user-centered design then we think that they can then undergo the same actual process. In contrast to this, however, in the development process that is the subject of this current study then the user-centered design process will start from a stage taken as being during the initial product planning process (Figure 2). This will result in the planning process being incorporated as the first of a series of processes.

110

Y. Itoh et al. Table 1. Scope of a new user-centered design process Realization time for the relevant system

Yes

New value? No

Recent events

Near future

Development of new value that has been recently realized

Development of new value that will be realized in the near future

*Suitable for application using the proposed process


Relatively little development of new value that has been recently realized

Relatively little development of new value that that will be realized in the near future

*Suitable for application using ISO13047


Fig. 1. Process of human-centered design activities

2.2 The Process Model Figure 2 shows a general outline of a conceptual model for the user-centered design process that we propose here. In this model, in contrast to the process model outlined in Figure 1 that is included within the conventional ISO13407, an additional 3 processes have been added: 1) User-centered planning, 2) Study and prediction of future circumstances, and 3) Selection and creation of new value. 1) The user-centered design process differs from cases of development in which the planning stage has already been decided and involves the process beginning from the initial product planning process stage. As the product planning that takes place here includes a user-orientated philosophy we therefore decided to name this process as user-centered planning.


111

Fig. 2. A new user-centered design process for creating new value and future

2) The study and prediction of future circumstances is a necessary process for envisaging an actual realization time for the relevant product or service in the near future. In the event of the development period being within the several months or between 1 to 2 years from the current period, it can be assumed that future circumstances and users will be virtually unchanged from the current period and suggesting that development can already take place. In contrast to this, however, if the realization period is in the near future (anticipated as being between 2 to 10 years in the future) then it is likely that a wide variety of factors will change in this period, including the currently available technology, and it is also difficult to envisage future users having the same needs and requirements as current users. In this event it is necessary to conduct a study of future circumstances and global changes as well as attempting to predict the characteristics of potential users in the future, together with the anticipated conditions for the relevant product or service. This particular process of study and predication of future circumstances is therefore an integral part of the proposed process. 3) The process of creating and selecting new value is also connected to usercentered planning. In the event of new value being one of the requirements of this planning process, then coming up with new ideas is an essential element of this process. If such ideas are subsequently found to be of high value and feasible for implementation then these can be used as the basis for the refinement of the product planning process. In order to carry this out, however, it is first necessary to develop a number of creative ideas. This involves generating a number of ordinary ideas and subsequently choosing the best ideas from this number for use as the basis for refinement of the product planning process. This element of generating and selecting new ideas is an important factor in the user-centered design process.

112

Y. Itoh et al.

2.3 The ISO13407 Process Model and Its Application In addition to the 3 processes outlined in section 2.2, there are a number of processes that share a number of the same points as ISO13407. The content of each of these processes, however, has undergone some change and expansion, and the content of each of these processes will be touched on in chapter 3. 2.4 The Life Cycle of ISO/IEC15288 and Its Application Table 2 shows the system life cycle stages of ISO/IEC15288[2]. We consider our new proposed process (Figure 2) as actually corresponding to the concept stages and development stages of ISO/IEC15288. In our proposed process we anticipate each activity involved in the concept stage and development stage to undergo repeated activity. When this occurs then there is a possibility of switching between both the concept stage and development stage, although in the event of this not fully satisfying the user or organization, or from the planning point of view, then the process will return to the previous stage and the overall process will be repeated. Table 2. System life cycle stages and purposes [2] LIFE CYCLE STAGES

PURPOSE

DECISIONS

Identify stakeholders’ needs CONCEPT

Explore concepts Propose viable solutions

DEVELOPMENT

PRODUCTION

Refine system requirements

- Execute next stage

Determine system components

- Continue this stage

Build system

- Go to previous state

Verify and validate system

- Hold project activity

Mass produce system

- Terminate project

Inspect and test UTILIZATION

Operate system to satisfy users’ needs

SUPPORT

Provide sustained system capability

RETIREMENT

Retire; archive or dispose the system

3 Proposed Process Activity 3.1 User-Centered Planning Product planning is an essential element of the process to develop the relevant service or product. Although product planning does give rise to technology-driven planning in a number of cases, this process adopts a system that doesn’t rely solely on


113

technology but also involves planning that takes into account the perspective of users who will actually use the system. Based on the subsequently developed planning then the realization date for the service and requirements from the planning side for creating new value can subsequently be determined. The process model that we describe here is expected to be mainly involved in systems whose realization date is in the near future and which require the creation of new value. It is also possible, however, to utilize this process for cases in which the creation of new value is required and the realization date is more recent, or in which the realization date is in the near future but which involve no demand for the creation of new value. In such cases some parts of the process will not be required (for the former case there will be no need to carry out a study and prediction process for future circumstances; for the latter case then the process for creating and selecting new value will become redundant). 3.2 Understand and Specify the Context of Use The proposed process will also involve carrying out a survey and analysis of users. The subjects of such a survey should be the actual anticipated users of the relevant system. Particular attention should be paid, however, in not being able to carry out a survey of potential future users in looking to determine a future realization date for the system. If the proposed realization date is 1 to 2 years following the planning process then a survey can be carried out based on the assumption that future users will not be noticeably different from current users. If the system or product’s current use and users are unclear then it will be difficult to actually develop a system in the future, meaning that is essential to carry out a survey of current users. The results of this survey can then be used as requirement definitions for the system as well as being used as important original data for creating new value. 3.3 Study and Prediction of Future Circumstances In focusing on being able to implement the system in the immediate future it is first necessary to carry out a study and prediction of future circumstances. While it is impossible to completely predict the future it is possible to survey and predict the future as much as possible relating to the development of the system and its targeted users. If the proposed realization date is only 1 to 2 years following the initial planning process then future users can be expected to not be noticeably different from current users and there should no significant changes [7]. Events that are anticipated as undergoing some change can also be expected to undergo quantitative prediction based on extrapolation of previous data [7]. Despite this, however, it is still important to remember that considerable change can still occur in new technology, products or services, and that the rate of usage or adoption of the relevant service or products is also subject to significant change. In the event of the realization date expected to be in the near future (roughly 2 to 10 years following the initial planning process) then it is only to be expected that significant change will take place between conditions now and in the future, meaning that being able to predict the future is a valuable facility. Although it is impossible to

114

Y. Itoh et al.

completely predict the future, in carrying out the principles of scenario planning this offers the potential of being able to portray a number of different scenarios for the future [4], [5]. In order to be able to carry out these predictions it is first necessary to fully clarify the items related to changes in the future and find out the principle factors that contribute to change on a global-scale [4], [5]. With these factors as the core it is then necessary to consider a number of different possible futures. Figure 3 shows a conceptual diagram of futures that have a high potential of actually coming about. The number of futures with a high potential of actually being realized are not limited to just one possible outcome, meaning that a number of different scenarios should be drawn up with different prospects for the future. The future scenario drawn up in this case includes stages showing each of the specified requirements and this will also be used as an important basis for data in creating new value.

Fig. 3. Model of future scenario planning

3.4 Specify the User and Organizational Requirements, and Future Circumstances This process involves using the results from user-centered planning, understanding and displaying usage, and a study and prediction of future circumstances to extract the necessary requirements for the relevant system and subsequently describing these in a text format. Outwith the displayed contents of the requirements involved in the ISO13407 process, it is also necessary to display requirements from the initial planning stage and for future circumstances. The usability requirements for the system and required conditions for the relevant functions are the same as conventional requirements. What should be of particular attention is that the results gained from the study and prediction of future circumstances can lead to a definition of what technology can be used and what type of technology will be unsuitable for use in the realization date to be decided in the future. These will also act as restrictions in creating new value. In order to be able to choose from a wide variety of different ideas in creating new value


115

it is necessary to define a rating scale for ideas and this rating scale can be developed based on the requirements for new value. 3.5 Creation and Selection of New Value This particular process contains a distinctive element to the process model and this process is essential if the creation of new value is required from the product planning stage. What is called new value in this situation is not simply a few minor changes to the product or a new level of model change, but rather the introduction of completely new functions, a new user interface, high added-value that previously didn’t exist, or a product or system that has been implemented based on new findings. A range of creative ideas is therefore necessary to be able to realize such new value contained in these products, and this usually involves implementing brainstorming sessions or individual thinking by product planning and design staff. It is then possible to select the most appropriate ideas that are developed and generate a concept using the best ideas. This will ultimately be compiled as part of the product planning process. This particular process takes advantage of the ideas of user-centered design and the results of user surveys and analysis, as well as the usage and predictions for a future world, future markets, and future users as a base for carrying out creative thinking and developing ideas. Regarding specific methods for creative thinking we are currently planning to explore this in a separate study and publication. Generating a broad range of multiple ideas means that these should be subject to the rating scale for ideas developed in section 3.4 and subsequently used to carry out a quantitative evaluation with the ensuing results used as a basis for selecting the most appropriate ideas. 3.6 Produce Design Solutions The requirements that include the selected ideas can then be used to design and develop a range of solutions. At this time we recommend that a number of different prototypes of the relevant product be created. As there is a concept stage element involved in this particular process this means that this is implemented as a result of selecting planned ideas that have a high level of feasibility of being implemented. For each process involving user-centered planning, displaying the relevant requirements, creating and selecting new value, and development of solutions through planning, we think that in some cases there may a repetitive and simultaneous carrying out of such processes in a progressive manner, although these processes will not necessarily be carried out in the order indicated by the arrows in Figure 2. This is the same process as occurs in ISO13407 [1],[3]. 3.7 Evaluate Designs Against Requirements Products or system prototypes that have been created by the previous process can then be evaluated using this process. The evaluation will essentially be implemented among anticipated users of the relevant product or system and the evaluation method used will be based on a usability test and user test. These will, however, differ from regular tests in that the anticipated users will be users at some point in the future. Although it is impossible to actually carry out an evaluation test on future users, it is possible to carry out a test on subjects who are anticipated as being relatively close to such future users.

116

Y. Itoh et al.

In order to carry out an evaluation of systems that will be used in the future, rather than performing an evaluation on regular users it is preferable to take measures to carry out such a test on progressive users of the product or system [6].

4 Conclusion Here we introduced a conceptual model for a user-centered design process for use on systems that involve the creation of new value and will be realized and implemented in the future. Although efforts are already underway into the development of systems that use such a conceptual model, these systems are currently in use and we have yet to see any clear results from these efforts. In the future we plan on further investigating the effectiveness of this process, as well as continuing to use this process model as part of the development process.

Acknowledgement We presented the first draft of this paper at the HIS2004.

References 1. ISO13407: Human-centered design processes for interactive systems (1999). JIS Z 8530: Human-centered design process for interactive systems (2000) 2. ISO/IEC 15288: Systems engineering - System life cycle processes (2002). JIS X 0170: Systems engineering - System life cycle processes (2002) 3. Kurosu, Hirasawa, Horibe, Miki: Understanding human-centered design processes for interactive systems, Ohmsha (2001) 4. Schwartz, P.: The Art of the Long View, John Wiley & Sons (1997) – Translated as Shinario puraningu no giho (Scenario Planning Techniques) (trans. Taomoto and Ikeda), Toyokeizai (2000) 5. Teramoto, Yamamoto, Yamamoto: Advanced Evaluation of Technology, Nikkei BP (2003) 6. Holmquist, L.E.: User-Driven Innovation in the Future Applications Lab, In: Proc. CHI2004, pp. 1091–1092 (2004) 7. Sherden, W.: The Fortune Sellers: The Big Business of Buying and Selling Predictions, Diamond (1999)

The Evasive Interface – The Changing Concept of Interface and the Varying Role of Symbols in Human–Computer Interaction Lars-Erik Janlert Department of Computing Science Umeå University, Sweden [email protected]

Abstract. This is an analysis of the changes the concept of interface is going through in the shift from the currently dominating virtuality paradigm of use to two new use paradigms, namely ubiquity and mobility; an analysis of the concomitantly shifting role of symbols in relation to the user and to the world; ending with an attempt to identify and analyze important research issues in the new situation that arises, two of which are to better understand the various ways different kinds of interface symbols can link to their real-world referents, and how to combine tracking reality with supporting the user’s own thinking.

1 Changing Paradigms of Use, Changing Notions of Interface There is enormous diversity in the ways modern information technology—that is, computer, telecommunication and interface technology1—have been put to use. Narrowing down to uses that would normally count as involving a “user” and falling within the field of study of human–computer interaction (HCI), still leaves a very great variety. On a high level of abstraction it is possible to discern general, broadly characterized forms of use, which may be helpful in identifying and understanding long-term trends and important challenges ahead. Often, specific technological advancements (e.g. in display or telecommunication technology) play a major role in determining new forms of usage, but there is also considerable inertia in a wellestablished form of use, striving to assimilate technological changes while retaining basically the same form. In this paper three of the most important paradigms of use in the last decades will be identified and examined: one older and well established—the virtuality paradigm; and two new, which are rapidly gaining ground theoretically as well as in practical applications—the ubiquity and the mobility paradigms. The purpose of this analysis is to draw some conclusions from the changing notion of interface and to identify some central research issues that arise as a consequence of the ongoing paradigm shifts. The choice of the term “paradigm” in this context is inspired by Thomas Kuhn’s famous 1

Usually just “information technology” (IT) or “information and communication technology” (ICT) as if deliberately ignoring the fact that such technologies (per definition) have been around since the beginning of history.

J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 117–126, 2007. © Springer-Verlag Berlin Heidelberg 2007

118

L.-E. Janlert

notion of scientific paradigms [9]. A use paradigm comprises important design examples, use scenarios, specific techniques and technologies, specific views on key concepts, such as what a “user” is, and what goals to pursue in HCI—and, not least important, groups or communities of people (researchers and interaction designers) developing and defending the paradigm. Unlike the scientific paradigms in Kuhn’s understanding of scientific development, however, new paradigms in HCI seldom completely replace old ones; even if a new paradigm becomes predominating over time, older paradigms can find niches where they survive. In this manner several use paradigms can coexist and come to be seen as complementing each other. Shifting paradigms of use imply shifting notions of interface. The interface concept in HCI has emerged from a variegated background: the precise physical specifications of components necessary for industrial mass production and assembly, the control panels and steering devices of complicated engines, and, of course, software interfaces between different parts of complicated programs. Within HCI, the interface concept has developed into a complex and multifaceted notion, and the development with regard to the three chosen use paradigms will specifically be studied here. The changing role of symbols is of particular interest. 1.1 The Role of Symbols in Human–Computer Interaction Human–computer interaction, as it is usually understood (which includes the three paradigms examined here), invariably involves the use of symbols, in the general, technical sense of the term [1]. Symbols are being used to represent input, control settings, system status, events, ongoing processes, available resources, possible user actions, results, outputs, etc., all for the benefit of the user. The earliest use of computers was as advanced calculating machines. Since then, not only has there been important changes in the kinds of symbols used, but also in what they are used to refer to, how the user envisages the relations between self, symbol and referent, and how these relations are upheld, in the abstract and concretely. Broadly speaking, in this context symbols may serve three general purposes: as a means for the user to access and acquire information; as a means for the user to supply information, including with the special purpose to achieve certain ends; as an instrument simplifying, supporting or extending the user’s own thinking. The first two are easy enough to understand: symbols are used for output; they are also used for input, data as well as control. The third purpose, supporting cognition, is less obvious but of special interest here. 1.2 Cognitive Artifacts Donald Norman [12] introduced the concept of a cognitive artifact, defining it as “artificial devices that maintain, display, or operate upon information in order to serve a representational function and that affect human cognitive performance.” Computer applications that normally concern HCI are generally cognitive artifacts. But there are two different senses in which a cognitive artifact can assist thinking, related to the distinction Norman makes between “personal view” and “system view.” One sense is that it can substitute for (parts of the) thinking in performing a certain task. An example would be the pocket calculator. The user doesn’t have to do all the thinking

The Evasive Interface

119

involved in the usual routine for multiplying numbers using pen and paper. The pocket calculator requires the user to press the right buttons to input the right numbers (watching out for errors) and the desired operation, and read off the result, too—but that is very much the same as in the pen-and-paper version. Norman would say that the task for the user has changed, the “personal view” has changed significantly, whereas from the “system view,” the result should be the same but delivered faster and possibly with less errors. The other manner in which a cognitive artifact can assist in the cognitive work of the user is well illustrated by the manual method of multiplying numbers using cognitive artifacts such as pen, paper, mathematical notation such as numerals and arithmetical operator symbols. In many cases, these two senses can be two different aspects of the same artifact. For this to happen, it is important that the symbols employed in the interface are chosen with care: they should raise the level of abstraction in such a manner that they really support fruitful higher-level thinking on the user’s part. We should not underestimate the extent to which computer applications can support users in their own thinking, serving as cognitive artifacts in the second sense. Spreadsheet applications and word processors are typical examples of a range of common applications where the support for the user’s own thinking is about equally important as supporting the user in producing results. 1.3 Thinking Versus Doing and Perceiving To be able to claim that thinking is taking place, it is of some importance that a distinction between thinking and doing can be maintained, even if it is relative and at a symbolic level itself: e.g. to entertain the possibility of X should not be tantamount to causing X.2 Thinking by “doing” is certainly sometimes a possibility, for example when we think about how to best lay the table by trying out different placements of plates, cutlery, glasses, etc.. It is reported that in playing the game of Tetris, contrary to what one would expect, as users become more skilled they increase rather than decrease the number of “epistemic” actions, i.e. actions performed to uncover or speed up access to information, compared to “pragmatic” actions, i.e. actions performed with a purpose to put the current piece in its chosen place and orientation [8, 10]. In some circumstances it seems difficult to say whether an action is part of the thinking preceding the “real,” effective action or the effective action itself, until after the fact. Computer applications supporting undo encourage such tentative actions, but if the ultimate purpose of the application is some real-world implement or effect, we can still see it as a (productive) play with symbols; at least as long as the symbols are more easy to change than their (ultimate) referents, and failures are less devastating at the symbolic level than at the referent level. In considering various hypothetical stages of thinking in evolution, Daniel Dennett arrives, first at what he calls the “Popperian creature,” which, as Karl Popper succinctly put it “permits our hypotheses do die in our stead,” and then at the “Gregorian creature,” named after Richard Gregory, which is also able to take cognitive shortcuts by importing “mind tools” from the environment [1]. 2

Compare Hegel’s remark in Lectures on the philosophy of history, that whereas animals cannot interpose anything between an impulse and its satisfaction, human beings have the ability to arrest an impulse and reflect on it before letting it pass into action [5].

120

L.-E. Janlert

Mainstream cognitive science has been attacked from different quarters for attaching too much importance to thinking with the help of symbols. Within HCI there have been several attempts to rectify the predominance of interaction through explicit symbols, by investigating alternatives in the direction of rich perceptual experiences and complex physical actions, which presumably make better use of natural human capabilities to interact. An influential case in point is the concept of affordance brought in from ecological psychology and adapted to HCI by Donald Norman [14]. Affordances, in Gibson’s original version, are not symbols (possibly they might count as indices in Peirce’s taxonomy of signs), they are rather perceptual cues that trigger responses, behaviors [3]. Still, it is one thing to perceive that a button invites to being pressed, another to know what the effect will be, and when and for what purpose it is appropriate to perform the action. In the pedagogical examples Norman likes to use, such as operating doors and water taps, the function of the artifact is severely limited and well-known: just about the only thing you expect to be able to do with a door is opening and closing it, so if you perceive a button-looking feature that invites to pushing, you can reasonably infer that pushing the button will either open or close the door. Going further in the direction of tangible user interfaces, consider computerized artifacts that lack a dedicated symbolic interface, e.g. a computerized chair that adapts to your body, interprets your spontaneous, small movements, learns and remembers your favorite positions, wakes you up when you fall asleep, makes you change your posture when you have been sitting too long in the same position, etc.. It may be an academic question whether this is really HCI, but researchers and designers will have to deal with such cases. At this time, however, none of the paradigms studied here seem to include artifacts of this kind.

2 The Virtuality Paradigm In what may be called the virtuality paradigm, the interface is a means for the user to access a different and symbolic world. This is the use paradigm that has become so common and dominating that we are hardly aware of it. The user ultimately wants to get through the interface, partly (as in the typical graphical user interface, GUI) or completely (virtual reality), into that other world. Transparency is commonly seen as an ideal. In engaging with the virtual world, the user more or less shuts out the real world and the specific situation of use; it rather disturbs the interaction and task performance. Maintaining links and relations between the symbols and the real world is the responsibility of the user and the service provider: mapping real world regularities and state of affairs into symbolic models, and interpreting and mapping symbolic results back for application in the real world. This arrangement puts the user in the position of a middleman: streams of information pass through the user in both directions; the user easily becomes a bottleneck, exhausted and confused by the traffic, afflicted by information and communication overload. Although taking its name from virtual-reality technology (VR)—which may be said to have as its ideal the complete immersion of the user in an alternative, virtual world appearing as real to the user as the real world—the virtuality paradigm not only


121

antedates VR, but also GUIs.3 The “other” world accessed through early textual interfaces, before the advent of graphical user interfaces, was also a symbolic world, typically consisting of mathematical models and data about the real world. It was a rather abstract “world,” usually lacking spatiality and shape, in some sense comparable to the world evoked by a book. GUIs transformed these abstract and spatially weak symbolic models into what could be more properly be called worlds, directly accessible to the user’s perception, in the process also replacing the previous conversation model of interaction with the acting-in-a-world model. In some sense this parallels the step from book to motion picture. Interface concept. The interface provides access to a different and symbolic world, whether the means are textual or graphical (or involves other modalities). The interface is something the user wants to reach or get through, to engage in the virtual world behind. Graphical interfaces open for a more vivid interpretation of “world,” and the interface can be viewed literally as an opening. Use scenario. The user accesses or enters the virtual, symbolic world via the interface in order to perform some operations in the world, to retrieve information, to update and develop. In many cases this is done in order to support some real-world activity: tasks arise in the real world; the user enters the virtual world beyond the interface, for help and assistance, mentally more or less leaving the real world (since it is difficult to engage in more than one world at a time); and eventually returns to the real world with an answer. In preparation for future uses, the user may also learn facts about the real world, and enter the virtual world to record or modify the facts or change the model. Whereas it is hard to find examples of virtual worlds that bear absolutely no relation to the real world, some uses are undeniably more escape from, than support for, the real world. Symbols. In most cases the symbolic world thus represents aspects of the real world even if large parts can be hypothetical, counterfactual, even fantastic. The task of keeping track of which real-world referents the symbols have, and what status they have, falls on the user and the service provider (maintaining the basic model, updating variable data). The situation of use is not linked to the model world. When engaged in using the application, the application is basically the only means of accessing the real world, which usually means a rather abstract, alienated view of the world, with little chance to verify that the virtual world gives a correct picture of the real state of affairs, especially since the user is in principal cut off from the real world by the very way the interface concept works. 2.1 Mixed Reality Leaving the purely virtual approach where symbols are unaffected by the real world, there are now many applications in which virtual world elements are causally coupled to real world counterparts. Some actions in the virtual world have real-world effects, 3

Similar to how computer graphics has had as its longstanding ideal the ability to produce pictures qualitatively indistinguishable from photographs of any actual or imaginable realworld scene.

122

L.-E. Janlert

they are not just symbolic actions; some real-world changes are reflected in virtualworld updates. This is part of the idea of cyberspace as interpreted by among others Benedikt [1]. By this move users are somewhat relieved in their role as mediators. Information can bypass the user. Some tasks can be completely automated, taking the user out of the loop completely. Typically though, real-world feedback to the user through the interface is weak and abstract, giving the user a feeling of unreality (as e.g. in computerized warfare). In lifting part of the responsibility of connecting symbols to reality off the user, the overview of consequences and quality of control may suffer. In the case of more radical forms of mixed reality, like augmented reality, the user may face a single world that is a fusion of real world and virtual world, where it potentially may become difficult to distinguish what is real and what is just a symbol, or perhaps even to insist that the distinction still exists. There are two types of augmentation. The first is to superimpose extra information (normally inaccessible to the user’s senses) about the real world on top of the real world elements it is about, producing a kind of “annotated reality.” The second is to introduce elements, components, aspects that are simply non-existent, fictional, in relation to the real, actual world. The first type of augmentation is less problematic as long as the extra symbols are easy to distinguish as such (e.g. textual annotations); the second kind is more problematic: it is what may turn this into a kind of “magic reality,” where you might become uncertain whether you can walk through that wall or not. Of course, it is not easy to freely mix fantasy with hard reality if the basic requirement is that reality is perceived as such and as it is. This branch of the virtuality paradigm is not so well developed yet—it clearly needs the addition of mobility to become more than very locally realizable—so it will have to await further analysis, but potentially there is a whole new use paradigm hidden here, just waiting for the right technology: efficient, comfortable, and cheap. One interesting technical possibility is hand-held VR [6].

3 The Ubiquity Paradigm If the old idea was to put a world into the computer, the new idea is to put the computer into the world of real objects and environments. In what may be called the ubiquity paradigm, ubiquitous computing and computer artifacts divide the traditional interface into a multitude of individual thing and environment interfaces. The computer artifact is reality, and the interface is a way to use and control the real thing. This is a notion of interface more in line with traditional industrial design: an envelope of the object, negotiating between inner and outer environment, as elaborated by Herbert Simon [16]. Whereas the virtual approach is arbitrarily free relative to the real world, the ubiquitous approach tends to be earthbound, welding symbol and object together, as in the notion of the object symbol introduced by Donald Norman and Edwin Hutchins [15, 14]. In the more environmentally oriented areas of ubiquitous computing, such as calm technology, introduced by Mark Weiser [17], the unobtrusiveness and even invisibility of the interface is emphasized. The interface can signal real-world state affairs, but it should not be in the form of proper symbols, rather like indexical signs in nature (e.g. smoke or smell of burning indicates fire).


123

Interface concept. The interface is the surface of a real, clearly distinguishable physical object, which it covers and is the means of controlling. The “invisibility” ideal, the interface as something the user should not have to think about, is an ideal of superficiality—everything of importance is on the surface—which is complementary to the transparency ideal of the virtuality paradigm. Use scenario. Users use facilities on site, wherever they happen to be, use objects and devices where they are present, for purposes that are pertinent to the situation of use. Computer artifacts typically have specialized functions (compared to the traditional general-purpose computer), dedicated uses. Symbols. Symbols are strongly real-world related, more precisely to the real-world situation of use, to the point where symbol and referent threaten to fuse into one entity. There is no reference to a different world. Accessing the symbols is accessing the real world, here and now. 3.1 The Problem with Object Symbols Three of the most basic expectations we have on symbols are: 1) that they are lightweight and easy to manipulate compared to their referents; 2) that they can be at a distance from their referents; and 3) that they can symbolize states of affairs other than the real and actual. In dropping one or two of these conditions, the third in particular, we also lose some or all of their ability to serve as tools for thinking. They may still work as tools for observing and acting. The notion of the object symbol was put forward to encourage very tight couplings between symbols and referents in HCI—as well as in artifact interaction in general, but in many older, mechanical artifacts, this tight coupling is already present and seen as an ideal by Norman: “when the object in the artifact is both the means of control (for execution of actions) and also the representation of the object state (for evaluation), then we have the case of an object symbol” [12]. It seems that object symbols violate all three of the above conditions for symbols. Per definition they violate the third condition, and thus give poor support for the user’s own thinking: if you cannot represent counterfactual state of affairs, if you do not have the ability to fantasize, you are not, properly speaking, thinking at all. Tracking reality is not thinking. Per definition, object symbols also violate the second condition: when objects represent themselves or a larger artifact of which they are a proper part, they cannot be at a distance from their referent. Of course, in the ubiquity paradigm, this is a feature, not a bug. Again, much depends on how cognitively sophisticated applications and artifacts we consider. For example, since we do not use stoves to help us think, perhaps the idea of object symbols might work out fine. Imagine that the knobs of the stove are object symbols: not only can the user control the heat by turning the knob, the current temperature is simultaneously indicated by the current angle of rotation of the knob. Here we see the effect of violating the first condition: if the stove has ordinary electric heaters, the logic of object symbols will require the user to apply torque to the knob for as long as it takes the stove to reach the desired temperature. Not very convenient. And there is another problem: if the symbol really works both ways, how does the user express desired artifact states except by constantly working the controls? What stops the stove from gradually

124

L.-E. Janlert

getting cooler, slowly turning the knob to indicate lower and lower temperature? In many ways it is easier to make interfaces to virtual worlds than to the real world where you cannot adjust the physics to suit the desired logic of the interface.

4 The Mobility Paradigm Another important new use paradigm is mobility, very much a consequence of mobile computing, using mobile, “untethered,” and (usually) small units, connected through wireless technology. Mobility brings two new scenarios of use: remote operation, which is the main focus of practical applications at present; and, more important, in situ application, which is just beginning to be explored. The latter creates a new kind of situation with regard to the interface. Bringing computer applications to bear directly and dynamically on their very point of use in the real world, precisely where the user is in space and time, the user will need to relate symbols with their also present real-world referents—contingent on real-world location and real-world changes. Contrary to the virtuality paradigm, the real world and the actual situation of use in particular, is not a distraction but a resource as well as an, obviously present, target for the use of the application. Interface concept. The interface concept is not one and fixed. One possible concept ties in with the remote access use scenario, basically inheriting the interface concept of the virtuality paradigm. With regard to the in situ use scenario, the issue of interface concept is interesting but so far unresolved: it is clear that like in the ubiquity paradigm, the interface must relate closely to the objects and environment at hand in the situation of use; on the other hand the interface must allow access to informational and computational resources not tied to a particular real-world location or time, like in the virtuality paradigm. Use scenarios. There are thus two use scenarios. One scenario is remote access and control, that is, use independent of situation, which can be seen as extending the virtuality paradigm to allow remote operation from wherever the user happens to be; as if bringing along your desktop computer, connections and all. The second use scenario is the exact opposite, in situ application: use is determined by and dependent upon the situation. The computational resources are brought to bear on the very situation of use and user. Symbols. For the remote operation scenario and interface notion, symbols work similar to the virtuality paradigm. For the in situ application scenario and interface notion, we have a more complex situation. Some of the symbols need to relate to referents that are copresent with the user: the user needs to mentally and dynamically link present realworld referents to symbols in the interface. This is different from both the virtuality paradigm where the user disappears into the interface, and the ubiquity paradigm where the referents are within the artifact itself, so it puts the interaction designer in a new kind of situation. We have not really had to deal with how the user is supposed to match symbols to particular, present real-world referents, dynamically and efficiently, before. In [7] there is an attempt to begin a systematic investigation of the possibilities to make this kind of linking in the particular case of visual symbols.


125

4.1 Context Awareness and Use Situation The mobility paradigm brings with it the opportunity and challenge of context-aware computing (CAC) [11]. Many suggested applications of CAC build on the assumption that the physical setting and situation largely define social roles and agenda. Ironically, just when we have the means to automatically silence mobile phones as we enter the meeting room (remote operation scenario), it is becoming less obvious that we should do so, and less axiomatic that “meeting room” is a physically fixed location with this one purpose. If before, the physical environment very much determined the social environment —e.g. a class room is for teaching, which involves teacher and pupils playing their particular roles—and, vice versa the informational environment, i.e., the available informational and computational resources, very much determined the physical environment—e.g. to access the reference literature you would need to go to the library—with the mobility paradigm of use we now are both freer to mix environments and more exposed to inconvenient environment combinations (e.g. driving and using the mobile phone at the same time). Before, the user would typically do one thing at a time; handling the physical stuff, negotiating with people, and doing the thinking and information work, in turns. The mobility paradigm creates a condition where the total situation of use (i.e. the information situation, the social situation, and the physical setting) has to be taken into account in parallel, and where the course of events in each environment no longer can be assumed to be well correlated with the course of events in the others.

5 Conclusion Earlier and more recent developments in HCI have worked to modify, extend and elaborate the concept of user interface, making it the complex and multifaceted notion it is today. The meeting of the established virtuality paradigm with the new ubiquity and mobility paradigms (and there are no signs at this point that any of these three paradigms will recede into the background), seem to result in a confusion of options and requirements that need to be satisfied regarding the status of symbols and their relation to the real world. The mobility paradigm, in particular, produces some new research challenges by bringing to the fore the issue of linking interface symbols to the real world at the very point of use. Research challenges identifiable from the above analysis include: examining and developing the various ways different kinds of symbols can link to their real-world referents, as seen from the user’s point of view; investigating how conceptual links can be turned into effective perceptual links; studying how to make different statuses of relation between symbols and reality perspicuous to the user, as well as the distinction between symbol and reality itself; and finding out how in doing all this we can strike a balance between tracking reality and allowing symbolic “freedom of thought” supporting the user’s thinking, without confusing the user too much. When it comes to practical answers they will certainly depend on the application, on the particular circumstances and functions.

126

L.-E. Janlert

References 1. Benedikt, M.: Cyberspace: Some Proposals. In: Benedikt, M. (ed.) Cyberspace: First steps, The MIT Press, Cambridge MA (1991) 2. Dennett, D.C.: Darwin’s Dangerous Idea. Simon & Schuster, New York (1995) 3. Gibson, J.J.: The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, Hillsdale NJ (1986) 4. Goodman, N.: Languages of art: an approach to a theory of symbols, 2nd edn. Hackett, Indianapolis IN (1976) 5. Hegel, G.F.: Vorlesungen über die Philosophie der Geschichte (1837) 6. Hwang, J., Jung, J., Kim, G.J.: Hand-held Virtual Reality: A Feasibility Study. In: Proceedings of the ACM symposium on Virtual reality software and technology, pp. 356– 363. ACM Press, New york (2006) 7. Janlert, L.E.: Putting Pictures in Context. In: Janlert, L.E. (ed.) Proceedings of the working conference on Advanced Visual Interfaces, pp. 463–466. ACM Press, New York (2006) 8. Kirsh, D., Maglio, P.: On Distinguishing Epistemic from Pragmatic Action. Cognitive Science 18, 513–549 (1994) 9. Kuhn, T.S.: The Structure of Scientific Revolutions, 2nd edn. The University of Chicago Press, Chicago (1970) 10. Maglio, P.P., Kirsh, D.: Epistemic Action Increases With Skill. In: Proceedings of Twenty-first annual conference on the cognitive science society, Lawrence Erlbaum Associates, Hillsdale NJ (1999) 11. Moran, T.P., Dourish, P. (eds.): Context-Aware Computing. Special Issue of Human– Computer Interaction 16(2–4) (2001) 12. Norman, D.: Cognitive Artifacts. In: Carroll, J.M. (ed.) Designing interaction, Cambridge University Press, Cambridge (1991) 13. Norman, D.: Emotional Design. Basic Books, New York (2004) 14. Norman, D.: The Psychology of Everyday Things. Basic Books, New York (1988) 15. Norman, D.A., Hutchins, E.L.: Computation via direct manipulation (Final Report: ONR Contract N00014-85-C-0133). Institute for Cognitive Science, La Jolla CA. University of California, San Diego (1988) 16. Simon, H.A.: The Sciences of the Artificial, 3rd edn. The MIT Press, Cambridge MA (1996) 17. Weiser, M., Brown, J.S.: The Coming Age of Calm Technology. In: Denning, P.J., Metcalfe, R.M. (eds.) Beyond Calculation: The Next Fifty Years of Computing, Springer, Heidelberg (1997)

An Ignored Factor of User Experience: FEEDBACK-QUALITY Ji Hong1 and Jiang Xubo2 1

Shanghai Research Institute of China Telecom [email protected] 2 ISAR User Interface Design [email protected]

Abstract. User experience plays a more and more important role in the process of design and development for the information products. About the user experience in the field of the network-based (Internet and mobile network) application a lot of research and development teams focus on the information architecture (IA) and user interface (UI) design, they locate on the middle and front level of the products. But in the same time a very important factor of user experience is ignored: FEEDBACK-QUALITY, which is decided by the quality of telecommunication from Telecom Service Support. Through the long observation and research we find: this factor can basically influence the most network-based products. Keywords: feedback quality , feedback periods , feedback periods integrality , feedback time.

1 Brief Introduction At present , the study about user experience concentrates in user interface design which is user straight osculant , but the other important factor is ignored by most people which we called feedback quality . By studying three kinds of long-distance scrutiny software , we discover that the ignored factor takes an important part in information systems user experience which is mostly intermediary by internet.

2 Definition In order to make our discussion clearly , we make several definitions: 2.1 Feedback Periods It is the process from user sends out instruction of aiming at information storeroom to receives corresponding feedback . We can make this definition clearly by picture. USER

Intermedium

Information storeroom

Fig. 1. Feedback periods model J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 127–132, 2007. © Springer-Verlag Berlin Heidelberg 2007

128

H. Ji and X. Jiang

We can see too principal parts from the model : User and information storeroom . On the side ,there’s intermedium between them . There’s user interface that we are familiar with in this intermedium , and also there’s An important part which is made of the whole information system: Internet connecting. User interface -----------internet physics stratum -----------------informaition storeroom interface (machinery interface) Fig. 2. Intermedium forming model

2.2 Feedback Quality It is a standard to scale efficiency of feedback quality , also an important to user experience which is ignored for a long time. We think there are too standards to scale the feedback quality : 1. Integrality of feedback periods , which is directly deciding form of feedback periods, it affects users’ needs can be satisfied or not in the user experience field. 2. Feedback time which also called the time users finish feedback periods , satisfies users’ needs of efficiency .Generally speaking ,user experience field only needs to pay attention to the affection of user interface ,but after studying we discover that the internet speed is also affects user experience .

3 Methodology What we found comes from a UT about 3 softwares from China Telecom. The methodology is the usability testing: Let the really users in the same designed scenarios perform the selected Tasks, with the statistic of the perform-time, the amount of errors and the interview to the participant,we can get the problems in the tested products about the usability.[1][2].

4 The Design of Experiments This test is a landscape orientation contrast test between 3 different versions of the far-controled security software. We found at the first 7 users to perform the test to their main functions. 4.1 The Choose of the Participants First of all we setted the standards of participants: the staffers in security room or normal personnels without absolute use experiences of the tested softwares. Base of the choosing standard,we found,there were 2 out of 7 participants worked to sell the those softwares,so we deleted their datas in the test. 4.2 The Test Plan To avoid the affections of study-impacts we set a matrix-order for each participants in the test.

An Ignored Factor of User Experience: FEEDBACK-QUALITY

129

Table 1. Participant

Participant

Participant

Participant

Participant

1

2

3

4

5

1

Version A

Version C

Version B

Version A

Version B

2

Version B

Version A

Version C

Version C

Version A

3

Version C

Version B

Version A

Version B

Version C

Orde

4.3 Task Arrangement We design tasks for users according to the main functions of software: 1. Showing the XXXX watch menu in the top left corner window 2. Showing the YYYY watch menu in the bottom right corner window 3. Select the picture in the top left corner window, turn the camera left and up, then take establishing shot 4. Take a picture from the bottom left corner window. 5. Check the picture of step 4. 4.4 Data Collection Criteria 1. Time Criteria of Task Completion The time will be accounted after user finishing reading the task, and it will end after the user’s announcement of finish. If the time a user used excesses the average time, the task completion of this user will be considered failure. 2. Criteria of Successful Task Completion User announces the finish himself, and the completion is confirmed by the question-master.

5 Analysis of Data and Experiment Result 5.1 Stat of Testing Time Software A User1

Use2

Use3

Use4

Use 5

Task1

231s

261s

453s

142s

177s

Task2

218s

224s

125s

22s

177s

Task3

100s

113s

134s

37s

71s

73s

124s

155s

36s

43s

Task4 Task5

43s

130

H. Ji and X. Jiang

Software B Use1

Use2

Use3

Use4

Use5

Task1

89s

26s

34s

57s

24s

Task2

185s

82s

27s

224s

79s

Task3

378s failure

124s

140s

100s

68s

52s

245s

66s

57s

23s

94s

24s

49s

（

）

Task4 103s Task5

Software C Use 1 Task1 Task2 Task3

Use 2

Use 3

Use 4

Use 5

139s

35s

25s

103s

（failure）

（failure）

431s failure

169s

167s

115s

failure

failure

12s

111s

failure

127s

35s

（） 18s 211s（failure）（failure） 422s（failure） 93s 325s failure

Task4

（

）

107s Task5

5.2 Stat of User’s Mistakes Task1, 2

Software A

Software C Software B

Description User selected” Kinescope Research” to display the watch picture User hit the control under catalog of “facility list”; without hitting the image of the camera, but hitting images of subdirectory. User drag and hit images of the control User hit other irrelative widgets to research watch pictures -

No. 2 5 3 4 -

Task3 Mistake Description Software A Software C Software B

User went into “Image Effect” and “Advanced Control” to research establishing shot control function. User could not find the direction control function User misconduct the direction control frequently User hit wrong widget to control establishing shot User confused the images of establishing shot and close shot

No. 3 4 4 3 5

An Ignored Factor of User Experience: FEEDBACK-QUALITY

131

Task4, 5 Since the three testing all had situations of Task failure, comparison and stat is hard to handle. However, the task failure itself shows the mistake bearable problem of the software design. When the mistakes happened, all three software did not show clear hint or help, nor necessary in-support function insert, so user’s operation can only be based on one single mode. If any problem happens in this process, user will have on way to finish the whole task. This is the biggest Mistake Bearable problem of the three software so far. 5.3 Our Discovers After the analyse to the statistion about the using-time and the errors but also the interview to the participants,we got: version C gets the worst note of UE. But we did not immediately conclude,that all prolems ascribe the UID of this version,becase we found 2 strange phenomenas: 1. the 2. task of the version C is not difficult,it is just a simple select-perform.but we got a lot of error-records. 4 in 5 participants were lost. It confused us,why after the study of task 1, the perform of the participants got a lower note. The difficulty of the task schould be reduced. To explain this question we reviewed the feedback of the participants after the test and wacthed the video-tapes again. Through the analyse we found,the impersonality reason of the lost is the discontinuity of the video cable,it leads to the participants can not find the video they wanted,and more there were not clear clew for them. We defined it : the discontinuity of the feedback cycle. Its intergrality is destroyed. The interface in users(UID) —Destroyed— The Medium——the interface in machine Fig. 3. The intergrality of the feedback cycle is destroyed

2. We got a bad record of using-time and errors in the 3. task of version C.We discovered the reason after the review of the test. There was a 2-3 seconds delay when the participants tried to control the direction of the camera. That means the feedback time is far higher than the participant`s limit of patient. It leads, that they can not decide their performs as their custom.They must face to the difficulty of study and more the efficiency is affected. At the last we got: The unintergrality of the feedback cycle and the too long feedback time can directly affect the feedback Quality,and more can lead to the reduction of the UE, which is not just localized in the field of UID.

132

H. Ji and X. Jiang

6 Conclusions Customer experience is always just considered as user interface design. Once it’s related to technique issue, it’s easy to neglected and simply considered as technique bug. However, through this usability testing, in our opinion, user experience is not just user interface design. The company should not only focus on the user research and user interface design of the two sides of feedback time but also should pay attention to the relative technique aspects. Particularly China Telecom, how to improve the quality of feedback or how to male the customer satisfied with basic needs for feedback periods and decrease feedback periods to improve the potential needs of feedback efficiency is significant. User experience could be considered as the furniture which is made up by a few woods. User research, feedback quality and user interface are one of them. If one of them is neglected, user experience will be affected. Therefore, research and improve feedback quality which is always neglected is the key point to improve user experience to China Telecom.

References 1. Handbook of Usability Testing(How to plan, design, and contduct effective Tests) Jeffrey Rubin (John wiley & Sons Inc) pp. 25–26 2. A Practical Guide to Usability Testing Joseph S. Dumas Janice C. Redish(Intellect) p. 4

10 Heuristics for Designing Administrative User Interfaces – A Collaboration Between Ethnography, Design, and Engineering Luke Kowalski and Kristyn Greenwood Oracle Corporation, 500 Oracle Parkway, Redwood Shores, CA 94065 [email protected], [email protected]

Abstract. The lack of focus on administrative interfaces often comes from management's mandate to prioritize end user screens ahead of others. This often shortchanges a more technical class of users with unique needs and requirements. At Oracle, design heuristics for admininstrative GUIs were sourced from a multitude of sources in the corporate ecosystem. Ethnographers, software architects, designers, and the administrators themselves all contributed to bring a better understanding of this often forgotten class of user. Administrators were found to inhabit anywhere from two to five particular classifications, depending on the size of the company. Recently, an ethnographer studied one classification in greater detail, the Database Administrator, while a designer, in the course of an E-Business Suite Installer project analyzed another, the application administrator. What emerged based on the gathered data was a remarkably consistent and universal set of rules and tools that can be used to lower the total cost of ownership and increase usability, attractiveness, and satisfaction for administrative interfaces. Keywords: Design, Administrative interfaces, design techniques, heuristics, ethnographic research, design methods.

1 Introduction Research at Oracle has identified 5 distinct administrator types, with varying amounts of skills, responsibility, and training. This number applies to large complex environments that consume a suite of enterprise applications, or even multiple layers of the technology stack (database, application server, and application). In smaller environments, administrators play multiple roles, but the general classifications break down as follows: The Data Center Administrator tends to focus on the infrastructure. He or she does not design it, but administers the firewall, networks, and manages large-scale deployments. This individual usually holds multiple operating system, network, and language certifications. Their duties are managerial in nature and include reporting, trending, and forecasting. The System Administrators deal with hardware and some software, but on a much smaller scale than the Data Center Administrator. They usually administer servers, troubleshoot operating system level issues and PCs, and often work in help desk departments. Database Administrators J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 133–139, 2007. © Springer-Verlag Berlin Heidelberg 2007

134

L. Kowalski and K. Greenwood

deal with administration, configuring, tuning, and maintenance of databases. They posses highly specialized skills. This last user type was the subject of a 2-year long, 8 site ethnographic study. Data collection in this instance involved 23 Database Administrators and included a self-report survey, observational sessions during which task, object, and tool use was recorded at set intervals, and a follow up interview to elicit more quallitative data. The study was designed to find out what the administrators spend time on, what tools they use, and how this information could influence the next generation of Oracle’s server products. The fourth administrator type is the Application Administrator or Functional Administrator. These professionals usually work with a given application like Human Resources, Manufacturing, or Financials. Their duties encompass Lifecycle Change Management, which spans installation, setup, configuration, maintenance (patching), and upgrade. It is often said that the funds spent on LCM are anywhere from 2 to 4 times as much as the initial license cost of the software. Total Cost of Ownership (TCO) issues are much more relevant in software supporting complex enterprises than in consumer software. Most of the administrator’s time is spent tailoring the applications to meet the business needs and practices of a given company. The fact that an enterprise suite is installed does not mean that it is ready to use. Administrators need to configure things like security, populate or provision the system with users, and define defaults for invoicing, printers, and tax structure, among other tasks. These individuals were studied in detail in the context of a design project to improve the task completion of a suite installer. The installer was a Java based wizard that installed the database, application server, and the applications tiers of the Oracle E-Business Suite. The last type of administrator belongs in the Business Analyst or Implementation Consultant class. They often customize the seeded business flows to meet specific business needs, or work on legacy system integration projects. When the project gets too technical they are often joined by a team of developers who extend and customize the application programmatically, often using a developer tool like Oracle JDeveloper. In studying the administrators through the ethnographic research and through a series of design projects we were able to abstract out heuristics and tools that are generalizable for most administrators and could help a designer better target their deliverables to the needs of this unique community.

2 Heuristics Heuristic 1: Do Not Force a Graphical User Interface (GUI). Innovate Only Where Appropriate. In the ethnographic study, we found that 32 percent (See figure 1) of the administrators relied on the command line as their primary tool on the job. They often found it more efficient, faster, and offering more feedback than a GUI. It can also be accessed remotely. Furthermore, the UNIX command line does not involve any set up or configuration in order to be immediately usable. Designers often assume that command line tools and utilities only exist because engineers did not have the time to develop a GUI. Instead of forcing a GUI it is instead advised to support the habits, comfort zone, and core competencies of the administrators by developing tools to accommodate the command line. These could include repositories of custom scripts for batching jobs, or logging tools and information visualization for mentoring.

10 Heuristics for Designing Administrative User Interfaces

135

Fig. 1. Percentage of Time Using Tool Categories. Data from Oracle Study of Database Administrators.

Heuristic 2: Design Based on Observation. Do Not Rely on Self-Reported Data When It Comes to Design for Administrators. Participate in user groups, advisory councils, and include observational data. This is often a universal truth when it comes to data gathering methods, but we found it to more pronounced for this type of user. Surveys and interviews provided inconsistent data compared to observational sessions. Administrators told us that they spent little time doing troubleshooting, where the observational data showed otherwise (Fig. 2). In the study design we did make sure that our sample came from a representative day, and did not include a singular task. It is recommended to focus on 2 or 3 methods when gathering information for the design of administrative applications. One of them should include some form of direct observation, in context, or with a prototype.

Fig. 2. Comparison of Self Report and Observed Database Administrators for Top 5 SelfReport tasks

Heuristic 3: Design Lightweight and Flexible Applications to Accommodate Remote Administration. Administrators often work from home, or administer hardware located in a data center far away. We have observed that if a tool needs to be installed, or if it has slow performance, or long download times, it will not be used

136


at all. With the current technology, this means thin client web applications, as opposed to native operating system applications, or Java on the client. Mobile applications are critical for administrators, as well. More intelligent devices that can provide more information about a given escalation are slowly replacing pagers that notify the administrator of a given alert. Personal Data Assistants (PDAs) like Treos and BlackBerries were very popular in the Database Administrator and Application Administrator environments. One data point came from a user in a supervisory capacity. His role was to send as a trafficmaster for alerts and data center escalations. He would send specific tasks to administrators based on severity and acquired competencies. Heuristic 4: Design for Collaboration. Administrators spend a large portion of their time communicating with others. Database Administrators spent 19 percent of their time talking to others and 9 percent using e-mail . A good set of collaboration tools can help them become more efficient, automate certain tasks, or just become better organized. Accountability and record keeping also come into question. If collaboration tools are not integrated with the other tools to monitor or tune the hardware and software, they are not considered as useful. Rob Barrett of IBM Almaden Research presented a similar finding, where collaboration was found to be a critical element in the Database Administrator’s work. [1]. In our study we found that administrators underreported all of the communication tasks. Once we were able to identify collaboration as a key feature we were able to design it into the knowledge repository and other tools used by our users, and these features well extremely well received in future lab tests. Heuristic 5: Integrate the Major Administrative Tool Silos: Collaboration, Monitoring, Information Knowledgebase. All administrators studied expressed a desire for a better-integrated portal that would provide an overview of their systems and tools. The application to monitor and tune was only useful if it had an “in context” connection to the application that was used to troubleshoot (Information Knowledgebase, or the repository of solutions to known problems). Collaboration tools were also deemed more useful if they were integrated with their monitoring tools and were designed specifically for administrators to collaborate on lifecycle management of the software environments they were supporting. A good example of this is the ability for the administrator to append notes to an alert in the application that monitors database performance. Administrators are often presented with multiple interrupts of different priorities. We found that they could be more efficient if provided additional context. If they receive two critical notifications (running out of tablespaces) they will triage the one that involves a sales deal database before the end of the quarter and then try to troubleshoot one that belongs to a test system for a future version implementation. Heuristic 6: Documentation for Administrators Is More Frequently Referenced, Needs To Be Fresher, Task vs. Product-Based, and Include the Web. If an application administrator needs to apply patches to their system, they need to have the most recent source of truth, since patches can affect security and stability of the applications they are administering. A 3-month-old printed manual will not be as


137

useful as online documentation (Fig. 3). Administrators, in contrast to the end users study the documentation and form detailed project plans around installation and production deployments. Administrators also work with software and tools authored by sometimes disconnected product groups within one company. Their tasks do not correspond to the product or organizational boundaries. They often span them. In working with the application administrators, in the context of administering a Common Industry Format (CIF) test from NiST, we found that when administrators were stuck after reading the documentation, they went to search Google. They would often find a web based discussion group where this exact error message was analyzed and the problem solved. These were not always official, or company sponsored sites.

Fig. 3. Documentation in the form of a Post-Installation Page with Links to Tools, Guides, and Information Knowledgebases

Heuristic 7: Manage Complexity by Providing Defaults, and Automating Tasks. A constant point of feedback from the application administrators was a request to provide tool defaults that work. This tends to entail a reduction in the number of

138


screens and fewer decision points. If an administrator is using a wizard to perform an installation they do not always want to see all the choices and all the paths (Fig.4). Creating a Quick Install path and an Expert path resonated very well with administrators in the next iteration of the design. Sometimes intelligent assumptions are better, and the optimization, or “tweaking” can happen after the system is working in its basic configuration. Other feedback included complaints about the number of manual steps necessary to prepare for the installation. Automation of some steps proved the answer. In one usability test, one issue involved the absence of system checks that, when not performed beforehand, would cause failure of the installation. One of the checks, for free disk space, took place at the end of the installation when it was too late to do anything about it.

Fig. 4. Managing Complexity by Providing Alternate Paths and Decreasing the Number of Decisions on a given screen

Heuristic 8: Perform Competitive Analysis, Including Open Source Tools. As much as a company will try, it is impossible to force the administrators to use only your tools. They will find utilities developed by their user group, an open source product for monitoring and health checks, or even deploy your competitor’s product. The more a designer studies these tools, the more effective the integration exercise, or the information can be used to enhance existing applications. Subjects of our DBA study all had their favorite collections of tools, and while there were some patterns there seemed to be a race to discover the coolest and latest utility to make the adminstration tasks more efficient and operations transparent. Heuristic 9: International Focus and Hosted Applications. Administration is being outsourced. In some cases, the physical infrastructure and the software are remote to both the end users and the administrators. This is the case when a company hosts an application suite for a customer who accesses it over the Web. In other cases, only the administrators are in remote locations. Designers need to include sensitivity to other cultures, and design with internationalization support in mind, including support for


139

languages, bi-directional support, and accessibility standards relevant to the local government bodies. Heuristic 10: Use the Right Communication Vehicle during the Design Process. When designing for administrators, it is very common to create designs that are not implemented. Results of studies are often communicated in 100 page reports that the stakeholders do not have time to read. Conversely, posters representing flows, or “before and after” designs, are more successful. What also helps is “speaking the administrator’s or the developer’s language” and using the bug defect database to record design and usability issues. Communication among team members can also prove to be a failure point for a designer. Utilizing a new tool like the collaborative Twiki can accelerate communication and foster a feeling of an extended virtual team, with everyone working on the same goal. A designer is furthermore successful if they extend their role and try to understand why technology, legal, or business issues stand in the way of their vision’s implementation. Standardized testing, while not always useful in the creative phases of the project can still be instrumental when comparing unassisted task completion rates between one release and the next, or comparing yourself to the completion. And lastly, direct involvement with the end users and project stakeholders tends to work better than management mandates and lengthy and abstract guidelines.

3 Conclusion Administrators are not yet a fully understood user type. More work is needed to fully develop complete user profiles. Enterprise software also represents just one dimension. Consumer companies like EBay and Yahoo are also cultivating their own administrative ecosystems. The domain is not an easy one since it involves constantly evolving technology and industry standards. Furthermore, few enterprise installations include only the software sold. There are always legacy systems and integration exercises present unique logistical, financial, and human factors challenges. The heuristics identified provide a focus for a designer who is new to this domain and dealing with that user type. If taken into consideration, the most basic administration UI design bloopers will be avoided.

Reference 1. Barrett, R.: System Administrators are Users, Too, Standard Human Computer Interaction Seminar (Mary 30, 2003), http://hci.stanford.edu/seminar/abstracts/02-03/030530-barrett.html

Micro-Scenario Database for Substantializing the Collaboration Between Human Science and Engineering Masaaki Kurosu1, Kentaro Go2, Naoki Hirasawa3, and Hideaki Kasai4 1

National Institute of Multimedia Education, 2 University of Yamanashi, 3 Otaru University of Commerce, 4 NEC Software Hokkaido, Ltd. [email protected], [email protected], [email protected], [email protected]

Abstract. For the purpose of achieving the effective and efficient humancentered design, a database of problem micro scenario (p-MS) is proposed. In the concept of this system, the human scientist work first for getting the information about the user and the context of use by applying the field work methods. The information about problems discovered in the field data will be stored in the p-MS database with the tag and the ground information. Engineers who plan to manufacture something can retrieve relevant problem information from this database, thus they can shorten the time required for the early stage of development. This idea of p-MS database is believed to facilitate the humancentered design and the feasibility study will be conducted within a year from this presentation. Keywords: usability, scenario based design, micro scenario method, database.

1 Introduction Human science (especially psychology, ergonomics, sociology and anthropology) has been serving as a source of information on the human being, especially on their internal and external behaviour, for the engineering and manufacturing. Figure 1 above indicates such situation. There was just a contribution from the side of the human science to provide the information that is requested from the engineering side. But since the advent of the human-centered design paradigm, it was gradually realized that the human science should establish its own stance in the total design process and both process was integrated into one as is shown in Figure 1 below. This article focuses on how the engineering and the human science should collaborate and proposes an idea of micro-scenario database for optimizing the collaboration. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 140–145, 2007. © Springer-Verlag Berlin Heidelberg 2007

Micro-Scenario Database

141

Fig. 1. Relationship between the human science and the engineering (above: previous situation where human science served just as the information source separately, below: situation where HCD is implemented and both approach are integrated into one)

2 Collaboration Between Human Science and Engineering As is shown in Figure 2, there are two types of collaboration between the human science and the engineering. In this figure, above is an idealistic type where the human scientist takes the first role by investigating the user characteristics and the context of use, thus summarizes the requirement.

Fig. 2. Two types of collaboration between the human science and the engineering

142

M. Kurosu et al.

But most of the real development takes the type below where both parties start at the same time. Although this type of development is better than no collaboration, engineers will not wait for the requirement information presented to them. Because it is a waste of time, they start “something” while waiting. As a result, when the requirement information is given, engineers might have stepped into some designing process without the adequate information about the user and the context of use. If engineers were quite flexible and receptive, they will redo designing. But in most cases, to our regret, engineers do not lend their ears to the requirement, thus design something that do not fit to the user requirement. On the other hand, the serial approach described above in Figure 2 is difficult because it is unbearable for engineers just to wait for the completion of the requirement and do nothing until then.

3 Micro Scenario Database One answer to the problem above is to construct the database of problem micro scenario as is shown in Figure 3.

Fig. 3. Concept of micro-scenario database

The problem micro scenario (p-MS) is a scenario that represents the micro information structure constructed from the field work data. It is an output from the first half step of the micro scenario method (Kurosu et al. 2003, Kurosu 2004, 2005, 2006) described in Figure 4. Micro scenario method is a successor to the scenariobased design originally proposed by Caroll (1995). As is shown in Figure 5, each p-MS represents the problem in terms of the relationship between the user and the artefact. Fundamental information about the user and the context of use is described as the ground information (GI) and linked to each p-MS, hence if one wants to get the background information of the p-MS, s/he can get it by tracing the link to the GI. p-MS is attached with the tag information that represents the content or the domain of the problem. It is similar to the keyword. Thus it will be used to retrieve the relevant p-MSs from p-MS database and the user of the system can get p-MSs with similar problems, and can summarize the information. In this way, this database of p-MS can be used to create the requirement for developing some products or systems.


Fig. 4. Basic flow of micro scenario method

Fig. 5. Problem micro scenario

143

144

M. Kurosu et al.

As shown in Figure 3, human scientists work for investigating the user and the context of use by using the field work methods independently from the engineering developmental process. They summarize the information as a set of p-MS, tag, and GI and store them into the database. Engineers can use that database whenever they would like to start a project for manufacturing something. Relevant information can be retrieved from the database by entering the keyword. Figure 6 represent the situation where the micro scenario database is used by many engineers. In this figure, an interpreter is added to the top of each engineering project. The interpreter must have the background of usability engineering and can interpret the retrieved p-MS adequately in order to create the requirement.

Fig. 6. Use of micro scenario database

4 Conclusion The p-MS database is just a concept at the time of this presentation, but it is planned to be implemented in a year or two. The feasibility study will then be started. Authors have a belief that this kind of database is surely be useful in order to spread the human centered design. Besides, the micro-scenario authoring tool (Kurosu et al. 2006) that has just completed will facilitate the use of the database.

References 1. Carroll, J.M. (ed.): Scenario-Based Design: Envisioning Work and Technology in System Development. Wiley, Chichester, UK (1995) 2. Kurosu, M., Nishida, S., Osugi, T., Mitsui, M.: Analysis of Field Data by Micro-scenario Method (in Japanese) In: Proceedings of Human Interface Symposium (2003)


145

3. Kurosu, M.: Micro-scenario method for designing and re-designing the e-Learning system, E-Learn 2004 (2004) 4. Kurosu, M.: Micro-scenario method: a new approach to the requirement analysis, WWCS 2004 (2004) 5. Kurosu, M.: Micro-scenario method (MSM) - a new approach to the requirement analysis , Human Interface Society SIGUSE (2004) 6. Kurosu, M.: Micro-scenario method – interface design based on the context of use information, Design IT (in Japanese) (2005) 7. Kurosu, M.: Scenario creation by using the micro-scenario analysis system, JPA 2006 (in Japanese) (2006) 8. Kurosu, M., Kasai, H., Hirasawa, N., Go, K.: Analysis tool for micro-scenario, Human Interface Society SIGUSE, 2006 (in Japanese) (2006) 9. Kurosu, M.: Micro-scenario method, NIME report, 2006 (in Japanese) (2006) 10. Ohnishi, J., Go, K.: Requirement Engineering, Kyoritsu-shuppan, 2002 (in Japanese) (2002)

A Meta-cognition Modeling of Engineering Product Designer in the Process of Product Design Jun Liang, Zu-Hua Jiang, Yun-Song Zhao, and Jin-Lian Wang Department of Industrial Engineering & Management, School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dong Chuan Road, Shanghai, 200240, P.R. China {jliang, zhjiang, lemon_zhao, buada}@sjtu.edu.cn

Abstract. For further effectual tacit knowledge reusing in the process of product design, individual cognitive processes, cognitive factors, and cognitive strategies need to be realized to find the essential factors that affect the generation of tacit knowledge and control designer activities in the whole design process. But these key factors are relative to individual cognitive capability and meta-cognitive level. So, based on physical symbol system hypothesis (PSSH) and connectionism, a meta-cognition model of engineering product designer is provided to elucidate the active monitoring and consequent regulation in this paper. Designers’ cognitive activities in the process of product design are analyzed from the viewpoint of cognition science. Finally, the cognitive differences between the experienced designers and the novices in the process of fuel injection bump design is compared and elaborated in detail. Keywords: Meta-cognition, Cognitive activity, Individual Difference, Product design.

1 Introduction In the process of product design, both routine design and knowledge-intensive intelligent design, the tacit knowledge and individual cognition of the designers affects the designing, ideation, and creativity of product. Individual cognitive activities and meta-cognition level act as the important and different roles in each stage of engineering product design to some degree. For further effectual tacit knowledge reusing, individual cognitive process, meta-cognitive patterns, cognitive strategies, cognitive activities, and the relationship between these conscious or unconscious processes and the concrete design problems should be realized, and the essential factors that affect the generation of tacit knowledge and control designer activities in the whole design process should be found by users. Then the tacit knowledge of individual designer can be used more effectively to guide creativity design. Furthermore, individual cognitive capability and meta-cognitive level can be improved to increase the intelligent asset and core competition of the enterprise. Meta-cognition refers to our ability to monitor and consequently effectively use our cognitive processes [1], and in brief it is “cognition about cognition” [2] and one's knowledge of and control over her/his own thinking [3]. In the process of engineering J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 146–155, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Meta-cognition Modeling of Engineering Product Designer

147

product design, meta-cognition can monitor and control cognitive process of the designers about product design. For example, in the case based design, designer starts cognitive activities and meta-cognitive activities from design tasks and design requirements, continues with the confirmation of features, case retrieval, case revision, case using, and ends with the accomplishment of design. Effective design support systems must complement human cognitive activities, and must be based on a sound understanding of the human cognitive abilities [4]. This paper focuses on the analysis of designer’ cognitive and meta-cognitive activities and builds a bridge that connects cognition psychology and engineering product design. This paper is organized in the following way. Section 2 introduces the cognitive foundation of designer meta-cognition model and the related works of meta-cognition. Section 3 provides a meta-cognition model of engineering product designer and presents the components of meta-cognition model. Cognitive and meta-cognitive activities in the process of product design are explored and analyzed in Sect. 4. Section 5 discusses cognitive and meta-cognitive activities in fuel injection pump design and compares the cognitive differences between the experienced designers and the novices and the conclusions are presented in Sect. 6.

2 Cognition Science Foundation for Individual Meta-cognition in Product Design Meta-cognition emphasizes personal mind activity, thought, perception, memory, and interaction of cognitive activities, and pays more attention to self-awareness and selfregulation. Meta-cognition is defined by Flavell as “knowledge and cognition about cognitive phenomena” [5] and often described as the executive process governing our cognitive efforts [1] and it consists of meta-cognitive knowledge and self-regulation [6]. Susan V. Baxt [7] defined six meta-cognitive processes, i.e. problem definition, planning, Strategy selection, flexibility (of strategy use), evaluating, and checking and monitoring, which are based on above three meta-cognition models. Meta-cognitive activity was significantly related to knowledge acquisition, skilled performance at the end of training, and self-efficacy [8]. Monitoring and Control are the two important information flow processes, one information flow is from a cognitive to a meta-cognitive level allows monitoring of the cognitive level by the meta-cognitive level, the other is from the meta-cognitive to the cognitive level allows control of cognition by meta-cognition [9].Monitoring one's thinking and the effects of controlling it are the model's mechanisms for increasing meta-cognitive understanding [10]. Furthermore, the importance of the context was emphasized by Erik Hollnagel [11] and he pointed out that cognition and context are inseparable. Valkenburg R. [12] and Smith R.P. [13] had studied the cognitive activities of design teams. Mao-Lin Chiu[14] considered that design status is a kind of manner of design operation, which can implement with sense input, perceive process, conception process, status-structure process and memory construction process.

148

J. Liang et al.

3 Meta-cognition Model of Engineering Product Designer 3.1 The Framework of the Model Designer meta-cognition in product design domain refers to designers monitor and control a series of cognitive activities to dominate individual knowledge to solve design problems in self-awareness when stimulating information of design environment interchanges with their cognitive behaviors. Meta-cognitive process in product design is a continuous process of driving design task forward until its accomplishment. Designers can cognize not only design objectives, design process, but cognitive process and cognitive results, and these cognitive activities happen in a positive and self-conscious situation.

Fig. 1. Meta-cognition model of Engineering Product Designer

As shown in Fig. 1, meta-cognition model of engineering product designer involves five sub-modules, and they are meta-cognitive knowledge, meta-cognitive experience, meta-cognitive operation, product design cognition sub-module, and long memory module of product design knowledge introduced in following context. Among of them, meta-cognitive knowledge, meta-cognitive experience, and metacognitive operation are the hard core of the model. Long memory module of product design knowledge provides various kinds of knowledge for solving the design problems, and product design cognition sub-module supports the minute cognitive activities, such as sense perception and, respectively.


149

3.2 The Components of the Model 3.2.1 Meta-cognitive Knowledge Meta-cognitive knowledge refers to beneficial knowledge, experiences, and lessons that impact cognitive processes, cognitive strategies, cognitive structures, and cognitive results during cognitive activities happening in the product design process, and it supports and affects meta-cognitive operation and meta-cognitive activities and transfers cognitive tasks. In product design, meta-cognitive knowledge includes three main aspects, people, tasks, and strategies, which are described as follows: People means that the designer self or others act as the objects, concerns cognitive capability, intelligence level, design experiences, knowledge, cognitive structure etc., which involves to cognize his/her own cognitive capability, and to perceive cognitive states in design requirements, cognitive differences and similarities, and special cognitive rules and experiences formed in the product design process. Tasks mainly mean cognitive knowledge when the designer analyzes and judge detailed cognitive goals and cognitive requirements, which include to cognize the requirements, goals, and features of cognitive tasks, the properties, characteristics and mode of appearance of cognitive materials, and the familiarity, degree of difficulty and schedule of cognitive object in product design. Strategies mean cognitive knowledge and methods used by the designers when they plan, employ, monitor, control, and adjust cognitive activities, which include methods about cognizing designer’s cognitive activities, analysis of merits and demerits about cognitive strategies, guideline about exception problem handling in process, and directions about cognitive activities, such as, attention, memory, and thought etc. 3.2.2 Meta-cognitive Experience Meta-cognitive experience refers to designers’ comprehension and consciousness about their cognitive activities and cognitive process, reflects the awareness and unawareness about cognitive activities, and shows in the form of affective experience. The execution of designer cognitive activities in product design emerges from meta-cognitive knowledge activated by meta-cognitive experience that changes metacognitive knowledge from activated level to work state to serve meta-cognitive monitoring and meta-cognitive regulation. Positivity or negativity of meta-cognitive experience impacts designer cognitive activities, decides designer’s decision-making behaviors, such as, attention of different degrees about design process, cognitive strategy, and method choosing, and finally, conducts the success and failure of product design. Meta-cognitive experience is a mediator and a trigger of monitoring and regulating cognitive activities. From the viewpoint of engineering product design, at initial stage of product design, the designer experiences degree of difficulty, familiarity, and ongoing situation of cognitive tasks. In medium-term, the designer experiences the process of cognitive tasks, all kinds of difficulties and obstacles about cognitive tasks, the gap between planning and practices, and reschedule of cognitive strategies. At final stage, the designer experiences the effect of cognitive activities, the evaluation of planning and practices, meta-cognition activities, such as, the improvement of cognitive strategy, and emotional experiences, like, glad and sad. So, it is very important to

150

J. Liang et al.

arouse designer meta-cognitive experience in product design, because meta-cognitive experience can activate enthusiasm of cognitive activity and improve the validity of cognitive process about design problems. 3.2.3 Meta-cognitive Operation Meta-cognitive operation refers to a series of meta-cognitive activities monitoring and regulating designer cognitive activities by activation of meta-experience, when research objects is cognitive activities of the designer oneself. It means a continuous work process of different operative behaviors, regulates and acts on cognitive activities directly, and interacts with meta-cognitive experience and meta-cognitive knowledge. Herein, operative behaviors of meta-cognitive operation includes choosing, controlling, feedback, monitoring, evaluating, comparing and analyzing etc., which is self-consciously governed by phenomenal consciousness called “metacognitive center” in this model. All these meta-cognitive operative behaviors may do execution in concurrent mode or in serial mode. Such as, the operative behaviors of “monitoring--feedback-controlling” is a serial handling structure in the individual cognitive process of product design, but “choosing” of meta-cognitive knowledge or design domain knowledge in long memory of module, and the operative behaviors of “monitoring— feedback--controlling” are dealt with in a concurrent mode. Meta-cognitive center is the core of meta-cognitive operation, guides the operative behaviors, and contacts with meta-cognitive knowledge. It is affected by the designer oneself, call cognitive tasks with corresponding cognitive strategy. Meta-cognitive operation revises meta-cognitive knowledge and responds to the activation of metacognitive experience. In the individual cognitive process about product design, metacognitive operation carries through meta-cognitive activity to monitor, to control and to regulate the cognitive process of product design, interacts with others sub-modules in the model. 3.2.4 Cognition of Product Design Cognition of product design refers a set of cognitive activities happening in designer consciousness, which starts from receiving the stimulation of design requirements and design tasks and ends with the completion of a concrete design. This process is a special applying of general cognitive activities in product design, an access of metacognition and cognition, and the cognitive access of design problems. It includes cognitive activities about product design, cognitive process of product design, attention, the characteristics of cognitive tasks of product design, cognitive effects, and mental feeling etc. 3.2.5 Product Design Expertise Knowledge in Long-Term Memory From the product design coming to hand to accomplishment of this product, all individual memory contents about product design, such as, expertise knowledge, experiences, lessons that exist with all product design activities are stored in long memory module of product design knowledge, and this module serves for metacognitive operation. Tulving [15] divides memory into episodic memory and semantic memory. Over here, semantic memory refers to memory of general knowledge and rules of product design, relates to the connotation of concepts that emerges from the


151

whole product design process. However, the information of episodic memory comes from external information resource and concerns design experiences and their concrete scene and specific details. This module provides the needed expertise knowledge, domain knowledge, and other knowledge for the designer going along cognitive activities and supports meta-cognitive knowledge.

4 Relationship Between Meta-cognitive and Cognitive Activities and the Product Design Process Individual cognitive activities in the product design process mainly focus on the imago and cognition of the components, concepts, execution, and completion of cognitive tasks about design, and involve cognitive process and mental activities, such as, sensation, perception, image, thinking, memory and attention etc. individual metacognition is a cognition about product design cognition and a continuous process of realizing design tasks. The designers can cognize objective tasks and their own cognitive process and cognitive results. Cognitive activities and meta-cognitive activities for product design are governed and regulated in a positive and selfconscious status, such as, self-regulation, self-awareness, and self-control. The designers start their cognitive activities from receiving the stimulation of design tasks, such as, sensation, perception, and attention etc. At one time, metacognitive activities work in a concurrent mode, like, meta-cognitive monitoring and meta-cognitive controlling. With the development of product design activities, cognitive activities and meta-cognitive activities continue to advance and improve. Finally, individual cognitive and meta-cognitive activities will end along with the completion of design tasks. Observing from a special time or space viewpoint, cognitive and meta-cognitive activities of the designers exist in the dispersion, fragment, and concurrency mode, but in the whole design process, they go along in the sequence, order, and series mode.

5 Cognitive and Meta-cognitive Activities and Cognitive Differences in Fuel Injection Pump Design The retrospective verbal protocols of two experienced designers and four novices have been analyzed and compared to research the cognitive process and metacognitive activities that happened in PM fuel injection pump design process. 5.1 Cognition and Meta-cognition Analysis in Fuel Injection Pump Design As soon as designers get the design task of PM fuel injection pump, their cognition and thinking start to deal with related design tasks and cognitive tasks from task assignment, technology resource, strategy, role, and potential problem etc. The design information is sensed and perceived by vision and audition of the designers, and the design requirements of PM fuel injection pump are paid more attention firstly, such as, the type of matching engine, key parameters. With the stimulation of design information, meta-cognitive center handles the related cognitive information from

152

J. Liang et al.

bottom to top. Meta-cognition analyzes cognitive tasks, considers designer own role, cognitive goals and intention, and monitors cognitive activities by meta-cognitive center. Meta-cognitive operation goes into effect in series or concurrent mode, such as, planning individual cognitive process, selecting cognitive strategy, and comparing the differences of this design task and one-time design tasks in mental feeling. At the same time, meta-cognitive operation inspires meta-cognitive experience, which activates meta-cognitive knowledge to call the related knowledge and design scenario segments, like PL and/or PM fuel injection pump design scenes. Designer metacognitive knowledge guides and affects meta-cognitive operation and comprehends meta-cognitive experience, in reverse, and meta-cognitive experience supports all kinds of operative behaviors. They interact, restrict, collaborate, and depend one other to monitor, control and regulate cognitive activities in product design. The design tasks and design intention of PM fuel injection pump, its functionbehavior-structure, and sub-goal and sub-task need be arranged, discussed, and determined at some meeting and branch meeting in several working days, which lead to designer cognition existing in a dispersion and fragment mode observed from time and space viewpoints. Designer cognitive and meta-cognitive activities govern and dominate individual behaviors, such as, the lingual expression of design scheme, the drawing practice, and the concrete design steps. With regard as minute design calculation and basic parameters, designers finish them in the direction of design template and design manual or by the professional software, in which there is few creative activity, so designers only need to notice, monitor, and control their cognitive activities. When designers encounter some difficulties, they need to extract related experiences, knowledge, and shortcuts from long-term memory module of product design. Sometimes, designers need to activate individual image, creativity and afflatus etc. to complete design task and design activities of PM fuel injection pump. In general, designer cognitive and meta-cognitive activities in PM fuel injection pump design conform to the principle of economy. 5.2 Cognitive Differences of Different Designers The whole design process of PM fuel injection pump contains two stages in a nutshell, the preparation of design scheme and concrete design and calculation of PM fuel injection pump. At the first stage, the experienced designer and the novice differ in cognitive plan, cognitive strategy, the perception and prediction of PM fuel injection pump design process and detailed step. The cognitive differences between them mainly focus on cognitive effects, mental feeling, cognitive goal and intention, result prediction of cognitive tasks, cognitive process, and meta-cognitive activities etc, which are shown in Fig. 2. For example, in cognitive effects and cognitive tasks, the experienced designers like to perceive all-sided design tasks to plan their cognitive tasks and transfer and use their design experience, but the novices focus their attention on design details and design difficulties, and their cognitive strategies are different. Furthermore, the experienced designers emphasize the utilization of the techniques, setting forms, materials, and tolerances of existing series products, like IW fuel injection pump and P fuel injection pump, and mature products, like PW2000 fuel injection pump, but there is any experience using and operation found in the novices.


153

Fig. 2. Meta-cognitive and cognitive activities and individual differences in the preparation of fuel injection pump design

Due to the difference of knowledge quantity, problem analysis, experience and shortcut possession of similar design task between them, so the design effects and design schemes generated by them are distinct obviously at this stage. At the second stage, the experienced designers and the novices solve the minute design problem and parameter calculation. The cognitive differences are focused on key problem perception of design process, design experience, knowledge quantity, knowledge structure, which is represented in the method selection of concrete parts, like plunger and camshaft, and in the determination of parameters, like the pressure of fuel supply. For example, the novices design plunger and plunger barrel according to fuel delivery per cycle and duration of feeding, but the experienced designers analyze the parameters and history data of dimension chain and max pressure at pump end of PL fuel injection pump and PW2000 fuel injection pump, and consider the influence of fuel supply rate, spray quality, and the pressure of combustion system at the end of injection to calculate the coefficient of plunger diameter/ effective stroke and chute inclination of plunger. Table 1 shows the partial comparison of the cognitive differences between the experienced designers and the novices in the design process of fuel injection pump. Due to the differences of design role, cognitive tasks, cognitive strategies, and knowledge structure etc., the designers have different mental feeling, perception, cognitive activities and meta-cognitive activities, and their activated meta-cognitive experience and meta-cognitive operative behaviors are also different.

154

J. Liang et al.

Table 1. Partial cognitive differences between the experienced designers and the novices in fuel injection pump design Differences

Experienced Designers

Cognitive People

Understand them and solve difficulty by easy stages, but lack creativity.

Trend to field dependence, Reflective, divergence, holist Abundant expertise knowledge, Knowledge Quantity domain knowledge, and practice experience Ordered, connected, and Organized Manner hierarchical organizing Cognitive Level of Simple and effective product Problem design cognitive process Manner of Knowledge Extracting according to the Extraction rules of schema and hierarchy Cognitive Style

Novices Deficiency of Self-cognition, excessive self-confidence or negative in problem-solving, and sometimes creativity. Trend to field independence, impulsive, convergence, serialist Only part expertise knowledge learned in university of by enterprise training Out of order, untrimmed, and random organizing Form product design cognitive process gradually. Extracting in stochastic and disorder manner

6 Conclusions This paper explores the designer’s cognitive activities in the process of product design and provides a meta-cognition model of engineering product designer, which afford the bedrock of cognition psychology for the research of cognitive process and metacognitive activities in the engineering design process. The core factors of the module are described and discussed in detail, and they interact, restrict, collaborate, and depend on one other in the product design process. Meta-cognitive and cognitive activities in the process of product design are analyzed, and the cognitive differences of the experienced designer and the novices in PM pump design process are compared. It can support and sever for cognition research in engineering design. Furthermore, meta-cognition activity can guide the reusing of the important tacit knowledge and provide the designer the effective knowledge, experience and right design orientation. At the same time, this study provides a useful reference for other domains researches about cognitive and meta-cognitive activity. Acknowledgments. This work is supported by the Shuguang Program of the Shanghai Educational Committee under grant No.05SG15 and the National Basic Research Program of China (973 Program) under grant No. 2003CB317005.

References 1. Sternberg, R.J.: Human intelligence: the model is the message. Science. vol. 230(4730) pp. 1111–1118 2. Flavell, J.H.: Cognitive monitoring. In: Dickson, W.P. (ed.) Children’s oral communication skills, pp. 35–60. Academic Press, New York (1981)


155

3. Walczyk, J.J.: The Development of Verbal Efficiency, Metacognitive Strategies, and Their Interplay. Educ. Psychol. Rev. 2, 173–189 (1994) 4. Sherman, Y.T., Lang, J.D., Ralph, O.B.: Cognitive factors in distributed design. Comput Ind. 48, 89–98 (2002) 5. Flavell, J.H.: Metacognition and cognitive monitoring: A new area of cognitive developmental inquiry. Am. Psychol. 34, 906–911 (1979) 6. Brown, A.L.: Metacognition, executive control, self-regulation, and other more mysterious mechanisms. In: Weinert, R.E., Kluwe, R.H. (eds.) Metacognition, Motivation and Understanding, pp. 65–116. Lawrence Erlbaum Associates, Hillside New Jersey (1987) 7. Baxt, S.V.: Metacognition gets personality: a developmental study of the personality correlates of metacognitve functioning. Carleton University, Ottawa (1995) 8. Ford, J.K., Smith, E.M., Weissbein, D.A., Gully, S.M., Salas, E.: Relationships of goal orientation, metacognitive activity, and practice strategies with learning outcomes and transfer. J. Appl. Psychol. 83, 218–233 (1998) 9. Butterfield, E.C., Albertson, L.R., Johnston, J.: On making cognitive theory more general and developmentally pertinent. In: Weinert, E., Schneider, W. (eds.) Memory Performance and Competence: lssues in Growth and Development, pp. 181–205. Lawrence Erlbaum, Hillsdale New Jersey (1995) 10. Butterfield, E.C., Hacker, D.J., Albertson, L.R.: Environmental, Cognitive, and Metacognitive Influences on Text Revision: Assessing the Evidence. Educ. Psychol. Rev. 8(3), 239–297 (1996) 11. Hollnagel, E.: Cognition As Control: A Pragmatic Approach To The Modelling Of Joint Cognitive Systems. IEEE Trans Syst Man Cybern (in press). http://www.ida.liu.se/ eriho/Publications_O.htm 12. Valkenburg, R., Dorst, K.: The reflective practice of design teams. Des Stud. 19, 249–271 (1998) 13. Smith, R.P., Leong, A.: Observational study of design team process: a comparison of student and professional engineers. J. Mech. Des, Trans. ASME. 120(4), 636–642 (1998) 14. Chiu, M.L.: Design moves in situated design with case-based reasoning. Des. Stud. 24, 1– 25 (2003) 15. Tulving, E., Donaldson, W.: Episodic and semantic memory, Organization of Memory, pp. 381–403. Academic Press, New York (1972)

User Oriented Design to the Chinese Industries Scenario and Experience Innovation Design Approach for the Industrializing Countries in the Digital Technology Era You Zhao Liang, Ding Hau Huang, and Wen Ko Chiou 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C Chang Gung University [email protected]

Abstract. Designing for Chinese industries and the new China market has became a ‘hot’ issue within the global and Chinese industrial design society. The characteristics of low labor costs and hard-working Chinese have had an effect on the rapid economic development within the region as a whole. The purpose of this paper is to analyze state of the art industrial development within Taiwan and Mainland China, and to evaluate the critical problems of industrial design development in both regions. Additionally to discover how Taiwan Chinese digital technology industries confront this situation with user-oriented design (UOD). This paper synthesizes six approaches to carry out an innovative product development framework of new product development procedures, with user oriented scenario predictions and experience innovation approach. These approaches not only generate original design data from a user’s point of view, but furthermore make it much easier to get consensus from product development teams and really create innovative designs through interdisciplinary collaboration to create innovative cultural enterprises. Keywords: User oriented design, Scenario approach, Innovation design, Industrializing countries, Digital technology.

1 Introduction Designing for Chinese industries and the new China market has became a ‘hot’ issue within the global and Chinese industrial design society. It is a key factor in the transformation of a ‘developing’ to ‘developed’ integration and application of information and automation technologies. The characteristics of low labor costs and hard-working Chinese have had an effect on the rapid economic development within the region as a whole. But at the Pacific rim of Mainland China, it seems as though they are following in the same footsteps as the Taiwan Chinese. Most Asian industrialized and developing countries seem to be following in the Taiwan’s footsteps concerning industrial development. Whilst industries are rapidly progressing, most manufacturers have been reluctant to create new designs of their J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 156–163, 2007. © Springer-Verlag Berlin Heidelberg 2007

User Oriented Design to the Chinese Industries Scenario

157

own, preferring instead to copy or imitate those products that are already available in highly industrialized countries. Most manufacturers in the region involve themselves more with technical and production problems and with upgrading their production and technical quality. It is thus obvious that most makers are primarily concerned with ‘how to produce’ rather than with ‘what to produce’. In the past Taiwan has developed the export of low-priced items based on the island’s competitive edge which stems from relatively low labor costs. Taiwan has been competing in terms of ‘price’ rather than ‘quality’. The product has not been considered as ‘important’ and manufacturers have spent comparatively little on it. This situation has been changing as other nations with even lower labor costs are producing lower priced products. Looking particularly at the recent history of Taiwan, the slow but steady implementation of industrial design reflects this dilemma. This history can be grouped into three periods. The first, the economic industrial development period from 1966 to 1973 focused on ‘design as a tool’ in developing products which must satisfy local users’ need as will as environment requirements. The second, the export industries development period from 1973 to 1989, emphasized ‘design as a bridge’ between foreign buyers and local manufacturers. The third period, the industrial period from 1981 to the present, has implemented ‘design as a tool’ in developing unique Taiwanese products for the global market. Therefore the purpose of this paper is to analyze state of the art industrial development within Taiwan and Mainland China, and to evaluate the critical problems of Industrial Design development in both regions. Additionally to discover how Taiwan Chinese digital technology industries confront this situation with useroriented design.

2 The Value of Design Firstly we propose how product design and development actually work. 2.1 Definition and Scope of Industrial Design A number of managers in Taiwan local industries have understood that industrial design is a very important element in industry. However, it is still necessary to clarify the role of industrial design as something more than cosmetic ‘face-lifting’ or the creation of a ‘nice outer shell’ surrounding technology in general. In this respect, we would like to quote the definition of industrial design as formulated by the International Council of the Society of the Industrial Design (ICSID): “industrial design is a creative activity. Its objective is to improve human life and its environment through product design which satisfies user’s needs and habits, and is concerned with their functional and emotional requirements” [3]. Today, most top managers in global business enterprises have recognized the importance of industrial design, not only as an important specialized field during the product development process, but also as a quality ‘tool’.

158

Y.Z. Liang, D.H. Huang, and W.K. Chiou

2.2 Product Design and Value Planning To further enlighten the issue, We would like to quote the industrial design policy of the Concern Industrial Design Center (CIDC) of Philips Netherlands [2]: “It is the task of the CIDC to transform technology into products which are simple to produce, ergonomically correct, safe and easy to use and to service, and which are also aesthetically appealing there by improving man’s comfort and environment”. Based on this policy we can list the main design factors as ‘function’, ‘use’, ‘appearance’ and ‘production’. Each factor significantly influences a product’s quality and value. The relationship between the factors can be formulated as: V =

Q C

=

F+U+A C

where V=value, C=cost of materials & production, Q=quality, F=function, U=use, and A=appearance. 2.3 Function of Product Design A quantity of managers in the Taiwan region imagines the function of product design to be simply a product’s engineering and manufacturing. Others may think it follows in terms of electronics. In fact, design can be defined as a conscious plan. Its main contribution to product development lies in the synthesis of a concept using carefully assembled facts. Design skills may be defined in relation to the type of product and may also be related to the various functions of the designer. The three main groups directly involved in the product design and development process are: ‘the marketing group’, ‘the technical development and production group’, and ‘the industrial design group’. Team work is the key word applicable during the product development process. All specialists involved cooperate according to a systematic product development pattern, and they must be competent enough to coordinate their specialized ‘optimal solution’ with the expected holistic solution. This coordination creates an optimal product or product system, and, at the same time, prevents the dominance of one function over another. The product development procedure is a systematic process which integrates all product design and development activities from the idea stage to mass production to ensure the product meets market and consumer time and price needs. Product development also works as a coordinator and integrator to ensure that every functional division works as an integrated team to maintain good communication with full commitment to the project goal.

3 Experience in Taiwan At the Pacific rim of Mainland China, it seems as though they are following in the same footsteps as the Taiwan Chinese, that is, developing their industries on the basis of original equipment manufacturer (OEM) orders, and then trying to upgrade to an


159

original brand manufacturing (OBM) level through original design manufacturing (ODM) business. Therefore we can talk about the experience of Taiwanese industry especially on product design and development as following. 3.1 The Gap of Smile Curve and Its Shifting ‘ACER’ has a famous brand image the world over. It symbolizes that Taiwan has not just got a manufacturing industry, but can create a brand which embodies value. The founder of ACER described the current characteristic of the electronic equipment manufacturing industry by a theory called ‘Smile curve’. Within the smile curve, the two sides are marketing and development. Manufacturing function is in the middle [2]. Mr. Shih was encouraged Taiwan industry should move to these two functions of smile curve during the global value system. Marketing and development have higher add-on value throughout the industry. Taiwan should not stay in the middle of the curve which has lower value within the industry. Therefore Taiwan should develop industries on the basis of OEM orders, to upgrade to an OBM level through ODM business (shown as figure 1).

Fig. 1. Smile curve

However a product developed without market strategy, positioning and user knowledge will find it hard to get customer acceptance and to become a market recognized brand. Therefore in the knowledge economy era we should transfer the strategy from ‘smile curve theory’ to ‘close cycle concept’. We should integrate manufacture knowledge into the bottom of smile curve, the technology knowledge on the left side and the marketing knowledge on right side, but more importantly should add the content knowledge of user oriented needs in the top of the cycle (shown as figure 2). 3.2 The Missing Link from OEM to OBM Regarding the industrial development gap as with the product development practice in Taiwan, the term ODM actually means ‘own development manufacturing’, that means we are qualified in technical and engineering development, but usually only offer an on-going solution with ‘me too’ (follower) design, rather then applying user-oriented design (UOD) principles.

160

Y.Z. Liang, D.H. Huang, and W.K. Chiou Cl ose cycl e

Gap1 . Original De si gn Manufactu ri ng Engnineer Knowledge

Research and development Engine ering

Ow n D evelop ment M an ufactu rin g

、

Sale me too

、

MKT ENG'S MFG MNGT Design

、

S tr at eg ic In no vat ion D es ig n

(OBM) Domain Marketing Branding Knowledge Sales

MFG

MFG(OEM) MFG Knowledge (OEM)

Fig. 2. Smile curve shifting

GAPⅡ

M ar ket St rategy

RdD Hi-M anagement Hi -Tec h Hi -Des ig n

Desi gn

Engi neer ing Me too

Manu fact ur ing

DCOR

Fig. 3. Reasonable product innovation development process

However a reasonable innovation design process should define the direction of the innovation strategies firstly, and then according to the goal of the strategies conduct R&D and design and then finally the result could be put into manufacture. Moreover we should be concerned with both technological innovation and product strategy at the same time and building brand image by interdisciplinary collaborative design and marketing based value efficiently. In order to bridge the gap of industrial development in Taiwan, UOD and ‘interdisciplinary collaboration’ integration should be emphasized. It is proposed to upgrade the own development manufacture (follower design) to original design manufacture and to build design strategies including High design of user oriented


161

base, High tech of technology base and hi-management of interdisciplinary collaborate base (shown as figure 3).

4 Scenario and Experience Innovation Design Approach This paper recommended six approaches to carry out user oriented innovative product development framework of new product development procedures which can be applied to a series of practical cases. The approaches are as follows: 4.1 I-Ching and Darwin’s Natural Law Applying I-Ching (the theory of change) and Darwin’s natural law to describe the principle of the form, shape, function to create and develop from a ‘natural environment and scenario’ in which things are living. 4.2 Competitive Product Appraisal and Monitoring Competition As products are developed following their ‘field of use’ and ‘use scenario’, monitoring competition from the users’ point of view and market positioning assists in evaluating their advantages and disadvantages in order to position and define the competitive advantage. 4.3 Macro Vision Scenario This is from an economic, social, technology point of view, defining the product opportunity from macro vision to develop the key issue/s for new product development. 4.4 Micro Scenario Defines the target user group and detailed scenario situations, and activities from the above product opportunities, which interact with the product/s (the user target groups will be generated from character mapping, which are defined from a set of attributes and relate to the product and users). From the micro scenario key issues and design requirements for new products can be identified. The above approaches not only generate original design data from a user’s point of view, but furthermore make it much easier to get consensus from product development teams and really create innovative design/s through interdisciplinary collaboration to create innovative cultural enterprises. 4.5 Scenario Observation Observations of actual situations and interaction with actual sampling characters verify critical issues and design requirements, which are generated from micro scenarios so final design definitions become evident.

162

Y.Z. Liang, D.H. Huang, and W.K. Chiou

4.6 Design Development and Scenario Verification Scenario simulation and scenario verification are facilitated by means of rough ‘mock ups’, prototyping and ‘field test sampling’ to experience and verify users’ scenario/s in order to refine designs and to reduce risks from both users’ and business’ points of view.

5 User Oriented Innovation Design Concept With this approach, we collaborated with ADVANTECH Co. Ltd [1], Taiwan, who is a leader in the industrial computing and automation market. The above methods were applied to a series of interactive interface products for e-automation systems including industrial automation e-platform, service automation e-platform (medical, vehicle), home automation e-platform in ADVANTECH, with UOD scenario prediction and experience innovation approach. The innovative UOD concept for e-automation industries is shown as figure 4.

Fig. 4. Innovative UOD concept [1]

6 Conclusion The most important consideration for managers in this region is the development of marketing and design and not just technology and production. Products have to be designed to closely fit the market and complement users’ life-styles, needs and habits.


163

It is also essential for our region’s producers to think more along the lines of longterm advantages instead of immediate profit. Manufacturers have to put more effort into creating new products as well as improving existing products. They must simultaneously establish their own corporate identity and product image to further their global development. These goals are best met by adhering to a set procedure of product development. This will give the customer both what he desires and generate an inbred ‘quality consciousness’ toward innovative design manufacturers. As noted earlier, product development is the coordinator and integrator of the entire product development cycle. It ensures that the overall program stays on schedule and that the product introduction date is met. Most important, the whole concept is based on the premise that the customer is the boss.

7 Implications Taiwan is an island with a population of 23 million; the market is too small for new innovative products to survive unless enterprise scales up to international markets. With a population of 1.3 billion, Mainland China’s market and industries will have many more opportunities in developing innovative UOD in today’s knowledge economic era.

References 1. 2. 3. 4.

ADVANTECH: http://www.advantech.com/ CIDC: Concern Industrial Design Centre -Philips, Nederland: http://www.design.philips.com ICSID: http://www.icsid.org/ Shi, Z.R.: Acer reconstruction: Initiating, growing up and challenge. Commonwealth Publishing (2004)

Emotional Experiences and Quality Perceptions of Interactive Products Sascha Mahlke1 and Gitte Lindgaard2 1

Centre of Human-Machine Systems, Berlin University of Technology, Franklinstr. 28/29 – FR2-7/2, 10587 Berlin, Germany [email protected] 2 Human-Oriented Technology Lab, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada [email protected]

Abstract. Over the past few years, various novel approaches have been applied to the evaluation of interactive systems. Particularly, the importance of two categories of concepts has been emphasized: non-instrumental qualities and emotions. In this paper we present an application of an integrative approach to the experimental study of instrumental and non-instrumental quality perceptions as well as emotional user reactions as three central components of the user experience. A study is presented that investigates the influence of system properties and context parameters on these three components. The results show that specific system properties independently influence the perception of instrumental (i.e. usability) and non-instrumental qualities (i.e. visual aesthetics). Especially the perception of instrumental qualities was shown to have an impact on the users’ emotional reactions (subjective feelings as well as cognitive appraisals). There was also evidence suggesting that context parameters influenced emotional user reactions.

1 Introduction To date, approaches to the evaluation of interactive systems have mainly focused on tasks and goals, their efficient achievement, and the cognitive information processing involved [1]. In the past few years, various ideas have been discussed that go beyond the notion of efficiency and that aim to better understand how people experience technology. In this regard, two important concepts have been explored: noninstrumental qualities and emotions. In the next sections we will discuss these two concepts and their relations. 1.1 Non-instrumental Quality Perceptions Traditionally, evaluations of technology have focused on instrumental aspects of interactive systems, predominantly the concepts of usefulness and usability. Noninstrumental qualities on the other hand can be described as quality aspects that address user needs that go beyond tasks, goals and their efficient achievement. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 164–173, 2007. © Springer-Verlag Berlin Heidelberg 2007

Emotional Experiences and Quality Perceptions of Interactive Products

165

Mahlke [2] reviewed various approaches to the study of non-instrumental quality aspects. Briefly, he argued that two distinct categories of non-instrumental qualities have been differentiated in most approaches. On the one hand, aesthetic aspects have been discussed. These contain first and foremost visual aspects of product appearance, but can also imply other sensory experiences like haptic or auditory aspects of product use, as for example discussed by Jordan [3] and captured in his definition of physiopleasure. The other category refers to a symbolic dimension of product appearance. The concept of hedonic quality discussed by Hassenzahl [4] belongs to this category, which is similar to what Jordan [3] calls socio- and ideo-pleasure. Although much is being said about non-instrumental quality aspects and their application to design, only a few empirical studies actually measuring these have been reported. In a study of the interplay of non-instrumental quality perceptions with other concepts, Tractinsky, Katz and Ikar [5] highlighted the connection between aesthetics and usability. They argue that users’ aesthetic judgment made before using an interactive system affects their perceived usability even after using it. Lindgaard & Dudek [6] found a more complex relationship between these two concepts. Hassenzahl [4] studied the interplay between usability and hedonic quality in forming overall judgments concerning beauty and goodness. He found that judgments of beauty are more influenced by the user’s perception of the hedonic qualities, while judgments of goodness - as a more general evaluative construct - are affected by both hedonic quality and usability. Although a few empirical studies do exist that contribute to a better understanding of the role of non-instrumental qualities and their interplay with other relevant aspects of technology use, many questions remain to be addressed. In particular, the relationships between quality perceptions and emotional experiences have barely been explored. 1.2 Emotions as Part of the User Experience Rafaeli and Vilnai-Yavetz [7] attempted to link quality perceptions and emotional experience. They suggested that artifacts should be analyzed in terms of three conceptually distinct quality dimensions: instrumentality, aesthetics, and symbolism. They conducted a qualitative study in a non-interactive product domain to better understand the influence of these three quality dimensions on emotional responses. All three categories contributed significantly to the emergence of emotion. Tractinsky and Zmiri [8] applied this idea to an interactive domain by studying various existing websites which yielded similar results, and Mahlke’s [9] study on actual audio players showed that various instrumental and non-instrumental quality perceptions influenced users’ emotional responses. While Rafaeli and Vilnai-Yavetz [7] used interviews, Tracinksy and Zmiri [8] and Mahlke [9] applied questionnaires to assess users’ emotional responses. All these studies focused on the subjective feelings that arise when perceiving or using the relevant products. Much research has been conducted on measurements of emotion during interaction with technical devices, and different methods have been proposed to measure emotions in interactive contexts. Mahlke, Minge and Thüring [10] used Scherer’s [11] multi component model of emotion to structure a range of relevant emotion-measurement methods and relating them to the five components of emotion:

166

S. Mahlke and G. Lindgaard

subjective feelings, facial expressions, physiological reactions, cognitive appraisals and behavioral tendencies. Taken together, there are two major problems with the interpretation of results emerging from the studies reported above that relate emotional experiences during the interaction with users’ quality perceptions [7, 8, 9]: 1. They took a quasi-experimental approach by using existing products. As it was not discussed which properties of the stimuli or other variables influenced quality perceptions and the emotional experience, this question remains unanswered. 2. Rather than measuring all the five components of Scherer’s [11] model, only subjective feelings were measured as indicators of emotions. 1.3 Research Approach Mahlke and Thüring [12] describe an integrated research approach to the experimental study of emotional user reactions considering both instrumental and non-instrumental quality perceptions of interactive systems. Their model defines instrumental and noninstrumental quality perceptions as well as emotional reactions as three central components of the user experience, claiming that characteristic of the interaction affect all three of these. These characteristics primarily depend on system properties, but both user characteristics and context parameters like aspects of the tasks and the situation can play an important role. The outcomes of the users’ interactive experience as expressed in overall judgments of a product, usage behavior or choices of alternatives are shown to involve all three components, namely emotional user reaction as well as instrumental and non-instrumental quality perceptions. This model has been applied to study the influence of system properties on the three user experience components and users’ overall appraisal of the system [12]. In an effort to affect the perception of instrumental qualities as well as user performance, the level of usability was systematically varied as were other system properties modified expected to affect perception of visual aesthetics. Emotions were measured in terms of subjective feelings, motor expressions and physiological responses. The results confirmed that the manipulations had the predicted impact on the perception of both instrumental and non-instrumental qualities. Prototypes high in usability and attractiveness were significantly rated more highly than those that were low in both aspects. The results of the questionnaire assessing subjective feelings showed an effect of both factors. They revealed that the effect of variations in usability was greater than variations in visual aesthetics on both valence and arousal measures. Consequently, the high-usability/high-aesthetics prototype was experienced as most satisfying, while the low-usability/low-aesthetics was found to be most annoying. Since no statistical interaction of usability and aesthetics was found, both factors contributed additively to these emotions. EMG data of facial muscle sites and other physiological measures (dermal activity and heart rate) supported this interpretation. The following study is based on the same research approach, but differs in two aspects. First, the measurement of emotions focuses on subjective feelings and cognitive appraisals to learn more about another component of emotions defined by Scherer [11], and second, task demands were varied as an example for contextual parameters. Hassenzahl, Kekez and Burmester [13] found that the influence of instrumental and non-instrumental quality perceptions on overall judgments differs


167

depending on whether users are in a goal- or action-mode. In the goal-mode participants were required to accomplish given tasks, while they had the same amount of time to explore the system on their own in the action-mode. This variation was applied to investigate the effect of context parameters on emotional responses. The following predictions were made: 1. The versions with higher levels of usability and/or visual aesthetics would lead to higher instrumental and/or non-instrumental quality ratings. 2. Quality ratings would not be influenced by the usage mode [13]. 3. The versions with higher levels of usability and/or visual aesthetics would lead to differences in the cognitive appraisal of the usage situation and more positive subjective feelings. 4. In goal-mode, the correlation between instrumental quality perceptions and subjective feelings would be higher than between non-instrumental quality perceptions and subjective feelings. In action-mode the opposite would be found.

2 Method The variables investigated concerned the influence of system properties associated with usability and aesthetics of the system and task demands, that is, goal- versus action-mode, on the perception of instrumental and non-instrumental qualities and emotional user reactions. These included subjective feelings and cognitive appraisals. 2.1 Participants Eighty undergraduate students (48 women, 32 men) participated in the study. They were between 18 and 54 years old (average 21.3 years) and received course credit for participation in the study. Most of the participants (n = 72) owned a portable audio player and used it regularly. Almost all (n = 78) used computers daily. 2.2 Material Portable audio players were chosen as the domain of study and different versions were simulated on a computer. The aim of the variation of system attributes was to influence perceived usability and aesthetics of the system independently. To produce two versions with different levels of usability, three system features were varied: the number of menu lines shown (five versus two), a scrollbar indicating available but hidden menu items (given or not), a cue about the present position in the menu hierarchy (given or not). These variations had been used in a previous experiment [12] in which the effect of these on usability varied in the direction one would predict, that is, the most usable version resulted in the highest usability ratings. With respect to system features designed to influence the perception of visual aesthetics, two different body designs were used in the earlier experiment [12] varying in symmetry (high or low), color combination (high or low color differences) and shape (round or square). Because these manipulations resulted only in small differences in perceived aesthetics between the two versions, an attempt was made here to improve the high-aesthetic version by consulting a professional designer.

168


The prototypes were presented on a 7” TFT-display with touch screen functionality that participants could hold in their hands for providing input. The display was connected to a computer which ran the simulation of the audio player. 2.3 Design Three independent variables were manipulated: ‘usability’, ‘visual aesthetics’, and ‘mode’ (goal- vs. action-mode). Since each of the variations of ‘usability’ and ‘visual aesthetics’ had two levels (‘high’ and ‘low’), four prototypes were created: (a) ‘highusability’ and ‘high-aesthetics’, (b) ‘high-usability’ and ‘low-aesthetics’, (c) ‘lowusability’ and ‘high-aesthetics’, (d) ‘low-usability’ and ‘low-aesthetics’. In the goalmode participants were required to accomplish a set of tasks, and in the action-mode they were freely browsing the system for the same amount of time. All three variables were between-subjects factors. 2.4 Measures Two types of behavioral data were recorded in the goal-mode condition to ensure that versions of assumed high or low usability differed as planned: task completion rates and time on task. Questionnaires were employed to assess the user’s perception of instrumental and non-instrumental qualities. Selected sub-dimensions (controllability, effectiveness, helpfulness, learnability) of the Subjective Usability Measurement Inventory (SUMI) [14] served to rate usability. The dimension ‘classical visual aesthetics’ of a questionnaire developed by Lavie and Tractinsky [15] was used to measure visual aesthetics. Subjective emotional data were obtained via the Self-Assessment Manikin (SAM) [16] which captures the quality, or valence (positive/negative), and intensity (arousal) of emotions. Cognitive appraisals were obtained via a questionnaires based on the Geneva Appraisal Questionnaire [17]. It measures five appraisal dimensions: intrinsic pleasantness, novelty, goal/need conduciveness, coping potential, and norm/self compatibility. Novelty is a measure of familiarity and predictability of the occurrence of a stimulus, while intrinsic pleasantness describes whether a stimulus event is likely to result in a positive or negative emotion. A goal conduciveness check establishes the importance of a stimulus for the current goals or needs. Coping potential refers to the extent to which an event can be controlled or influenced. Norm/self compatibility describes the extent a stimulus satisfies external and internal standards. 2.5 Procedure The experiment took roughly 30 minutes on average. Participants were given instructions describing the experimental procedure and the use of SAM. They were then asked to rate their subjective feelings as a baseline measure. Then, depending on the experimental condition to which they were assigned at random, the relevant player was presented and participants rated its visual aesthetics. Next, they read a short text describing how to use the system.


169

Participants were then asked either to complete the set of five tasks or to explore the system for a certain amount of time. In the goal-mode condition a limit of two minutes was set for each task. Typical tasks were ‘Please have a look which songs you find on the player in the Genre POP’ or ‘Please change the sound setting of the player to CLASSIC’. However, participants actually completed the five tasks in five minutes on average. Therefore, a five-minute time limit was also set for the browsing participants. In the task condition participants filled in SAM scales after the first, third and fifth task. In the browsing condition, they were asked to rate their current subjective feeling after one, three and five minutes of exploration. At the end of this, the cognitive appraisal questionnaire was completed and usability ratings were obtained.

3 Results A 2x2 ANOVA for ‘usability’ and ‘visual aesthetics’ was performed on the goalmode data only, assessing task-completion rates and task-completion time. There was a significant main effect for ‘usability’ only, for both task-completion rates, F(1,38)=9.20, p < .01, and task-completion time, F(1,38)=13.10, p < .01. Thus, high usability led to better performance on both measures. 3.1 Instrumental and Non-instrumental Quality Perception Table 1 summarizes the average usability and visual aesthetics ratings for each condition. The ratings were transformed to values between 0 and 1 because the range of ratings differed between the variables. The Table shows that the average ratings were comparatively high even in the low-usability and the low-aesthetics conditions. Table 1. The first number in each cell represents the average usability rating and the second number the average visual aesthetics rating for each condition (ratings are transformed to values between 0 and 1)

Aesthetics high Aesthetics low

Goal-mode (tasks) Usability high Usability low 0.74 / 0.80 0.59 / 0.78 0.77 / 0.58 0.55 / 0.57

Action-mode (exploration) Usability high Usability low 0.78 / 0.82 0.59 / 0.72 0.73 / 0.58 0.54 / 0.50

A 2x2x2 ANOVA for ‘usability’, ‘visual aesthetics’ and ‘mode’ performed on the usability ratings revealed a significant main effect for ‘usability’ only, F(1,72)=9.0, p < .01. A similar 2x2x2 ANOVA carried out on the visual aesthetics ratings showed a significant main effect for ‘visual aesthetics’ only, F(1,72)=34.3, p < .001. Consistent with hypotheses 1 and 2, this suggests that the system properties affected the perception of both instrumental (i.e. usability) and non-instrumental qualities (i.e. visual aesthetics), and that quality perceptions were not influenced by usage mode.

170


3.2 Emotional User Reactions A series of 2x2x2 ANOVAs for ‘usability’, ‘visual aesthetics’ and ‘mode’ on each of the five cognitive appraisal dimensions showed that participants rated the intrinsic pleasantness of the interaction higher for the high-usability than for the low-usability version, F(1,72)=3.9, p < .05. Furthermore, the experience with the low-usable system was rated as more novel, F(1,72)=5.6, p < .05, and self/norm compatibility was higher for the high-usability version, F(1,72)=5.2, p < .05. Neither ‘visual aesthetics’ nor ‘mode’ influenced intrinsic pleasantness, novelty or self/norm compatibility, and goal conduciveness as well as coping potential showed no significant effect for any of the independent variables. In summary then, we found partial support for hypothesis 3: differences in cognitive appraisals for three of the appraisal dimensions and only the factor ‘usability’ had a significant influence. For the analysis of subjective feelings we calculated the changes from the baseline value obtained at the beginning of the experiment to the three values assessed during the interaction for each participant. For the changes from the baseline to the first two assessments of subjective feelings the 2x2x2 ANOVAs with ‘usability’, ‘visual aesthetics’ and ‘mode’ as independent variables revealed no significant effects for either the dimensions valence or arousal. Figure 1 shows the average subjective feeling changes to the third data point at the end of the interaction for the four prototypes. A 2x2x2 ANOVAs for ‘usability’, ‘visual aesthetics’ and ‘mode’ and the changes in valence as dependent variable revealed a significant effect for ‘usability’ only, F(1,72)=25.5, p < .05. The ANOVA for arousal as dependent variable showed no significant effects. Thus, only ‘usability’ affected the valence of subjective feelings, what again only partially supported hypothesis 3.

Arousal

1

0

-1 -2

-1

0 Valence

1

2

Fig. 1. Changes of subjective feeling ratings from the beginning of the experiment to the third assessment during the interaction with the system for the four systems (squared high vs. round low usability; filled high vs. unfilled low aesthetics; SAM ratings were between 0 and 8)

In order to test prediction 4 we conducted partial correlations to assess the correlation of usability and visual aesthetic ratings and subjective feelings in the two usage situations. As shown in Table 2 we found a high correlation for perceived usability and valence in the goal-mode, but not for perceived aesthetics and valence. For arousal none of the correlations was significant. For the action-mode the results yielded a moderately significant correlation with perceived usability and also with perceived aesthetics. For arousal again none of the correlations was significant.


171

Table 2. Correlation coefficients between quality ratings (usability and visual aesthetics) and subjective feelings (valence and arousal) Goal-mode (tasks) perceived usability – valence

.66

perceived aesthetics – valence

-.01 b)

perceived usability – arousal

-.16

.35 a) * .35 b) *

a)

-.19 a)

.04 b)

perceived aesthetics – arousal Partial correlation coefficients with p < .05; ** p < .01

Action-mode (exploration)

a) **

a)

.22 b) b)

visual aesthetics controlled and usability controlled

*

4 Discussion As stated in hypothesis 1, system properties did independently influence instrumental as well as non-instrumental quality perceptions. Both usability and aesthetics manipulations affected subjective predictions in the predicted directions. In comparison to other studies [5, 18], we did not find any influence of the visual aesthetics variation on perceived usability. One reason may be that in other studies an overall usability rating was used, while we applied a detailed measure for usability. No effect of the factor ‘mode’ was found on quality perceptions (prediction 2) as one would have expected based on Hassenzahl et al.’s [13] findings. The integration of cognitive appraisals as another component of emotions followed the recommendations by Mahlke et al. [10] to consider different components of emotions. We found an influence of the factor ‘usability’ on cognitive appraisals. The interaction with the low-usability system was experienced as less intrinsically pleasant, which corresponds to the findings regarding the subjective feelings. Furthermore, participants rated it as more novel or unusual, which may have led to more negative subjective feelings. The low-usability system was also rated as less self/norm compatible. Although this experiment is another step to the study of cognitive appraisals in interactive contexts, further research is clearly needed on this topic. In terms of the users’ subjective feelings, these were only affected by variations in usability. Furthermore, only the valence dimension was influenced. Participants’ subjective feelings were more positive in the high usability condition towards the end of the experiment compared to the beginning. Surprisingly, we did not find an effect of ‘visual aesthetics’, although we tried to improve the differences in visual aesthetics in comparison to a previous experiment [12]. The variation of usage mode revealed differences in the connections between quality perceptions and participants’ subjective feelings. These differences were most pronounced for the subjective feeling dimension of valence. While there was a high correlation between the valence of users’ subjective feelings and the perceived usability of a system and no correlation with the perceived visual aesthetics when participants focused on the given tasks in the goal-mode, we found moderate correlations between valence and both perceived usability and aesthetics when participants were merely exploring the system. These results indicate that context

172


parameters like usage mode influence both the specific quality dimensions for overall judgments [13], and also the quality of the emotional experience. However, more research is needed on these relationships, especially with respect to the subjective feeling dimension of arousal. In future studies the influence of user characteristics should also be studied in addition to system properties and context parameters. Furthermore, the variation of system properties that influence noninstrumental qualities other than visual aesthetics (e.g. haptic and acoustic quality) may reveal important insight especially for the domain of consumer electronic products. Acknowledgements. This research was supported by the German Research Foundation (DFG) as part of the Research Training Group ‘Prospective Engineering of Human-Technology Interaction’ (no. 1013) and by the German Academic Exchange Service (DAAD) with a travel grant. We would like to thank Lucienne Blessing, Manfred Thüring and various colleagues at the Center on Human-MachineSystems in Berlin and the Human-Oriented Technology Lab in Ottawa for the discussions on the study.

References 1. ISO: ISO 9241: Ergonomic requirements for office work with visual display terminals. Part 11: Guidance on usability. ISO, Genf. (1998) 2. Mahlke, S.: Aesthetic and Symbolic Qualities as Antecedents of Overall Judgements of Interactive Products. In: Bryan-Kinns, N., Blanford, A., Cruzon, P., Nigay, L. (eds.) People and Computers XX - Engage, pp. 57–64. Springer, Heidelberg (2006) 3. Jordan, P.W.: Designing pleasurable products. Taylor & Francis, London (2000) 4. Hassenzahl, M.: The Interplay of Beauty, Goodness, and Usability in Interactive Products. Human-Computer Interaction 19, 319–349 (2004) 5. Tractinsky, N., Katz, A.S., Ikar, D.: What is beautiful is usable. Interacting with Computers 13, 127–145 (2000) 6. Lindgaard, G., Dudek, C.: What is the evasive beast we call user satisfaction? Interacting with Computers 15(3), 429–452 (2003) 7. Rafaeli, A., Vilnai-Yavetz, I.: Instrumentality, aesthetics and symbolism of physical artifacts as triggers of emotion. Theoretical Issues in Ergonomics Science 5, 91–112 (2004) 8. Tractinsky, N., Zmiri, D.: Exploring Attributes of Skins as Potential Antecedents of Emotion in HCI. In: Fishwick, P. (ed.) Aesthetic Computing, MIT Press, Cambridge (2006) 9. Mahlke, S.: Studying user experience with digital audio players. In: Harper, R., Rauterberg, M., Combetto, M. (eds.) ICEC 2006. LNCS, vol. 4161, pp. 358–361. Springer, Heidelberg (2006) 10. Mahlke, S., Minge, M., Thüring, M.: Measuring multiple components of emotions in interactive contexts. In: CHI ’06 extended abstracts on human factors in computing systems, pp. 1061–1066. ACM Press, New York (2006) 11. Scherer, K.R.: What are emotions? And how can they be measured? Social Science Information 44, 693–727 (2005)


173

12. Mahlke, S., Thüring, M.: Antecedents of Emotional Experiences in Interactive Contexts. In: CHI ’06 proceedings on human factors in computing, ACM Press, New York (2007) 13. Hassenzahl, M., Kekez, R., Burmester, M.: The importance of a software’s pragmatic quality depends on usage modes. In: Lucsak, H., Cakir, A.E., Cakir, G. (eds.) (WWDU2002). Proceedings of the 6th international conference on Work With Display Units, pp. 275–276. ERGONOMIC Institut für Arbeits- und Sozialforschung, Berlin (2002) 14. Kirakowski, J.: The software usability measurement inventory: Background and usage. In: Jordan, P.W., et al. (eds.) Usability Evaluation in Industry, pp. 169–178. Taylor & Francis, London (1996) 15. Lavie, T., Tractinsky, N.: Assessing dimensions of perceived visual aesthetics of web sites. International Journal of Human-Computer Studies 60, 269–298 (2004) 16. Lang, P.J.: Behavioral treatment and bio-behavioral assessment: Computer applications. In: Sidowski, J., Johnson, H., Williams, T. (eds.) Technology in Mental Health Care Delivery Systems, pp. 119–137. Ablex Publishing, Greenwich (1980) 17. Scherer, K.R.: Appraisal considered as a process of multi-level sequential checking. In: Scherer, K.R., Schorr, A., Johnstone, T. (eds.) Appraisal processes in emotion: Theory, methods, research, pp. 92–120. Oxford University Press, New York, Oxford (2001) 18. Ben-Bassat, T., Meyer, J., Tractinsky, N.: Economic and Subjective Measures of the Perceived Value of Aesthetics and Usability. ACM Transaction on Computer-Human Interaction 2, 210–234 (2006)

CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle Thomas Memmel, Fredrik Gundelsweiler, and Harald Reiterer Human-Computer Interaction Lab University of Konstanz, D-78457 Konstanz, Germany {memmel,gundelsw,reiterer}@inf.uni-konstanz.de

Abstract. This article seeks to close the gap between software engineering and human-computer interaction by indicating interdisciplinary interfaces of SE and HCI lifecycles. We present a cross-discipline user interface design lifecycle that integrates SE and HCI under the umbrella of agile development. Keywords: Human-Computer Interaction, Usability Engineering, Extreme Programming, Agile Modeling, User-centered Design & Development (UCD).

1 Human-Computer Interaction and Software Engineering From its birth in the 1980’s, the field of human-computer interaction (HCI) has been defined as a multidisciplinary subject. To design usable systems, experts in the HCI arena are required to have distinct skills, ranging from an understanding of human psychology, to requirements modeling and user interface design (UID) [1]. In this article we will use the term user interface (UI) designer as a synonym for a professional who combines knowledge of usability, graphics and interaction design. Table 1. Methods for integrating SE and UE, based on [2] (excerpt) Integration issue

Method of application

Mediating and improving the communication lines between users, usability experts and developers

Use medium-weight artifacts, work with toolkits appropriate for collaborative design, talk the same language, work in pairs

Extending software engineering artifacts for UI specification & conceptualization

Use artifacts known by both professions and adjust their expressiveness

Extending RE methods for collecting information about users and usability

Include principles, practice and light- to medium-weight methods from HCI into RE

Representing design artifacts including prototypes using different formalisms

Apply prototyping as a method of participatory design; all stakeholders gather requirements

Whereas HCI focuses on UID issues such as ease of use, ease of learning, user performance, user satisfaction or aesthetics, software engineering (SE) considers how J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 174–183, 2007. © Springer-Verlag Berlin Heidelberg 2007

CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle

175

functional requirements are translated into a running system. HCI and SE are recognized as professions made up of very distinct populations. Each skill set is essential for the production of quality software, but no one set is sufficient on its own. The interaction layer is the area where HCI and SE are required to work together, in order to ensure that the resulting software product behaves as specified in the initial requirements engineering (RE). To provide a high level of UI usability, software SE has to work with people with a background in HCI, but the course of collaboration is mostly unclear. It is therefore true that classic and agile SE methods still lack integration of HCI methods and processes (see Table 1). Bearing these two different engineering disciplines in mind, each software design process can be characterized in terms of its dependency on its engineering orientation, ranging from a formal and model-based methodology to an informal explanatory design. SE tends to be more formal and “consequently, the business user and IT analyst may think that they both agree on a design, only to discover down the line that they had very different detailed implementations and behaviors in mind” [3]. Very formal or complex models are an inappropriate base for communication, especially so for collaborative design processes with high user- and businessstakeholder participation. Scenarios [4] - known as user stories in Extreme Programming (XP) [5] - and prototypes are recognized as interdisciplinary modeling language for RE and as bridging techniques for HCI and SE [6]. In SE, scenarios – as a sequence of events triggered by the user – are generally used for requirements gathering and for model checking. HCI applies scenarios to describe software context, users, user roles, tasks and interaction [4]. Prototypes in SE are used to verify functional specifications and models. Agile Modeling (AM) and XP recognize prototypes as a type of small release [5,7], whereas HCI mainly employs them for iterative UID [8]. The bottom-line is that some informal methods of XP and AM are close to HCI practice and therefore the pathfinder for a common course of action. While heavy-weight methods such as style guides (HCI) are far too expensive, lightweight methods such as essential use cases (SE) are in contrast too abstract for system specification. Cross-discipline agile methods are the optimum, and workable, compromise. Agile approaches of both SE [5] and HCI [9,10] are therefore the interface for our common and balanced software lifecycle known as CRUISER.

2 From XP to Agile Cross-Discipline Software Engineering In contrast to classic, heavy-weight SE processes like the V-Model, agile methods begin coding at a very early stage while having a shorter up-front RE phase. Following the paradigm of XP, implementation of code takes place in small increments and iterations, and the customer is supplied with small releases after each development cycle. During the exploration phase, teams write user stories in an attempt to describe user needs and roles. But the people interviewed need not necessarily be the end-users of the eventual software. XP therefore often starts coding based only on assumptions about end-user needs [10]. AM is less rigid than XP and takes more care over initial RE as is provides room for low-fi prototyping, activity diagrams or use-case diagrams [11]. Nevertheless, the analysis phase is finished as soon as requirements have been declared on a horizontal

176

T. Memmel, F. Gundelsweiler, and H. Reiterer

level, because the iterative process assumes that missing information will be filled in at later stages. Development in small increments may work properly as long as the software is not focused on the UI. Changes to software architecture usually have no impact on what the user sees and interacts with. With the UI, however, it is a different story. When designing UIs, continual changes to the UI may give rise to conflicts with user expectations and learnability, cause inconsistency and finally lead to user dissatisfaction. Thus, agile development does not really qualify as user-centered design (UCD), but can function as one pillar for an integrated approach [10]. Both SE and UID have to cope with a shorter time-to-market, in which the quality of the delivered software must not suffer. This therefore is a great challenge both for management and the methods and tools applied. Our idea is a balanced hybrid process, which is both agile SE and agile UCD, and which is consistent with the principles and practices of both disciplines. In order to identify interfaces between agile SE and agile HCI, we have to highlight different approaches to UID, analyze their agile potential and their different contributions to a cross-discipline process. Like XP, original UCD is a highly iterative process. It differs from agile methods, however, since real users are taken into account and the development team tries to understand user needs and tasks before any line of code is written. The lifecycles of usability engineering processes [4,12] provide numerous methods and tools that should support the designer in gathering all of the required information. Most of these methods are rated as heavy-weighted, due to their claim to analyze and document as much as possible about users, work flows, context, etcetera right from the beginning. Constantine [9] argues that UCD produces design ideas in a rather magical process in which the transformation from claims to design is neither comprehensible nor traceable. Such a “Black Box Designer” produces creative solutions without being able to explain or illustrate what goes on in the process. Furthermore, UCD tries to converge the resulting, often diverse, design alternatives into a single solution, which is then continuously evaluated and refined. UCD may therefore take a long time, or even fail, if too many users are involved and narrowing the design space is difficult. Iteration may create the illusion of progress, although the design actually goes round in circles and solutions remain elusive. Altogether, a one-to-one integration of UCD processes and methods is in general inappropriate for an agile course of action. Constantine’s usagecentered design approach takes up the basic philosophy of AM and concentrates on essential and easy to understand models. Through their application, HCI becomes more formal, but the simplicity of their syntax still enables collaborative design by engineering rather than by trial and error [9] (see Table 2). Although the list of usagecentered design success stories is creditable, the products praised tend to support user performance rather than user experience. This cannot be the only aspiration of a modern design approach, however. This is where Donald Norman's recently proposed activitycentered design approach (ACD) [13] comes in. Products causing a high joy of use can reach great user acceptance even when they lack usability. Norman therefore votes for the integration of emotional design issues and the stronger consideration of user satisfaction. In Lowgren and Stolterman’s book about thoughtful interaction design (TID) [14], the designer, in order to design such highly usable and aesthetic systems, switches between 3 levels of abstraction: vision, operative image and specification. If the designer is confronted with a design situation, at first an often sketchy and diffuse vision emerges. Frequently, several visions are promising and are therefore competing to be implemented, eventually resulting in a chaos of conflicting visions. The initial


177

version of the operative image is the first externalization of the vision, e.g. captured in mock-ups or elaborated interactive (hi-fi) prototypes. It enables manipulation, stimulation, visualization and decision making for the most promising design. The designer wants to learn as much about the design space as possible, narrowing the design towards the best solution as late as possible. The operative image is transformed into a (visual) specification of the final design if it is sufficiently detailed. Table 2 shows a comparison of the design approaches under discussion. Our development lifecycle is set up on the core methods of all the approaches presented, such as e.g. selective user involvement (UCD, ACD), prototyping for visual thinking (TID), as well as modeling with scenarios or task maps (usage-centered design). Table 2. Comparison of user interface design approaches, adapted from [9] User-Centered Design

Usage-Centered Design

Activity-Centered Design

Thoughtful Interaction Design

Focus is on users

Focus is on usage

Focus is on activities

Focus is on design

Substantial user involvement

Selective user involvement

Authoritative user involvement

Thoughtful user involvement

User studies Particip. Design User testing

Explorative modeling Model validation Usability inspections

Sequential activity / Task analysis Emotional design

Visual Thinking Particip. Prototyping Visual Specification

Iterative prototyping

Modeling & abstract prototyping

Understanding activities

Switching btw. abstract and detail

Informal, black box process

Systematic, specified white box process

Rule-breaking black box process

Depends on guidance and authority

Design by trialand-error

Design by engineering

Design by authority

Design by visual thinking

All designers in a project need to have a similar understanding of the vision and the wholeness of the system (TID). Thus continuous and lively discussion is necessary (XP). Informal communication across organizational borders should be easy, and teams should have common spaces (XP). Since reaching agreement on abstract notions (text) is difficult, ideas have to be made visible, allowing participants to look at, feel, analyze and evaluate them as early as possible (XP, AM). The process should be controlled by an authoritative person who must have a deep understanding of both SE and HCI. With our demand for such highly capable personnel, we concur with what XP and AM announced as one of their most important criteria for project success [5]. The leader navigates through the development process, proposes solutions to critical design issues and applies the appropriate design, engineering and development methods. Since the gap between SE and HCI becomes less significant “when the (HCI) specialist is also a strong programmer and analyst” [2], we chose XP as fundamental to our thoughts on bonding SE and HCI. Its principle of pair programming allows people with different fields of expertise, but common capabilities, to design a system together.

178


The basis of our cross-discipline lifecycle is therefore the identification of similarities between XP and HCI (see Table 3), AM and HCI (see Table 4), as well as ACD and TCD when compared to HCI, AM and XP (see Table 5). We outline some major similarities, although our comparison highlighted many more interfaces of these disciplines. Although different in their wording, agile principles and practices are comparable and show a significant overlap, such as in iterative design, small releases and prototyping, story cards of active stakeholder participation and scenarios, or testing and evaluation. Modern UID approaches do not oppose collaboration with SE; on the contrary, they underline the commonalities. Table 3. Similarities between XP and HCI (excerpt) XP Practice

HCI Practice

Iteration, Small Increments, Adaptivity

Prototyping

Planning Game

Focus Groups

Story Cards, Task Cards, User Stories

Scenarios, User Profiles, Task Model

Table 4. Similarities between AM and HCI (excerpt) Agile Modeling Practice

Usability Engineering Practice

Prove It With Code

Prototyping

Create Several Models in Parallel

Concurrent Modeling

Active Stakeholder Participation

Usage-Centered Design, User Participation

Consider Testability

Evaluation, Usability Inspections

Table 5. Overall comparison of agile SE, usual HCI and other practice (excerpt) AM & XP Practice

HCI Practice

TID & ACD Practice

Minimalist documentation

Comprehensible models

Interactive representations

Show results early

Lo-/Hi-Fi prototyping

Make ideas visible asap

Small teams, design rooms

Design rooms, styles guides

Informal communication

Active stakeholder part.

Collaborative design

externalization of visions

User performance

User performance, user experience

User performance, user experience, hedonic quality

3 Agile Cross-Discipline User Interface Design and Software Engineering Lifecycle Our agile cross-discipline user interface and software engineering lifecycle, called CRUISER, originates in our experience of developing various kinds of interactive


179

software systems in teams with up to 20 members [16]. Although CRUISER is based on XP, we firmly believe in a scaling of our lifecycle for larger teams, bearing in mind success stories of agile development with several hundred team members [17] and within large organizations [18]. For the following explanation of CRUISER, we concentrate on those issues that need to be worked out collaboratively by HCI and SE experts. SE practice that is independent from UID are not mentioned in detail. CRUISER starts with the initial requirements up-front (IRUP, see Table 6), which must not take longer than the claims analysis in XP. The agile timeframe can be preserved if the methods employed can be rated as agile (see Table 3, 4, 5) and interdisciplinary. Concerning the design of the UI, XP and AM practice is not sufficient and has to be endorsed by UID practice and authoritive design (TID, ACD). Table 6. CRUISER initial requirements up-front; contributions of disciplines Initial Requirements Up-Front (IRUP) Agile SE

Human-Computer Interaction

Authoritive Design

Use Cases, Usage Scenarios Technical Requirements User Performance Goals

Role & Task Model User-, Task-, Interaction Scenarios Essential Use Cases UI Patterns User Experience Goals

Sketchy Design Visions Hi-Fi Prototypes Hedonic Quality Goals

As discussed in Chapter 2, the real users have to be taken into account rather than just stakeholders of any kind. Appropriate cross-discipline methods for analyzing user needs are role models and task models. The model-based RE proposed by [9] focuses on surveying essential information and satisfies an agile course of action due to the use of index cards. The user roles are prioritized (Focal User Roles) and sorted in relation to their impact on product success. Finally, essential use cases describe user tasks and enable the building of task model and task map. Like user roles, task cases are sorted in accordance with Kent Beck’s proposal, which is “required - do first, desired - do if time, deferred - do next time”, whenever the necessary scenarios are established for understanding and communication. For a shared understanding of developers and for communication with stakeholders, all models are translated into scenarios, which can focus on different aspects of UID (users, tasks, interactions). Since agile methods do not consider the UI in detail, they do not recognize extensive style guides as used in HCI practice. We therefore suggest light-weight style guides that are shorter, more relevant and contain UI patterns [19]. They ease the design process by providing design knowledge and experience (AM: Apply Design Standards, Use Existing Resources). During all IRUP assignments, users, HCI, SE and business personnel support and finalize RE with initial discussions about scenarios and design alternatives. This alone will result in various outline visions such as mockups or prototypes that make up the initial project design space. In contrast to other HCI lifecycles (e.g. [12]), CRUISER envisions the externalization of design visions even before the requirements analysis is finished. In our opinion, this break with common HCI practice enables the UI designer to decide

180


very early about the degree of user involvement and the necessity of more innovative solutions. He can have a considerable influence on balancing user performance, user experience and hedonic quality demands and can guide the IRUP accordingly. The second phase of the development process is the initial conceptual phase (ICP, see Figure 1). In the ICP we envisage a separation of ongoing UI prototyping from architectural prototyping whenever possible to speed up the process. The conscientious application of software patterns [19] facilitates this procedure. The development of UI and system architecture can take place in parallel as soon as a minimalist, common UI specification [13] is generated and the necessary interfaces are identified. Dependencies between UI and system architecture can be found with the help of task cases and scenarios established during IRUP. It is very likely that highly interactive UIs will have greater impact on the system architecture.

Fig. 1. CRUISER initial conceptual phase

As discussed, prototypes are common practice in HCI and SE. The overall purpose of the ICP is therefore the generation of more detailed and interactive prototypes for narrowing the design space towards a single solution through discussion with stakeholders and through scenario refinement [3]. For this assignment, the designer must leap between abstract and detailed levels of prototyping, always considering a timeframe and expressivity suitable for an agile project environment (see Table 7). Bearing in mind the claims of agile methods, prototypes should be easy to work with and, above all, quick to produce and easy to maintain. With more interactive and complex external representations, the designer conducts a dialogue about design solutions and ideas. Prototypes that are visually more detailed help us to overcome the limitations of our cognitive abilities to process, develop, and maintain complex ideas and to produce a detailed operative image (TID). As long as the prototype can be modified using simple direct manipulation techniques, the users can be proactively involved in the participatory process. In addition to low-fi prototyping for e.g. conceptual design, a modern UID approach must also provide methods and tools for hi-fi prototyping that overcomes most of the disadvantages mentioned in Table 7. We


181

recommend prototyping tools such as Macromedia Flash and iRise Studio. They are easy to use for all stakeholders due to the absence of coding, they allow reuse of components through the application of patterns or templates, and they produce running interactive simulations that can be enhanced to small releases. Table 7. Low- and High-Fidelity Prototyping, based on [8] (excerpt) Type

Advantages

Disadvantages

Low-Fidelity

less time & lower cost evaluate multiple concepts communication device address screen layout issues

limited usefulness for usability tests navigational and flow limitations facilitator-driven poor specification

High-Fidelity

partial/complete functionality interactive use for exploration and test marketing & sales tool

time-consuming to create inefficient for proof-of-concept designs blinds users to major representational flaws management may think it is real

Interactive prototypes can also run as “Spike Solutions”, which are used to evaluate and prove the functionality and interoperability of UI concepts and system architecture. More importantly, they can be applied as visual, interactive UI specifications in the ensuing construction phase. Visual specifications are unambiguous and can guarantee the final system matches stakeholder expectations about UID and behavior. The prototyping-based process minimizes the risk of making wrong design decisions and leads the way towards a winning design solution. Through the well-balanced and thoughtful application of selected methods of RE such abstract modeling or detailed prototyping, CRUISER avoids a design by trialand-error and makes the design process move forward in a traceable manner. The process of identifying the most promising design solution is guided by UI evaluations, which can be kept at low complexity if the UE methods applied are agile [20]. In order to give due regard to the UI's hedonic qualities, which are e.g. the ability to stimulate or to express identity, we envision a design review with AttrakDiff [15]. On entering the construction and test phase (CTP), coding starts (see Figure 2). At this phase, the CRUISER lifecycle closely resembles the incremental and iterative manner of XP. CTP therefore begins with iteration planning and the creation of unitand acceptance-tests, which are later used to evaluate parts of the system architecture (e.g. automatically) and the UI (e.g. with extreme evaluations [20]). The latter guarantees that the previously defined usability or hedonic quality goals are properly taken into account. They are only to be executed if a usability expert on the team identifies a need for it. We therefore recommend the integration of HCI personnel in the pair programming development. As with the construction of prototypes, the actual coding of UI and system architecture again takes place in parallel, and components of the UI that have great impact may be developed faster initially and then later refined during the following iterations. As in XP, the CTP ends with the deployment of a small release. Before the next iteration starts, each small release can again be evaluated using cheap and fast methods [20]. If usability or hedonic quality issues are identified, they can also be

182


documented on index cards (“defect cards”). Each defect is assigned to its corresponding task case. The usability defects may be sorted and prioritized and thus reviewed during earlier or later iterations. If usability or design catastrophes occur, HCI and SE experts and stakeholders can decide on the necessary measures. The last step in the CRUISER lifecycle is the deployment phase. While users are working with the system, new functionality may be requested, or usability and design issues that were underrated during the iterations may be raised. The lifecycle therefore allows for a return to earlier phases to cater for such new requirements.

Fig. 2. CRUISER construction and test phase

4 Summary Our motivation was to take a step towards a cross-discipline procedure for software design with respect to agile movements. With the CRUISER lifecycle, we bridge HCI and SE based on the commonalities of both fields. Similarities can be found in basic principles and practices as well as among the methods and tools that are typically applied. CRUISER has important links to XP [5], but differs from it in many important aspects related to AM, HCI and beyond. For integrating all critical disciplines under the umbrella of one common lifecycle, we concur with the findings of interdisciplinary researchers and use scenarios and prototypes as fundamental artifacts propelling a design process with high involvement of users and stakeholders.

References 1. Pyla, P.S., Pérez-Quiñones, M.A., Arthur, J.D., Hartson, H.R.: Towards a Model-Based Framework for Integrating Usability and Software Engineering Life Cycles. In: Proceedings of Interact 2003, Zurich, Switzerland, September 1-3, IOS Press, Amsterdam (2003) 2. Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.): Human-centered software engineering – integrating usability in the development process, pp. 3–14. Springer, Heidelberg (2005) 3. Zetie, C.: Show, Don’t tell - How High-Fidelity Prototyping Tools Improve Requirements Gathering, Forrester Research Inc. (2005) 4. Rosson, M.B., Carroll, J.M.: Usability engineering: scenario-based development of human computer interaction. Morgan Kaufmann, San Francisco (2002) 5. Beck, K.: Extreme Programming Explained. Addison-Wesley, London, UK (1999)


183

6. Sutcliffe, A.G.: Convergence or competition between software engineering and human computer interaction. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-centered software engineering – integrating usability in the development process, pp. 71–84. Springer, Heidelberg (2005) 7. Blomkvist, S.: Towards a model for bridging agile development and user-centered design. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-centered software engineering – integrating usability in the development process, pp. 219–244. Springer, Heidelberg (2005) 8. Rudd, J., Stern, K., Isensee, S.: Low vs. high fidelity prototyping debate, Interactions, vol. 3(1), pp. 76–85. ACM Press, New York (1996) 9. Constantine, L.L.: Process agility and software usability: Toward lightweight usagecentered design, Information Age, vol. 8(8) (August 2002) 10. Gundelsweiler, F., Memmel, T., Reiterer, H.: Agile Usability Engineering. In: KeilSlawik, R., Selke, H., Szwillus, G. (Hrsg.) Mensch & Computer 2004: Allgegenwärtige Interaktion, pp. 33–42. Oldenbourg Verlag, München (2004) 11. Ambler, W.S.: Agile Modeling. John Wiley & Sons, New York (2002) 12. Mayhew, Deborah, J.: The usability engineering lifecycle - A Practicioners Handbook for User Interface Design. Morgan Kaufmann, San Francisco (1999) 13. Norman, D.: Human-Centered Design Considered Harmful. Interactions 12(4), 14–19 (2005) 14. Lowgren, J., Stolterman, E.: Thoughtful Interaction Design: A Design Perspective on Information Technology. MIT Press, Cambridge, MA (2004) 15. Hassenzahl, M., Platz, A., Burmester, M., Lehner, K.: Hedonic and Ergonomic Quality Aspects Determine a Software’s Appeal, In: Proceedings of the CHI 2000, Conference on Human Factors in Computing, The Hague, NL, pp. 201–208 (2000) 16. Limbach, T., Reiterer, H., Klein, P., Müller, F.: VisMeB: A visual Metadata Browser. In: Rauterberg, M. pp. 993–996. IOS Press, Amsterdam (2003) 17. Eckstein, J.: Agile Software Development in the Large: Diving Into the Deep. Dorset House Publishing Co., Inc. New York (2004) 18. Lindvall, M., Muthig, D., Dagnino, A.: Agile Software Development in Large Organizations. Computer 37(12), 26–34 (2004) 19. Borchers, J.: A Pattern Approach to Interaction Design. John Wiley & Sons, New York (2001) 20. Gellner, M., Forbrig, P.: Extreme Evaluations – Lightweight Evaluations for Soft-ware Developers, In: IFIP Working Group 2.7/13.4, editor, INTERACT 2003 Workshop on Bridging the Gap Between Software Engineering and Human-Computer Interaction (2003)

Interface Between Two Disciplines The Development of Theatre as a Research Tool Maggie Morgan and Alan Newell School of Computing, University of Dundee, Scotland, DD1 4HN [email protected]

Abstract. Dundee University’s School of Computing is researching technology for older users, whose difficulty with technology often exclude them from its benefits. This paper discusses the problems raised in consulting potential users who feel they do not understand technology and are anxious about using it. How should the technologists and designers get over to this clientele the somewhat abstract concepts of ‘what might be developed’ and how it might affect the users’ quality of life? How could they keep the focus of discussion while giving the older people the confidence to be truthful? Experiments made with video and live theatre in consulting with older users, requirements gathering and evaluation of designs are described. This paper addresses: the process of scientific data being transformed into appropriate and useful ‘stories’ to the satisfaction both of writer and researchers: the role of actors and facilitator: the impact on the ‘extreme users’ in the audience: and the data thus gained by the researchers.

1 Background: Research into the Interface with Older Users Dundee University’s School of Computing has several research projects concerned with the interface between older users of technology, and have developed methods for consulting with these potential users at all stages of design, including the very early conceptual stage [8]. The researchers work regularly with older volunteers, both individually and in small groups as appropriate. The School has a cohort of over 200 interested volunteers, which shows that older people’s interest in the usefulness of technology can be generated when they are given an opportunity for their voices to be heard. Early research showed that older users tend to miss out on the progress of technology, at a time when they are living longer, demand better quality of life and ultimately will need a good quality of care. Older people have not been brought up with fast moving technology: feel excluded: do not understand the language of, for example, computers: are afraid they might ‘break’ the machine: avoid feeling stupid: tend to blame themselves, not bad design: and do not understand the potential benefits and relevance of new technologies. In addition the increasing minor disabilities that come with age: worse sight, hearing and dexterity: short term memory problems: and lack of general mobility, exacerbate this trend. The changing demography, with a rapidly increasing older population in many parts of the world, means there will be a shortfall of carers, both formal and informal, J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 184–193, 2007. © Springer-Verlag Berlin Heidelberg 2007

Interface Between Two Disciplines - The Development of Theatre as a Research Tool

185

to look after people as they become increasingly frail. Technology should have an important role in improving an old frail person’s quality of life, giving him/her more control over his/her environment, and in giving support to the carers. In order for such technology to be successful, however, older people should be consulted as part of any design process [3].

2 Problems of Consultation Consulting older people about the design of potential technology raises a number of questions: − How do you translate rather abstract scientific concepts into a ‘reality’ that older people can relate to and apply to their own lives? − How can you make older people really understand a piece of technology that has not yet been developed? − How can you make it easier for older people to be critical? They often do not want to ‘upset’ designers and their responses aim to please. How can you create a ‘safe’ method of lively discussion between older people and designers, without the older people feeling intimidated and ashamed of their ‘ignorance’ or the designers either being frustrated or unwittingly patronising.

3 The Introduction of Drama The School of Computing is experimenting with using drama, both video and live theatre, to address these problems [7]. This is based on the following premises: Theatre, whether live or on video, has the ability to ‘pretend’ - so undeveloped technology can be presented as real and working. Scientific concepts and novel technology, with their esoteric language and jargon, can be translated into everyday life. This enables the audience to apply them to their own situation; thus facilitating significant information transfer between researchers and older users. Stories, with ‘real’ characters, with whom the audience can identify, help the audience engage with problems and questions encountered [4,5,11]. All discussion, debate and criticism are focussed on the story and the characters; no-one is going to be offended. This enables both older people and designers to discuss, argue, inform and share needs and experience in a very safe way. This very safety helps older people and designers to draw on and share their experiences. This can be particularly useful in an area where individual needs and disabilities are subject to very wide variation. The roles of researchers, writers, actors and facilitators within this process are all very important, and will be discussed later in this paper. 3.1 Maggie Morgan The Scotland-based Foxtrot Theatre Company, which specialises in interactive forum theatre, provided Maggie Morgan, a theatre writer, director and interactive theatre

186

M. Morgan and A. Newell

facilitator, to work with researchers, write scripts and produce video for two research projects within the School of Computing. The success of these resulted in her being awarded a Leverhulme Art-in-Residence Fellowship for the academic year 2005-6, with the remit to further develop the role of theatre as a research tool within computing. 3.2 The Fall Mentoring Project – Requirements Gathering Using Video A group of researchers were developing a mentoring system which detected falls which involved video cameras within an old person’s home. The pictures would be transmitted to a computer which would alert a carer if it detected the person suffering a fall [6]. The initial reaction of people to the idea of having cameras in the home can be completely negative, but is perhaps an uninformed judgement. To address this issue in more depth, Morgan and the researchers devised four different situations which would inform the viewers, open up wider discussion, and provide valuable data for the researchers. Videos of these scenarios were then made using professional actors and video engineers. The four brief video scenes consisted of – − Older man rushing to answer door bell, and tripping and falling when there was not monitor in his house to detect the fall. − Older woman who has a monitor in her room, reaches up to dust, loses her balance and falls. She is shocked and cannot get up. The monitor registers fall, and soon someone arrives, having been alerted. − False alarm: an older woman – with a monitor – drops a jigsaw, gets down on the floor to pick up the pieces. The monitor registers this and alerts her daughter, who rings her immediately. Music is playing so the mother does not hear the phone for a long time; the daughter rushes out from an important meeting and arrives to find her mother enjoying her jigsaw. She is both relived and frustrated! − A daughter, talking to her father, describes the monitor her mother-in-law has, and that it has somewhat eased the burden of checking up on the old lady. The conversation is interrupted by a phone message from the computer connected to mother-in-law’s monitor. The computer is letting her know that, although the old lady has not fallen, she is not moving around as usual. Daughter-in-law rings to check whether she might be ill. It’s OK! It is Wimbledon fortnight, the old lady is a tennis enthusiast and is hardly moving from the television! Relief but some irritation – but father comments to his daughter that she might be very glad of this function some day. Pauses were built into these video scenes so that the audience could comment and discuss each scenario. The researcher facilitating the discussions, who had been trained in facilitation by Morgan, was able to answer questions about what happened to the ‘pictures’ the cameras took and how carers might be alerted. Audiences varied from relatively fit older people living independently in sheltered housing or their own house to very frail old people who needed a lot of care in order to stay at home and who came together at Day Centres. One audience consisted of a group of professional carers. Each audience brought its own experiences and perspectives; among the


187

topics covered were anxieties about privacy; what support systems were already in use and how effective these were; anxieties about falling or becoming ill and this being detected; where falls were most likely; how their individual activities differed and false alarms. The narrative form of the video clips engaged the audience and kept the focus of the discussion. Using drama was found to be an extremely useful method of provoking discussion at the pre-prototyping stage and provided many insights that we believe would not have been obtained without such techniques being used. This confirms the comments made by Sato & Salvador [13] that human centred stories lead to a more detailed discussion and that the drama provides a point of contact, which makes the evaluative task much easier. Although Strom [14] reported that he found it difficult to combine large or dramatic consequences with the exploration of an interface, this was not an issue in this piece of research. 3.3 The UTOPIA Trilogy Video – An Attitude Changing Exercise Using Video A similar technique to the above was used to produce narratives for discussion aimed at communicating the essential findings of the UTOPIA group (Usable Technology for Older People: Inclusive and appropriate) [2] to designers of technology for older people. During the research phase of the project, which included discussions with individuals and groups of older people, important data emerged concerning older people’s problems with language; anxiety; assumptions of knowledge that they in fact lacked; confusing software and the increase of disabilities with aging. Designers, however – usually young – found it difficult to conceive of people who were totally unfamiliar with basic modern technology. Three videos were produced, which focussed on: installing a web camera, a completely novice user attempting to use email, and a first time user of a mobile telephone [15]. The video stories were viewed and discussed by several audiences: some consisting of designers and engineers, some of older people, others of mixed audiences. Changes in audience attitudes were measured by identical questionnaires about perceptions of older people being filled in before the viewing and at the end of the event. Each performance provoked lively discussion and proved very enjoyable. Significant changes in attitude were noted in all audiences who viewed these videos [1]. 3.4 The Rice Digital Television Project – Live Theatre for Requirements Gathering Rice, a researcher in the Dundee University School of Computing, used focus groups in his initial requirements gathering for the design of a home telecommunication system for older adults, and subsequently used live interactive theatre as a method of holding in-depth discussions with a large groups of older people [12]. Although digital television and its possible applications is very topical many, particularly older people, neither understand how digital TV worked nor what its potential uses are, especially those which could enhance the quality of life of older people. The potential uses of digital TV examined were: a ‘chatting’ service, communication between homes via a camera: a ‘scrap book’, and a reminder service. The problems of describing technology, which had not yet been developed, and therefore ascertaining

188


how desirable or useful it might be seen to be were solved by the ability of theatre to ‘pretend’. A ‘multi-media’ production was scripted, developed and produced, using professional actors, on-stage props, and the projection of DVD onto a back screen. The situations chosen were those frequently found in real life experience – children and grandchildren living at a distance: having to move from the family home to a smaller place; becoming more forgetful The creation of characters in life-like situations resulted in a ‘reality’ with which older audiences could identify and empathise, directly relating the action to their own experiences and expectations. The discussion was enhanced even further when the characters - i.e. the actors who remained in role - took part in the discussion with the audience. The characters bore the brunt of being unsure of the role of the technology and finding the possible disadvantages – but also discovered how it might help their human situation. The performances, and all the audience interaction, were conducted in a purpose designed studio theatre within the School of Computing [9] and were recorded using four cameras and a high quality sound system. This ensured that all the interaction within the audience was faithfully recorded, and were subsequently transcribed. This provided extensive data which was extremely useful both in the decision making process for, and in the detailed development of digital television application.

4 Experiments with Combining Video and Live Theatre Live theatre has a big impact, but the full rehearsed performance is not always feasible both practically and financially. We therefore also experimented with a mixture of video clips and live theatre. Showing a video clip was followed by the actors in that clip being present ‘in role’ to dialogue with the audience. The aim of the viewings was to measure change in attitude towards older people and technology with three audiences – undergraduate students, post-graduate students and professionals at an HCI conference. The undergraduate and post-graduate students reported that, although the video was interesting and informative, being able to question and discuss with the ‘live’ characters had more impact. The response of the professional audience at the HCI conference [10], who were not specialists in designing for older people, was very mixed, but again the session with the actors stimulated a huge amount of discussion and argument and made the session highly memorable for the audience. With all three very different audiences, the fact that the characters were actually actors liberated everyone to say what they really thought. The ‘characters’ were highly believable and convincing, but the audience could attack the characters, knowing that the actors were not going to take their comments personally.

5 Continuing Use of Theatre in Technological Research Plans are already being implemented by a group of researchers from four Scottish Universities, and involving “telecare”, health and social work stakeholders, to use live theatre for requirements gathering, evaluation and inter-communication among


189

audiences of older people, formal and informal carers, designers and engineers, health and social work professionals. Two different formats for discussion following the performances will be tested, and the results of this methodological experiment will be reported at the conference at HCI 2007. 5.1 How Does It Actually Work? The essential constituents of Interactive Forum Theatre are quality of: • • • •

The script, The performance , The facilitation of interaction with the audience, and the use of Appropriate interaction techniques.

5.1.1 The Script – The “Story of an Interface” The script must be the result of thorough collaboration between researchers and writer. The task of the researchers is to convey their aims accurately and clearly to the writer: the questions they want answers to and/or the information they wish conveyed. The writer’s task is to understand clearly the aims of the researchers, and to translate their research issues into the form of a story. This interaction between researchers and writer may sound simple, but is in fact complex. The researchers may well be anxious about their measured scientific data being rendered inaccurately: they may find the whole process very alien. Researchers with no experience of this method may feel a lack of trust both in the process and in the writer. The writer on the other hand may find their technical jargon impenetrable, and have to ask many ‘idiot’ questions in order to understand what is really required. The writer has to produce a good story that will work dramatically in performance how can this be reconciled with scientific data and analyses? The writer too can feel frustrated if the researchers seem not to understand what (s)he is trying to do and are even suspicious of the process. The process, however, gradually builds up a rapport between researchers and writer. The writer goes through several stages of composition: she produces one or more outline ideas: then a first draft of script: then a second draft of script, then a ‘working draft’ that the director and actors can begin to rehearse with. At each stage, the writer’s outlines and scripts are referred back to a working group of the researchers for checking out. The writer needs to be clear about her limitations and continually ask the researchers to amend or suggest. For example, when needing audience responses to technological help in the home, what pieces of technology would the researchers like to see in the story? What are the questions they would like asked around this piece of technology? How would the character make this work? An older person might have disabilities to take into account when operating it. Or even how might you persuade an older person that this facility would really benefit them? Alien as this process may seem to traditional theatre, the structure of a dramatised story is actually very appropriate. Tension and conflict are need to achieve drama: characters resisting or struggling with pieces of technology introduces tension and asks questions, and, as with all HCI technology, the interface is with human beings, with their own psychologies, knowledge and context. Theatre can create the “story of an interface”, where an audience can look at a piece of technology, its possible

190


usefulness, design and usability, and how a human being interacts with it, the human being having attitudes, emotions, physical difficulties and needs. 5.1.2 The Actors Only professional actors have been used in the experiments reported. Minimal costume and only essential props were used and the actors were physically very close to the audience. This form of theatre requires experienced professional actors who can take direction and immediately, or almost immediately, to produce a three dimensional believable characters. The actors, which have been used in this interactive work, also are experienced in interactive theatre, and are able to ‘suspend disbelief’ and have the ability to engage an audience without the normal technical aids of a full theatre production. The actors were very well briefed into the aims of the theatre: the way the pieces of technology were supposed to ‘work’: how it might relate to the life style and needs of the character: and what questions might arise in the audience that they may have to react to. It was extremely useful for one or more researchers to be present for some of the rehearsals. Questions inevitably arise about the technology during rehearsal, and a researcher can supply the information and explanation the actors need. This also assures the researchers that they still have control over the project and that their research is being respected in detail. For example, if a character is being ‘hot-seated’ – questioned ‘in role’ in a dialogue with the audience, (s)he needs to be well versed in the character’s own story and circumstances and also the issues around the piece of technology. Other dramatic possibilities with this format include, the audience being able to redirect a character in the story. For example, one of the characters in the story may have explained the technology in a way that is either incomprehensible or patronising to the older person - the audience can be given the opportunity to replay that part of the story to see the effect of a different approach to the challenge of communicating technology to older people. 5.1.3 The Director and Facilitator The director needs to thoroughly understand the research aims and brief the actors as they rehearse. The director and facilitator have to be as well briefed as the writer. In the case of the work reported here, the writer was also the director and facilitator. If this is not the case, the writer, director and facilitator must work very collaboratively. The facilitator’s role is crucial. (S)he must: Thoroughly understand the issues which the researchers need investigated, Explain clearly and simply to the audience how the process will work and how the facilitator will enable them to interact, Particularly with older people, but in fact with any audience, have a brief, relaxed ‘warm up’ session, to begin the process of audience members responding and beginning to focus, to establish the rapport between facilitator and audience. At the ‘Pauses’ for interaction, guide the audience through the techniques appropriate at that point. Ask questions that are as open as possible, and accept contributions from the audience unconditionally. No one should be made to feel belittled by a facilitator’s response.


191

Frequently repeat or paraphrase what an audience member has just said both to reinforce the point and also to make sure everyone in the audience has heard. Where conflicting attitudes and perspectives come from the audience, briefly sum up the divergence, with respect, which often moves the discussion on. The different perspectives are aired and heard by everyone, but there is the safety of the differences being projected onto the characters and the situation in the story. If the focus of the discussion is being lost, regain the focus by referring back to the story. 5.1.4 Co-facilitation In some projects it is appropriate to have a co-facilitator who is a member of the research team. Whenever scientific issues or queries arise, the main facilitator can call on the co-facilitator to supply the information. In the case of a researcher / cofacilitator thinking an important issue or question is being missed in the discussion, (s)he can raise this with the audience. This method of co-facilitation worked well [12]. 5.2 Focus The performance of the story maintains the focus of the discussion, the characters bear the brunt of any negative comments, the audience increasingly engages and feels it’s comfortable to join in and a great deal of data emerges from the discussion. The whole process can be recorded unobtrusively (though with permission) for subsequent transcription and analysis. 5.3 Cost Video and live theatre are both extremely useful for engaging and informing an audience and stimulating lively discussion. They can be used for requirements gathering and evaluation by large groups of people at a time. The impact of live theatre and the ability of the audience to respond, and often directly interact with the characters, cannot be underestimated. If a video is used the discussions following the viewing need to be as well facilitated, as those in live performances, though obviously there is no direct interaction with the performers. The balance of costs between producing a DVD and live performances depends on the number of performances planned. Economically live performances need to be put on close together, so that the actors are employed for a single period and need only one rehearsal period as part of this. If the presentations are spread out in time rebriefing and re-rehearsal of the actors will be needed. The cost of producing a good quality video can be up to five times the cost of producing a series between 2 and 5 live performances within a single run of productions, but if researchers wish to use the performance many times but at intervals and in different places, the initial cost of a video may be more economical. A useful compromise, where performances have to be at intervals, is to make a video and have at least one of the actors present in character for dialogue with the audience. This means that the actor(s) do not need a rehearsal period prior to the performance.

192


6 Conclusions the Appropriateness of Theatre for HCI The work reported has shown that theatre can be very effective in many stages of the development of technology. There is a logic to the use of theatre in HCI research. Human needs and wants should be the starting point with researchers frequently needing to consult potential users at the earliest stage, and theatre provides a very effective communication method. Once technological ideas begin to be developed, further consultation is needed with potential users. At the pre-prototype stage, theatre is particularly useful to help the researchers create a ‘reality’, where we imagine these devices are being used, but raising questions about appropriateness of design for older people’s life situations and for their usability by people who are unsure about technology and slower to learn than when they were younger. An interactive performance essentially provides a very flexible ‘virtual’ world in which an audience can play with novel technology and concepts. Acknowledgements. The work reported has been supported by the Scottish Higher Education Funding Council, the Engineering and Physical Sciences Research Council, and the Leverhulme Trust.

References 1. Carmichael, A., Newell, A.F., Dickinson, A., Morgan, M.: Using theatre and film to represent user requirements. Include, Royal College of Art, London (April 5-8, 2005) 2. Dickinson, A., Eisma, R., Syme, A., Gregor, P.: UTOPIA: Usable Technology for Older People: Inclusive and Appropriate. In: Brewster, S., Zajicek, M. (eds.) A New Research Agenda for Older Adults, Proc. BCS HCI, London, pp. 38–39 (2002) 3. Eisma, R., Dickinson A., Goodman, Mival, O,J., Syme, A., Tiwari L.: Mutual inspiration in the development of new technology for older people. In: Proc. Include 2003, London, pp.7:252–7:259 (March 2003) 4. Grudin, J.: Why Personas Work – the psychological evidence. In: Pruitt, J., Adlin, T. (eds.) The Persona Lifecycle, keeping people in mind throughout product design, Elsevier (In press) 5. Head, A.: Personas: Setting the stage for building usable information sites. Online 27(4), 14–21 (2003) 6. Marquis-Faulkes, F., McKenna, S.J., Gregor, P., Newell, A.F.: Gathering the requirements for a fall monitor using drama and video with older people. Technology and Disability 17(4), 227–236 (2005) 7. Newell, A.F., Carmichael, A., Morgan, M., Dickinson, A.: The use of theatre in requirements gathering and usability studies. Interacting with Computers 18, 996–1011 (2006) 8. Newell, A.F., Gregor. P.: User sensitive inclusive design in search of a new paradigm. In: Scholtz, J., Thomas, J. (eds.) CUU 2000, Proc. First ACM Conference on Universal Usability, USA. pp. 39–44 (2000) 9. Newell, A.F., Gregor, P., Alm, N.: HCI for older and disabled people in the Queen Mother Research Centre at Dundee University, Scotland, CHI 2006 Montreal, Quebec, Canada, 22-27 April 2006. pp. 299–303 (2006)


193

10. Newell, A.F., Morgan, M.: The use of theatre in HCI research, In: “Engage” 20th Annual BCS HCI Conference University of London (September 11-15, 2006) 11. Pruitt, J., Grudin, J.: Personas: Practice and Theory. In: Proceedings DUX 2003, CD ROM, 15 (2003) 12. Rice, M., Newell, A.F., Morgan, M.: Forum Theatre as a requirement gathering methodology in the design of a home telecommunication system for older adults, Behaviour and Information Technology (In press ) 13. Sato, S., Salvador, T.: Playacting and Focus Troupes: Theatre Techniques for creating quick, intensive, immersive and engaging focus group sessions, Interactions, pp. 35–41 (September-October, 1999) 14. Strom, G.: Perception of Human-centered Stories and Technical Descriptions when Analyzing and Negotiating Requirements. In: Proceedings of the IFIP TC13 Interact 2003, Conference (2003) 15. Utopia Trilogy can be downloaded from: http://www.computing.dundee.ac.uk/projects/ UTOPIA/utopiavideo.asp

Aspects of Integrating User Centered Design into Software Engineering Processes Karsten Nebe1 and Dirk Zimmermann2 1

University of Paderborn, C-LAB, 33098 Paderborn, Germany [email protected] 2 T-Mobile Deutschland GmbH, Landgrabenweg 151, 53227 Bonn, Germany [email protected]

Abstract. Software Engineering (SE) and Usability Engineering (UE) both provide a wide range of elaborated process models to create software solutions. Today, many companies have realized the need for usable products and understood that a systematic and structured approach to usability is as important as the process of software development itself. However, theory and practice still have problems to efficiently and smoothly incorporate UE methods into established development processes. One challenge is to identify integration points between the two disciplines SE and UE that allow a close collaboration, with acceptable additional organizational and operational effort. The approach presented in this paper identifies integration points between software engineering and usability engineering on the level of process models. The authors analyzed four different software engineering process models to determine their ability to create usable products. Therefore, the authors synthesized demands of usability engineering and performed an assessment of the models. Keywords: Software Engineering, Usability Engineering, Standards, Models, Processes, Integration, Assessment.

1 Introduction The question how to align the two disciplines of Software and Usability Engineering originates from the need of usability practitioners to embed their activities into an existing organization that already applies methods for structured software development. Often a standalone usability process often does not have sufficient impact and its results are not readily absorbed by the development organization. Therefore it would be desirable to identify potential points for exchanging information or possibly sharing resources and efforts to achieve a situation where the benefits of usability engineering become usable for the development organization. The paper aims to identify these integration points on different levels and to create a framework for aligning the two disciplines. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 194–203, 2007. © Springer-Verlag Berlin Heidelberg 2007

Aspects of Integrating User Centered Design into Software Engineering Processes

195

1.1 Software Engineering Software engineering is a discipline that adopts various engineering approaches to address all phases of software production, from the early stages of system specification up to the maintenance phase after the release of the system ([15], [18]). Software engineering tries to provide a systematic and planable approach for software development. To achieve this, it provides comprehensive, systematic and manageable procedures: so called software engineering process models (SE Models). SE Models usually define detailed activities, the sequence in which these activities have to be performed and resulting deliverables. The goal of SE Models is to define a process where the project achievement does not depend on individual efforts of particular people or fortunate circumstances [5]. Hence, SE Models partially map to process properties and process elements and add concrete procedures. Existing SE Models vary with regards to specific properties (such as type and number of iterations, level of detail in the description or definition of procedures or activities, etc.) and each model has specific advantages and disadvantages, concerning predictability, risk management, coverage of complexity, generation of fast deliverables and outcomes, etc. Examples of such SE Models are the Linear Sequential Model (also called Classic Life Cycle Model or Waterfall Model) [16], Evolutionary Software Development [12], the Spiral Model by Boehm [1], or the V-Model [9]. Software engineering standards define a framework for SE Models on a higher abstraction level. They define rules and guidelines as well as properties of process elements as recommendations for the development of software. Thereby, standards support consistency, compatibility and exchangeability, and cover the improvement of quality and communication. The ISO/IEC 12207 provides such a general process framework for the development and management of software [7]. It defines processes, activities and tasks and provides descriptions about how to perform these items on an abstract level. Thus, there is a hierarchy of different levels of abstractions for software engineering: Standards that define the overarching framework and process models describe systematic and traceable approaches for the implementation. All these levels put the focus on system requirements and system design. 1.2 Usability Engineering Usability Engineering is a discipline that is concerned with the question of how to design software that is easy to use. Usability engineering is “an approach to the development of software and systems which involves user participation from the outset and guarantees the efficacy of the product through the use of a usability specification and metrics.” [4]. Therefore usability engineering provides a wide range of methods and systematic approaches to support the development process. These approaches are called Usability Engineering Models (UE Models). Examples are Goal-Directed-Design [2], the Usability Engineering Lifecycle [11] or the User-Centered Design-Process Model of

196

K. Nebe and D. Zimmermann

IBM [6]. They describe an idealized approach to ensure the development of usable software, but they usually differ in its details, in the applied methods (the “how?”) and the general description of the procedure (the “what?”, e.g. phases, dependencies, goals, responsibilities, etc.) [19]. Usability engineering provides standards which are similar to the idea of software engineering standards. They also serve as a framework to ensure consistency, compatibility, exchangeability, and quality. However, usability engineering standards lay the focus on the users and the construction of usable solutions during the development of software solutions. Examples for such standards are the DIN EN ISO 13407 [3] and the ISO/PAS 18152 [8]. The DIN EN ISO 13407 introduces a process framework for the human-centered design of interactive systems. Its overarching aim is to support the definition and the management of human-centered design activities. The ISO/PAS 18152 is based on the DIN EN ISO 13407 and describes a reference model to measure the maturity of an organization in performing processes that make usable, healthy and safe systems. Thus, in usability engineering exists a similar hierarchy of abstraction levels as in software engineering: Standards define the overarching framework and process models describe systematic and traceable approaches for the implementation. However, usability engineering puts the focus on creating usable and user-friendly systems instead of system requirements and system design. 1.3 Relationship of Standards, Models and Operational Processes In general standards and models are seldom applied directly, neither in software engineering nor in usability engineering. Standards merely define a framework to ensure compatibility and consistency and to set quality standards. Models are being adapted and/or tailored according to the corresponding organizational conditions, such as existing processes, organizational or project goals and constraints, legal policies, etc. According to this, the models are detailed by the selection and definition of activities, tasks, methods, roles, deliverables, etc. as well as responsibilities and relationships in between. The derived instantiation of the model, fitted to the organizational aspects, is called software development process (for SE Models) or usability lifecycle (for UE Models). Thus, the resulting Operational Process is an instance of the underlying model and the implementation of activities and information processing within the organization. This applies to both software engineering and usability engineering. Thus, there is not just a single hierarchy of standards and models but an additional level of operational processes for software engineering, as well as for usability engineering. Standards define the overarching framework, models describe systematic and traceable approaches and on the operational level these models are adjusted and put into practice (Figure 1). In order to achieve sufficient alignment between the two disciplines, all three levels have to be regarded to ensure that the integration points and suggestions for optimized collaboration meet the objectives of both sides and not lose the intentions behind a standard, model or operational implementation.


ISO/IEC 12207

Usability Engineering DIN EN ISO 13407 ISO/PAS 18152

Opera tion Proce al ss

Proce ss Mode l

Stand ards

Software Engineering

197

Procedure

Procedure

Fig. 1. Similar hierarchies in the two disciplines software engineering and usability engineering: standards, process models and operational processes

2 Motivation For development organizations SE Models are an instrument to plan and systematically structure the activities and tasks to be performed during software creation. However, software development organizations aim to fulfill specific goals when they plan a software solution. Such goals could be the rapid development of a new software solution (to become the leader in this area) or to develop a very stable and reliable solution (e.g. because of the organization’s prestige) and of course, to create revenue with it. Depending on its’ goals an organization will chose one (or the combination of more than one) SE Model for the implementation that will in their estimate fits best. As an example, the Linear Sequential Model with its predefined results at the end of each phase and its sequential flow of work certainly provides a good basis for planability. On the other hand, the Evolutionary Development might not be a good choice if the main focus of the solution is laid on error-robustness, because the continuous assembling of the solution is known to cause problems in structure and the maintenance of software code. As usability engineering puts the focus on the user and usability of products, which is an important aspect of quality, usability becomes important for the development process and thus also an important criterion for organizations to choose a well-suited SE Model. However, usability engineering activities are not just a subset of software engineering or SE activities. Although different models exist for software and usability engineering, there is a lack of systematic and structured integration [17]. They often coexist as two separate processes in an organization and therefore need to be managed separately and in addition need to be synchronized, by adding usability engineering activities to the software engineering process models. In order to identify integration points between the two disciplines the authors believe examinations on each level of the hierarchy have to be performed: On the level of standards it has to be shown that aspects of software engineering and usability

198


engineering can coexist and can be integrated, even on this abstract level. On the level of process models it has to be analyzed how usability engineering aspects can be incorporated into SE Models. And on the operational level’s activities, a close collaboration should be achieved, resulting in reasonable additional organizational and operational efforts. 2.1 Common Framework on the Level of Standards In previous work the authors already performed an initial analysis on the first two hierarchy levels [13] of Standards and Processes. First integration points on the level of Standards could be found in comparing the software engineering standard ISO/IEC 12207 with the usability engineering standard DIN EN ISO 13407. Therefore, standards’ detailed descriptions of processes, activities and tasks, output artifacts, etc. have been analyzed and similarities were found. Based on common goals and definitions, the single activities of the standards could be consolidated as five common activities: Requirement Analysis, Software Specification, Software Design and Implementation, Software Validation and Evaluation. These common activities represent and divide the process of development from both, a software engineering and a usability engineering point of view. The five common activities can be seen as basis for integrating the two disciplines on the overarching level of standards: a common framework for software engineering and usability engineering activities. The authors used the framework to set the boundaries for the next level of analysis in the hierarchy: the level of process models. 2.2 Ability of SE Models to Create Usable Products Based on the common framework different SE Models were analyzed with regards to see how they already support the implementation of the usability activities. Thus, an assessment of SE Models with the goal to identify the ability of SE Models to create usable software solutions was performed. In order to create valuable results, the authors defined several tasks to be performed. First, adequate criteria for the assessment of the SE Models needed to be defined, by which unbiased and reliable statements about process models and their ability to create usable software can be made. The assumption was that based on the results of the assessment specific recommendations can be derived to enrich the SE Models by adding or adapting usability engineering activities, phases, artifacts, etc. By doing this, the development of usable software on the level of process models can be guaranteed. Furthermore, hypothesizes about the process improvements can be made for each recommendation which then can be evaluated on the Operational Process level. Therefore, case studies will be identified based on which the recommendations can be transferred in concrete measures. These measures can then be evaluated by field-testing to verify their efficiency of user-centeredness of software engineering activities. In summary, four types of analyses need to be performed: two on the level of process models and two on the operational process level. The four respective analysis topics differ in their proceedings as well as their expected results:


-

-

-

199

Operationalization of the base practices and the identification of criteria for the assessment von usability engineering activities and the corresponding deliverables. Assessment of SE Models, based on the identified criteria and the derivation of adequate recommendations. Inspection of case studies with regards to the recommendations and the derivation of specific measures for the implementation of UE activities in SE Processes Evaluation of the measures in practice

For each of the analyses several methods can be used, some of which involve domain experts as interview partners, whereas others are more document oriented. This paper focuses on the description of the performed analyses in the first topic listed above and first results on the second topic as a forecast based on the results of the first topic, i.e. the operationalization of base practices and derivation of UE criteria for the assessment. 2.3 Criteria for the Assessment of SE As the authors identified the need for assessment criteria to define the degree of usability engineering coverage in SE Models, the following section shows how these criteria were gathered and what results were derived and to be expected from further research activity. To obtain detailed knowledge about usability engineering activities, methods, deliverables and their regarding quality aspects, the authors analyzed the DIN EN ISO 13407 and the ISO/PAS 18152. In addition to the identified common activities of the framework within the human-centered design activities, ISO/PAS 18152 defines detailed Base Practices that specify the tasks for creating usable products. These base practices have been used as a foundation to derive requirements that represent the common activities’ usability engineering perspective. The quantity of fulfilled requirements for each activity of the framework informs about the level of compliance of the SE Model satisfying the base practices and therewith the usability view of activities. For each base practice the authors determined whether the model complied with it or not. In a second iteration of the gap-analysis expert interviews will lead to more detailed criteria in order to assess the corresponding SE Models more specific. Additionally the completeness and correctness of the base practices and humancentered design activities as defined in the ISO/PAS 18152 itself needs to be verified. The detailed descriptions of the base practices have been used to pre-structure the collection of criteria and for the expected results. Since the base practices are structured based on activities, methods, and deliverables the authors used this to prestructure the expected results. Additionally expected results are criteria about the quality aspects of the overall process. The results will be separated based on the specific human-centered design activities and those that are more generic and overarching. This results in a matrix of activities & methods, content & deliverables, roles & quality aspects in relation to the human-centered design and overall activities as shown in Table 1.

200


of use

Produce Design S l i Evaluation

User Requirement

Context of use

Overarching Aspects

Table 1. Structure and orientation of criteria for the assessment of software engineering models

Activities & Methods Deliverables & Content Roles Quality Aspects

Based on this, several evaluation questions have been gathered, focusing on the abstract level of process models. The goal is the to define overarching criteria and not evaluate the concrete accomplishment within one specific model or particular procedure, e.g. questions about overlaps of activities, phases, deliverables, or questions about the relevance of specific activities or roles within a process model. According to the questions and based on the initial structure, as shown in Table 1, the authors performed the first analysis, the documentation of existing SE Models (Linear Sequential Model, Evolutionary Software Development, the Spiral Model by Boehm and the V-Model) and for the second analysis created an interview guideline that is currently used as basis for the expert-interviews. Initial results of theses analyses are described in the following section.

Across Activities

0%

Evaluation of Use

0%

Produce Design Solutions

User Requirements

Linear Sequential Model

Context of Use

Table 2. Summary Results of the gap-analysis, showing the sufficiency of SE Models in covering the requirements of usability engineering (based on the ISO/PAS 18152; HS 3)

0 % 60 %

13 %

Evolutionary Development

13 % 40 % 40 % 80 %

39 %

Spiral Model

13 %

V-Modell

88 % 80 % 40 % 100 % 78 %

Across Models

28 %

80%

50 %

40 % 100 % 52 %

30 %

85 %

3 Results As a result of the first analysis of selected SE Models first general statements can be made: The overall level of compliance of the SE Models satisfying the base practices and therewith the usability view of activities, is rather low (Table 2). For none of the


201

SE Models all base practices of the ISO/PAS 18152 are fulfilled. However, there is also a large variability in the coverage rate between the SE Models. For example, the V-Model shows a very good coverage for all modules except for lower compliance of the activity HS 3.3 Produce Design Solution criteria, whereas the Linear Sequential Model only fulfills a few of the HS 3.4 Evaluation of Use criteria and none of the other modules. Evolutionary Design and the Spiral Model share a similar pattern of findings, in that they show little coverage for Context of Use, medium to good coverage of User Requirements, limited coverage for Produce Design Solution and good support for Evaluation of Use activities. By looking at the summary of results and comparing the percentage of fulfilled requirements for each SE Model, it shows that the V-Model has a better compliance than the other models and it can basically be regarded to be able to produce usable products. In the comparison, the Linear Sequential Model cuts short, followed by Evolutionary Development and the Spiral Model. Both in the overview and the detail findings it shows that the emphasis for all SE Models is laid on evaluation (Evaluation of Use), especially in comparison to the remaining activities. The lowest overall coverage could be found in Context of Use and Produce Design Solution. Based on the relatively small compliance values for the Context of Use (28%), User Requirements (50%) and Produce Design Solutions (30%) activities across all SE Models, the authors see this as an indicator that there is only a loose integration between usability engineering and software engineering. In summary, the results confirmed expectations of the authors, showing the low level of integration between both disciplines on the level of the overarching process models. As expected it becomes apparent that there is a dire need to compile more specific and detailed criteria for the assessment of the SE Models. As the analysis showed, the base practices currently give too much leeway for interpretations. In addition it turned out that the dichotomous assessment scale (in terms of “not fulfilled” or “fulfilled”) is not sufficient. A less granular rating is necessary to evaluate the process models adequately. Performing the documentation analysis of the SE Models produced first insights but it turned out that the documentation is not comprehensive enough to ensure the validity of the resulting statements. In the second analysis the authors plan to conduct more specific criteria will be determined, according to the previously described structure. These will be compiled in semi-structured interviews with experts from the domain of usability engineering. The criteria focus on the activities defined in the module Human-centered design (ISO/PAS 18152) and their respective base practices and specifics in: fundamental activities, basic conditions and constraints, relevance of activities, resulting outcomes, type of documentation, and respective roles and responsibilities. Beyond this, a substantial focus is put on the quality aspects based on the activities, deliverables, roles and the super ordinate model. The criteria will be evaluated concerning questions like: -

How to identify good activities? How to identify good results or deliverables? How to identify appropriate Roles What are properties/characteristics for the relevance and frequency? How could the progress of an activity or deliverable be measured and controlled?

202


Based on these criteria the authors expect to be able to get evidence, which activities, deliverables and roles are necessary to ensure the development of usable products from the experts’ point of view. Relevant factors of influence could be for instance: „When will an activity A not be performed, and why?” or “Under which circumstances will an activity A be performed completely, when just partly?” Additionally, criteria are to be raised, based on which the progress of the process could be measured. However, the central point will be collection of criteria that focus on quality aspects of the activities, deliverables and roles as well as their relevance. It is expected that the results can not just be used as more detailed criteria for the assessment but will also provide evidence on the level of completeness of the ISO/PAS 18152 and surface potential areas of improvement.

4 Summary and Outlook The approach presented in this paper was used to identify integration points between software engineering and usability engineering on the level of process models. The authors analyzed four different software engineering process models to identify their ability to create usable products. The authors synthesized demands of usability engineering and performed an assessment of the models. The results provide an overview about the degree of compliance of the models with usability engineering demands. It turned out that there is a relatively small compliance to the usability engineering activities across all software engineering models. This is an indicator that there only little integration between usability engineering and software engineering exists. There are less overlaps between the disciplines regarding these activities and therefore it is necessary to provide suitable interfaces to create a foundation for the integration. The authors identified the need to compile more specific and detailed criteria for the assessment as well as a more differentiated dichotomous assessment scale to evaluate the process models appropriately. Therefore the authors introduced a structured approach of how they will perform the follow-up analysis. The more detailed criteria will be compiled in semi-structured interviews with experts from the domain of usability engineering. Thereby, a substantial focus is put on the quality aspects based on the activities, deliverables, roles and the super ordinate model. Based on these criteria the authors expect to be able to make statements about their necessity and the relevance to ensure the development of usable products from the experts’ point of view. It is expected that the results could not just be used as criteria for the assessment of software engineering models but could also define the demands of usability more precisely and to give evidence about the completeness and potential extension areas of the ISO/PAS 18152.

References 1. Boehm, B.: A Spiral Model of Software Development and Enhancement. IEEE Computer 21, 61–72 (1988) 2. Cooper, A., Reimann, R.: About Face 2.0. Wiley, Indianapolis, IN (2003)


203

3. DIN EN ISO 13407. Human-centered design processes for interactive systems. CEN European Committee for Standardization, Brussels (1999) 4. Faulkner, X.: Usability Engineering, pp. 10–12. PALGARVE, New York (2000) 5. Glinz, M.: Eine geführte Tour durch die Landschaft der Software-Prozesse und – Prozessverbesserung. Informatik – Informatique, pp. 7–15 (6/1999) 6. IBM: Ease of Use Model. (11/2004) Retrieved from http://www-3.ibm.com/ibm/easy/ eou_ext.nsf/publish/1996 7. ISO/IEC 12207. Information technology - Software life cycle processes. Amendment 1, 2002-05-01. ISO copyright office, Switzerland (2002) 8. ISO/PAS 18152. Ergonomics of human-system interaction — Specification for the process assessment of human-system issues. First Edition 2003-10-01. ISO copyright office, Switzerland (2003) 9. KBST: V-Modell 97. (05/2006), Retrieved from http://www.kbst.bund.de 10. Larman, C., Basili, V.R.: Iterative and Incremental Development: A Brief History. Computer 36(6), 47–56 (6/2003) 11. Mayhew, D.J.: The Usablility Engineering Lifecycle. Morgan Kaufmann, San Francisco (1999) 12. McCracken, D.D., Jackson M.A.: Life-Cycle Concept Considered Harm-ful. ACM Software Engineering Notes pp. 29–32 (4/1982) 13. Nebe, K., Zimmermann, D.: Suitability of Software Engineering Models for the Production of Usable Software. In: Proceedings of the Engineering Interactive Systems 2007, HCSE (IFIP Working Group 13.2, Methodologies for User Centered Systems Design). Lecture Notes In Computer Science (LNCS), Springer, Heidelberg (in prep. 2007) 14. Pagel, B., Six, H.: Software Engineering: Die Phasen der Softwareentwicklung, 1st edn. vol. 1. Addison-Wesley Publishing Company, Bonn, D (1994) 15. Patel, D., Wang, Y. (eds.): Annals of Software Engineering. Editors’ introduction: Comparative software engineering: Review and perspectives, vol. 10, pp. 1–10. Springer, Heidelberg (2000) 16. Royce, W.W.: Managing the Delopment of Large Software Systems. In: Proceedings IEEE, pp. 328–338. IEEE, Wescon (1970) 17. Seffah, A. (ed.): Human-Centered Software Engineering – Integrating Usability in the Development Process, pp. 3–14. Springer, Heidelberg (2005) 18. Sommerville, I.: Software Engineering. 7th ed. Pearson Education Limited, Essex, GB (2004) 19. Woletz, N.: Evaluation eines User-Centred Design-Prozessassessments - Empirische Untersuchung der Qualität und Gebrauchstauglichkeit im praktischen Einsatz. Doctoral Thesis. University of Paderborn, Paderborn, Germany (4/2006)

Activity Theoretical Analysis and Design Model for Web-Based Experimentation∗ Anh Vu Nguyen-Ngoc Department of Computer Science University of Leicester United Kingdom [email protected]

Abstract. This paper presents an Activity Theoretical analysis and design model for Web-based experimentation, which is one of the online activities that plays a key role in the development and deployment of flexible learning paradigm. Such learning context is very complex as it requires both synchronous and asynchronous solutions to support different types of interaction, which can take place not only among users but also between the user and the provided experimentation environment, and also between different software components that constitute the environment. The proposed analysis and design model help clarify many concepts needed for the analysis of a Webbased experimentation environment. It also represents an interpretation of Activity Theory in the context of Web-based experimentation. Keywords: Analysis and Design model, Activity Theory, Web-based experimentation.

1 Introduction Since about a decade, several engineering departments in colleges and universities have faced the logistical matters of educating more students with the same resources while maintaining the quality of education. There is also an increasing need to expand the diversity of laboratory resources provided to students. Within this challenging context, the flexible learning paradigm [1, 2] could be seen as an appropriate solution. It refers to a hybrid-learning scheme in which the traditional courses are combined with online activities. In engineering education, Web-based experimentation is one of the online activities that plays a key role in the development and deployment of such flexible paradigm. In fact, since the last decade, several institutions have already exploited the usage of the Web infrastructure and developed their experimentation courses in engineering curricula using this medium as a main infrastructure. However, Web-based experimentation is a very complex socio-technical setting [2-4]. As a consequence, understanding the main factors that constitute such particular learning context is an essential step in finding solutions to support and sustain interaction, ∗

Most of this work has been carried out while the author was with the Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.


Activity Theoretical Analysis and Design Model for Web-Based Experimentation

205

collaboration and learning processes. Though several Web-based experimentation environments have been developed, such as [5-9], so far, there is still no analysis and design model that is really capture the main characteristics of such learning context, and provide useful guilds for analysts, designers, and developers to design and develop Web-based experimentation environments. This paper proposes such a model. Section 2 of this paper discusses the major characteristics of Web-based experimentation. Section 3 presents a typical scenario of interaction and collaboration processes in such learning context. The Activity Theoretical analysis and design is discussed in Section 4. Finally, section 5 concludes the paper.

2 Characteristics of Web-Based Experimentation Although there have been several works on Web-based experimentation environment design, development, and deployment, there is still no clear standard for determining the main characteristics of the collaborative hands-on activities in such learning environments. In this section, a list of these essential characteristics is discussed. 2.1 Hands-On Activities Support First of all, the content delivered in engineering courses that rely on Web-based experimentation includes not only static documents, textual presentations, or video presentations but also computation, graphics generated on-the-fly, real devices measurements, and the like. Web-based experimentation can include virtual and/or remote laboratory resources. In fact, real experimentation is still irreplaceable in engineering curricula since students need to have contact with the apparatus and materials, and that labs should include the possibility of unexpected data occurring as a result of material problems, noise, or other uncontrolled real-world variables. Virtual and remote laboratory resources provide a complement means to carry out real experimentation online and/or at distance. A typical virtual laboratory resource is an interactive experiment that relies on a simulation engine. A typical remote laboratory resource is a physical experimental system that is equipped with the necessary facilities to enable Webbased measuring, monitoring, and manipulation [2]. 2.2 Components Integration Due to the complexity of hands-on work [2-4], several components may need to be integrated into the same experimentation environment. These components should help support the whole experimentation process from the preparation stage, to the design stage, to the experiment stage, and to the experimental analysis stage. Each component provides a working space or working console where students carry out some dedicated tasks to solve a particular problem for a complete experiment. Since the output from one stage may serve as the input for the next stages, there should be some linkages between these components. A comparative study have been carried out

206

A.V. Nguyen-Ngoc

in various engineering courses at the EPFL to determine the most common service spaces that may well require the supporting components for completing typical experimentation assignments [2, 10]. Each service space can be supported by one or several components developed using different technologies. These spaces are as follows • The first space that needs to be supported of course relates to the experimentation itself. This can be regarded as the interaction part of the environment. It enables the actual realization of experiments by interacting with virtual laboratory or remote laboratory resources. • The second space that needs to be supported concerns with tools to carry out interactive design and analysis activities related with the experiment. • The third space of a Web-based experimentation relates to the collaboration support. This is where the professors and the teaching assistants can interact with the students to monitor their progress and to guild their learning activities; where students interact with each other to get the tasks done. • Furthermore, a Web-based experimentation environment may also need to integrate some supplementary components, which give access to a number of pieces of information, including relevant reminders or links presenting the underlying theory, experimental protocol, and description of the environment, including the laboratory resources and the environment features that are used in the experiment. Obviously, depending on the experimental protocol, a Web-based experimentation environment may not need to integrate all of these components. 2.3 Multi-session Experiment Typical Web-based experimentation sessions are mediated by teaching assistants and also by professors responsible for the course. There may be some face-to-face sessions, in which the students work in the laboratory with the presence of the professor and/or teaching assistants, but most of the learning activities take place in flexible sessions. Actually, multi-session experiments are an important factor that helps facilitate students to perform experimentation in a flexible way. In a Web-based experimentation environment, students should be able to carry out several trial-anderror experiments that help them reinforce their understanding of theoretical lectures and physical phenomena in a framework where errors are neither penalized nor hazardous. Ideally, a Web-based experimentation environment should be able to allow students to reconstruct the whole or some parts of the experiment and perform it as many times as they want. Hence, the experimental parameters need to be stored somehow for further reconstruction or reuse of that experiment. To support multi-session experiments carried out by a single student or by groups of students; many issues need to be addressed, such as the continuity of interaction [11] that allows students to interact smoothly and uninterruptedly with the experimentation environment and the laboratory resources, and also with other students. Several asynchronous and synchronous collaboration facilities need to be considered as well.


207

2.4 Types of Collaboration The importance of collaboration among students has been recognized since a long time in education, especially in distance and online education. According to social constructivists, learning is a social construct mediated by language via social interactions [12], where collaboration among learners is a critical concept [13]. In addition, hands-on activities are usually conducted in small groups [2]. Consequently, Web-based experimentation environments should integrate components that help students to actively create their own contextual meaning, rather than passively acquire knowledge structures created by others [3]. These components should facilitate students to interact with their peers, discuss their positions, form arguments, reevaluate their initial positions, and negotiate meaning. Students become responsible for learning as they collaborate with one another, with their environment, and with their teaching assistants and professors. Both synchronous and asynchronous collaborations should be supported in a Web-based experimentation environment. 2.5 Discretionary of Collaboration The autonomy of individual students while working in flexible modalities means that collaboration with other students is, in many cases, not strictly required. In other words, the student can collaborate with other students only when they believe that it is worth to do so. In fact, students participating in the course using the provided Webbased experimentation may enrol in different other courses. This means that they may have different study schedule, and they may carry out different tasks at different times. These variations can make it difficult to find some common times when students can collaborate. As a consequence, even working in groups, students usually work together, either in face-to-face or distance modes, when a due date is approaching, e.g. before the laboratory sessions, or before the laboratory test. Of course, there exist also other modes of group working. Our experience in observing the students’ work shows that there are some “well-organized” groups, in which the members clearly divide the tasks for each one. There are also many cases in which only one member of the group does the “whole job”. However, depending on the experimental protocol, more precisely speaking, on how the laboratory test is carried out, sometimes it is difficult for the teaching assistants and professors to recognize such problems. The Web-based experimentation environment should allow students to switch between single working mode and collaborative working mode. This switching should be smooth and transparent as much as possible from the student’s point of view.

3 Typical Scenario of Interaction and Collaboration Process Fig. 1 illustrates the interaction and collaboration process happened in Web-based experimentation in which collaborative actors perform a chain of activities to obtain an outcome, i.e. to acquire knowledge from the course (see 1 in the figure). Collaborative actors are, for instance, student groups enrolled in the course and are using the environment to carry out their experimentation. In hands-on sessions, the group size is usually small (consisting of 2 or 3 students) [2, 3]. These actors share their common

208

A.V. Nguyen-Ngoc

background, divide tasks, coordinate their work, and collaborate with each other based on some social rules to get the work done. To support the coordination and communication between these actors, several collaboration and communication facilities may be needed and integrated into the experimentation environment.

Fig. 1. The interaction and collaboration process of Web-based experimentation

These actors interact with various (software) objects displayed in the GUI of the Web-based environment (2). For example, a student uses the computer mouse to modify the parameters of an electrical drive, which are displayed in the GUI as scrollbars. These objects are actually the representations of software components (3), which may be located on different servers. The interaction between the actors and the objects may change the status and the behaviours of the components, as well as may invoke the interaction and/or the internal calculating process of these components (4). In its turn, the interaction between the components at the system level facilitates the interaction process at the user level, which may serve for the next activities of students (5). To summarize, this scenario depicts the complexity of the context in which • Students can collaboratively carry out their hands-on activities in a flexible way. • The online learning community is heterogeneous and its members may have different roles. The coordination and collaboration among the members of the community may be defined by different social protocols and rules. • The Web-based experimentation environment itself may integrate a large variety of software components, which constitute what we call the system level. These components are represented by several objects displayed in the interface of the provided experimentation environment. • The interaction process conducted by the actors, which externally and internally happens in both user and system levels, allow the actors to acquire the outcome for the course.


209

4 Activity Theoretical Analysis and Design Obviously, the complexity of Web-based experimentation is caused by several social and technical factors. As a consequent, when studying the collaborative hands-on work in Web-based experimentation, the interaction and collaboration process should be analyzed as a whole, not as any of its constituent entities in separation, since there are close, active, reciprocal, and bidirectional interdependences among these entities. Actually, the importance of Activity Theory as a framework for conceptualizing human activities has also been studied since a long time by the CSCW and CSCL communities [14, 15]. In an influenced paper published in 1999, Jonassen and RohrerMurphy also argued that Activity Theory has provided a powerful instrument to analyze the needs, tasks, and outcomes for designing constructivist learning environments [16]. They proposed a framework that helps analyze and design a constructivist learning environment. However, one of the most difficult problems for the analysts and designers is how to apply these abstract concepts to a real world problem, e.g. to design a real Web-based experimentation environment that supports online collaborative hands-on activities. In this section, the Jonassen and Mohrer-Murphy‘s framework is adapted to introduce a mapping and interpretation from the abstract concepts introduced in Activity Theory into the real context of Web-based experimentation. The constructed framework would help understand and clarify the context of Web-based experimentation from an Activity Theoretical perspective. 4.1 Activity Theory Concepts 1.

Subject: There could be several types of subjects in the context of Web-based experimentation. Following are the most important ones a. Professor: is someone who is in charge of the course. His/her role is to design and construct the course pedagogical scenario, to guide students in their learning process during the whole course, and also to evaluate the students’ progress and their acquired knowledge. b. Teaching assistant: is someone who may play a very important role in distributing knowledge in the class. The teaching assistant would help students during hands-on sessions. His/her role could also be to support the course management and administration. c. Student: the main subject using the environment, who enrols in the course for carrying out experimentation using the environment provided. d. Technician: is responsible for the configuration of physical equipments in the laboratory. e. Evaluator, research assistant: is responsible for assessing the effectiveness and efficiency of the environment, and/or proposing further improvement, development, and the like.

2.

Object: Different objects can be defined. These different objects are transformed during the course to obtain different outcomes a. Long-term object: can be composed of both physical and mental products. The physical object could be the deliverables obtained after

210

A.V. Nguyen-Ngoc

finishing the course, e.g. a course report, or a set of adequate parameters to obtain a stable state of the system. The mental product refers rather to the knowledge, the concepts, or the perceptions of students on a particular engineering domain. b. Short-term object: objects for each experimental sessions, or modules. Deliverables represented short-term objects could be a report, a mathematical problem to be resolved, a hands-on module to be realized, and the like. Short-term objects can also be the knowledge obtained after finishing these modules. 3.

4.

5.

6.

Community: All professors, assistants, students, technicians using the environment for the course form an online learning community, in which the student is the central character and the professors, teaching assistants are usually the central source of knowledge distribution. Rule: Several rules can be defined for a course depending on the course requirements, the laboratory policies, and on the pedagogical scenarios. The task organization among the members of the same groups normally relies on a social protocol or a compromise established within the group or between groups in the community. In hands-on sessions, experimental protocol is what the professors define to guild the students’ hands-on steps. Tool, artefact: Tools that need to be integrated should support and reflect the major characteristics of Web-based experimentation as presented in the contextual model. Various tools may be required. The analysts and designers should also consider the question of developing the tools themselves or integrate those having been developed by other institutions. Division of labour: This also means the division of tasks between the members of the learning community. The division of labour is actually dependent upon the learning community and the rules defined for that community.

4.2 Activity Structure This part involves in a definition of the activities that engage the subject. Each activity could be decomposed into its component actions and operations. However, the definition of the activity structure and its granularity is solely based on the pedagogical scenarios as well as on the objectives of the environment evaluators. In a practical course, an activity is usually equated with the task students need to complete [11]. For each activity (or task), actions are the full meaningful experimental steps that need to be realized. Operations are what students do unconsciously by interacting with the environment to complete each step. In an automatic control laboratory course, for example, a task could be “Modelling and control of an electrical drive”. For each task, several actions need to be realized. These actions have an immediate, pre-defined goal, such as “preparing the pre-lab”, “manipulating the physical drive”, or “analyzing the experimental result”. Actions consist of a chain of operations, such as “moving the parameter scrollbar to increase or decrease the value of a parameter of a studying electrical drive”.


211

4.3 System Dynamism This part investigates the interrelationships between the components that are integrated into the environment. Actually, the interrelationships are dependent upon the pedagogical scenarios defined by the professors. The dynamics of the relationship between members of the community, who use the environment for their learning activities, depends on the social protocol, the division of labour established, and the rules set for the course. Usually, in hands-on sessions, the experimental protocol is pre-defined by the professors and always available for students to follow; hence, for students, the task complexity is mostly dependent upon how they carry out the tasks following the steps defined in the experimental protocol. In addition, the “objectives of work” is also pre-defined, thus collaborative activities are usually not necessarily up to the co-construction level of activity [17]. Fig. 2 summarizes the Activity Theoretical analysis and design model, in which all major elements of Activity Theory are mapped into the context of Web-based experimentation. In other words, the proposed model illustrates our Activity Theoretical vision on the analysis and design of Web-based experimentation environments. Actually, it can also be used as an independent guidance for analysts and designers to analyze and design Web-based experimentation environments. In fact, this model has facilitated the design and development of the eJournal, which is an electronic laboratory journal integrated into the eMersion experimentation environment. In turn, the iterative design and development of the eMersion environment and the eJournal have validated the reliability and usefulness of the proposed model. The eMersion environment has been used in several automatic control courses offered by the EPFL since several academic semesters. It has also been deployed and tested in other European institutions such as the University of Hanover in Germany, the UNED University in Spain and the Ecole Nationale de Mines St. Etienne in France. More information about the design and evaluation of the eMersion and eJournal could be found in [2, 3, 10, 18, 19].

Fig. 2. Activity Theoretical analysis and design model

212

A.V. Nguyen-Ngoc

5 Conclusion This paper presents what we call Activity Theoretical analysis and design model. It discusses the characteristics of Web-based experimentation and also introduces a typical scenario of interaction and collaboration processes in such learning context. This model shed light on many concepts needed for the design of Web-based experimentation environments. It also represents a mapping from Activity Theory to the context of Web-based experimentation. The goal of the proposed models is to capture the important aspects concerning the collaborative hands-on activities in a Web-based experimentation environment. The model could be used by a variety of users. Researchers and professors could be based on this model to conduct their study on the students’ behaviours and activities in such particular learning context. Environment developers could use the model to facilitate their development tasks as the model focused already on the most relevant issues of the domain. And the developers could use the model to structure the environment in a coherent way. Acknowledgments. This work would not have been finished without the invaluable support from the eMersion team, EPFL.

References 1. Holmberg, B.: Theory and practice of distance education, Routledge, London (1995) 2. Gillet, D., et al.: The Cockpit, An effective metaphor for Web-based Experimentation in engineering education. Int. Journal of Engineering Education, 389–397 (2003) 3. Gillet, D., Nguyen-Ngoc, A.V., Rekik, Y.: Collaborative Web-based Experimentation in Flexible engineering education. IEEE Trans on Education, 696–704 (2005) 4. Feisel, L.D., Rosa, A.J.: The role of the laboratory in undergraduate engineering education. ASEE Journal of Engineering Education (2005) 5. Böhne, A., Faltin, N., Wagner, B.: Synchronous tele-tutorial support in a Remote laboratory for process control. In: Aung, W., et al. (eds.) Innovations 2004: World Innovations in Engineering education and research. iNEER in cooperation, pp. 317–329. Begell House Publishers, New York (2004) 6. Schmid, C.: Using the World Wide Web for control engineering education. Journal of Electrical Engineering, 205–214 (1998) 7. Tzafestas, C.S., et al.: Development and evaluation of a virtual and remote laboratory in Robotics. In: Innovations 2005: World innovations in Engineering education and Research. iNEEER in cooperation, pp. 255–270. Begell House Publishers, New York (2005) 8. Ko, C.C. et al.: A Web-based virtual laboratory on a frequency modulation experiment. IEEE Trans on Systems, Man, and Cybernetics, pp. 295–303 (2001) 9. Sepe, R.B., Short, N.: Web-based virtual engineering laboratory (VE-LAB) for collaborative experimentation on a hybrid electric vehicle starter/alternator. IEEE Trans on Industrial Applications (2001) 10. Nguyen-Ngoc, A.V., Rekik, Y., Gillet, D.: Iterative design and evaluation of a Web-based experimentation environment. In: Lambropoulos, N., Zaphiris, P.P. (eds): User-centered design of online learning communities. Idea Group Inc, Pennsylvania, pp. 286–313 (2006)


213

11. Nguyen-Ngoc, A.V., Rekik, Y., Gillet, D.: A framework for sustaining the continuity of interaction in Web-based learning environment for engineering education. ED-MEDIA conference, Montreal, Canada (2005) 12. Vygotsky, L.S.: Mind in Society. In: The development of higher psychological processes, Harvard University Press, London (1978) 13. Jonassen, D.H., et al.: Constructivism and computer-mediated communication in distance education. The American Journal of Distance Education, pp. 7–26 (1995) 14. Kuutti, K.: Activity Theory as a potential framework for Human-Computer Interaction research. In: Nardi, B.A. (ed.) Context and Consciousness: Activity theory and Humancomputer interaction, The MIT Press, MA (1995) 15. Nardie, B.A.: Context and consciousness: Activity theory and Human-computer interaction. MIT Press, MA (1996) 16. Jonassen, D.H., Rohrer-Murphy, L.: Activity Theory as a framework for designing constructivist learning environments. Educational Research and Development, pp. 61–79 (1999) 17. Bardram, J.E.: Collaboration, Coordination, and Computer Support: An Activity Theoretical Approach to the Design of CSCW. University of Aarhus (1998) 18. Nguyen-Ngoc, A.V., Gillet, D.S., Sire, S.: Evaluation of a Web-based learning environment for Hands-on experimentation. In: Aung, W., et al. (eds.) Innovations 2004: World Innovations in Engineering education and research. iNEER in cooperation, pp. 303–315. Begell House Publishing, New York (2004) 19. Nguyen-Ngoc, A.V., Gillet, D., Sire, S.: Sustaining collaboration within a learning community in flexible engineering education. In: ED-MEDIA conference. Lugano, Switzerland (2004)

Collaborative Design for Strategic UXD Impact and Global Product Value James Nieters1 and David Williams2 1

255 W Tasman Ave, San Jose, CA 95134- PhD, 2 934 Nanjing West Road, Suite 505, Shanghai, 20041 China [email protected], [email protected]

Abstract. Experts in the field of HCI have spoken at length about how to increase the strategic influence of User Experience Design (UXD) teams in industry [2] [5]. Others have talked about how to build a usability or user experience team in industry [3], and others have offered courses in managing HCI organizations [1] [7]. At the same time, other experts have spoken about the importance of making products usable and desirable for international audiences [9] and the value of “offshoring” their usability efforts [8]. Few though have discussed the value and process for an embedded UXD Group functioning as an internal consultancy to different product teams within their organizations. This paper presents both how the consultancy model can increase the strategic effectiveness of UXD inside a company, and how, by leveraging partners internationally, such groups can broaden the usefulness, usability, and desirability of their products to a more global audience. Keywords: User Experience Design, Organizational development, User Experience Teams, Management, Internationalization.

1 Introduction The Cisco UXD Group grew from a single person to almost 60 people. As this group increased its scope, it evolved through several traditional funding and organizational models (central funding, client-funding, distributed teams). In the more traditional organizational model, UXD resources are typically spread across multiple projects in order to support the design of all (or most) products across the company [1] [5] [7]. Most recently, this team has settled on the consultancy model. In this model, crossfunctional teams of UXD experts combine to deliver an exceptional experience on just a few projects. The traditional models, their benefits, and challenges include: • Centralized funding model, with a large budget from one centralized organization. In this model, one senior leader can manage the entire organization, which permits consistency of practice and process, as well as flexibility of UXD resources across the organization. The major challenge is that a central organization becomes a target for reduction or elimination, because it is such a large cost center. Also, in a highly decentralized organization (such as Cisco), J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 214–222, 2007. © Springer-Verlag Berlin Heidelberg 2007

Collaborative Design for Strategic UXD Impact and Global Product Value

215

leaders in each division may request that UXD resources working on their project report directly to them. • Client-funded model, where individual business units fund a central team that provides UXD resources to their teams, and one central UXD organization manages these people. The benefits of this model are similar to the central model. In addition, the central organization does not become a cost center because other divisions pay for UXD resources. However, managers in each division may feel that UXD practitioners who are not part of their organization are not core or central to their business—and they can decline to pay for the individuals at any point. This challenge becomes more likely when managers need to reduce headcount and do not want to eliminate the individuals whom they “own” (who report to them). • Distributed model, where there is no central UXD group, but UXD practitioners (and smaller groups) report directly to the divisions for the products on which they work. One benefit of this model is that such people are viewed more as “insiders,” as part of the team. While an increasing number of companies are using this model, it poses many challenges for the UXD groups and their influence. There is often no explicit sharing of resources or processes across UXD groups, and destructive competition can arise. Unless each UXD group is large enough, practitioners can end up reporting to a manager who does not understand the value of the UXD function. In addition, without a central UXD group, there is no team responsible for UXD process, standards, or infrastructure. At Cisco, these more traditional organizational structures met with some success. One group within the centralized model was able to show a ROI of more than 10x, or $50 Million USD annually. However, $50 Million in a company that grew from $4 Billion to >$30 Billion from 1999 through 2006 was barely noticed. Attempting to improve the usability, usefulness, and desirability of too many products at one time diminished the Cisco UXD group’s ability to gain the sustained support of senior executives. The Cisco UXD group needed a different model. Senior leaders at Cisco, and other companies, both in mature and emerging markets, are held responsible for steep revenue growth. As such, they are in search of the next “advanced technology” (AT). AT’s are disruptive innovations [6] that differentiate one company from its competition, resulting in large revenue increases. To become strategically relevant, the Cisco UXD team needed to deliver disruptive innovation that changed the way people thought about and interacted in a domain. Therefore, these executives want to invest in groups that can drive radical differentiation. They may also invest in groups that incrementally increase revenue or decrease costs (such as prior Cisco usability teams), but they are likely to invest the most in groups who prove that they can stimulate disruptive innovation [6]. Attempting to improve the usability, usefulness, and desirability of too many products at one time diminished the Cisco UXD group’s ability to gain the sustained support of senior executives. The Cisco UXD group needed a different model, so it could increase revenue geometrically instead of incrementally. To influence a complex-systems company [6] such as Cisco, the UXD Group needed an ROI of 100x to 1000x.

216

J. Nieters and D. Williams

2 Enter the ‘External Consultancy Model’ Within the areas of product and interaction research, design and testing, independent design studios have flourished in mature markets such as the US, Europe and South Korea (IDEO, Fitch, Razorfish). Now, a new breed of international and cost-effective design studios such as Asentio Design are developing business from bases in emerging markets such as China or India. Asentio Design flourishes due to its ability to allocate multi-function design teams to chosen client projects without being constrained by processes and corporate politics experienced by design teams within companies. By capitalizing on its geographic and linguistic context is also able to act as a design bridge between clients in mature markets and ODM/OEM design teams in emerging markets. This model has been referred to as “Collaborative Design.” [9] With such companies in mind, the Cisco UXD Group is able to act like an external design firm. Instead of assigning one UI designer to one or even multiple projects, the Cisco UXD Group now assembles highly focused teams comprising multiple crossfunctional experts to support speedy innovation on carefully selected products. These experts include user researchers, interaction designers, visual designers, developers, and industrial designers as necessary to deliver a superior user experience in a very short time. The consultancy model has the additional advantage of placing the UXD group outside of the organization, allowing freedom of decision-making and objectivity when selecting projects to pursue. Following this model, the group can focus intensively on the five or six most strategic products, and work with teams truly interested in their expertise. Since converting to the Focus Team model, senior leaders recognize that the UXD Group’s contribution to revenue increased to more than $2.5 Billion! Such impact has been difficult to ignore; one result is that Cisco’s new motto is “Lead the Experience.” Cisco executives now recognize that the experience itself is the next “advanced technology.” 2.1 Engagement Model for Successful Focus Teams The Internal Consultancy Model is not ideal in every environment. For it to succeed, UXD management must: 1. Only choose worthwhile projects where measurable opportunity exists for demonstrable impact, and where management is willing to give credit to the UXD Focus Team. 2. Merge each UXD Focus Team into the Product Development Team with clearly delineated roles. 3. Adhere to best practices by following a clearly defined process, with well-defined entry and exit criteria. 4. Choose Focus Team members carefully. 5. Follow through to demonstrate impact.


217

2.1.1 Choosing Worthwhile Projects While it is a shame to forego UXD on smaller projects, the point is to dedicate resources where they will have the most effect—we must pick our battles wisely. To take this metaphor a bit farther, a classic military strategy is to focus overwhelming resources on a single target. Then, when success has been achieved, move to the next target. This model can apply to UXD efforts: Shouldn’t any UXD manager make sure that critical projects are fully resourced, even if it means neglecting other projects? The alternative is to be spread desperately thin, resulting in average improvements on most projects, rather than disruptive innovation [6] on a few projects. Choosing the right projects also includes: 1. Conducting an Opportunity Review before agreeing to commit resources, to ensure that the product team is receptive and executives recognize the problem. The product team must agree that their success requires a UXD Focus Team. 2. Generate a Project Brief, a statement of work that describes: • • • • • • • •

Statement of value (summary) Challenges (such as competition) Solution (typically broken into multiple phases) Deliverables to be provided Resources (people) required on UXD team Detailed schedule Costs Assumptions and risks

3. Concept and Execution Commitments, in which managers from the different organizations agree to supply people and money 4. The UXD Focus Team is embedded and integrated with the product development team. 5. The project has clear start and stop points, with clear exit criteria, and is not open-ended. 6. Focus team members love to collaborate, and excel at working in teams. When UXD Group leaders decide which projects to accept, they consider the following factors: • Product team receptivity. The product development team itself has requested support from UXD, rather than had it “pushed” upon them by management. If a product team is ambivalent, the UXD group disengages. • Potential revenue or cost savings. The UXD group seeks projects on which they anticipate a minimum revenue increase of $25 Million in the first year. • Advanced technology—a new technology that has not yet been introduced to the market, so the UXD Group can make a larger impact than on legacy products (preferable, but not required). • Leveraging the Cisco UE Standards (UI guidelines and tools). If a product team does not intend to adopt the UE Standards, the UXD Group will not assign

218


resources. These standards include component libraries to help engineers quickly create code that is accessible, usable, internationalized, and branded. • High visibility. If a project is a “pet project” of a cross-functional or highly visible organization within the company, the UXD Group is more willing to accept it. • Point in the product lifecycle. If design has already begun, it is often too late to impact a product’s overall experience at a fundamental level. There are times when the UXD group agrees to work on a project through multiple iterations, starting late in one cycle to impact a subsequent release. • Realistic time-to-market demands. The Cisco UXD Group delivers value rapidly. However, if project schedules make delivering a high-quality user experience impossible, the UXD group is less likely to accept the project. While there are other factors, this list represents the most salient ingredients used in deciding to work on a project. 2.1.2 Merging the UXD Focus Teams into Product Teams with Clearly Delineated Roles UXD Focus Teams must integrate completely with the product development team during a project. They cannot function as the “icing on the product team’s cake.” In the centralized and client-funded models, product teams can more easily treat UXD team members like outsiders. In the focus team model, management and product team members have all committed to a stellar user experience. UXD Focus Teams need to be viewed as true partners with product teams, and they must treat each product team like the paying customer it actually is. The roles of the UXD Focus Team must be specifically defined, just as the roles of the product team members are. Cisco’s UXD management created a role grid that explicitly defines UXD roles and skills. The UXD Focus Team functions as the architect who provides the blueprint for the elements of the product that define the user experience, and the developers function as the carpenters who deliver to the specifications. If the product team does not agree in advance to these roles, the UCD group does not accept the project. 2.1.3 Choosing Focus Team Members Carefully To win the trust and respect of product teams, members of the UXD Group must demonstrate world-class user experience design skills. Of equal importance, UXD practitioners must have the business, teamwork, technical, communication, and advocacy skills to ensure that product teams will choose to work with the UXD Focus Team. We must understand the larger business context of our work rather than drive single-mindedly toward an ideal design goal. By approaching the design role as though the product team is a customer with a revenue target that we need to help meet, we become more strategically relevant in our organizations. Ddespite their underlying focus on business goals, corporate executives need to trust you to understand their requirements, to trust that you can help them succeed. Personal trust and accountability can be more important than ROI. UXD Focus Team members must be able to build this credibility.


219

2.1.4 Following Through to Demonstrate Impact As any consultancy would do, it is essential to make all successes visible. Future business requires such demonstrable impact. No one would engage a consultancy without a fine reputation and portfolio, and the same rules apply to internal consultancies. To achieve this visibility, the Cisco UXD Group tracks impact and records case studies on its website, as you would find on the websites of design firms in industry. The stories in this portfolio describe: • The Problem • Our Solution • The Impact If the UXD Group cannot calculate the financial impact and managers do not provide a quote attesting to the value of the UXD Group activities, that project does not appear on the portfolio website. Other managers can refer to these examples of impact and trust that the group can deliver the same value for them. 2.2 Extending UXD with a Partner Ecosystem Since Cisco’s UXD Group now behaves as an internal consultancy, it has been able to increase its influence by subcontracting to external consultants. To the customers of the UXD Group (Cisco’s product teams), there is little difference. Such collaboration with external design firms such as Asentio Design in China not only increases the internal UXD team’s capacity. It also injects emerging and global perspectives on research, design, technology, partnerships, and the connection between these domains. Such fresh perspectives are critical to stimulate the innovation required in such a company. The UXD Group soon realized it needed an ecosystem of partners who could augment staff, drive entire projects, and introduce ideas that stimulate disruptive innovation. Using external consultants has become a natural extension of the group’s engagement model. The UXD partner ecosystem includes different types of design firms for different types of design projects. Asentio Design, through its international team can provide dedicated support in all areas of the design lifecycle as well as specific market knowledge and partner relationships from its base in China. As product experiences are increasingly designed to support emerging and mature markets, models such as Asentio Design’s are crucial in allowing Cisco to collaborate with manufacturers in, and develop new products for, emerging markets. A partner ecosystem therefore provides opportunities for innovation between internal and external consultancies as well reducing cost and providing design “bridges” between markets. Many Western companies now leverage a global network of partners [4]. Should we as designers, not also leverage this business model to deliver rapid, low-cost, and globally relevant products?

220


2.3 Leveraging Intact Design Firms Is Not Offshoring It is important to distinguish “offshoring” from leveraging intact global partners. In the consultancy model, as companies hire work with external design firms, they are seeking rapid, high-quality and globally relevant engagements. This process differs from “offshoring,” which in this paper we define as a company hiring its own resources in another country in order to decrease costs. One of the key value propositions for hiring an intact design team (international design firm) is that they have already performed the hard work of seeking and hiring trusted researchers and designers. These teams have also already gone through the hard work of teambuilding. Developing an ecosystem of partners prevents leaders of UXD organizations from having to attract, hire, and retain talent, which can be even more difficult across international boundaries.

3 Examples of Impact Asentio Design and Cisco are currently working on some joint projects that we hope will change market dynamics, but because these products have not yet reached market, we look forward to reporting on these in subsequent years. From a Cisco perspective though, the company is attempting to enter emerging markets in which they have less experience to cultural expectations, norms, and challenges from a user perspective. As such, it is critical that they partner with design firms in four areas: • Design of personal experiences, which encompasses physical products, application user interfaces, out-of-box experiences and retail environments. • Consumer Research in markets where Cisco does not have a UXD research or design present. The costs of leveraging a company such as Asentio Design are significantly cheaper than setting up a presence in each such emerging market. • Globalization. As Cisco has focused more on Internationalization and Localization as they enter new markets, they need partners in-country to help test their products for these international audiences. Asentio Design has many examples of working with US and European companies and delivering world-class and culturally appropriate designs at a much lower cost than if US or European-based companies had designed them. The following case studies show examples of such international collaboration. 3.1 Case-Study 1 (US/China): Commercialization of a Military Product The client had a long history in developing products for military customers. However, they now wished to take their advanced image processing technology into the commercial market place. While building sourcing relationships in China the client was introduced to Asentio Design as a possible design partner. In order to develop their first consumer product, the scope of the client’s requirement was broad, covering consumer research, feature planning, retail & packaging, user interface design and industrial design. Asentio Design, through its international team and position in shanghai (allowing rapid travel to the client’s west coast US headquarters) conducted


221

consumer research on the US East and West coast and personal experience strategy planning through two design workshops at the client’s US-based office. Research and Strategy work was followed by a user interface and product design phase where teams in Shanghai and US worked in close collaboration with frequent face-face meetings. 3.2 Case-Study 2 (Europe/China): Research into Digital Imaging Lifestyles in China and Europe A European mobile phone OEM wished to research and compare the usage of highend camera phones in Europe and China. The company approached Asentio Design because of the latter’s partners’ long experience in researching and designing mobile personal experience across global markets, their location in China and their lower cost base compared to European design consultancies. Asentio Design, through its multilingual team was able to conduct diary studies, one-one interviews and on-line surveys in 4 languages (Mandarin, Cantonese, English, German) in Shanghai, Hong Kong, London and Germany. The on-going results of the research were presented to client teams in Europe and China, allowing wide-dissemination, and providing the stimulus for subsequent more focused research

4 Choosing a UXD Organizational Model The Focus Team model is not right for every company. Perhaps the most important factor in deciding what UXD structure to adopt for your group is management that understands what business model is appropriate for your company’s unique environment. The Focus Team or Internal Consultancy model, is best when: • The organization does not have enough UXD practitioners to support every project. • When cost is an issue. Working with a reputable design firm, such as Asentio, who knows how to deliver excellent results, provides highly qualified resources at a much lower cost. • You need to design products for international markets and need a partner who can design a culturally appropriate product. • Your team’s survival or reputation depends on delivering excellence on every project (you cannot afford to assign one designer to multiple projects, thus diluting their impact) • Product teams can “opt out” from working with you. If your company does not require every product team to follow UCD practices and work with a UXD staff, then working only with motivated teams can optimize your resources. • You can “opt out” of minor projects and focus on the highest-priority projects in the company. Trying to make small improvements on all (or most) products can dilute a UXD group’s impact.

5 Summary Personal experience design is now a truly global activity. In order for companies such as Cisco to effectively support product teams and innovate in global markets, their

222


UXD groups must look increasingly to the new breed of international design studios located in these markets. Companies such as Asentio Design can offer local knowledge allied with western design processes and experience.

References 1. Anderson, R.I.: Managing User Experience Groups. Second Offering, UCSC Extension, Cupertino, CA (2006), http://www.well.com/user/riander/mguxgrps.html 2. Bias, R.G., Mayhew, D.J.: Cost-Justifying Usability. Academic Press, INC, San Diego, CA, USA (1994) 3. Huh, B.-L., Henry.: PhD. Developing usability team in a company: Multiple perspectives from industries, In: Conference Proceedings, Asia-Pacific CHI (2006) 4. Engardio, P., Einhorn, B.: Outsourcing Innovation, BusinessWeek (March 21, 2005) 5. Innes, J., Friedland, L.: Re-positioning User Experience as a Strategic Process. CHI 2004 tutorial (2004) 6. Moore, G.A.: Dealing with Darwin: How Great Companies Innovate at Every Phase of Their Evolution. Portfolio, New York, New York, USA (2005) 7. Rohn, Janice.: Managing a User Experience Team. In: Proceedings of the Nielsen Norman Group Conference, Seattle, WA (2006) 8. Schaffer, E.: Offshore Usability: Helping Meet the Global Demand? Interactions, p. 12 (March – April, 2006) 9. Williams, D.M.L.: Co-Design, China and the Commercialisation of the Mobile User Interface, ACM “Interactions” Special Gadget Issue, October, Vol. XIII(5) (2006)

Participatory Design Using Scenarios in Different Cultures Makoto Okamoto1, Hidehiro Komatsu1, Ikuko Gyobu2, and Kei Ito1 1

Media Architecture, Future University-Hakodate, Kamedanakano 116-2 Hakodate, 041-8655, Japan 2 Fuculty of Human Life and Environmental Sciences, Ochanomizu University, 2-1-1 Otuka, Tokyo, 112-8610, Japan {maq, g2105009, k-ito}@fun.ac.jp, [email protected]

Abstract. In this paper we have examined the effects of scenarios from a participatory design and cross-cultural perspective. The Scenario Exchange Project was an international workshop using scenarios. The participants were university students from Japan and Taiwan. The impetus behind this project was the practical demand for designers to correctly understand different cultures and design products and services. We confirmed that scenarios are effective techniques for bolstering participatory design. Furthermore, we have recognized that we must create new methods for describing the lifestyle and cultural background of personas. Keywords: Scenario, Information Design, Cross Culture, Situated Design, Participatory Design.

1 Introduction The information environment has become the basis of work and everyday life for modern people. It is not enough for designers to only improve a product’s appearance or user interaction. Designers need to make a range of user and system relations. That is, the situations in which human beings find themselves in (situated design) and their experiences (experience design), should be the object of design. New design techniques are necessary in order for designers to tackle this objective which is problematic. We have been researching design techniques which use scenarios. We have been studying methods for detailing the lifestyles with users using scenarios (by means of words and images), proposal techniques based on scenarios, and information environments for sharing scenarios of observations and proposals. Information systems need to be designed from a global perspective, overcoming geographical constraints and differences in cultural backgrounds. Yet efforts at designs which tackle such conditions are still few and far between. Therefore, we implemented a design workshop using scenarios, with students from Japan and Taiwan. This paper describes the effectiveness of scenarios in participatory design. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 223–231, 2007. © Springer-Verlag Berlin Heidelberg 2007

224

M. Okamoto et al.

2 Design Process Using Scenarios Many researchers have worked on the scenario-based design approach, starting with John M. Carroll [1]. As they describe human activities and goals using unformatted symbols (words or pictures), scenarios are special in that anybody can understand and use them easily. Furthermore, they facilitate smooth, mutual understanding between stakeholders, such as requirement analysts (designers), customers, users and developers [2]. While scenarios are common tools for expressions, they are also effective tools for interaction designers or media architects. However, it cannot be said that most designers commonly use scenarios in their work. A number of innovations need to be made in order for designers to use them at work, including the situations in which they are effective, simple and effective methods for describing situations, how to elicit requirements, ways of expressing new scenarios and methods for evaluating them. We believe that the meaning behind the use of scenarios is the process by which designers and users cooperate in order to understand unknown living conditions and create new products. As advances continue to be made in information technology, we will be confronted with situations for which we have never designed before. The scenario-related tasks which we have worked on include systems that support the lifestyles of the visually impaired [3, 4], systems that assist exchanges between people from different cultural and linguistic backgrounds [5], and mobile communications services[6]. In an increasingly complex and globalizing modern society, we believe that there is a limit to the world view which individuals are capable of understanding, and new techniques for sharing situations, such as scenarios, will become more and more necessary in the future. In this paper we will report on cases which we implemented design activities using scenarios, in situations where language and culture differed.

3 Scenario Exchange Project Okamoto, Der-Jang Yu and associates held a workshop for students from different cultures to design systems using scenarios (from May 2005 to May 2006). The Scenario Exchange Project (hereafter, SEP) proposed by Yu was one in which Japanese and Taiwanese students designed new information systems through the medium of scenarios. They used the Scenario Exchange Server to share scenarios; problem scenarios with described situations and solution scenarios that address those situations. Furthermore, in order to verify factors which could not be expressed via scenarios or online communication, we held workshops in the respective countries (Table 1). There were two kinds of scenarios which we used with this technique. The first was a Problem Scenario. It describes how users applied the device and the kinds of problems that may have been confronted. This was described from field surveys and interviews. Based on the requirements extrapolated from an analysis of this scenario, a Solution Scenario, which describes how the proposed service should be used, was introduced. These scenarios are special in that they are specific and easy for anybody to understand, making them useful for communication between stakeholders, such as designers, engineers and those involved in the process [1].

Participatory Design Using Scenarios in Different Cultures

225

Table 1. Summary of Project Title: Scenario Design & Episode Exchange Term: Dec16-18, 2005 Place: Hakodate, Japan Participants: FUN 11, NCTU 18 Title: Mobile Taiwan & Ubiquitous City Term: May 7-9, 2006 Place: Taipei, Taiwan Participants: FUN 14, NCTU 18, TAU 22, NYUST 4

1st Workshop

2nd Workshop

Table 2. U-team’s and D-team’s Roles Team U-team (User) One Group D-team (Designer)

Role - User’s perspective (or assuming the role of the user) - Write a Problem Scenario - Idea development - Establishing a hypothesis

SEP constructed the Scenario Exchange Web to enable stakeholders to share scenarios and exchange opinions with each other. This Web enables the whole process, from Problem Scenario to Solution Scenario, to be recorded and shared. It is possible to post not only text, but also camera images and hand-drawn sketches. Furthermore, users and designers can exchange opinions by means of a function for commenting on scenarios. This environment makes it possible for information to be shared via unformatted symbols. Our aim was to consider how much participants who were interacting in this environment were able to understand their counterpart’s situation. The participants were students studying user interface, graphic design and product design. Every group had five to six members, a combination of Japanese and Taiwanese. Each group was further subdivided into a U-team and a D-team (Table 2). The U-team acted as observers. They had to carefully observe the condition of users (or assume the role of users) and create Problem Scenarios. The D-team had to propose ideas based on U-team’s Problem Scenario, that is, they acted as designers. All of the workshops were on the theme of proposing information-processing devices which facilitate travel. By adopting the theme of travel, the students had to take into account the local characteristics of the travel destination.

4 The SEP Process SEP comprised of two phases (Fig.1). Phase 1 was the Remote Research Phase. During this phase, U-team and D-team carried out activities from separate locations (Japan and Taiwan). U-team was the first to travel. The actions of one subject being observed were recorded using a camera or by taking notes. The observer interviewed the subject and wrote a Problem Scenario. This scenario was then uploaded onto the Scenario Exchange Web. This scenario was divided into multiple scenes and each scene was provided with Positive, Negative and Wish categories. Observers wrote down brief notes on the users’ satisfied, positive attitudes and behavior under

226

M. Okamoto et al.

Positive, mistakes and passive attitudes and behavior under Negative, and desires under Wish. Furthermore, personas were set up based on the subject observed and brief profiles were written down at the beginning of each scenario. The term personas described in this paper were virtual user profiles used in the scenario method. Usually, personas are set up based on multiple persons and the most appropriate one is determined from among them, but this step was omitted in the SEP. D-team gained an understanding of what U-team had experienced on their travels from the scenarios and proposed ideas (establishing a hypothesis) via the Scenario Exchange Web. They proactively used online communication such as the Internet, email and chat, asking questions about unclear points in the Problem Scenarios. In Phase 2, D-team actually visited U-team’s country and held a joint workshop. D-team re-experienced the situations which they had previously only been able to understand from the Problem Scenarios. U-team answered questions on differences in the social and cultural background, in particular, and facilitated D-team’s understanding. By re-experiencing, they became aware of things which they had been unable to understand with the scenario, reconsidering ideas based on the new insights and views which they had attained. Solutions Scenarios and Product Images were then created collectively.

Fig. 1. Process of Scenario Exchange Project

5 Project Results 5.1 Workshop 1: Scenario Design and Episode Exchange (December 2005, Japan) We will discuss the significance of this technique by using an example from a group that worked on the “Service Proposal for Fishermen” at the first workshop, which was held in Hakodate in December, 2005. Phase 1: Three students from Future University (U-team) went to a fishing port in the vicinity of Hakodate City for a fishing trip. The actions of one subject being observed were recorded using a camera and by taking notes. Additionally, details of the participant’s experiences were gathered at an interview, and were written up as a Problem Scenario before being uploaded onto the Scenario Exchange Web (Fig.2, Left). The Problem Scenario descriptions began with the fishing preparations on the day prior to the trip, right up until the moment when the fish that had been caught were eaten.


227

Fig. 2. Problem Scenario (Left) and Idea Sketch based on User Requirements (Right)

Fig. 3. Solution Scenario (Left) and Product Images (Right)

Four students from Chao Tung University (D-team) extracted user requirements from the Problem Scenario and proposed three Idea Sketches (Fig.2, Right). Phase 2: On the first day of the workshop, the students from Future University reexperienced fishing with the students of Chao Tung University. By actually experiencing the situation which they had previously only known via the scenario, Dteam was able to understand the enjoyment of fishing and the persona’s feelings. During Phase 1, the members of D-team assumed that hardly anybody went fishing in the winter, and thought that young people did not fish. These assumptions were at odds with the facts. Re-experiencing enabled the students to become aware of such assumptions and lead to an understanding of the persona’s intentions. On the second day, the students discussed in groups whether ideas were valid or not. U-team and Dteam cooperated with one another to create a final design summary. Students then proposed an information system that enhances the enjoyment of fishing by allowing people to compete against other fishermen with regard to the size

228

M. Okamoto et al.

of the fish they have caught using Solution Scenarios and 3D models (Fig.3). The proposed solutions provided extensive support, from a persona making preparations at a fishing tackle store to taking a fish print of the fish which they had caught as a memento. We feel that this was the result of a design that grasped the broad spectrum of the persona’s experiences. 5.2 Workshop 2: Mobile Taiwan and Ubiquitous City (May 2006, Taiwan)

．

The second workshop was held in Taiwan in May, 2006 In this workshop the roles were reversed, with the Japanese students becoming D-team and the Taiwanese students becoming U-team. We use the proposal for a “device which supports communication between people who do not understand each other’s language during a trip” as an example for discussion. Phase 1: Two students from Chao Tung University (U-team) went on a trip to Tamsui in northern Taipei. Tamsui is a historical town blessed with water and greenery. In accordance with the SEP process, they created a Problem Scenario and uploaded it onto the Scenario Exchange Web (Fig.4, Left). The Problem Scenario described subjects who were unfamiliar with Tamsui freely traveling in the town. Four students from Future University (D-team) then proposed multiple ideas (Fig.4, Right) from the Problem Scenario, from the perspective of new tourism experiences in Tamsui.

Fig. 4. Problem Scenario (Left), Ideas based on User Requirements (Right)

Phase 2: On the first day of the workshop the Japanese and Taiwanese students joined together and went on a trip to Tamsui. In Taipei, Tamsui is a leading sightseeing area with many market stalls. The group walked around sampling the local food and taking in the natural scenery, historic buildings and landmarks. As a result of re-experiencing Tamsui, D-team realized that the streets were a maze, complicated and easy to get lost in. The group then became aware that many problems arose when the Japanese and Taiwanese students communicated with each other, such


229

Fig. 5. Solution Scenario (Left) and Product Image (Right)

as when trying to fill in direct information on the map for deciding the nextdestination, or when the Taiwanese students communicated about food which they recommended through pictures. Furthermore, they focused their attention on the importance of finger pointing in these activities. The group then proposed that an IC chip be attached to the finger and a device be worn over one of the eyes (Fig.5). A canvas could then hypothetically be spread in the air and words and pictures drawn onto it, operated by metaphorical movements of the fingers. Additionally, it would also be possible to search for information using a network. These were the findings for an innovative design based on the very exchange experience of U-team and D-team.

6 Discussion 6.1 Scenario and Hypothesis Exchange In Phase 1, the Problem Scenario which U-team had created was exchanged for a hypothesis by D-team relating to it. It is thought that the Problem Scenario was useful in communicating the situation to D-team, the members of which were from a different cultural background. U-team had to condense the actual experience of the trip into the form of a scenario. Although the actions of the persona and the situations could be written into the scenario, the intentions of individual actions and the cultural meaning behind them were obscure. The persona profiles described, for example, that someone was male, 22 years old and that their hobby was skiing, but nothing more detailed. As a result, it became clear from the follow-up interviews that D-team had trouble understanding the intentions or culture, even though they were able to learn about the individual situations. Although scenarios were definitely able to relay the situation, it was difficult for them to get across factors, such as context and culture that were related to those situations.

230

M. Okamoto et al.

6.2 Re-experiencing in Workshops Re-experiencing at Phase 2 was useful for developing a more refined solution, by Dteam’s having an experience equivalent to that of U-team. The interactive efforts during Phase 1 also deepened rapport at the time of the workshops. Furthermore, it is also thought that the level of cross-cultural understanding increased according to the depth of that rapport. We concluded from the students’ reports that proactive communication for gaining background knowledge of their counterparts and reexperiencing made them realize that they had made assumptions about facts, which in real-world situations may lead to poor interaction between stakeholders and design processes.

7 Conclusion The advantages and limitations of our discussion so far are divided and summarized in Table 3. In the SEP, students from completely different cultural backgrounds cooperated to design products. The results of their activities were that they created proposals which offered rich experiences and they were able to practically apply situational designs. For these efforts, an understanding of the cultural background was important when designing. Cultural background is not limited to the national and ethnic cultures of Japan and Taiwan. Culture exists in different structures, such as age, generation, occupation, family or area. Scenarios are extremely helpful for grasping situations. Scenarios are no more than a doorway for understanding. The repetition of questions about problematical points or obscure areas contained in scenarios leads to a deep understanding of the user (context or cultural background). Although re-experiencing grants a deeper understanding of the counterpart’s situation, it seems that formation of a rapport between users and designers such as that gained in our workshops is of significance. The use of a representational scenario as a mediator has the effect of stimulating active participation, even when the participants’ counterparts are from countries where different languages are spoken. Scenarios are effective media in participatory design efforts. However, scenarios also have the following limitations: • The information takes a lot of processing effort (resizing of photographs, composition of text, etc.) before it can be sent to the server. • Text-intensive descriptions take time to read and write. • There is no easy way of writing up background information (context and culture). In order to solve these problems, we would like to create a design system which assists easily stores observational records on a server and allows viewers to understand situations easily.


231

Table 3. Advantages and Limitations of the Scenario Exchange Project

Advantages

Phase 1 Scenario Exchange 1. D-team can understand situation in which U-team is placed. 2. Easy for D-team to discover problems from scenarios. 3. Scenarios give opportunity to try to understand intentions and culture. (Questions and interests arise easily) 4. Since scenarios and hypotheses are disclosed on web, they are always available for viewing.

Limitations

1.

2.

3.

4.

U-team occasionally takes time to write up scenarios. Innovations for explicitly relaying situations are required. Risk of subjective and objective perspectives becoming mixed in scenarios. Skills for expressing self in foreign language (English) are required in order to communicate ideas to counterpart. Possibility that D-team will carry their assumptions on reality.

Online Communication 1. Can hold discussions in real time by using chat software. 2. Can observe counterpart’s face and voice by using video chat. 3. Can exchange information which cannot be completely supplemented with scenarios. 4. Leads to rapport building. 1. Risk of exchanges taking up a lot of time. 2. U-team is required to be well acquainted with their own country’s culture and have skills to relay that knowledge adequately.

Phase 2 Re-experience 1. D-team can notice environments or information which was not described in scenarios. (Discovery of new problem areas) 2. D-team can notice assumptions about reality. 3. D-team can increase level of understanding of intentions and culture. 4. Can verify whether ideas are appropriate through re-experience. 1. Possibility that information gathering will be insufficient when time is short.

References 1. 2. 3. 4.

Carroll, J.M.: MAKING USE. MIT Press, Cambridge, MA (2000) Go, K.: Requirement Engineering, Kyoritsu Publisher (2002) Kato, S., Okamoto, M.: Tool supporting memory of visually impaired person, WIT (2006) Okamoto, M., Akita, J., Ito, K., Ono, T., Takagi, T.: CyARM; Interactive Device for Environment Recognition Using a Non-Visual Modality. In: Miesenberger, K., Klaus, J., Zagler, W., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, Springer, Heidelberg (2004) 5. Komatsu, H., Ogawa, T., Gyobu, I., Okamoto, M.: Scenario Exchange Project. In: International workshop using Scenario Based Design, Human Interface 2006, Japan, pp. 503–508 (2006) 6. Okamoto, M., Ishii, K.: Method for Information Design based user’s thinking process, JSSD, pp.18–19 (2002)

Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations Ronnie Taib and Natalie Ruiz ATP Research Laboratory, National ICT Australia Locked Bag 9013, NSW 1435, Sydney, Australia School of Computer Science and Engineering The University of New South Wales, NSW 2052, Sydney, Australia {ronnie.taib, natalie.ruiz}@nicta.com.au

Abstract. The use of Wizard of Oz (WOz) techniques for the acquisition of multimodal interaction patterns is common, but often relies on highly or fully simulated functionality. This paper suggests that a more operational WOz can benefit multimodal interaction research. The use of a hybrid system containing both fully-functional components and WOz-enabled components is an effective approach, especially for highly multi-modal systems, and collaterally, for cognitively loaded applications. The description of the requirements and resulting WOz set-up created for a user study in a traffic incident management application design is presented. We also discuss the impact of the ratio of simulated and operational parts of the system dictated by these requirements, in particular those related to multimodal interaction analysis. Keywords: Wizard of Oz, Multimodal user interface, Speech and gesture, User-centred design.

1 Introduction Multimodal interfaces can make interactions with computer systems more flexible and adaptable to user preferences, application content, and varying context [1]. Several input combinations have been studied, including modalities such as pen, speech, stylus, touch screen, finger pointing. In particular, the combination of speech and hand gesture has been explored in the context of deictic gestures over large display, initially relying on specific hand worn devices [2] until video processing allowed for acceptable hand tracking results [3]. Gesture and speech interfaces benefit applications with spatial placements, e.g. in the domain of intelligent transport systems as they heavily rely on map-based applications (Figure 1). Traffic incident management (TIM) requires operators to have fast and accurate retrieval of information and decision making, especially under very high time pressure and within strict safety requirements. We study the benefits of multimodal user interaction in collaboration with a large TIM centre. A large part of our research focuses on interface design and development with aim to reduce the cognitive load of operators, especially during peak hours and emergency situations. Cognitive load refers to the amount of mental effort imposed by a J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 232–241, 2007. © Springer-Verlag Berlin Heidelberg 2007

Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations

233

particular task, and has been associated with the limited capacity of working memory [4, 5]. TIM operators are bombarded with live information that needs to be integrated, synthesised and entered into the system. They need to monitor heterogeneous information sources and respond to several complex incidents at a time, activities which induce very high levels of cognitive load. Thus, the research involves correlating two factors: high levels of cognitive load and use of multimodality, i.e. speech and gesture interaction.

Fig. 1. TIM User Interface in the Experiment

We hypothesised that the operators’ patterns of multimodality would significantly change as their cognitive load increased. In more detail, we expected: • Increase in level of multimodality, i.e. using more than one modality when given the choice, as a strategy for cognitive load management; • Increase in the frequency of complementary information carried across modalities, for example, where the same message is partly carried partly by speech and partly carried by gesture, with no semantic overlap; • Decrease in the level of redundant information in interactions as cognitive load increased, i.e. less occurrences of input where each modality would carry the same message, with semantic overlap. In this paper, we present the design of a Wizard of Oz (WOz)-based user experiment intended to verify these hypotheses. We review the constraints imposed by the study of multimodal interaction, given our research field and requirements and discuss the trade-off existing between simulated and operational parts of the WOz.

2 Background Multimodal interaction, though characterised as more intuitive or natural is not yet robust enough to fulfil its promises. Product-oriented multimodal systems become

234

R. Taib and N. Ruiz

limited in their functionality to alleviate robustness problems, while research-oriented multimodal systems can suffer from over customisation and application dependency, not allowing broader reuse of components. The Wizard of Oz (WOz) technique has early been recognised as an essential tool for the design of multimodal interfaces, where novel interaction patterns were expected to appear [6]. WOz set-ups allow intuitive interaction while removing the limitations, such as input recognition error rates and misinterpreted semantic fusion of multimodal signals. The ethics of the method have been criticised, considering that the subjects are deceived into believing that they are interacting with a working system; however, [7] have noted a positive acceptance by the subjects when informed during post-hoc debriefing. Another limitation of evaluating simulated systems versus real working systems is mitigated by the same authors on the ground that human users can adapt to the capabilities of a system. While this remark is interesting and correct, we have found that an unconstrained WOz implementation is an efficient UCD tool since it may still require a large user adaptation to the system functionality. This highlights a crucial aspect of the development of a WOz set-up: the relationship between the boundaries of the real system in comparison to the simulated functionality of the WOz system.

3 Design Methods for MMUI Systems 3.1 Task Design for Eliciting Multimodal Interaction A user study of multimodal user interaction requires well planned experiment tasks in order to elicit as natural interaction as possible, yet providing targeted data. The traffic incident management scenario we designed comprised the tasks of marking entities or locations on a map, then deploying resources in relation to those incidents. Four sets of tasks with varying difficulty corresponded to four distinct cognitive load levels. Each set comprised three modality-based conditions, namely using speech only, gesture only, and multimodal (speech and gesture) interaction, this latter being the focus of this paper. Each condition had three repeat tasks in order to obtain statistical power. Hence, subjects had to perform 48 tasks in total. Each set of tasks was completed with the same interface, and the subjects were trained in all conditions during a preliminary session. Task difficulty can be induced in two ways. Firstly, the content and inherent complexity of the problem can be increased: this is known as intrinsic load [4]. Similarly, task difficulty can be induced by increasing the complexity of the representation of the data, known as extraneous load [4]. A good example of this is performing a simple ‘drag and drop’ operation with a mouse-driven UI versus a speech-driven UI. The operation is the same, so the difference in complexity originates from affordances of the input modality. It is much simpler to increase the task difficulty (and cognitive load) by increasing the inherent complexity of the concepts in the task, rather than providing more complex representations, where the effects are much more subjective and unpredictable. For these reasons, we chose to


235

manipulate intrinsic load to increase task difficulty. The four distinct levels of cognitive load varied in five ways (Table 1): • • • •

Visual complexity: The number of streets in each task increased from ~40 to ~60; Entities: The number of distinct entities in the task description was increased; Extras: The number of distractor (not needed for the task) entities increased; Actions to completion: The minimum number of actions required for task completion; • Time Limit: The most difficult level was induced by a time limit for completion. Table 1. Cognitive Load Levels Entities

Actions

Distractors

Time

1

6

3

2

∞

2

10

8

2

∞

3

12

13

4

∞

4

12

13

4

90 sec.

Level

3.2 Modality Selection Intelligent transportation systems manipulate large amounts of spatial data and traffic control rooms invariably offer wall displays providing an overview of complex situations. In this context, we introduce speech and gesture as two novel communication media, allowing the operator to interact from a distance with the large displays. We further discriminate modalities over these carrier media by the type of interaction that they allow. The resulting modalities are: • Hand pointing: Simple deictic gestures can be used to point to items on the large display, e.g. a specific location on the map; • Hand pausing: Pausing for a short lapse during the deictic movement results in the selection or the activation of the item being pointed at; • Hand shapes: A few specific hand shapes have been allocated specific arbitrary meanings, e.g. a closed fist to tag a location as an accident; • Speech: Natural language can be used for the selection or tagging of items; • Menu bar buttons: Some graphical buttons on the display can be selected by hand pointing and pausing, in order to tag items. The hypotheses of this study necessitate that all tasks be achievable in either a complementary or redundant multimodal way, hence all modalities should be made as semantically equivalent as possible. For this experiment, most tasks could be achieved using the main three modalities: speech, hand pointing and hand shapes. This required a careful crafting of the user interface so that the modalities provide similar functionality in spite of their various affordances. Table 2 provides some examples of equivalent speech and gesture interaction.

236

R. Taib and N. Ruiz

An important aspect to note is that the design allowed users the freedom to choose combined multimodal interaction. They could opt to interact with a single input, in either modality; or with more than one input, in the same or different modality. This applied to the task as a whole e.g. performing the whole task using speech or using gesture; but also to each subtask e.g. performing the item selection using pointing and tagging it using hand shape or speech. Table 2. Examples of multimodal inputs

Functionality

Speech

Gesture

Zooming in

“Zoom in to the top left quadrant”

Point to the corners of the top left quadrant

Selecting an element

“Select the Church on Street X”; or “Church on Street X”; or “St Mary’s Church”

Point to the element and pause

Requesting information on an element

<Select an element> then, “Information on selected element please”; or “Information”

<Select an element> then, Point to “Info” button

Tagging an element as an accident

<Select an element> then, “Mark as accident”; or “Accident”

<Select an element> then, Point to “Accident” button; Or, make closed fist shape

Tagging an element as an event

<Select an element> then, “Mark as event”; or “Event”

<Select an element> then, Point to “Event” button; Or, make scissors shape

Using automatic speech and video-based gesture recognition would dramatically decrease the usability of the system because of the average recognition rates exhibited by such technologies [8]. Reduced usability, in turn, forces subjects to adapt to the system limitations, which works against our primary objective to collecting natural inputs. Hence a WOz approach was selected for this set of experiments, where the wizard manually performed speech and hand shape recognition. An automated hand tracking module developed in-house was found to be sufficiently robust to use during the experiment.


237

3.3 Data Collection Given our hypotheses and selected modalities, a number of interaction features have to be captured and analysed. Each stream reflects different aspects of the interaction and involves specific requirements. Application-Generated Events. The application monitors the progress towards task completion by recording relevant actions such as the selection or tagging of items on the map. The time of occurrence of such actions may also be used to estimate the subject’s performance on the task. Speech Input. Speech is a major input in many multimodal user interfaces, and we decided to use unconstrained, natural language input during this experiment. The wizard is in charge of interpreting inputs and a the task. However, a complete recording of the speech inputs is very desirable as it contains rich features for the post-analysis of the interaction. Since this experiment was using a single user, we opted for a directional microphone connected to a camcorder in order to capture speech. The major benefit is the inherent synchronisation with the video signal. Gesture Input. An in-house gesture tracking and recognition software module was used to capture hand moves and shapes. This provides untethered gesture interaction with a fair reliability in a lab setting, by using a dedicated high quality Firewire camera focusing on the subject’s hand. The subjects were also videotaped on a classical camcorder in order to capture the overall gestures (see Figure 2). Biosensor data. Physiological data was captured in order to evaluate the level of stress and arousal of the subject during the interaction. In particular, galvanic skin response (GSR) and blood volume pulse (BVP) were recorded using an external device with finger sensors.

Fig. 2. Gesture and speech TIM prototype

238

R. Taib and N. Ruiz

3.4 Data Type Limitations Each data stream provides a rich source of information for the analysis of multimodal interaction, however, there are inherent limitations that have to be balanced in view of the experiment’s purpose. Volume. Audio-visual information is very rich but high quality recordings imply large storage capacity requirements and potential playback and trans-coding issues. Recording on tapes (e.g. MiniDV) requires transfer to a computer at a later stage, often with trans-coding. Further to the consumable cost, this process is extremely time consuming. Hence we opted for connecting the camcorder directly to a computer and record the stream directly on the hard drive. A flat format codec was used for the video streams in order to ensure correct synchronisation between audio and video channels. The resulting files are very large though, so we decided to record them directly on external hard drives in order to provide the maximum flexibility during the post-analysis, while avoiding file copies and transfers that have the potential to corrupt data. Biosensor data also generates large amounts of information due to the high sampling rate at which they should be acquired. Being short text records, the overall file sizes remain easily manageable. Reliability. Multimodal interaction analysis relies on the combination of distinct modality streams in order to improve recognition of other parameters, such as cognitive load. This mutual disambiguation process [8] is most effective when the individual streams are unreliable because of inaccurate recognisers or user context, e.g. automatic speech recogniser or noisy rooms. Biosensor data sensors and acquisition chain is fairly complex, hence often inaccurate. The position and stability of the sensor are paramount for reading GSR, for example. In our experiment, subjects used their main hand for gesture interaction, while their other hand was connected to the biosensors and rested on a back rest of a chair. Any unnecessary movement with the ‘sensor’ hand could cause a disruption in the reading. While it may be difficult to compare results across subjects, within subject evaluation is reasonably stable with this set-up. Another key reliability issue are manual annotations. Uniformity among annotators is difficult to achieve and requires precise annotation rules and cross-validation between annotators. The precision of manual annotations usually comes at a cost; for example we annotated start and end of speech with a precision of around 10ms, which required specialised software tools and more annotation time. Finally, data precision is important as it can restrain the span of numerical analysis. Biosensor technologies vary in cost and precision, so a trade-off between these parameters dictates the final choice. In this experiment, we used professional grade biosensors, with a real-time link to the computer for acquisition. Synchronisation. Accurate synchronisation of all the data streams is crucial to the effective annotation and analysis of multimodal interaction. Logging data on separate computers and devices requires means to ensure synchronisation during recording, for example using the Network Time Protocol (NTP) to synchronise the computers’ time.


239

But it also requires means to synchronise streams post-hoc, which may be unreliable for video or biosensor data for example. To alleviate this issue, we directed all data streams, except the audio-visual, to a single software logging application. This latter provides a uniform time scale as well as a preformatted output easing annotations. Output is buffered to memory during the tasks, in order to avoid loss of information, and is stored to disk files between tasks. The post-hoc synchronisation of the audio-visual stream is possible thanks to auditory beeps played by the system at the beginning and end of each task. The time of occurrence of the beeps is logged by the unified software logger, and can be manually reconciled with the audio channel during annotation.

4 Discussion: Level of Actual Functionality In the field of multimodal research, system robustness is critical to elicit intuitive and reliable interactions from the subjects, e.g. users will compensate unnaturally for errors in recognition of a certain input and may stop using it or increase use of other modalities that are perceived to be less error prone. Hence, the WOz systems are usually highly or fully simulated, sometimes based on multi-wizard support (e.g. one for speech recognition and one for output preparation). However, there are no general guidelines available in terms of design factors for WOz systems, and our experiment allowed us to determine some characteristics of the data that greatly impact the design, such as volume, reliability and synchronisation. Further to those characteristics, we discovered that the balance between functional components and ‘wizardry’ is highly dependent on the user study design and the goals of the research. When the goals are largely evaluative, more functional modules are necessary, such that feedback on actual functionality can be assessed and incorporated into final versions of software. In addition, having a fairly functional system makes product development far more achievable. In our case, the focus was on identifying multimodal behavioural patterns in highly multimodal systems: the goals were exploratory and we aimed to capture naturalistic interaction. Though input could only be conveyed through three asynchronous modalities, (speech, hand movements and hand shapes), the temporal, syntactical and semantic characteristics of the interaction were highly complex. To illustrate: the least expressive modality, free-hand gesture, could be used to issue 11 distinct commands in a single movement, each of which could then be combined with other commands, in groups of two or three, along various temporal arrangements to alter the semantics of the command. Further, any command could also be conveyed through the speech modality, and again, combined with others in various temporal arrangements. The choice of modality and the temporal arrangements are very delicate characteristics of interaction and subject to both unreliable input recognition and individual subject differences [9]. The state of the art in fully functional speech and gesture recognition would not be sufficiently error-free to produce unbiased interaction, and for this reason, the decision was made to use wizard-based simulation in place of the recognition and multimodal fusion engines. Giving the wizard this responsibility meant that very few other tasks could be allocated to him, so as to prevent overloading. The limitations of the wizard’s attention span, and the lack of resources

240

R. Taib and N. Ruiz

to provide a second wizard, drove the rest of the functionality to be automated as much as possible. The WOZ technique relies on the user believing the system is fully functional. This gives rise to two aspects of system design that impact the implementation of the system and hence the percentage of actual vs. simulated functionality. The complex form of input in multimodal interaction requires equally complex forms of output. Though primarily graphical, the task scenario was also required to provide able textual output at different stages of input forcing the lag time for system feedback to be as fast as possible. The feedback for each different kind of command may require more than one element to appear on the screen, or some text at various stages of the command being issued. The back-end logic of the application, e.g. responses and immediate output were fully functional and largely operated by the wizard, once user input was interpreted, but the wizard did not need concern themselves with selecting the content or form of output on the fly. The wizard’s interface was tailored to suit, providing large buttons which would facilitate this process. Another factor that may also drive the decision of how to distribute the ratio of functional vs. simulation in a WOZ system is the post-analysis required. The more system events are fully automated, the more markers that can be placed on the data and the more features can be recorded on the fly, such as time stamps, command sequences and types, and The centralisation of system models on a single machine allows a better synchronisation of input signals, facilitating data analysis post hoc. In conclusion, our WOz design allowed us to collect the target data and to confirm our hypotheses. However, there are still many aspects of multimodal user interaction that need addressing, especially in view of the evaluation of the cognitive load experienced by a user. So reflecting on the design choices brought some important insights for the design of future WOz based user experiment. In particular, we identified data characteristics that have a deep impact on the design choices, and we clarified the necessary trade-off between implemented and simulated functionality.

References 1. Oviatt, S.: Ten Myths of Multimodal Interaction. ACM, Communications of the ACM 42(11), 74–81 (1999) 2. Bolt, R.A.: “Put-That-There”: Voice and Gesture at the Graphics Interface. In: Bolt, R.A. (ed.) Proc. 7th annual conference on Computer Graphics and Interactive Techniques, Seattle, WA, USA, pp. 262–270. ACM Press, New York, USA (1980) 3. Schapira, E., Sharma, R.: Experimental Evaluation of Vision and Speech based Multimodal Interfaces. In: PUI’01, Workshop on Perceptive User Interfaces, Orlando, FL, pp. 1–9. ACM Press, New York, USA (2001) 4. Paas, F., et al.: Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist 38, 63–71 (2003) 5. Baddeley, A.D.: Working Memory. Science 255(5044), 556–559 (1992) 6. Salber, D., Coutaz, J.: A Wizard of Oz platform for the study of multimodal systems. In: Ashlund, S., Mullet, K., Henderson, A., Hollnagel, E., White, T. (eds.) INTERACT’93 and CHI’93 Conference Companion on Human Factors in Computing Systems, Amsterdam, The Netherlands, pp. 95–96. ACM Press, NY (1993)


241

7. Dahlbäck, N., Jönsson, A., Ahrenberg, L.: Wizard of Oz studies: why and how. In: Gray, W.D., Hefley, W.E., Murray, D. (eds.) Proc. 1st international Conference on intelligent User interfaces, Orlando FL, USA, pp. 193–200. ACM Press, NY (1993) 8. Oviatt, S., Cohen, P.: Perceptual user interfaces: multimodal interfaces that process what comes naturally. Communications of the ACM 43(3), 45–53 (2000) 9. Oviatt, S., DeAngeli, A., Kuhn, K.: Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction. In: SIGCHI conference on Human factors in computing systems, Atlanta, GA, USA, pp. 415–422 (1997)

Extreme Programming in Action: A Longitudinal Case Study Peter Tingling1 and Akbar Saeed2 1

2

Faculty of Business Administration, Simon Fraser University, 8888 University Drive, Burnaby, Canada V5A 1S6 [email protected] Ivey School of Business, University of Western Ontario, 1151 Richmond St. N., London, Canada N6A 3K7 [email protected]

Abstract. Rapid Application Development (RAD) has captured interest as a solution to problems associated with traditional systems development. Describing the adoption of agile methods and Extreme Programming by a software start-up we find that all XP principles were not adopted equally and were subject to temporal conditions. Small releases, on site customer, continuous integration and refactoring were most vigorously advanced by management and adopted by developers. Paired programming on the other hand was culturally avoided. Keywords: Extreme Programming, Agile Methods, Rapid Application Development.

1 Introduction The speed and quality with which systems are delivered continues to concern both practitioners and academics. Traditional methodologies, while praised for their rigor, are often criticized as non responsive, bloated, bureaucratic, or contributing to late and over budget systems that when delivered solve problems that are no longer relevant. Various solutions have been proposed. Frequently combined under the rubric of Rapid Application Development (RAD), these include extensive user involvement, Joint Application Design, prototyping, integrated CASE tools, and more recently, agile methods such as eXtreme Programming (XP). Following a qualitative study of agile methods and concepts we conclude that adoption and extent of agile principle appropriation are affected temporally and by culture. Coding standards for example may initially be excluded in a search for creativity and flexibility. Similarly, in addition to the continuous improvement of refactoring bursts of intense focus also occur.

2 RAD and Agile Methods The need for software quality and reliability is a mainstay of application development [1, 2]. While recognizing there is no “silver bullet” [3-5], the Systems Development J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 242–251, 2007. © Springer-Verlag Berlin Heidelberg 2007

Extreme Programming in Action: A Longitudinal Case Study

243

Life Cycle is a well adopted ‘systematic, disciplined, quantifiable approach to the development, operation and maintenance of software’ [2, 6]. However, with increasing backlogs; some high profile development failures; and the need to adapt to emerging business conditions; the SDLC has been subject to criticism that it is constraining, heavyweight and results in projects that are outdated before they are finished [7]. Consequently, many organizations have adopted alternates that emphasize incremental development with constant customer feedback (Rapid Application Development); structured processes where constituents collectively and intensely review requirements (Joint Application Development); construct partial systems to demonstrate operation, gain acceptance or technical feasibility (Prototyping); and tools that assist in software development and business analysis (Computer Aided Systems Engineering). Table 1. Agile Principles of Extreme Programming XP Principle 40-Hour Work Week Coding Standards Collective Ownership Continuous Integration Continuous Testing On-Site Customer

Rationale and Description Alert programmers are less likely to make mistakes. XP teams do not work excessive hours. Co-operation requires clear communication. Code conforms to standards. Decisions about the code are made by those actively working on the modules. All code is owned by all developers Frequent integration reduces the probability of problems. Software is built and integrated several times per day. Test scripts are written before the code and used for validation. Ongoing customer acceptance ensures features are provided. Rapid decisions on requirements, priorities and questions reduce expensive communication. A dedicated and empowered individual steers the project. Two programmers using a single computer write higher quality code than individual programmers. Business feature value is determined by programming cost. The customer decides what needs is done or deferred. The software is continually improved Programs are simple and meet current rather than future evolving requirements.

Pair Programming Planning Game Refactoring Simple Design Small Systems are updated frequently and migrated on a short cycle. Releases System Communication is simplified and development guided by a Metaphor common system of names and description. Source: Adapted from [8]

In 2001, a group of programmers created a manifesto that embodied the core principles of a new methodology [9]. An extreme application of RAD, agile methods capitalize on member skill; favor individuals and interactions over process and tools; working software over comprehensive documentation; customer collaboration over negotiation; and change rather than plans and requirements. Dynamic, context specific, aggressive and growth oriented [10, 11], agile methods favor time boxing

244

P. Tingling and A. Saeed

and iterative development over long or formal development cycles. The most widely adopted agile development methodology, eXtreme Programming is a generative set of principles that consisting of twelve inter-related principles. These are described in Table 1.

3 Methodology and Data Collection For this study, we used a case oriented approach which is an “empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident” [12]. Site selection was opportunistic, the result of an ongoing relationship with Semper Corporation, a year-old start-up developing an interactive software product. Data was collected between August 2005 and December 2006 and consisted of interviews with employees, observation of the environment and work practices, and retrospective examination of documents and email [13] These are described in Table 2. Table 2. Data Collection Activities Data Type Interviews Observation Artifact examination

Description The development staff and company principals were regularly interviewed throughout the year long data gathering. Programming and development staff was observed at least weekly. This was both active (at the development offices) and passive (by remote viewing of video cameras). Employment and programming records, progress and bug reports, and copies of each build and version of the product were reviewed as was email correspondence.

The main steps in analysis involved identification of concepts and definition followed by the theorizing and write up of ideas and relationships. Content relating to agile methods and extreme programming content were separated and categorized according to discrete principles. Illustrative yet concise examples were then selected. Direct quotations have been italicized and placed within quotation marks.

4 Extreme Programming at Semper Corporation This section reviews principles described in Table 1. Although these principles were meant to be generative rather than all-inclusive, typical recommendations recognize their inter-relatedness and suggest that implementation be done in entirety with adaptation encouraged only when use and familiarity is established [6]. Findings are summarized in Table 3. 40 Hour Work Week. Company policy was one of flexible work hours. Other than core hours between 10:00 to 15:00, developers were free to set their own schedule. While there were work weeks longer than 40 hours (during customer testing or


245

resolving production problems) this was the exception rather than the norm. There was no overtime compensation. Another factor affecting the schedule was the young (average 21) age of the developers who adopted nocturnal habits because of their social schedule. For example, advising when he might be in the office, one of the developers noted “I will be in early tomorrow -around 10:00-10:30”. Email conversations (where a message was sent and a response received) between the developers and managers declined during the core hours from 45% to 31% and increased from 6% to 37% between 22:00 and 04:00. Table 3. Adoption Faithfulness of XP Principles eXtreme Programming Principle 40-Hour Work Week Coding Standards Collective Code Ownership Continuous Integration

Adoption Level*

Temporal Effects

Summary

Developers worked flexible but regular workdays. Low to Standards were initially avoided Y Partial but later implemented. Code was officially shared but Partial Y developers exhibited possessiveness. Code was rarely broken and was Full N continually linked and compiled. Testing was continuous but Continuous Partial to Y advance scripts were not created. Testing Full Black box testing was phased. On-Site The CEO and Analytic Director Full N Customer acted as customers. Programmers were independent Pair Low N except when difficulties or Programming interdependencies existed. Value engineering balanced Planning Game Full N features against time and budget. Modules were constantly Refactoring Full Y improved. Periodic bursts of and dramatic improvement occurred. Simple Design Full N Working software was favored. Small Releases Full N Frequent (weekly) build cycles System Communication was simple and Full N Metaphor informal but unambiguous. *Adoption is considered Complete, Partial or full Full

N

Coding Standards. Coding standards were initially avoided. For example, rather than conventionally declaring all variables at the beginning of a module, one programmer simply added them wherever they were needed. Requests to impose standards were generally ignored by management until the program became sufficiently complex as to require tighter control and the CEO realized development teams continually rewrote the variables when they refactored or changed modules. Staff attrition later resulted in de facto standards.

246


Collective Code Ownership. With the exception of a few core modules, module decisions were made by the active developers. As a consequence, different ideas about modules were continually rewritten according to individual preferences. Although modules had multiple authors – one team tended to write the analytic modules while another wrote the graphically intense reporting component. While officially shared and located on a common storage medium, developers were reluctant to adapt code written by others and continued to speak of “their code”. Continuous Integration. The programming environment was Visual Basic in a Microsoft .Net framework. Modules were tested in isolation and embedded into the program several times a day adding to the formal build schedule with weekly integration. During the sixteen months of observation, more than 35 complete formal versions of the product and 225 integrations were compiled. In addition, internal and external users were given replacement Dynamical Link Libraries (DLLs) that encouraged up-to-date testing. Despite a preference for working code, there were several occasions when changes to the data model required extensive rewrites and the code was broken for up to two weeks as modules were re-written and tested. Continuous Testing. Test scripts were not written in advance of coding (as recommended by XP) and were frequently developed in parallel. Ongoing functional and compatibility testing used standardized and ad hoc test scripts. Because the design was modular and addressed a specific rather than generic problem, the majority of the code could be tested in isolation. Integration testing was completed after each weekly build and was conducted by management and external users. Black box testing was conducted using a combination of end user and test samples. HCI and usability aspects were the most dynamic with the majority of the changes immediately accepted or rejected by the onsite customer. The few exceptions to this occurred when the developers were given free rein to creatively design new ideas or when previously adopted choices were abandoned. The CEO often challenged the developers to present complex information simply and intuitively rather than providing them with a design to be implemented. After reviewing the work, he frequently commented that they seemed to anticipate what he wanted or were able to implement what he had been unable to imagine. In addition to a comprehensive series of test scripts that were developed and executed, the program was also provided to industry professionals. Two beta tests involving early customer experience programs were used by the company for acceptance testing and both of these surfaced unanticipated areas for attention. Semper used formal bug and feature tracking software for major or outstanding problems but generally the developers tended to simply immediately fix problems once identified. Often the first indication that management had of a problem was when a fix was provided or noted in the change log. Discussing the need to document bugs, the programmers opined that judgment was used to determine if a bug report should be completed after the fact and that this was only done for difficult or particularly complex solutions. Onsite Customer. Because Semper was an early-stage pre-market company, they did not have customers in the traditional sense. Instead, the product vision was provided by the CEO and the Director of Analytics. Originally trained as a mainframe


247

programmer, the CEO was empathetic to technical problems but was not familiar with modern systems development and did not get involved in construction details. He would often jokingly describe programming and analytic modules as “it is just a sort and a print right - what is the big deal – three to four hours programming tops!” and would often laugh and offer to write some code himself if he thought some simple aspects were taking too long. He would challenge developers by reminding them that they learned little by programming simple tasks. A developer response to his question about a particularly complex change provides an example “This is possible but will be hard to do. This is because [text redacted]. Anyway, I’m not going to start talking about the how-to parts. I know your response will be ‘if it were easy, why would you want to do it?’ ”. The Director of Analytics on the other hand, had current technical skills and would often interact directly with the developers and offer suggestions. Generally developers worked interactively with the management team and demonstrated prototypes for immediate feedback. Where planned requirements or changes necessitated extensive coding and development work, Unified Modeling Language use cases, conceptual sketches and data models were used as scaffolding to be discarded in favor of a prototype. A great deal of the management and developer communication was oral but the fact that offices were physically separated meant that email and instant messenger were used a great deal. The main design artefacts were the data model and build reports that identified progress and what was planned for or deferred to the next iteration. Pair Programming. Pair programming was not adopted. Developers were dyadic but each within their own workstations. Modules were coded by one person although complex or difficult problems were shared. Although management discussed paired programming as an option with developers when they were hired (new applicants were interviewed by the programming staff and in addition to technical competency had to “fit in”) it was not pursued. Developers, hired directly from university where assignments and evaluations were competitive and individual; did not embrace collective approaches. While the environment was co-operative, developers would occasionally compete to see who could write the most efficient and effective code. Further exacerbating the difficulties with paired programming were work schedules, staff turnover, and personalities. Two of the development staff for example preferred to listen to iPods and to be isolated. Although programmers would often compete to see who could develop the better module they were reluctant to comment on code written by co-workers except in a joking manner. However, once a programmer left the company or was assigned to a different capacity they immediately became part of the out group and their code would often be referred to as “strange” , “poorly written” or “in need of a re-write”. Although developers would blame problems on former co-workers they would laugh when reminded that they may ultimately be subject to the same criticism. After one developer had been gone for six months another noted it was “too late to blame [redacted] now”. Planning Game. Management realized that development had aspects of both art and science. Nevertheless the planning game was used extensively and trade-offs between time and features were routine. Estimates were almost exclusively provided by the developers and once established were treated as firm deadlines against which they

248


were evaluated. Development was categorized into Structural Requirements, Differentiating Features, Human Computer Interaction, and Cosmetic changes. Structural Requirements. Features and capabilities outlined in the business plan, considered core and treated as priority and foundational items. Differentiating Features. Provided differentiating or competitive capabilities and were further grouped into “must haves”, “nice to have” and “defer”. The majority of the “must haves” differentiated the product. Additions to this list resulted from competitive reviews or extensions to existing capabilities suggested by users. Typically a few “must haves” were included each week and developers knew that these could delay the build (there were two or three occasions where a deadline was missed). “Nice to have” items were optional. There were between eight to twenty of these each week although they were added to a cumulative list. Approximately threequarters of these were included in each time box. “Defer” items were a combination of large and small features or changes that could be moved over time into the “must have” or “nice to have” group. Examples included the tutorial to complex encryption requirements that were included in subsequent builds. Human Computer Interaction. Although management realized that HCI was important it was considered secondary to programming and design staff were not hired until the first version of the product had been completed. The main proponent of a more expanded view of usability was the Director of Analytics. Rather than criticize the existing product he would usually make his point by identifying other products that he believed exemplified good design. The result of these comparisons was a complete re-write from the existing traditional Window’s-based interface (Icons, Menu’s and Pointers) to one that was much more intuitive and conversational. Despite the fact that Human Computer Interface issues were later seen as critical to the system and a great deal of time was spent in design, HCI was considered technically minor by the CEO. Cosmetic Changes. Semper viewed all non programming changes as important to customers and use but mainly “cosmetic”. There were numerous evolutions and changes to text, font, color, position and alignment. These were continuous and, in the words of a developer, were “tedious but not hard”. The frequency and approach used to manage these changes are described in Table 4. Refactoring. Code focused on functionality and was continually refined and improved. The first product build, created after just two weeks, was essentially a shell program but was designated version 1.0.0. Substantive changes incremented the second order digit and minor changes usually incremented the low order identifier. In addition there were several major changes. For example a complete change in system interface required that all of the modules be re-written simultaneously and the main analytic engine (over 6,000 lines of code) was completely re-written over a two month period. As such, in addition to continuous improvement through refactoring there were periods of intense improvement in function, usability, reliability, speed and stability.


249

Table 4. Development Taxonomy

Type Structural

Feature

HCI

Cosmetic

Description Fundamental aspects or product core. Market and competitive requirements. Grouped into “must have”, “nice to have” and “defer” Usability issues such as placement of glyphs, screen dialogue and presentation. Icons, glyph, color, dialogue and position changes (not all simple).

Number of Changes 100

>250

>1,000

Approach Simple Design & System Metaphor On Site Customer, Planning Game, Simple Design, & Refactoring. On Site Customer, Small Releases, Continuous testing, Refactoring. Onsite Customer, Refactoring, & Small Releases

Simple Design. Development was guided by simple principles but trying to avoid architectural constraints or what the CEO called “painting themselves into a corner”. Problems were designated BR or AR. BR were those that impacted customers and had to be fixed before revenue. AR were those could be solved with the increased resources provided after revenue. The planning game arbitrated between the cost of desired features and refactoring delivered functionality that was later improved. Conceptually developers were told to consider the metaphor of a ‘modern digital camera’, where a high level of complexity and functionality was behind a simple interface that users could employ in a myriad of sophisticated ways. Small Releases. Time boxing was part of the discipline. Consequently, developers released a new version almost every second week. This was relaxed during major revisions and accelerated to almost daily versions when approaching a major deadline. In addition, management and users were also given replacement modules (DLLs) that delivered specific functionality, fixed problems or generally improved the code. Despite periods where developers complained that the ongoing short term focus impeded delivery of a more systematic and quality oriented product, management remained committed to the concept of small releases. In a twelve month period developers delivered approximately 35 complete versions, with almost two dozen non-developer compiles and more than 150 replacement DLLs over and above the build cycle. Working through the planning game, management and the developers laid out a build schedule that was tracked using basic project management tools and rarely modified. System Metaphor. Communication was simple and directly facilitated most often by the data model, the program itself, and the fact that with the exception of the Director of Finance and two junior business analysts all employees had been formally trained in systems analysis or computer programming. Design of the products was handled through a combination of strategic and tactical adjustments. Joint Application Design

250


(JAD) sessions were used to begin product development and after each of the beta programs and before each of the three program redesigns. Tactically, designers and management met twice a week to receive the weekly build and to review progress, bug status and planned revisions to the upcoming version schedule. We next draw conclusions about the degree and extent of appropriation, discuss limitations and suggest future research and implications.

5 Conclusions and Summary Semper’s partial adoption of agile principles reinforce other findings that indicate up to two thirds of large companies have adopted ‘some form’ of agile methods [8] which are then blended with more traditional practices. Practitioners have not adopted XP in an all or none action and faithful appropriation of all principles seems to be a rarity. Initially Semper implemented only eight principles. Interestingly, three of the remaining four (continuous testing, shared code and coding standards) did later become more fully and faithfully appropriated. At first, it would appear that Semper should have applied more diligence in following agile principles from the outset. Alternatively, we suggest that these principles may have required a certain level of maturity not present in the organization’s employees and processes. Coding standards were initially eschewed by management in favor of creativity, until a basic level of code had been developed. While the programming staff themselves favored standards, they were unable to agree on the specifics, until staff turnover and management support of a standard pressured them to do so. Similarly, developers still sought code ownership despite a concerted effort by management to curb such behavior. Paired programming, the only principle that did not manage to gain any momentum continues to be supported by management but has yet to be embraced by the developers. Therefore, we find that temporal conditions and maturity affect the extent to which extreme programming principles are adopted and that both management and developer cultures are salient considerations. Consequently, future research should consider both cultural conditions and managerial preferences. Acknowledgments. We are grateful to Semper Corporation. This research was supported by a grant from Simon Fraser University.

References 1. Geogiandou, E.: Software Process and Product Improvement: A Historical Perspective. Cybernetics and Systems Analysis 39(1), 125–142 (2003) 2. Gibbs, W.W.: Software’s Chronic Crises. Scientific American 271(3), 89–96 (1994) 3. Berry, D., Wirsingm, M., Knapp, A., Simonetta, B.: The Inevitable Pain of Software Development: Why there is no silver bullet. Radical Innovations of Software and Systems Engineering in the Future. Venice (2002) 4. Brooks, F.P.: The Mythical Man Month. Addison-Wesley, London, UK (1975)


251

5. Duggan, E.W.: Silver Pellets for Improving Software Quality. Information Resources Management Journal 17(2), 1–21 (2004) 6. Beck, K.: Extreme Programming Explained: Embrace Change. Addison -Wesley, Reading, Mass (2000) 7. HighSmith, J.: Agile Software Development Ecosystems. In: Cockburn, A., HighSmith, J. (eds.) Agile Software Development Series, Addison-Wesley, Boston (2002) 8. Barnett, L., Narsu, U.: Best Practices for Agile Development. on accessed, (January 15, 2003, 2005), http://www.gigaweb.com 9. AgileManifesto: The Agile Manifesto (2001) 10. Goldman, S.L., Nagal, R.N., Preiss, K.: Agile Competitors and Virtual Organizations. Van Nostrand Reinhold, NY (1995) 11. Williams, L., Cockburn, A.: Agile Software Development: IT’s About Feedback and Change. Computer 36(6), 39–43 (2003) 12. Yin, R.K.: Case Study Research: Design and Methods. Sage Publications, Thousand Oaks, CA (1994) 13. Spradley, J.P.: The Ethnographic Interview. Holt, Rinehart and Winston, New York (1979)

Holistic Interaction Between the Computer and the Active Human Being Hannu Vanharanta and Tapio Salminen Tampere University of Technology, Industrial Management and Engineering, Pohjoisranta 11, 28101 Pori, Finland

Abstract. In the design, development and use of computer-based decision support systems, the ultimate challenge and goal is to arrange and organize successful interaction between the computer and the active human being. This paper therefore examines the extent to which, by applying the hyperknowledge framework developed by Ai-Mei Chang, the holistic concept of man developed by Lauri Rauhala, and the Circles of Mind metaphor developed by Hannu Vanharanta for decision support systems, these systems can be made to emulate human cognitive processes. The approach is a new one, and it represents an emerging paradigm for achieving emulation and synergy between human decision-making processes and computer configurations. Keywords: Holistic, Interaction, Human Beings, Computer Systems, Concepts, Constructs, Architecture, Co-Evolution, Decision Support Systems.

1 Introduction Technological advancement is segmenting information technology into ever more varied organizations, user groups and activities. The basic purpose of a decision support system is to extend the individual decision maker’s own capabilities of mastering the usage, knowledge and information content of the computer as well as those procedures and interpretations that the computer is processing in any situation where the user uses a computer system. Designers, developers and users of these new support systems and their applications will therefore need to understand much more than previously required about the nature of the human actor, the activities of the human brain, as well as the theories, methodologies and methods of decision making. This paper focuses on the design of a computer-based system for supporting the user beyond the mere retrieval and processing of data. We examine the extent to which, by applying the hyperknowledge framework, the holistic concept of man and the Circles of Mind metaphor for decision support systems, these systems can be made to better emulate human cognitive processes. We also aim at an improved understanding of the situationality of the active user in order to better understand which constructs and concepts should be included in the design of computer systems. By first combining the hyperknowledge framework with the holistic concept of man framework, and then by combining this with the Circles of Mind metaphor, our research provides a new approach to the design of computer systems. This new J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 252–261, 2007. © Springer-Verlag Berlin Heidelberg 2007

Holistic Interaction Between the Computer and the Active Human Being

253

architecture can easily be applied to many computer systems as well to new areas of computer usage where holism plays an important role. 1.1 A Philosophic Model of the User The Holistic Concept of Man (HCM) is a philosophic model that has been described in a number of books and articles by Rauhala, a Finnish phenomenological philosopher and psychologist [1] [2] [3]. Rauhala’s source material consists, in particular, of the works of two well-known German philosophers: Husserl [4] and Heidegger [5]. The advantage of the holistic concept of man, compared to the theories presented by Husserl and Heidegger, is that it has a rather simple construction and is therefore more understandable for non-experts. 1.2 The User’s Mind The Circles of Mind metaphor [6] opens up the mind’s most important sectors: the memory system, interpretation system, motivation system and automatic system. These systems and their content must be reinforced by the computer system so that the user feels supported when using the computer. 1.3 The User as a Decision Maker The hyperknowledge framework [7], in turn, views a decision maker as cognitively possessing many diverse and interrelated pieces of knowledge (i.e. concepts). Some are descriptive, some procedural, some are concerned with reasoning, and so forth. The mind is able to deal with these in a fluid and inclusive manner via the controlled focusing of attention. That is, the decision maker actively acquires (recalls, focuses on) desired pieces of knowledge by cognitively navigating through the universe of available concepts. The result is then a hyperknowledge view of the underlying concepts and content involved in the decision making. 1.4 Computer Architecture By combining these three above-described different views, we end up with a new architecture for computer applications and constructs.

2 The Holistic Concept of Man Metaphor 2.1 Modes of Existence The Holistic Concept of Man (HCM) is a human metaphor. The basic dimensions of the metaphor consist of a body, a mind and a situation [2] [8]. A human being is an organism [9] which uses thinking processes and exists in particular and individually formed situations. The human being constitutes simultaneously of three modes of existence based on the above basic dimensions of the HCM, which cannot be separated from each other. According to the HCM, all three modes are needed in

254

H. Vanharanta and T. Salminen

order to make human existence possible and to understand the holistic nature of the human being. These modes of existence of the human being are called 1) corporeality, or existence as an organism with organic processes (the body) 2) consciousness, or existence as a psychic-mental phenomenon, as perceiving and experiencing (the mind), 3) situationality, or existence in relation to reality (the situation). Human beings have relationships and interrelationships that characterize their qualities as individuals in specific situations [10]. 2.2 Corporeality The first mode of existence, corporeality, maintains the basic processes of existence and implements the physical activities of the human being. The human brain and sense organs (internal and external) are needed when observing the objects and concepts in a specific situation which send meanings to the observer [10]. 2.3 Consciousness In consciousness, the active human being experiences, perceives and understands the phenomena encountered. This is more than a mere thinking process (cf. Res cogitans) because qualities such as experiencing, perceiving and understanding are also involved. When human beings uses their inner and outer senses to receive physical signals from the environment, the situation provides the consciousness with a meaningful content, and the human being understands this content, i.e. perceives the corresponding construct(s) or object(s) or concept(s) to be “something.” As a result of an act of understanding, there emerges a relationship, or a meaning or meanings. The HCM metaphor separates the terms consciousness and mind. Consciousness is the totality of the psychic-mental existence of the human being. Mind is used in a more functional sense to refer to the psychical and mental processes which, when taken as a totality, form the mode of existence called consciousness. Mind is a continuous process in which meanings emerge, change and relate to each other. Meanings are linked together in the mind and collectively form networks of meanings. The totality of these networks is called the world view of a human being. In relation to the world view, a human being understands both old and new phenomena. “Cause maps” or “mental models and maps” used in the cognitive psychology approach correspond to some degree to the notion of the world view. The psychological term “memory” also corresponds to some degree to the world view in the HCM metaphor. [10]. 2.4 Situationality Situationality is the third dimension of human existence. Situationality emphasizes that a human being exists not only “as such,” in a vacuum or isolation, but always in relation and interrelation to reality with its multitude of aspects. The world, or reality, is all that exists concretely or ideally, i.e. the world with which people in general can


255

Will

Scientific information Everyday knowledge

Sense s Brain s

Mind

Intuition Feeling Belief

Activities Lim

bs

Object(s)

World view

relate to. Situation (or the situation of life) is that part of the world with which a particular human being forms relationships and interrelationships. [10]. Situationality is always unique to each individual. Human beings understand the same object(s) in their situation in an individual way.

Corporeality Consciousness Situationality

Fig. 1. An active human and different types of meaning [10]

3 The Circles of Mind Metaphor 3.1 The Theatre Metaphor The HCM metaphor, or the idea of the human being in a specific situation as a totality, is not sufficient to be used for the development of a brain-based system. The metaphor lacks the new, current research findings on the unconscious part of the human brain. Baars [11] has combined psychology with brain science and the old conception of the human mind to create a metaphor based on the workspace of the mind. The totality can be explained through the theatre metaphor, where the self as an agent and observer behave as if on the theatre stage. Close to the stage is the unconscious part of the brain (the audience), which is divided into four main areas: the motivational system, automatic systems, interpreting system and memory system. The spotlight controller, context and theatre director are also present. 3.2 The Circles of Mind Construct A combination of the HCM and the theatre metaphor of Baars led to our new particular and very practical metaphor. This was named the Circles of Mind metaphor [6](Vanharanta, 2003). The Circles of Mind metaphor was also designed as a physical entity so the metaphor could be used for design purposes. This has led to the idea of a brain-based system which contains the physical body following the Cartesian mind-body relationships, i.e. as a thinking thing and an extended thing [9]. One version of the Circles of Mind metaphor is presented in Figure 2 below.


R

Y

S

Co beh ntex ind t op the era s ce tor s n es

E

e

D

ing Read n tio Ac ntrol co

Th eu au nc d on Em ien sci re s oti ce ous po ona n ses l F expr acial essio ns

liz rba Ve

Th eP la y er In s spe n er ec h

Imagible

Viol ative

Goals conflicts

Details of language

nstructive

ed Imagings feelin

Pl ay er Int uiti s ons

Motives Objectives

ms ea Dr

Di rec tor

S

Thinkin g

udes Attit

E

Th e

onceptual

S

M

Th eu au nco di ns en ci ce ou s Skill mem ory

l Loca

ST EM

E

C

S M E L S T L S Y O

S

S

T

S

T I ON

E

S

A

I V

S

Y

T O M A

E

T

N

S

U

O

O

S E

O R Y

A

O

N

S

M

M

P

Vi im sual a ge ry

ling Fee ng ari He

E

S

B

Consci ous Exper i ence on the St age

D

S

xts nte Co

Ob eve ject a rec nts nd ogn itio n

S

M

Th eu au nco di ns en ci ce ou Au s gra tob io me phic mo al ry Declara See tiv ing memor e y Smelling Beliefs Facts g Tastin Lexicon on rati Vib at He e g led ow lf Kn onese e g of led ow s e I Kn other edg rld l o f s r o w o o ew at es r n e K f th op scen xt o nte he Co i nd t beh

ST EM

Synta analy ctic sis

N

S

Lo ng Go -term al

erm

tors nes

E T I NG

E P

Social es inferenc and ech Spe n face gnitio reco

I

T

s lue E Va nd s a es sk n g us Ri le l io a sc ch on e nc enc e u di Th au

256

Fig. 2. The Circles of Mind metaphor [6]

Res cogitans/A Thinking Thing was evident here, giving us the four main parts for the architecture of a new computer system. Res extensa/An Extended Thing (body) represents the other dimension of man, which physically uses the computer keyboard and gives the power of functionality to the computer application to be used on the stage.

4 The Hyperknowledge Framework 4.1 Hyperknowledge The hyperknowledge framework views the decision maker, i.e. here an active computer user, as cognitively possessing many diverse and interrelated pieces of knowledge (i.e. concepts). Some concepts are descriptive, some procedural, some are concerned with reasoning, and so forth. The mind is able to deal with these in a fluid and inclusive manner via the controlled focusing of attention. That is, the decision maker actively acquires (recalls, focuses on) desired pieces of knowledge by cognitively navigating through the universe of available concepts. To the extent that a DSS emulates such activity, interacting with it should be relatively “friendly,” natural and comfortable for the user. That is, the DSS can be regarded as an extension of the decision maker’s innate knowledge management capabilities. The decision maker is able to contact and manipulate knowledge embodied in the DSS as a wide range of interrelated concepts. The decision maker is able to navigate through the concepts of the DSS in either a direct or an associative fashion, pausing to interact with it. Thus, the hyperknowledge framework regards a decision support environment ideally as an extension of the user’s mind or cognitive faculties. Its map of concepts and relationships extends the user’s cognitive map, pushing back the cognitive limits on knowledge representation. Its knowledge processing capabilities augment the


257

user’s skills, overcoming cognitive limits on the speed and capacity of human knowledge processing. In the following passages we summarize, on a technical level, the major contents and functionality of a DSS specified as per the hyperknowledge framework. For further details, readers can refer to Chang, Holsapple, and Whinston [12], [7] and study also the prototype applications based on the Vanharanta’s framework. [13]. 4.2 Decision Support Content and Functionality According to the hyperknowledge framework, a decision support system is defined, architecturally, in terms of a language system [LS], a presentation system [PS], a problem processing system [PPS], and a knowledge system [KS]. The LS is the universe of all requests the DSS can accept from the user, and the PS is the universe of all responses the DSS can yield. The KS is the extensive universe of all knowledge stored in the DSS. The PPS has a wide range of knowledge management capabilities corresponding to a wide range of knowledge representations permitted in the KS. The KS holds concepts that can be related to each other by definition and association. These concepts and their relationships could be formally expressed and processed in terms of data base, formal logic and model constructs. Associative and definitional relationships among concepts in the KS are the key to creating a hyperknowledge environment and navigating within it. The KS also contains more than just models and data. It contains reasoning, assimilation, linguistic and presentation knowledge (see Figure 3, the human system metaphor developed by Dos Santos and Holsapple. [14].

Fig. 3. Structure of the decision support system [14]

The dynamics of the DSS involve transformations of messages from the user's language system to the decision support system’s LS. These transformations are carried out by the PPS (subject to the content of the KS) using four basic functions: translation (t), assistance (a), functionality (f), and presentation (p). The user interface and functionality of a DSS specified as per the hyperknowledge framework are depicted in Figure 4.

258


User Interface

Functionality K ling m

m

SI

inner loop

User

m

SO

a

DI

K

outer loop f

p

K K

m

reas desc proc

DO

K

Language System

Problem Processing System

pres

Knowledge System

Fig. 4. User interface and functionality of the hyperknowledge framework [13]

The knowledge symbols in Figure 4 signify the following: Kling Kreas Kdesc Kproc Kpres

= linguistic knowledge available in the KS = reasoning knowledge available in the KS = descriptive knowledge available in the KS = procedural knowledge available in the KS = presentation knowledge available in the KS

4.3 Working Space When a decision maker is working in the hyperknowledge environment, a concept must be “contacted” before it can be “impacted” (affected) by or have an “impact” on the decision maker. Contact is the recognition of a concept in the environment and entails sensing the existence of the concept and bringing it into focus. Either implicitly or explicitly, the user is provided with a “concept map” as the basis for establishing contacts [13]. The concept map indicates what concepts are in the environment and what their interrelationships are. An implicit map is external to the DSS (e.g. in the user’s cognitive environment, which may be burdensome as the KS becomes complex). An explicit map is provided by the DSS itself and can be regarded as a piece of descriptive knowledge held in the KS, describing the present state of its contents. With a concept map as the original contact point within the environment, the user can make controlled purposeful contacts with any desired concept in the hyperknowledge realm. Users can focus their attention on any part of an image, multiple windows can provide different views of parts of the same image, and different images of the same underlying concept can be seen in various windows. The result is extensive user interface flexibility, which is important in the facile and adaptive interface design.

5 Emerging Paradigm 5.1 Fusion Framework In the developed computer architecture, we have based our thinking on co-evolution by combining the HCM metaphor, hyperknowledge framework and the Circles of


259

Mind metaphor in one design framework, i.e. a fusion framework. The basic idea has been to map computer constructs and computer applications according to our theories based on modern brain science and the basics of the HCM and hyperknowledge functionality. With the created fusion framework we can design various computer applications and alter the design of existing created knowledge and data bases. First, our created applications contain the same systems as integral to the human brain or emulate the business processes as the brain emulates reality with the brain processes. The knowledge structure therefore contains the same important areas as the unconscious part of the brain. 5.2 Functionality of the Fusion Framework Figure 5 shows the user’s brain processes interacting with the user interface via the computer screen. The functionality is described as the hyperknowledge functionality and the database construction as the unconscious part of the human brain. In contemporary Internet applications, it is possible to navigate through the data and then combine the information according to the user’s needs, just as the hyperknowlege functionality describes the active computer user [10]. Again, these new applications share the same construct – to support the user through the user interface and, furthermore, to support the basic human processes of the mind, i.e. interpretation, memory, motivation and automatic activities. On the other hand, the combination possibilities are huge and, therefore, we have to focus on creating efficient and effective computer content for the computer user in a context-specific situation. In our computer applications, we first describe the content and the objectives of the application itself. The creation of the context-specific ontology then becomes crucial.

Fig. 5. A human compatible computer system (Salminen & Vanharanta 2007)

The construction is new and can be applied in many ways, from application design to database and computer design purposes. Our goal is to demand more from the computer and its application design. We require the design to be more holistic for the user.

260


6 Conclusion In the developed computer architecture, we have based our thinking on co-evolution. In this kind of overall system design, the computer has been illustrated to have the same sub-systems as we have in our brains. This framework can be applied to many different applications which use hardware and software. We can increase our knowledge through computer interaction. Hyperknowledge is then created on the computer screen. The construction as such contains the basic ideas of co-evolution: self-development through the use of and interaction with computer. Some applications bring the user information automatically and others extend the user’s memory capacity. Some applications also help the user to interpret the current reality, while others may help motivate the user. There are even some applications which support all system areas. Therefore, all applications, to one extent or another, increase and support our brain processes. In the same way, we can work with concepts other than computers within the conscious experience of humans. If we put an object into the conscious experience, for example different business processes, it is possible to create the extroversion of the business processes through an application. The actor can then explore the concept and gain a holistic view and understanding of the matter. These kinds of applications need supporting ontologies, concepts as well as technology, to uncover the underlying models behind the motivation, interpretation, memory and automatic systems and how these different sub-systems can be used in real life applications. These models also need other living system concepts to evolve with the processes and make the applications all the more humanistic. Acknowledgments. The work behind this paper has been financed by the 4M-project (cf. National Agency of Technology in Finland DNro 770/31/06 and Nro 40189/06) in Industrial management and engineering department at Pori, Finland.

References 1. Rauhala, L.: The Hermeneutic Metascience of Psychoanalysis, Man and World, vol. 5, pp. 273–297 (1972) 2. Rauhala, L.: Ihmiskäsitys ihmistyössä, The Conception of Human Being in Helping People. Helsinki: Gaudeamus (1986) 3. Pihlanto, P.: The Holistic Concept of Man as a Framework for Management Accounting Research, Publications of the Turku School of Economics and Business Administration, Discussion and Working Papers, vol. 5 (1990) 4. Husserl, E.: Husserliana I-XVI, Gesammelte Werke, Martinus Nijhoff, Haag (1963-1973) 5. Heidegger, M.: Being and Time. Blackwell, Oxford (1962) 6. Vanharanta, H.: Circles of mind. Identity and diversity in organizations – building bridges in Europe Programme XI th European congress on work and organizational psychology 14-17 May 2003, Lisboa, Portugal (2003) 7. Chang, A., Holsapple, C.W., Whinston, A.B.: A Hyperknowledge Framework of Decision Support Systems. Information Processing and Management, 30(4), 473–498 (1994) 8. Rauhala, L.: Tajunnan itsepuolustus, Self-Defense of the Consciousness. Yliopistopaino, Helsinki (1995)


261

9. Maslin, K.T.: An Introduction to the Philosophy of Mind, p. 313. Blackwell Publishers, Malden (2001) 10. Vanharanta, H., Pihlanto, P., Chang, A.: Decision Support for Strategic Management in a Hyperknowledge Environment and The Holistic Concept of Man. In: Proceedings of the 30th Annual Hawaii International Conference on Systems Sciences, pp. 243–258. IEEE Computer Society Press, California (1997) 11. Baars, B.J.: In the Theatre of Consciousness. Oxford University Press, Oxford (1997) 12. Chang, A., Holsapple, C.W., Whinston, A.B.: Model Management: Issues and Directions. Decision Support Systems 9(1), 19–37 (1993) 13. Vanharanta, H.: Hyperknowledge and Continuous Strategy in Executive Support System. In: Acta Academiae Aboensis, vol. Ser. B, 55(1), Åbo Akademi Printing Press, Åbo (1995) 14. Dos Santos, B., Holsapple, C.W.: A Framework for Designing Adaptive DSS Interface. Decision Support Systems 5(1), 1–11 (1989)

The Use of Improvisational Role-Play in User Centered Design Processes Yanna Vogiazou, Jonathan Freeman, and Jane Lessiter Psychology Department, Goldsmiths College London, University of London New Cross SE14 6NW {y.vogiazou, j.freeman, j.lessiter}@gold.ac.uk

Abstract. This paper describes the development and piloting of a user-centered design method which enables participants to actively engage in a creative process to produce intuitive representations and inspire early design concepts for innovative mobile and ubiquitous applications. The research has been produced as part of the EC funded project PASION, aiming to enhance mediated communication in games and collaborative environments through the introduction of socio-emotional information cues, represented in meaningful yet abstract enough ways to accommodate variable thresholds of privacy. We describe our design research methodology, which combines analytical approaches, aiming to uncover participant’s needs, desires and perceptions with creative, generative methods, with which participants inform and inspire the design process.

1 Research Context Mobile technologies are undoubtedly changing the way we interact with other people and our patterns of communication, with new forms of language or signalling emerging among mobile users and enabling a sense of being in touch, even with minimal text based exchanges. We are interested to explore the potential of non verbal communication among individuals and groups and the kind of emergent interactions and user behaviours that can occur through the awareness of social and emotional cues that are normally absent in mediated communication. Our aim is to create engaging user experiences by introducing implicit or invisible social cues, such as an individual’s or group’s emotional and mental state. Our starting point of investigation are the kinds of individual and group information elements that are relevant in particular contexts of communication. Recent research in communicating people’s states through the visualization of changes in the measurement of physiological data like arousal, skin conductivity, or heartbeat (Paulos, 2003) has highlighted fascinating opportunities for the creation and evolution of a new, user generated ‘interaction language’, shared among people in close relationships. Other research projects, like the ‘Affective Diary (Lindström et al, 2006) and Emotion Map (Nold, 2006) have focused on users’ perceptions of their own states and their reflection on everyday mediated interactions or city journeys respectively. Such reflections were possible through the visualization of collected user J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 262–272, 2007. © Springer-Verlag Berlin Heidelberg 2007

The Use of Improvisational Role-Play in User Centered Design Processes

263

physiological data over time and the association of the data with communication exchanges with other people (former) or places the users were visiting (latter) at the time the changes in their physiological states were recorded. In a similar line of thought, we are particularly excited by the possibility of introducing such feedback in real-time in social and collaborative contexts and observing the kind of spontaneous individual and group behaviours that could emerge. For this purpose we have adopted a bottom-up, user centered design research approach in order to initially identify how people express various communication, personal and contextual cues in spontaneous ways that make sense to them. The benefits and future opportunities deriving from these research directions span across a range of application areas, in particular applications in which the communication and collaboration of individuals and groups through new technologies takes place. For instance, Reimann and Kay (2005) discuss the role of visualizations for groups of learners in improving upon their knowledge and performance. The authors consider groups as complex systems, where global dynamics can result from local interactions and propose visualizations as a means of providing team awareness. Research in social computing applications (Vogiazou, 2006) has shown that even minimal indicators of other people’s presence facilitate group awareness, which is beneficial for strengthening social bonds among groups and communities. Our interest in patterns of group behaviour and social dynamics in collaborative interactions, in work, learning and leisure oriented activities has motivated the initial phase of the research described in this paper. The goal is to identify through design and user research the kind of socio-emotional cues that can provide useful feedback in communication and to explore emergent group and individual user behaviours from the introduction of such cues. The studies we discuss in this paper, which are part of the EC funded PASION (Psychologically Augmented Social Interaction Over Networks) project aim to identify: • social, emotional and contextual information elements (situational cues, environmental context, and individual and group behaviours) that are relevant in mediated communication in collaborative work and social settings, and • potential real time and historical representations of these elements in the form of multimodal, non verbal/textual representations. • the relevance and importance of such cues in collaborative work and social gaming situations at different levels of privacy disclosure. Next we describe the design research method we deployed to address these issues.

2 Design Research The main premises of user centered design are to bring users closer to the design process and to help designers gain empathy with users and their everyday activities through the use of different methodologies. Role playing has been used in usercentred design workshops for the concept generation of innovative products in everyday life (Kuutti et al, 2002) as well as testing out design ideas with potential consumers (Salvador and Sato, 1999). In interaction design research, role playing has

264

Y. Vogiazou, J. Freeman, and J. Lessiter

been extensively performed with the use of low-fidelity prototypes to develop further design ideas, what Buchenau and Fulton Suri describe as ‘Experience Prototyping’ is usually based on improvising user scenarios that create opportunities for some kind of technological intervention or design solution (Laurel, 2003). These scenarios of use are often acted out either by users or designers with some kind of props or imaginary objects, aiming to identify potential breakdowns as well as design opportunities. This method of user involvement in the design process tends to generate potential or ‘futuristic’ functionalities for products that the design team is working on. The functionalities are then eliminated or developed further in the continuing process by the design team. Role play has the main advantage of facilitating empathy with the context of use while trying out early design ideas. When acting out everyday problems refrigeration technicians are confronted with, with designers as actors and target users as the audience, Brandt and Grunnet (2000) found that the users recognized the situations shown in the dramatized scenarios as ones they often experienced. The designers who performed the scenarios on the other hand found it harder to use drama in an unfamiliar context like this. In another study, role playing was used to elicit a first brainstorm among users about the potential functionalities of an interactive book, the Dynabook in the home environment. Both studies showed that drama can help designers to achieve a greater empathy for the users and the contexts of use. In our research, role play was not related to a particular prototype or imaginary object aiming to elicit ideas for functionality, but was used as an expressive medium for users to communicate emotional states and contextual situations. The provided props were open to interpretation and aimed to facilitate the acting itself, without binding the design process to a particular artefact. Previous research in role playing as a design methodology has outlined the difficulties in involving drama professionals as facilitators (Svanaes and Seland, 2004) because: a) introducing users to acting techniques can be very time consuming and is a separate activity from design – with drama exercises lasting for 4,5 hours the creative sessions need to be arranged at other time slots and b) drama professionals tend to focus on their subject of expertise – teaching and facilitating the acting – rather than the generation of design ideas and therefore need to be able to understand the purpose and scope of a generative workshop. In our studies it was important to ensure that the participants were initially immersed in the themes and ideas of the workshop. Following a group discussion the role playing itself was presented as a game, so there was no need to provide any training in performance, it was sufficient to describe the activity and act out an example of what was asked, introduced by one of the facilitators. An innovative research method, combining analytical and generative approaches was developed and deployed in two user group workshops, which focused on collaborative work (at the Center for Knowledge and Innovation Research, in the Helsinki School of Economics, in Finland) and social gaming (at the Department of Computing and Informatics in the University of Lincoln, in the UK) respectively. The workshops were designed to identify relevant and potentially useful elements of personal, social and contextual information, represented in meaningful ways to be readily interpretable. User attitudes in relation to privacy and comfort with sharing these information cues were also explored.


265

Both the collaborative work and social gaming workshops followed a similar structure, which encouraged participants to get immersed in the subject and discuss their views, before engaging in creative activities that required them to generate ideas and concepts for representations. The phases can be summarized as follows: • General group discussion and brainstorm. The discussion was focused on everyday collaborative work practices and different forms of play in either workshop, aiming to identify relevant information elements about individuals, teams and context. • Feedback on early sketches. Participants were shown the same set of rough sketches (see figure 1 for example), representing individual, collective and contextual states and cues and were asked to guess what they were meant to suggest. This initiated further discussion and suggestions on non-verbal representations. At the same time this activity acted as a warm-up, to prepare the generative session that followed and inspire participants to think about representations in a more abstract, broader sense. • Improvisational role playing. The role-playing was performed individually by each participant to come up with creative ideas about representing information using different modalities (e.g. visual, sounds, actions). Here we focus on this method in particular. • Card sorting activities. In this last task participants prioritized and grouped the main information elements that emerged in the initial group discussion. They were also asked to comment on when these elements need to be private and when they can be public.

Fig. 1. Left: sketch of a group state indicating collective activity (movement, excitement). Right: sketch of an individual in a calm environment.

For the workshop on collaborative work six male participants were recruited from the age group of 24-40, with professional experience of collaborative work, either as researchers or PhD students. For the social gaming workshop nine participants were recruited from the age group of 17-40, five of whom were female and four were male. Four participants were pursuing a postgraduate degree and the other five were A Level Psychology students. Participants had variable gaming experiences, ranging from online massively multiplayer games to traditional board and card games and physical street games. The role playing was not used as a re-enactment of a user scenario or for the evaluation of a design concept, but in an entirely generative way: participants were asked to act out in a non verbal way different situations that were relevant to the workshop theme. For example, some situations in collaborative work were: “You are confused by what your manager is saying to you in a conversation”, “You are very

266


stressed about a forthcoming deadline”, “Being on the bus or train to work, very crowded during peak time”. Situations related to social gaming were along the lines of: “You and your team are exploring a new area – a danger is approaching”, “You have developed bonds with a team of people”, “You are playing a mobile game in a really crowded café”. Participants were asked to pick one situation and one modality that they should use for their representations from a box, both written on strips of paper. Examples of modalities were: “Draw on paper”, “Act out a situation, improvise, mimic an activity” and “Make a sound orally”. The activity was introduced as a game of ‘Charades’, which appears in variations across cultures. Part of the challenge for participants was to represent individual, group and contextual information cues in a way that the rest of the group could guess what was being represented. Various props (e.g. a tambourine, plasticine, paper, coloured pens and cups) were brought and used to express different modalities (e.g. auditory, visual, tactile). The workshop was recorded on audio and video. The video recording of the role playing workshop was used for the further generation of concepts and design ideas. Video recordings, still photographs and sketches from these role-playing activities were then used as generators (Holmquist, 2005), they formed part of a process that generated inspiration, insights and ideas – the beginning rather than the end in concept development. Following the two workshops we organised a third one (at Goldsmiths College, University of London), which was primarily generative, aiming to explore in more depth the key emergent themes from the previous workshops. We used a similar method to the one used in the previous workshops. Two teams of graduate designers in the age of 23-30 were recruited (5 male, 1 female) to generate a breadth of concepts and multidimensional representations of individual, collective and contextual states. The workshop was structured as follows: • Brainstorm and concept mapping. Participants were asked to discuss the key concepts of ‘group power’ and ‘connecting’ in the context of different situations, taking into account location, user attributes and collaboration, either work or leisure related. They documented the generated ideas by drawing collaboratively a ‘concept map’ (Novak, 1998) on large sheets of paper. This acted as point of reference for further discussion and debate around the ideas. • Individual role play. Role-playing activity performed individually to come up with creative ideas about representing various situations using different modalities (e.g. visual, sounds, drawing, modeling, actions). Similarly to the earlier workshops, participants had to choose randomly a ‘situation’ to represent and a modality to use. • Collaborative role play. Role-playing in pairs: participants acted out together an idea they generated in the earlier discussion using various props. A range of props was provided to facilitate improvisation and idea generation on the fly, including a mixer with many different sounds in order to experiment with representations (figure 2). The mixer had two CDs: one with ambient sounds (e.g. park, street noises) and another CD with short sound effects (e.g. clapping, stampede). These could also be used in combination with a touch microphone attached under the table. The touch microphone allowed participants to produce sounds spontaneously by tapping on the table or moving objects on its surface, enhancing the role playing experience and the richness of the representations created. The design graduates


267

Fig. 2. Participants exploring and then using the audio equipment for sound based representations

produced detailed multimodal representations using, for example, samples of background sounds to represent emotional states and environmental situations, and combining traditional design processes like sketching and modelling with acting.

3 Insights from Improvisational Role-Playing Activities The two user workshops in Finland and the UK produced spontaneous representations with noticeable cross-cultural similarities. For example, we observed an open posture, shaking hands (or a tambourine) as a representation of positive affect relating to celebration or excitement. A more closed body posture indicated negative affect, namely confusion or sadness in collaborative work and gaming scenarios respectively. Participants in all three workshops engaged with the process and gave positive feedback; their interactions became more spontaneous during role play. Their nonverbal representations were very compelling in presentation and encouraged the continuous involvement of the rest of the group, as they tried to guess what was being represented as closely as possible. We found it easy to change the activity on the fly in the workshop, because of its flexible and non-prescriptive nature; participants could act out representations on their own or improvise collaboratively in pairs. The collaborative acts were more detailed and made extensive use of the available props. In the first two workshops, participants used visual representations and in particular actions (as opposed to static poses or drawing) more than other modalities, in spite of encouragement to explore all modalities. Often participants would try and combine modalities (e.g. drawing and then performing some gestures in the relation to the drawing) in order to communicate their situation more accurately. The design graduates who participated in the third workshop discovered the use of sound as a powerful creative tool through mixing the different sounds provided. Modifying the available props in future workshops could therefore reveal more emergent representations and encourage a diverse improvisational play. Below we present a selection of the generated representations that illustrate a variety of individual, group or contextual cues in collaborative work practice and social gaming, using various media (e.g. sound, hand gestures, poses, drawing and actions). 1. Using particular postures and movement to indicate personal states (confusion, stress, sadness) Posture tuned out to be a powerful means of communicating individual emotional states and social cues in role play. The postures and expressions of confusion (figure 3) were sketched out after the workshop to illustrate possible visual representations of confusion in technology mediated communication.

268


Fig. 3. Two different postures showing lack of understanding, confusion in communication. Sketches outlining the posture used to represent confusion and lack of understanding.

Fig. 4. One participant kept moving in a loop to indicate a high level of stress

Even when not using the whole body for acting, posture could be suggested with other means. A participant in the social gaming workshop communicated body posture in a rather abstract way using his fingers. An imaginary sad figure, represented by a bended finger and one finger moving away from it, showed the growing distance between two team players. This inspired the sketch in figure 5:

Fig. 5. Growing distance between two players

2. Using a continuous sound for ‘context’ and short sounds for an individual state or alert Sound was used to communicate a sense of atmosphere. One participant in Finland made a continuous noise orally (i.e. “bla bla bla bla bla”) occasionally interrupted by sounds of yawning to suggest boredom during a presentation, which the rest of the group understood. A social gaming participant produced an intense and continuous sound alert to indicate approaching danger. The sound (generated by beating a spoon inside a glass) became more intense and loud towards an imaginary player (represented by an object) to signify some kind of danger getting closer. This was also easily perceived by the rest of the group. In the third workshop, graduate designers experimented by combining techniques they were familiar with such as sketching, with acting or sound creation on the fly, by using a combination of sounds from the mixer. Sound was a good tool for communicating environmental cues. For example, the noise of traffic and a stampede of horses were played to represent crowd flow, while the designer drew the sketch in figure 6 to illustrate the flow of people towards different directions in rush hour.


269

Fig. 6. Crowd flow during rush hour, accompanied by the combination of city sounds with the noise of a stampede

3. Using lively sounds and an open posture to indicate excitement Open postures were used in all workshops to communicate positive affect, with crosscultural similarities. One participant in the collaborative work workshop (Finland) held the tambourine up and shook it to show joy. Similarly, in the social gaming workshop (Lincoln) another participant used an open gesture and moving wrists to show the celebration for victory in a game.

Fig. 7. An expression of celebration of success (left) and victory in a game. The middle sketch illustrates the same posture.

Similar representations of excitement emerged in the third generative workshop, illustrated by shaking a pair of maracas. A different one was the making of an exclamation mark from plasticine and the drawing of ‘emoticons’ (smileys), a rather common representation of joy. Excitement was also communicated through a juxtaposition of natural sounds – the sound of animals (monkeys) in the background of calm, environmental sound. 4. Size representing status indicator The size of a figure was used in a drawing to suggest that a player holds a higher status in a game, in the social gaming workshop.

Fig. 8. Size of figure represents hierarchical status; circle indicates one’s own team

270


5. Representations of private space Participants’ attitudes to privacy issues were explored through discussion and card sorting in the first two workshops to identify different levels of privacy. The concept of personal or ‘communication-free’ space also emerged in the third, creative workshops in which ideas on privacy and personal space were visualised in different ways. For example by marking the space with a line made out of objects or creating an ‘isolation tank’ which completely disconnects all communication and external stimuli. The ‘tank’ was also sketched as an ‘isolation island’, a kind of mobile ‘cloud’ that protects the person from the intrusion of wireless communication when this is not desired. In some of the performances in the third workshop, a participant would try to engage a ‘stranger’ in conversation, for example by playing lively natural sounds (e.g. monkeys), making eye contact, pointing out objects, getting closer to the other person or drawing links between individuals to show connection. The other person would respond by trying to maintain his or her privacy, for example by hiding behind sunglasses/ a book and moving further away. This performance also illustrated the idea of a state of ‘disconnection’ and maintaining one’s privacy and personal space.

Fig. 9. Different representations of private space

6. Varying degrees of disagreement were represented with ‘emoticons’ and gestures In the third workshop, the design graduates created representations to show disagreement within a group discussion or disapproval of a person, with varying degrees. For example, in the sketch in figure 10, gradual disagreement is represented by a ‘smiley’ that eventually stops smiling and responds with ‘abuse’. Another participant drew different icons for disagreement and then smashed a plasticine model of himself to show complete rejection and exclusion. An interesting representation through role-play which was fun to observe, was performed by a designer who pretended he was having a discussion with another participant (who had no idea on what he was trying to communicate). He made gestures of anger shaking his finger at the other person and then hit his fist on the table with the contact microphone, making a very loud sound and ripping up a sheet of paper.


271

Fig. 10. A sketch of a meeting in which gradual disagreement results in abuse! Different indicators of disagreement

4 Conclusion The combination of analytical and generative methods worked well by initially immersing users in the ideas of the PASION project, helping identify their needs and desires and then engaging them in communicating those ideas in interesting ways, that can be further explored and developed through a design process. The initial discussions introduced participants to the themes of collaboration, connection to other people and non verbal communication. Asking participants to guess what the sketchy drawings meant was also a way of encouraging them to consider more abstract non verbal representations of personal, environmental and collective states and set the scene for the role play. By introducing the role play activity as a fun ‘Charades’ game and demonstrating an example, we shifted the focus from trying to be a good actor to trying to come up with interesting ideas. Role playing and experimentation with different media also opened up a range of creative possibilities for the participating graduate designers, enabling them to enrich initial ideas, to bring them to life from a one-line sentence written on a piece of paper to an engaging performance. Because the activity was not bound to a particular artefact or technology, common to other uses of role play, where a user scenario is acted out to identify product functionalities or solutions to design problems, the generated representations were open to interpretation and diverse in the use of expressive media (actions, props, sound, drawing). In the future, we would like to see how this kind of improvisational role play can be applied in the exploratory design research phase for other innovative products and applications, which are not necessarily focused on non verbal representations. The concepts generated through the activities discussed in this paper demonstrate that improvisational role playing can be a powerful tool for both participants and designers: a) enabling participants to engage creatively in user centred design workgroups, and b) generating useful initial user input for the design process that can be then developed further for the design of easily interpretable and intuitive visualizations and interfaces. This method proved cost and time effective, compared to other role playing methodologies, involving drama professionals as facilitators, in which some training in acting needed to be provided. Most importantly, the method generated valuable concepts and ideas for novel representations of socio-emotional and situational states, which became part of the core design process for the PASION project. These representations are currently being developed further through sketching, mock-ups for application concepts and as user interface design elements that can be trialed with users.

272


Acknowledgements. The research is supported by the European Community under the Information Society Technologies (IST) programme of the 6th Framework Programme (FP6-2004-IST-4) – Project PASION (Psychologically Augmented Social Interaction Over Networks). The authors would like to thank all our participants and Nela Brown (sound artist) who planned and arranged the set up for the sound experimentation in the third workshop.

References 1. Charades, Wikipedia definition and rules of play at http://en.wikipedia.org/wiki/Charades (last accessed on 3/11/06) 2. Brandt, E., Grunnet, C.: Evoking the future: drama and props in user centered design. In: Cherkasky, T., Greenbaum, J., Mambrey, P. (eds) Proceedings of Participatory Design Conference, New York, CPSR (2000) 3. Buchenau, M., Fulton Suri, J.: Experience Prototyping. In: Proceedings of the DIS2000 conference, pp. 424–433. ACM Press, New York (2000) 4. Holmquist, L.: Practice: design: Prototyping: generating ideas or cargo cult designs? Interactions of the ACM 12(2), 48–54 (2005) 5. Kuutti, K., Iacucci, G., Iacucci, C.: Acting to Know: Improving Creativity in the Design of Mobile Services by Using Performances. In: Proceedings of the 4th Conference on Creativity & Cognition, Loughborough, UK (2002) 6. Laurel, B.: Design Research: Methods and perspectives, pp. 49–55. The MIT Press, Cambridge, MA London (2003) 7. Lindström, M., Ståhl, S., Höök, K., Sundström, P., Laaksolathi, J., Combetto, M., Taylor, A., Bresin, R.: Affective diary: designing for bodily expressiveness and self-reflection. In: CHI ’06 Extended Abstracts on Human Factors in Computing Systems, Montréal, Québec. Canada, ACM Press, San Francisco (2006) 8. Nold, C.: BioMapping Project. (last accessed on 25/01/06, 2006)Available online at http://biomapping.net/press.htm, 9. Novak, J.D.: Learning, Creating, and Using Knowledge: Concept maps as facilitative tools for schools and corporations. Mahwah, N.J., Lawrence Erlbaum & Assoc. 10. Paulos, E.: Connexus: a communal interface. In: Proceedings of the 2003 conference on Designing for user experiences, pp. 1–4. ACM Press, San Francisco (2003) 11. Reimann, P., Kay, J.: Adaptive visualization of user models to support group coordination processes. In: Paper presented at the 2nd Joint Workshop of Cognition and Learning through Media-Communication for Advanced e-learning, Tokyo, Japan (2005) 12. Salvador, T., Sato, S.: Playacting and Focus Troupe: Theater techniques for creating quick, intense, immersive, and engaging focus group sessions. Interactions of the ACM 6(5), 35– 41 (1999) 13. Svanaes, D., Seland, G.: Putting the users center stage: role playing and low-fi prototyping enable end users to design mobile systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’04, Vienna, Austria, ACM Press, New York (2004) 14. Vogiazou, Y.: Design for emergence: collaborative social play with online and locationbased media. IOS Press, Amsterdam (2006)

Quantifying the Narration Board for Visualising Final Design Concepts by Interface Designers Chui Yin Wong and Chee Weng Khong Interface Design Department, Faculty of Creative Multimedia, Multimedia University, 63100 Cyberjaya, Malaysia {cywong, cwkhong}@mmu.edu.my

Abstract. The narration board is a powerful design tool to help translate user observation studies into a storytelling format. It helps to communicate design values and ideas among the design team via visualising user scenarios in its proper context during the early design stages. This paper aims to discuss the narration board as a design tool to help the design team conceptualise and visualise user scenarios interacting with future design concepts within its context of use. Second part of the paper discusses how narration boards assist in generating ideations and visualising final design concepts by interface designers. Twenty (20) design projects (N=20) were examined to study and quantify two important factors, i.e. the components of the narration board in relation with the attributes of the final design concepts. A non-parametric correlation test was used to study the correlation coefficient between scores of the two factors. The results show that there is a statistically significant positive correlation between components of the narration board and attributes of the final design concept. Those with higher scores of components in narration board tend to produce better final design concepts, and vice versa. Keywords: Narration, Interface Design, Storyboard, and design concepts.

1 Introduction In recent years, user studies have been highly advocated by the human factors and user interface design communities. Basically, such studies solicit greater understanding in users’ behaviour within its real context, and to elicit user requirements for future products and services. In order to translate the data gathered from user studies throughout the design process, scenarios are usually presented as text based descriptions that depict personas within social contexts. This, however, may not be well comprehended to share the high-level product conceptualisation process for the design team. It is imperative for interface designers to be able to effectively convey ideation and concept designs to the design team. Having a coherent design vision that can be shared across the design team has become a challenge throughout the design process. In a design project, the team is usually comprised of multi-disciplinary members such as the project manager, designers, graphic artists, human factors specialists, user researchers, technical writers, software developers and engineers. In order for a J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 273–282, 2007. © Springer-Verlag Berlin Heidelberg 2007

274

C.Y. Wong and C.W. Khong

multi-disciplinary team to share the same vision and theme on a design project, there needs to be a means for communicating high-level concept designs across the team. Hence, narration or storytelling, has become an important channel for depicting scenarios and sharing visions of design ideas and concepts in the design process. The objectives of this paper are twofold. Firstly, to discuss on the narration board as a design tool to help the design team conceptualise and visualise user scenarios interacting with future design concepts within its context of use. Secondly, this paper attempts to quantify the components of narration (story) board in relation with the attributes of the final design concepts.

2 Storytelling and Scenarios 2.1 Rationale of Storytelling User researchers or ethnographers conduct user studies to elicit user requirements during the early stages of the design process. This is meant to have a closer understanding of how users behave and interact with artefacts within the real environment. Such studies will highlight social activities, trends and values, which are then analysed and incorporated in the scenario-building process to depict user personas in the context of use. Storytelling is perceived as an acceptable channel to share similar beliefs and thoughts among the community. In general, stories are easily remembered by society rather than design principles, facts and figures. There are several reasons why stories are good communication catalysts for a design team [5], [11]: • Stories communicate ideas holistically, conveying rich yet clear messages. Thus, they are an excellent way of communicating intricate ideas and concepts in an easy-to-understand format. Therefore, stories allow people to convey tacit knowledge that might otherwise be difficult to articulate. • Stories are easily remembered by people because they are circulated with human passion and emotions. • Stories aid in team-building as it becomes a communication tool to share similar user-activities events and information that help in constructing vision. It eases the communication flow by nurturing a sense of community and help to build relationships, especially in a multi-disciplinary design team. • Storytelling provides the context in which knowledge arises as well as the knowledge itself, and hence it can increase the likelihood of accurate and meaningful transfer of knowledge. 2.2 Adoption of Storytelling into Scenarios Storytelling has been widely adopted in different disciplines, particularly in film, animation, education, design and business. For instance, Walt Disney uses storyboards for creating motion pictures and animation characters in their film production process. In the real business world, multi-national companies like IBM’s Knowledge Socialisation Project [6] use storytelling to share business visions within

Quantifying the Narration Board for Visualising Final Design Concepts

275

the organizations. Instructional designers may use storyboards to create learning objects for courseware design whilst developing educational systems. In design practice, storytelling has been used by designers to share the conceptual design prototypes and design solutions across the design team. Stories and event scenarios are collected from observational fieldwork studies to share user behaviour, cultural belief, and insight to the whole design team for design strategy. Stories are concrete accounts of particular people and events, in particular situations; scenarios are often more abstract and they are scripts of events that may leave out details of history, motivation, and personality [5]. Despite the differences, storytelling and scenarios are intertwined and both are difficult to be distinguished as design story or user-interaction scenarios. In the user requirement stage, user researchers collect user stories and observational information from fieldwork studies. Observational data is then translated and analysed into various themes and consumer insights. This helps to create realistic example and build scenarios as shared stories in the design team. User profiles, characters and goals form personas in scenario-building process. Cooper [4] first proposed the concept of persona and it has widely applied in academic and industrial practice and the concept has been integrated in various design projects. In essence, persona is an archetype person representing a user profile whereas scenarios inherently describe how a person interacts with the product in the context of use. As mentioned earlier, stories are easily memorized by people, the medium of presenting storytelling are crucial in making the stories memorable and the shared visions are inherently comprehended within the design team. Rosson and Carroll [10] described user-interaction scenarios are setting, actors, task goals, plans, evaluation, actions and events. However, the design scenario activities are illustrated in conventional text-based description, embedding characteristic elements of user interaction scenarios. Thus, next session describes how narrative scenarios are illustrated in pictorial form to conceptualise high-level of user-interaction scenarios.

3 Narration in Context 3.1 Narration in the Design Process Narration has been used and applied in different phases of the design process. Lelie [8] described the value of storyboards in the product design process. The term “storyboard” is used instead of narration board. In each phase of the design process, the form of storyboards has its own style of exploring, developing, discussing or presenting the product-user interaction. The design process ranges from analysis, synthesis, simulation, evaluation to decision phase. The consideration of visualisation style is illustrated differently in relation to design activities, purpose/goals, and its representation form in each phase of design process [8]. In our context, we discuss how narration boards are used in the interface design process during early conceptual design stages for ideation purposes. Figure 1 shows the detail requirements in the conceptual design phase for interface designers. There are two types of narration boards being adopted, that are the Narration Board (preideation) and Narration Board (post-ideation). For the Narration Board (pre-ideation),

276


interface designers are required to translate the results of observation studies and market research into problem scenarios highlighting the problems or any issues that users face in the real environment. Different design aids such as mood boards, product design specification, and product positioning are also developed in assisting designers to achieve a holistic grasp on the concept designs being developed. The interface designers will then be required to produce another Narration Board (post-ideation) to project how their concept designs will be used in future scenarios. Research – User Studies – Ideation/Conceptual Design – Prototype – (Re) evaluate Fig. 1. Brief Conceptual Design Phase

In the realm of interface design, communication between designers and other team members are important for a successful design project. Narration board is a valuable design tool to the design team as it provides a common visual-based medium to share the common understanding of future design developments. Conventionally, scenarios are illustrated in textual descriptions to portray userinteraction scenarios [10]. For designers, visual-based mediums are important to assist them in ‘visualising’ and developing ideations for future design solutions. In such circumstances, scenarios described in visual forms accompanied by text explanations serve the communication purpose within the design team. Nonetheless, visual-based narrative is a valuable aid in provoking the thinking process, evoking ideations and to spur creativity to higher levels for interface designers. Several types of medium have been used to illustrate narration or storytelling in either analogue or digital format such as hand drawing, sketching, photography and video [2], [8]. There are some software tools developed for storytelling such as DEMAIS [1], and Comic Book Creator™ [3]. In developing narration boards, the interface designers are required to consider the characteristics of user personas, scenarios and context of use. They are able to select any medium of communication to illustrate the narrative scenarios. Due to time and cost considerations, hand sketching, marker rendering and drawing on layout pads are the most cost-effective way. The designers then scan their narrative scenarios into digital formats, which can then be posted online for sharing purposes. Alternatively, the interface designers can transfer the photographs they have captured during their observation studies using graphical software such as Adobe Flash™, Adobe Photoshop™ or Comic Book Creator™. 3.2 Types of Narration Boards Narration boards also play an important role in bridging the communication gap between the design team and other corporate departments such as top management, manufacturing department and the clients themselves. For top management and the clients, they usually do not have ample time to go through the detailed design levels. Hence, narration board assists in projecting the problem scenarios of the user experience. This is illustrated in the Narration Board (pre-ideation) (figure 2). On the


277

Fig. 2. An example of a Narration Board (pre-ideation) depicting a scenario of a primary school pupil who is robbed on the way to home from school

Fig. 3. An example of Narration Board (post-ideation) illustrating a scenario of how E-Hovx plays as a role in protecting the primary pupil from a potential robbery

278


other hand, top management and clients will be able to grasp the design solutions from the illustration of how the intended users interacting with the new product design concepts or design solutions in the future scenarios as demonstrated in Narration Board (post-ideation) (figure 3). An example of the E-Hovx project depicts a scenario faced by a primary school pupil encountering danger as he is robbed on his way home from school (figure 2). Figure 3 shows how the concept of the E-Hovx device assisting in the scenario by producing an alarm to alert the pupil and to ward off any potential harm.

4 Evaluating Narration Board for Visualising Final Design Concept 4.1 Methodology In order to evaluate how narration board (pre-ideation) is effective as a design tool in assisting interface designers for generating ideations and visualizing final design concepts, an empirical study was conducted by a usability specialist to examine the relation between the two variables of narration board and final design concept. The study examined twenty (20) different design projects developed by interface designers as test subjects (sample size N=20) at the Interface Design Department. Based on the above description, the null hypothesis (Ho) is “there is no relation between narration board and final design concept”. The alternative hypothesis (H1) is where ‘there is a positive association between the narration boards (pre-ideation) with final design concept for a design project.’ To produce a successful narration board, there are certain elements to be highlighted by the designers. Truong et. al. [12] highlighted that there are five significant elements for a narration board to convey its narrative across to the design team. The five elements of narration board are level of detail, inclusion of text, inclusion of people and emotions, number of frames and portrayal of time. On the other hand, there are also 5 attributes that determine how usable and functional the final design concepts are deriving from the input of the narration board. These 5 attributes of generating final design concept in the later conceptual design stage are form and functionality, usability (ease of use), user-artefact illustration, product semantics, and design appeal (emotional and mood). This study looks at 20 design projects (DP) developed by interface designers addressing a common theme of “i-Companion”. The DPs were selected based on the inclusion of narration board (pre-ideation) and final design concept in the design process. To quantify the effectiveness of the narration board, the usability specialist justified the scores based on a 1-5 point Likert scale (1 is the least, 5 is the most applicable of applying the elements) on the elements of the narration board. The elements look at level of detail, inclusion of text, inclusion of people and emotions, number of frames, and portrayal of time. Subsequently, a final score was given on the 20 DPs respectively based on the sum of the 5 narration board elements. Conversely, to evaluate the output of final design concepts, the final design concept scores were calculated with the total sum of the 5 attributes, i.e. form and


279

functionality, usability, user-artefact illustration, product semantics, and design appeal (emotional and mood) on the 20 DPs respectively. 4.2 Results, Data Analysis and Discussion Result. The table below (table 1) shows the summary of the final scores of narration board and final design concepts for the 20 DPs. Table 1. A summary of the final scores on Narration Board and Final Design Concepts for 20 Design Projects Design Project (DP)

Narration Board Scores (N)

Final Design Concept Scores (C)

DP 1 DP 2 DP 3 DP 4 DP 5 DP 6 DP 7 DP 8 DP 9 DP 10 DP 11 DP 12 DP 13 DP 14 DP 15 DP 16 DP 17 DP 18 DP 19 DP 20

16 17 12 10 12 8 19 13 10 12 18 12 9 5 12 12 15 14 17 16

13 18 12 9 7 7 12 7 10 13 21 8 5 5 15 11 15 16 16 17

Data Analysis. To examine the relation of both variables (narration board and final design concept), non-parametric Spearman’s Rho test was conducted to study the correlation coefficient for the sample size (N) of 20. The diagram below (table 2) shows the ‘correlations’ matrix of the two variables (scores of narration board and final design concept). From the diagram, there is a statistically significant positive correlation between narration board and final design concept scores (rho=0.78, df = 18, p j (or i<j) in the recording table of paired comparison. The selected interface is noted 1, while the non-selected interface is noted 0. Table 2. Table of Kendall Coefficient Calculation

Interface No. A B C D E F G

A 4 1 1 3 9 5

B

2 1 5 9 5

C

9 1 1 0

D

1 0 1

E

9 8

F

G

3

3.2.2 In-Complete Matching Task-Name Matching Task The in-complete matching method is to assess the identification of products. It requires users to select a product name for the login interface of each product, and then it will calculate the accuracy of interface-produce match (AM). AM is equivalent to the number of interfaces selected correctly by users dividing the total number of the required-selected interfaces, that is, AM = AI (accurate interface) / TI (total interface). If the matching task gets a high score, we will consider the product has high identification.

Fig. 3. Depiction of In-complete Matching Task

4 Experiment Preparation Screening of users is conducted through telephone. Combining with the server registered information, we make a quick telephone interview and invite users to participate in our activities, and we will require users to recall the used relevant QQ products and give a brief description of this product in order to ensure that users meet our requirements. 16 users in total participated in this experiment, among which were 10 men and six women, and 13 users aged 21-30 years, 3 users aged 20 years and less years. To

The Experimental Approaches of Assessing the Consistency of User Interface

425

ensure the familiarity of users to the QQ businesses, QQ business we selected those users who have used at least five QQ businesses. Only a few users have used the QQ businesses less than five. Seven login interfaces of the typical businesses and three login interfaces of other instant messaging software including MSN, Yahoo Message and Popo are selected in the complete-matching method. These interfaces are all processed in advance to remove the names and icons of relevant products.

Login interface before modifying

Login interface after modifying

Fig. 4. Depiction of Interface Modifying of In-complete Matching Task

5 Results The average accuracy of in-complete matching task of 16 users is 61%. The result shows that the identification of seven interfaces to be assessed is high. However, the Kendall’s U in coefficient calculated in the paired comparison task is approximately equivalent to 0.28. There is a certain trend in the distribution of users’ options, but the trend is not significant. Therefore, we can conclude that the overall consistency of 10 interfaces to be assessed is common. But in the calculation process, we also find that some options of pairs appear a concentrating trend, seeing the following table. Table 3. Table of Paired Comparison Data

Interface No. A B C D E F G

A

B

C

D

E

F

4 1 1 3 9 5

2 1 5 9 5

9 12 14 14

4 14 12

9 8

3

G

426

Y. Chen et al.

A-G is the code of seven interfaces to be assessed. Among of them, option C of CF, CG and CE is hardly selected, the selected times are separately 4 (subtract 12 times of E selected from 16 times in total), 2 (subtract 14 times of F selected from 16 times in total), and 2 ((subtract 14 times of G selected from 16 times in total). In the interviews, we know that the colors of the interfaces have greater impact on the users’ options. The color of C interface varies from the colors of all other interfaces and it is why the options of C have a concentrating trend. While in the DF and DG pairs, the times that D is marked as an option is very few, and they are separately: 2 (subtract 14 times of F selected from 16 times in total) and 4 (subtract 12 times of G selected from 16 times in total). Through interviews, we also learn that the sizes of the interfaces have greater influence on the users’ options. The size of D interface, relative to the sizes of F and G interfaces, is very different. Therefore, the size of the interface is the main cause of option concentration.

6 Conclusion We can conclude that the identification of products is high through the in-complete matching task, and conclude that the consistency of products is low by the method of paired comparison. Based on the list corresponding to the above study premises, we have come to the consistency description of eight interfaces to be assessed, seeing the table below. Identification

Consistency among

Complete consistency

Description

products 3

high

low

common

The complete consistency of the product experience is common, and the products have high identification while the consistency among different products is low. Possible reason: a part of products have high identification and it is possible to have sub-products.

Eight interfaces to be assessed including the QQ2006 interfaces as a standard in the paired comparison task all have high overall identification, while the consistency among the product is low. Design elements of some products with great differences, such as size, color and icon of the interface, may be the causes of a rise of the accuracy

The Experimental Approaches of Assessing the Consistency of User Interface

427

in the matching task, while that the consistency among the products is lower gives a support for the possibility of the design uniqueness existing in some products, and there are also many causes of the uniqueness formation, sub-products may be one of them.

7 Discussion In this study, it is found that the in-complete matching method and the method of paired comparison can rapidly assess the consistency performance of products, and that the increase in the number of products does not affect the experimental progress, therefore these two methods are more applicable for the situation with the larger quantity and more categories of products. However, the study only involves the login interface with single design elements, and therefore, the applicability of the methods needs for the next study. We consider adding the further interviews to users in the following studies and explore the specific factors which affect the consistency. A rise in the number of users and experimental groups to compare also could further verify the experimental conclusion.

References [1] Nielsen, J.: Coordinating User Interfaces for Consistency, Academic Press, Boston, MA. pp. 35–55 [2] Preece, J., Rogers, Y., Sharp, H.: Interaction Design, Beyond Human-computer Interaction. John Wiley & Sons, Inc, New York (2002) [3] Shneiderman, B.: Designing the User Interface: Strategies for Effective Human-computer Interaction, Person Education (1998)

Evaluating Usability Improvements by Combining Visual and Audio Modalities in the Interface Carlos Duarte, Luís Carriço, and Nuno Guimarães LaSIGE – Faculty of Sciences of the University of Lisbon Edifício C6, Piso 3, Campo Grande 1749-016 Lisboa, Portugal {cad,lmc,nmg}@di.fc.ul.pt

Abstract. This paper reports the findings of an evaluation of an adaptive multimodal application for reading of rich digital talking books. Results are in accordance with previous studies, indicating no user perceived difference between applications with and without adaptivity. The NASA Task Load Index was also used and showed that users of the adaptive application reported less workload. Results also include a comparison between tasks executed with electronic support and tasks executed with print support, and also what specific features in the interface benefited the most from the use of visual and audio modalities. Keywords: Evaluation, Adaptive Interfaces, Multimodal Interfaces, Electronic and Print Reading, Digital Talking Books.

1 Introduction In today's ever changing interaction scenarios, alternative modalities to the predominant visual channel, will have to be explored in order to cope with the new challenges. For interfaces to be flexible enough to adapt to these requirements, other modalities will have to be employed, either by themselves or in combination with visual interaction. Balancing the load between modalities, or even switching to different modalities, in order to better adapt the interaction characteristics to the information being presented, can decrease the cognitive effort required [1]. This interaction adaptation may improve the usability for the average user, and will certainly improve the accessibility for people with disabilities. One of the most promising modalities for complementing or replacing visual interaction is audio. For the average user, sounds and speech are natural input and output modalities. For visually impaired users they are, perhaps, the most important input and output modalities. Research has proved that audio, used either as input or output, can provide solutions to specific interaction problems [2], but also that audio is more suited to specific tasks, while other tasks are better accomplished using other modalities [3]. From these results we can expect that audio related benefits can be optimized when combined with other modalities. Speech recognizers and voice synthesizers, J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 428–437, 2007. © Springer-Verlag Berlin Heidelberg 2007

Evaluating Usability Improvements by Combining Visual and Audio Modalities

429

whose performance has increased over the years, are beginning to be deployed in general public applications, meaning more and more users have had contact with some kind of speech technology. However, most of these applications, like call centers, rely solely on audio. The combined used of two modalities remains outside of the general public reach. In this paper we explore usability issues in an application using video and audio as input and output modalities. The following section briefly introduces the application used in the evaluation sessions. The next section describes the experimental setting and procedures. This is followed by the presentation of the evaluation results. Section 5 discusses the results, and the final section concludes the paper and presents future work.

2 Rich Book Player – An Adaptive Multimodal Digital Talking Book Player The application used in the usability evaluation was the Rich Book Player, an adaptive multimodal Digital Talking Book player [4]. This player can present book content visually and audibly, in an independent or synchronized fashion. The audio presentation can be based on previously recorded narrations or on synthesized speech. The player also supports user annotations, and the presentation of accompanying media, like other sounds and images. In addition to keyboard and mouse inputs, speech recognition is also supported. Due to the adaptive nature of the player, the use of each modality can be enabled or disabled during the reading experience. Figure 1 shows the visual interface of Rich Book Player. All the main presentation components are visible in the figure: the book’s main content, the table of contents, the figures panel and the annotations panel. Their arrangement (size and position) can be changed by the reader, or as a result of the player’s adaptation. The other visual component, not present in figure 1, is the search panel. Highlights are used in the main content to indicate the presence of annotated text and of text referencing images. The table of contents, figures and the annotations panels can be shown or hidden. This decision can be taken by the user and by the system, with the system behavior adapting to the user behavior through its adaptation mechanisms. Whenever there is a figure or an annotation to present and the corresponding panel is hidden, the system may choose to present it immediately or may choose to warn the user to its presence. The warnings are done in both visual and audio modalities. All the visual interaction components have a corresponding audio interaction element, with one exception. Since the speech recognizer currently used in the player1 does not support free speech recognition, annotations have to be entered by means of a keyboard. All the other commands can be given using either the visual elements or vocal commands. 1

This applies to the Portuguese version of the player, which was the one used in the usability study.

430

C. Duarte, L. Carriço, and N. Guimarães

Fig. 1. The Rich Book Player’s interface. The center window presents the book’s main content. On the top left is the table of contents. On the bottom left is the annotations panel. On the right is the figures panel.

3 Experimental Setting The usability evaluation was carried out in the context of an article reviewing assignment for a Hypermedia Systems course. The students had several such assignments over the semester, which consisted of preparing a summary and an oral presentation of a given article. The summary and the oral presentation were group tasks, typically done over a two weeks period. With the students’ agreement, it was decided that one of those assignments was to be done with support from the Rich Book Player, over a one day period. The assignment consisted in reading the article “The Dexter Hypermedia” individually during the morning period, and preparing a group summary and answering a short test during the afternoon. Over a period of four days, thirty-three students participated in the evaluation: six in the first day, and nine in each of the other days. Giver the number of simultaneous participants, and the length of each session, the experiment was not conducted in our regular usability evaluation laboratory, but a special setting was prepared in another room. The room was set up with nine test stations. Each station consisted of a laptop computer with a larger screen, mouse, headphones, microphone and webcam attached to it. The Rich Book Player application was available in all stations. The application was endowed with logging capabilities, thus recording all interaction with the participants. The stations also did screen recording, voice inputs recording, and webcam recordings, thus allowing for a full backup of the experiment. In addition to the stations, two digital video cameras recorded other aspects of the interaction.


431

The experiment was divided in two periods. The morning period started with a 30 minutes period for application familiarization, which was followed by 120 minutes for article reading, and ended with a usability questionnaire. The afternoon period was composed by a 75 minutes session for summary preparation, 30 minutes for answering a short test without access to the article, 30 minutes for the same test with access to the article, and finally, another questionnaire. For the summary preparation task, the annotations of all the group’s members were merged, and the group worked on only one station. In order to be able to evaluate the effects of using a multimodal application on the task of reading an article, the students were divided in two major groups. The control group read the article printed in paper, and the test group read the article using the Rich Book Player. In order to investigate the effects of adaptation, the test group was further divided in two groups: a group with some of the adaptation features turned off, and other group with all the adaptation features on. In total the control group counted nine elements, and the other two groups, twelve elements each. To reduce the effect of extraneous variables, the following controls were applied: • The tasks were the same for each participant. • The tasks had the same time constrains for all participants. The questionnaires were answered immediately after task completion. • All test stations were equipped with laptop PCs of the same model (Sony VAIO TX3) and external monitors with the same dimensions. All stations were configured to use the same screen resolution, operating system version, applications and desktop configuration.

4 Evaluation Results The experiment results consist of qualitative data, gathered from the different questionnaires answered by the participants, and quantitative data, gathered from the logs and screen and video capture. In this paper we present and analyze the results from the qualitative data. Three sets of questionnaires were answered during the experiment by the participants from both test groups, and one set only by participants from the control group. 4.1 NASA Task Load Index The first questionnaire administered to the participants was the NASA Task Load Index (NASA-TLX) [5]. All the participants answered this questionnaire since it focused on the task, not the application. The questionnaire was presented to the participants immediately after the completion of the article reading task. The NASA TLX is a subjective workload assessment measure. NASA-TLX is a multi-dimensional rating procedure that derives an overall workload score based on a weighted average of ratings on six subscales: Mental Demands, Physical Demands, Temporal Demands, Own Performance, Effort and Frustration. The NASA TLX was used in this experiment with the main goal of finding a difference between the scores of participants in the adaptive and non-adaptive groups,

432


and between these groups and the control group. Previous findings [6,7] show users do not perceive advantages in using adaptive interfaces over non-adaptive interfaces. Using a subjective workload assessment measure might reveal a difference not directly perceived by the participants, leading to the following hypotheses: H1 Performing the article reading task with the adaptive application, the nonadaptive application, or with a paper article, will result in different perceived workload measures. Measures were collected for all participants (12 in the adaptive group, 12 in the non-adaptive group and 9 in the control group). A one-way ANOVA test was performed, and revealed that the perceived workload by users of the adaptive application (M = 53.30, SD = 14.27), users of the non-adaptive application (M = 57.11, SD = 13.45), and users with only a paper article (M = 57.56, SD = 14.79) did not differ significantly F(2, 30) = 0.31, p > 0.05. The statistical analysis does not support hypotheses H1, meaning that the perceived workloads do not differ significantly based on the support used for reading the paper. 4.2 Usability Questionnaire Following the NASA TLX, participants in the adaptive and non-adaptive group were asked to answer to a second questionnaire. This 26 questions questionnaire focused on feature usefulness and application usability, and was organized in the following groups: Navigation, Annotations, Images, Search, Adaptation (only for the adaptive application group), Presentation, Interaction and General Opinion. All the questions were answered in a 10 point scale. The General Opinion was measured on three questions, evaluation the participants’ opinion and reaction to the application (figure 2). The correlation between the answers to the three questions was calculated, and all three showed to be significantly correlated (p < 0.001). Taking this significant correlation into account, it was possible to reach a single measure of opinion by adding the answers to the three questions for each participant. In accordance to what has been presented before, no significant difference was expected to be found between the two groups, which lead to formulating the following hypotheses: H2 The general opinion of users of the adaptive application is similar to the general opinion of users of the non-adaptive application. To evaluate this hypotheses a t-test was performed on the data, showing that the opinion of people in the adaptive group (M = 18.92, SD = 6.05) was not significantly different from the opinion of non-adaptive group (M = 17.33, SD = 5.02), t(22) = 0.70, p > 0.05. For each of the other question groups in the questionnaire, t-tests were applied to the usability related questions, in order to understand how the use of multimodal output (visual and audio combined) contributed to the overall usability of the application. In the following paragraphs all reported t-tests take into consideration the necessary Bonferroni adjustment.


433

Fig. 2. Average of the answers per participant group to the three criteria on the General Opinion group of the usability questionnaire

Regarding the navigation in the Rich Book Player, several features were offered, including navigation using the table of contents, going forward or backwards a word, sentence, paragraph or chapter, and by direct selection in the main window content. The results indicate it the available mechanisms were considered usable, t(23) = 10.79, p < 0.001. Annotation creation is one of the most difficult mechanisms to implement. Previous evaluations showed it [8], and prompted an alteration of the steps necessary to create an annotation. This procedure was redesigned, making more explicit the need to first select the part of text being annotated, and only after that step inputting the annotation. Better support for text selection was developed, including an initial suggesting of the current sentence, and simple commands to expand this selection. However, both the sequence of commands to create an annotation, t(23) = 1.79, p > 0.05, and the commands for helping with the text selection, t(23) = 2.00, p > 0.05, did not reach statistical significance, meaning test participants did not consider them particularly usable. Search results appear highlighted in the text. To improve context acquisition, the whole sentence where the search term exists is also highlighted with a different color (one lighter than the one used for highlighting the searched terms). This feature was considered to improve the usability, t(21) = 4.21, p < 0.05. The application also tried to minimize the movement of the main text windows whenever another window appeared or disappeared from the screen, by controlling the appearance point, the width of the windows, and the position of remaining windows whenever a window was hidden. This feature was considered useful by the test participants, t(23) = 4.20, p < 0.05. On an overall interaction rating, the Rich Book Player was considered usable by the participants, t(23) = 7.05, p < 0.001.

434


The awareness raising mechanisms made special use of the two modalities available, displaying text which had been annotated, or had an image associated with it, in different background colors, and also using verbal cues to signal the presence of such text. Current chapter was also highlighted in the table of contents, and after arriving at a new chapter, verbal cues indicated its number and name (whenever applicable). A series of questions concerned this features, and tried to evaluate if they helped the users become aware of their place in the book, and what content existed around their current reading point. All the answers showed these features to be usable and effective awareness raising mechanisms, p < 0.05. 4.3 Comparing Electronic and Paper Reading The final questionnaire, presented after the group summary writing task, asked the participants from adaptive and non-adaptive groups to compare their experience of reading an article with the Rich Book Player application to that of reading printed articles. A questionnaire with eight questions comparing different aspects of the reading experience was prepared. Answers were given on a 5 point Likert Scale. Once again, all the t-test results presented in the following paragraphs have taken into account the necessary Bonferroni adjustment. The first question compared navigation in the electronic format to the printed format. The average of the answers was 3.79, and a t-test revealed that participants felt navigation in the electronic format was significantly easier than in the printed format, t(23) = 4.98, p 0< 0.05. The next question compared searching in both formats. Answers’ average was 3.96, and a t-test confirmed that participants felt that finding text in the electronic format is significantly easier than in the paper format, t(23) = 4.7, p < 0.05. The two following questions deal with annotation creation and annotation reading. Neither of these showed statistically significant results. Answers for easiness of annotation creation were 3.00 in average, while for annotation reading 3.46 on average. The next question dealt with how easy it was to acquire the context of an image in both formats. Once again the answer is not statistically significant, even though the average answer, 3.21, is above the scale’s mid-point. Questions six and seven dealt with which format did the users felt it was quicker to read, and easier to understand the article’s contents. The average for the first one was 3.04, and for the second one 3.13, with both failing to reach statistical significance. The last question asked which is the less tiring format for reading the article. Average answer was 3.08, not reaching statistical significance.

5 Discussion The analysis of the experiments results conducted so far allows drawing some conclusions regarding the usage of a digital book player endowed with multimedia and adaptive features: the comparison of an application with adaptive features turned on and off, the comparison of performing a task with electronic or printed support, and the improvements in usability gained from combining two modalities (in this case, video and audio).


435

5.1 Adaptive Versus Non-adaptive Applications When evaluating adaptive systems, additional problems have to be dealt with, in comparison to the evaluation of non-adaptive systems: • The definition of a control group is difficult for those systems that cannot switch off the adaptivity to make a non-adaptive version, because it is an inherent feature of the system [9] • Criteria for definition of adaptivity success are not well defined. On the one hand, objective standard criteria regularly failed to find a difference between adaptive and non-adaptive versions of a system. On the other hand, subjective criteria, standard in HCI research have been rarely applied to evaluation of adaptive systems [10]. • The effects of adaptivity in most systems are expected to be rather subtle in comparison to what may be expected from individual differences, and thus require precise measurements, potentially taking into account behavior and cognitive aspects of the users [11]. This study tried to deal with some of these aspects. By having some of the participants work with a print version of the article, it was possible to define a control group applicable to both adaptive and non-adaptive versions of the application. Furthermore, it was possible to turn off some of the application’s adaptive features without rendering it unusable, enabling a comparison between two versions of the application. The study also tried to establish a comparison between adaptive and non-adaptive versions of the same application using different subjective measures. The results, however, are in accordance to previous results in the literature, indicating no significant perceived differences between the adaptive and non-adaptive versions of the application, even tough the opinion of the participants who worked with the adaptive version of the application was, on average, higher than that of the participants who worked with the non-adaptive version. The same can be said about the perceived workload measured by the NASA TLX, where, once again, no statistical significance was found in the results. In this case, the comparison extended to the participants working with the print version, who achieved scores very similar to those of the participants working with the non-adaptive version of the application. The participants of the adaptive application group achieved lower scores on the NASA TLX, indicating a lower perceived workload, even tough not enough to be statistically significant, but justifying further studies to investigate if this indicator can identify a difference between adaptive and non-adaptive applications. 5.2 Reading in Electronic and Print Supports Another aspect evaluated in this study was the participants’ opinion regarding the task of reading an article using an electronic medium offering multimodal output, compared to reading printed works. A somewhat surprisingly result was the average answer to all the questions being above or equal to the 5-point Likert scale’s medium point, meaning that no task was more difficult to perform in the electronic medium than in the printed medium. This

436


was the expected result for some tasks, like searching, but not for other tasks like annotation creation. However, only two tasks were significantly easier to perform with the Rich Book Player than with printed articles: navigating and searching. While this was an expected result for searching tasks, given the digital supports advantage, it is worth mentioning that navigation tasks also achieved the same level in the participant’s opinion. This is probably explained by the vast possibilities offered for navigation inside the application, allowing users to navigate to any point with ease. 5.3 Improvements from Multimodality Multimodal output is used throughout the application: content is presented visually and aurally, awareness raising mechanisms combine both modalities, and reading position is presented in both modalities also. Usability questionnaires assessed how the use of multimodality impacted the participants’ opinion of the application. The results show that combining visual and audio led to improvements not felt in other areas of the interaction, where the modalities were not used in combination. This was particularly felt in the participants’ opinion of the usability of the awareness raising mechanisms.

6 Conclusions and Future Work This paper presented the results of an evaluation of an adaptive multimodal Rich Digital Talking Book Player. This player combines visual and audio modalities, both for input and output, and is also endowed with adaptive capabilities, leading to the interface’s behavior adaptation in response to changes in the user’s behavior. The evaluation experiment counted with the involvement of 33 participants, arranged in three groups: an adaptive application group, a non-adaptive application group, and a control group which worked with printed texts. Evaluation results confirmed no perceived differences between adaptive and nonadaptive applications. However, when considering the NASA Task Load Index, the workload felt was smaller for the adaptive application group. This result did not reach statistical significance, but nevertheless prompts the need for further experiments. When comparing tasks performed with the Rich Book Player, and tasks performed with printed texts, the participants’ general feeling was that it was easier to perform tasks with electronic support. While for some tasks (e.g. searching) this was expected, for other it was somewhat surprising. The use of multimodality has also proven beneficial from the usability viewpoint, particularly for implementing awareness raising mechanisms. To gather further results that may shed some light on the effects felt with long term usage of an adaptive application, another experiment is currently underway, where the participants have the Rich Book Player at their disposal in their home environment for a period of two months.


437

References 1. Kalyuga, S., Chandler, P., Sweller, J.: Managing split-attention and redundancy in multimedia instruction. Applied Cognitive Psychology 13(4), 351–371 (1999) 2. Sawhney, N., Schmandt, C.: Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. In: ACM Transactions on Computer-Human Interaction, vol. 7(3), pp. 353–383. ACM Press, New York (2000) 3. Oviatt, S., Coulston, R., Lunsford, R.: When Do We Interact Multimodally?: Cognitive Load and Multimodal Communication Patterns. In: Proceedings of the 6th International Conference on Multimodal Interfaces, State College, PA, USA, pp. 129–136. ACM Press, New York (2004) 4. Duarte, C., Carriço, L.: A conceptual framework for developing adaptive multimodal applications. In: Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, pp. 132–139. ACM Press, New York (2006) 5. Hart, S.G., Staveland, L.E.: Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In: P.A. Hancock, N. Meshkati (eds.): Human mental workload. North-Holland, Amsterdam, pp. 139–183 (1988) 6. Höök, K.: Evaluating the utility and usability of an adaptive hypermedia system. In: Moore, J., Edmonds, E., Puerta, A. (eds.) Proceedings of the 2nd International Conference on Intelligent User Interfaces, Orlando, Florida, United States, pp. 179–186. ACM Press, New York (1997) 7. Weibelzahl, S.: Evaluation of Adaptive Systems. PhD Dissertation. University of Trier, Germany (2003) 8. Duarte, C., Chambel, T.: Simões, H., Carriço, L., Santos, E., Francisco, G., Neves, S., Rua, A.C., Robalo, J., Fernandes, T.: Avaliação de Interfaces Multimodais para Livros Falados Digitais com foco Não Visual. In: Proceedings of the 2nd Conferência Nacional em Interacção Pessoa-Máquina, Braga, Portugal (2006) 9. Höök, K.: Steps to take before intelligent user interfaces become real. In: Interacting with computers, vol. 12(4), pp. 409–426. Elsevier, Amsterdam (2000) 10. Weibelzahl, S., Lippitsch, S., Weber, G.: Advantages, opportunities, and limits of empirical evaluations: Evaluating adaptive systems. Künstliche Intelligenz 16(3), 17–20 (2002) 11. Karagiannidis, C., Sampson, D.G.: Layered evaluation of adaptive applications and services. In: Brusilovsky, P., Stock, O., Strapparava, C. (eds.) AH 2000. LNCS, vol. 1892, pp. 343–346. Springer, Heidelberg (2000)

Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs Ryosuke Fujioka1, Ryo Tanimoto2, Yuki Kawai2, and Hidehiko Okada2 1

Kobe Sogo Sokki Co., Ltd, 2 Kyoto Sangyo University 4-3-8, Kitanagasadori, Chuo-ku, Kobe 650-0012, Japan 2 Kamigamo Motoyama, Kita-ku, Kyoto 603-8555, Japan [email protected], [email protected] 1

Abstract. In this paper, we propose a method that detects inconsistencies between user interaction logs of a task and desired sequences for the task based on mouse click coordinate logs. The proposed method models two successive clicks as a vector and thus a sequence of operation in a user/desired log as a sequence of vectors. A vector is from the ith clicked point to the (i+1)th clicked point in the screen. To detect inconsistencies in user interactions and desired sequences, each vector from user logs is compared with each vector from desired logs. As cues of usability problems, the method detects two types of inconsistencies: unnecessary/missed operations. We have developed a computer tool for logging and analyzing user interactions and desired sequences by the proposed method. The tool is applied to experimental usability evaluation of ten business/public organization websites. Effectiveness of the method is evaluated based on the application result. The proposed method contributes to find 61% of the usability problems found by a manual method in much smaller amount of time: the number of clicks analyzed by an evaluator with the proposed method is only 1/5-1/10 of that with the manual method. This result indicates the proposed method is efficient in finding problems. Keywords: Automated usability evaluation, web, user interaction logs, mouse clicks, usability problem cues.

1 Introduction Several methods have been proposed and developed for usability evaluation based on user interaction logs. A survey paper by Ivory et al. well summarized automated usability evaluation methods, and log-based methods are summarized in chapter 4 in their paper [1]. Some of the methods compare user interaction logs for a test task with desired (expected) interaction sequences for the task and detect inconsistencies between the user logs and the desired sequences (e.g., [2],[3]). The inconsistencies are useful cues in finding usability problems: for example, an evaluator can find that users selected some unexpected link on a webpage when another link on the page is expected for the test task and that the link selected by users may have some usability problem in its design (labeling, layout, etc.). J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 438–445, 2007. © Springer-Verlag Berlin Heidelberg 2007

Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs

439

The existing methods require widget-level logs for the comparisons: the logs are required to include data of widget properties such as widget label, widget type, title of parent window, etc. This requirement degrades independency and completeness in logging user interactions with systems under evaluation. In this paper, we propose a method that detects inconsistencies between the user logs and the desired sequences based on mouse click coordinate logs. Coordinate values of clicked points can be easily and fully logged independently of what widgets are clicked. We have developed a computer tool for logging and analyzing user interactions and desired sequences by the proposed method. The tool is applied to experimental usability evaluations of websites. Effectiveness of the method in usability testing of webpages is evaluated based on the application result.

2 Method for Analyzing Mouse Click Coordinate Logs 2.1 User Logs and Desired Logs A user log can be collected by logging mouse clicks while a user (who does not know the desired sequence of a test task) performs the test task on a computer in user testing. In our research, a log file is collected for a test user and a test task: if the number of users is N and the number of tasks is M then the number of user log files is N ∗ M (where all the N users completes all the M tasks). A “desired” log is collected by logging mouse clicks while a user (who knows well the desired sequence of a test task) performs the test task. For a test task, one desired log file is usually collected. If two or more different interaction sequences are acceptable as desired ones for a test task, two or more desired log files can be collected (and used in the comparisons described later). 2.2 Method for Detecting Inconsistencies in User/Desired Logs The proposed method models two successive clicks as a vector and thus a sequence of operation in a user/desired log as a sequence of vectors. A vector is from the ith clicked point to the (i+1)th clicked point in the screen. To detect inconsistencies in a user log and a desired log, each vector from the user log is compared with each vector from the desired log. If the distance of the two vectors (vu from the user log and vd from the desired log) is smaller than a threshold, vu and vd are judged as being matched: the user operation modeled by vu is supposed to the same operation modeled by vd. The method defines the distance of two vectors as a weighted sum of distance between start points and size of difference (Fig. 1). Distance between start points = Size of difference =

w x (x 1 − x 2 ) 2 + w y ( y1 − y 2 ) 2 .

w x (x 3 − x 4 ) 2 + w y ( y 3 − y 4 ) 2 .

Vector distance = wp(Distance between start points) + wv(Size of difference) .

(1) (2) (3)

440

R. Fujioka et al. (x3,y3) (x4,y4)

vd

vu

(x1,y1) (x2,y2)

Fig. 1. Two Vectors and Their Distance

The role of weight factors wx and wy used in the calculations of distance between start points and size of difference is as follows. Users may click on links shown in a web browser window. The width of a link is usually larger than the height of the link, especially of a text link. Therefore, the differences of clicked points for clicking on the same link are likely to become larger for the horizontal axis (the x coordinate values) than for the vertical axis (the y coordinate values). To deal with this, weights wx and wy are used so that the horizontal differences can be counted smaller than the vertical differences. User operations to scroll webpages by mouse wheels should also be taken into account: scrolls by mouse wheels changes widget (e.g., link) positions in the screen so that the clicked positions may not be the same even for the same widget. Our method records the amount of wheel scrolls while logging interactions. Bu using the log of wheel scrolls, coordinate values of clicked points are adjusted. Fig. 2 shows this adjustment. Suppose a user clicked the point (xi,yi) in the screen (Fig. 2(a)) and then clicked the point (xi+1,yi+1) (Fig. 2(b)). In this case, the vector derived from the two clicks is the one shown in Fig. 2(c). As another case, suppose a user scrolled down a webpage along the y axis by the mouse wheel between the two clicks and the amount of the scroll was S pixel. In this case, the vector derived from the two clicks is the one shown in Fig. 2(d).

(xi,yi)

(a)

vu=(xi+1-xi,yi+1-yi)

(xi+1,yi+1)

(b) vu=(xi+1-xi,yi+1+S-yi)

(xi+1,yi+1+S)

(c)

(d)

Fig. 2. Adjustment of Clicked Point for Mouse Wheel Scroll


441

2.3 Two Types of Inconsistencies as Cues of Usability Problems As cues of usability problems, the proposed method detects two types of inconsistencies between user interactions and desired sequences. We name them as “unnecessary” operations and “missed” operations. Fig. 3 illustrates unnecessary and missed operations. Desired Log User Log

m u

u

u

m

Missed operations

u

Unnecessary operations Two operations judges as the same

Fig. 3. Unnecessary Operations and Missed Operations

Unnecessary operations are user operations judged as not included in the desired sequences, i.e., unnecessary operations are operations in a user log for which any operation in desired logs is not judged as the same one in the comparison of the user/desired logs. The method supposes such user operations as unnecessary because the operations may not be necessary for completing the test task. Unnecessary operations can be cues for evaluators to find usability problems that users clicked on a confusing link when another link is desired (expected) to be clicked on for the task. Missed operations are desired operations judged as not included in the user interaction sequences, i.e., missed operations are operations in desired logs for which any operation in a user log is not judged as the same one. The method supposes such user operations as missed because the operations may be necessary for completing the test task but the user finished the task without performing the operations. Missed operations can be cues for evaluators to find usability problems that a link is not clear enough or not easy to find for users. Our method models an operation in a user/desired log by a vector derived from clicked point coordinate logs, so the method detects unnecessary/missed operations as unnecessary/missed vectors. Suppose two or more successive operations are unnecessary ones in a user log. In this case, the first operation is likely to be the best cue in the successive unnecessary operations. This is because the user might deviate from the desired sequence by the first operation (i.e., the expected operation instead of the user operation is not clear enough for the user) and had performed additional operations irrelevant to the test task until the user returned to the desired sequence. The method can extract the first operations in the successive unnecessary operations and show them to human evaluators so that the evaluators can analyze usability problem cues (unnecessary operations in this case) efficiently. 2.4 Unnecessary/Missed Operations Common to Users Unnecessary/missed operations common in many of test users are useful cues for finding problems less independently of individual differences among the users. The method analyzes how many users performed the same unnecessary/missed operation. The analysis of the user ratio for the same missed operation is simple. For each missed operation, the number of user logs that do not include the desired operation is counted. To analyze the user ratio for the same unnecessary operation, the method

442

R. Fujioka et al.

User2 User3

User1 Extraction

Vector Comparison

Desired Log

User1

User2 User3

Unnecessary Operations Common to Users

Unnecessary Operations

User Logs

Vector Comparison

compares unnecessary operations extracted from all user logs of the test task. This comparison is achieved by the same way as operations (vectors) in user/desired logs are compared. By this comparison, unnecessary operations common among multiple users can be extracted (Fig. 4).

Fig. 4. Unnecessary Operations Common to Users

3 Evaluating Effectiveness Based on Case Study 3.1 Design of Experiment Ten websites of business/public organizations are selected. For each site, a test task is designed. The average number of clicks in the designed sequences for the ten test tasks is 3.9. Five university students participate in this experiment as test users. Each test user is asked to perform the task on the site. They have enough knowledge and experience in using web pages with a PC web browser but they use the websites for the first time. The desired sequences of the test tasks are not told to the test users. Thus, if the desired sequences are not clear enough for the test users, the users will be likely to deviate from the desired sequences and unnecessary and/or missed operations will be observed. The interaction of each user for each test task is logged into a user log file. To avoid fatigue affecting the results, the time of experiment for each user is limited to 60+ minutes: each test user is asked to perform a test task within five or ten minutes depending on the task size. Fifty user logs (five users ∗ ten tasks) and ten desired logs (a log per test task) are collected. For each task, a computer tool implementing the proposed method analyzes the logs and extracts possible cues of usability problems (i.e., unnecessary/missed operations). An evaluator tries to find usability problems from the extracted cues. 3.2 Weight Factors and Thresholds for Vector Distance Our method requires us to determine the values of weight factors wx, wy, wp and wv and the threshold value of vector distance (see subsection 2.2). To determine these values, we conducted pre-experiment with another test user. Based on the analysis of the log files collected by the pre-experiment, we investigated appropriate values that lead an accurate result in detecting unnecessary/missed operations. Values in the row labeled as “Original” in Table 1 shows the obtained values.


443

In our method, distance of two operations (vectors) is defined by Eqs. (1)-(3). In the case where wp = 0, the vector distance = the distance between start points so that two operations are compared by the clicked points only (i.e., a click in a user log and a click in a desired log are judged as the same operation if the clicked points are near). Similarly, in the case where wv = 0, the vector distance = the size of difference so that two operations are compared by the size of vector difference only (i.e., the position on which the click is performed is not considered). We evaluate these two variations of the method. Variations A/B in Table 1 denote those in which wp/wv = 0, respectively. Table 1. Values of Weight Factors and Threshold for Vector Distance

Original Variation A Variation B

wx

wy

wp

wv

0.4 0.4 0.4

1.0 1.0 1.0

0.5 0.0 1.0

1.0 1.0 0.0

Threshold (pixel) 100 67 34

3.3 Number of Problems Found To evaluate the effectiveness of our method in finding usability problems, we compare the number of problems found by the method with the number by a method based on manual observation of user interactions. In addition to record click logs in user interaction sessions, PC screen image was also captured to movie files (a screen recorder program is used in the PC). A human evaluator observes user interactions with the replay of the captured screen movies and tries to find usability problems. This manual method requires much time for the interaction observation but it will contribute to find problems thoroughly. In this experiment, the evaluator who tries to find problems by the proposed method and the evaluator who tries to find problems by the manual method are different so that the result with a method does not bias the result with another method. Table 2 shows the number of problems found by each of the methods. The values in the table are the sum for the ten test tasks (sites). Eleven problems are shared in the four sets of the problems, i.e., the proposed (original) method contributes to find 61% (=11/18) of the problems found by the manual method. Although the number of problems found by the proposed method is smaller than the manual method, the time for a human evaluator to find the problems by the proposed method is much less than the time by the manual method. In the case of the manual method, an evaluator has to investigate all clicks by the users because, in this case, user clicks that are possible problem cues are not automatically extracted. In the case of the proposed method, a human evaluator is required to investigate smaller number of clicks extracted as possible problem cues by the method. In this experiment, the number of clicks to be investigated in the case of the proposed method is 1/5-1/10 of the number in the case of the manual method.

444

R. Fujioka et al. Table 2. Number of Problems Found by Each Method

Methods Manual Original Variation A Variation B

#Problems 18 15 14 13

This result of case study indicates that the proposed method • contributes to find usability problems to a certain extent in terms of the number of problems, and • is much efficient in terms of the time required. 3.4 Unnecessary/Missed Operations Contributing to Finding Problems Not all unnecessary/missed operations extracted by the proposed method may contribute to finding usability problems. As the number of unnecessary/missed operations that contribute to finding problems is larger, the problems can be found more efficiently. The contribution ratio is investigated for the proposed method and its variations (Table 3). Values in the “Counts (first)” column are the counts of unnecessary operations that are the first in two or more successive unnecessary operations (see subsection 2.3). For example, the original method extracted four missed operations in total from log files of the ten test tasks, and 25.0% (one) of the four operations contributed to finding a problem. Similarly, the original method extracted 375 unnecessary operations in total, and 7.5% (28) of the 375 operations contributed to finding problems. Table 3. Number of Unnecessary and Missed Operations Found by Each Method and The Ratio of Contribution in Finding Problems

Missed Operations Methods Original Variation A Variation B

Counts 4 8 1

Ratio 25.0% 37.5% 0.0%

Unnecessary Operations Counts Counts Ratio Ratio (all) (first) 375 7.5% 51 49.0% 422 5.7% 58 39.7% 299 9.4% 72 37.5%

Findings from the result in Table 3 are as follows. • In the three methods, the ratios are larger for unnecessary operations (first) than those for unnecessary operations (all). This supports our idea that an evaluator can find usability problems more efficiently by analyzing the first operations only in the successive unnecessary operations. • In the result of unnecessary operations (first), the ratio for the original method is larger than either of the two variations. In the result of missed operations, the ratio for the original method is larger than that for the variation B but smaller than that for the variation C. This indicates that both the original method and its variation A are promising ones.


445

4 Conclusion In this paper, we proposed a method that extracts cues for finding usability problems from user/desired logs of clicked points. To detect inconsistencies between user and desired logs, the method compares operations in the logs. The method compares user/desired operations by modeling each operation as a vector derived from coordinate values of the clicked points and checking the distance between two vectors. The distance is defined as a weighted sum of distance between start points and size of difference for the two vectors. The method extracts two types of inconsistencies: unnecessary and missed operations. Effectiveness of the proposed method was evaluated based on a case study. We tried to find usability problems for ten websites by the proposed method and the manual method. The proposed method contributes to find 61% of the usability problems found by the manual method in much smaller amount of time: the number of clicks analyzed by an evaluator with the proposed method was only 1/5-1/10 of that with the manual method. This result indicates the proposed method is efficient in finding problems. In our future work, we extend our method by utilizing log data of click time intervals. Timestamps are another data that are easily, independently and fully logged. By utilizing both logs of clicked points and time intervals, usability problems cues that are more likely to contribute will be obtained. Additional case studies are also necessary for further evaluations of our method.

References 1. Ivory, M.Y., Hearst, M.A.: The State of the Art in Automated Usability Evaluation of User Interfaces. ACM Computing Surveys 33(4), 1–47 (2001) 2. Okada, H., Asahi, T.: GUITESTER: a Log-based Usability Testing Tool for Graphical User Interfaces. IEICE Transaction on Information and Systems E82-D(6), 1030–1041 (1996) 3. Okada, H., Ashio, T., Kunieda, K., Shimazu, H.: Interaction Logging and Analysis Tool Estimating Expected Operations for Unexpected User Operations. In: Proc. of the 11th Int. Conf. on Human-Computer Interaction (HCI International 2005), CD-ROM (2005)

A Game to Promote Understanding About UCD Methods and Process Muriel Garreta-Domingo1, Magí Almirall-Hill1, and Enric Mor2 1

Learning Technologies Dept - Universitat Oberta de Catalunya, Av. Tibidabo 47, 08035 Barcelona, Spain {murielgd,malmirall}@uoc.edu 2 Computer Science, Multimedia and Telecommunication Dept - Universitat Oberta de Catalunya - Rambla del Poblenou 156, 08018 Barcelona, Spain [email protected]

Abstract. The User-centered design (UCD) game is a tool for human-computer interaction practitioners to demonstrate the key user-centered design methods and how they interrelate in the design process in an interactive and participatory manner. The target audiences are departments and institutions unfamiliar with UCD but whose work is related to the definition, creation, and update of a product or service. Keywords: Games with a purpose, game pieces, HCI education, HCI evangelization, user-centered design, role-playing, design games, experience.

1 Introduction Working in an environment that has a good understanding of user-centered design and its implications in a development process is crucial for the work of HCI and usability practitioners. Otherwise the results of their work might not be taken into account, used in an appropriate way or applied throughout the whole process. As a consequence, most often promoting this understanding becomes part of these practitioners’ tasks. The context of the present article is the Campus project, which aims to create a virtual campus in open source and under the license of General Public License (GPL). This project was initiated by nine universities of the Catalonia region in Spain and is promoted by the Catalan government. The Campus project will provide higher education both in a completely online fashion and combining online and offline learning for more than 300,000 students. The key aspects of this project are unique to the field of e-learning. First, this Campus will support up to 10,000 users connected at the same time. Second, the product design will follow a user-centered design (UCD) approach. Third, the user interface will follow usability and accessibility principles and standards. These three pillars make the project challenging both from a human-computer interaction (HCI) perspective and from a technical stance, which is not the focus of this paper. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 446–452, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Game to Promote Understanding About UCD Methods and Process

447

The project is organized in 12 work packages and besides the coordination and methodological packages, all except two are technically-oriented and led by programmers. These two exceptions are the first work package, which consists of gathering user requirements, and the second work package, which is responsible for the prototyping and user testing of all developed modules. Therefore, these two work packages with the help of the methodology package are responsible for ensuring that the entire project and, consequently, all development teams follow a UCD approach. As Twidale and Marty state, usability professionals often have to combine the roles of usability advocates, educators and practitioners [10]. Bias and Mayhew [2] addressed this issue by putting together a collection of articles on cost-justification of usability. However, the argument of cost-justification by itself is not enough to introduce UCD in an organization. As Siegel [7] explains, “success will hinge not on a single convincing argument, but on the many interrelated ideas we introduce to our organizations, on the kinds of relationships we build with various stakeholders, and on how we demonstrate our value to them first hand.” At the Open University of Catalonia (UOC), one of the Campus project participating universities, we created a game as a tool to increase the understanding of UCD methods. Through a participatory and interactive manner, its purpose is to promote a better understanding of a good design process; showing the importance of knowing the end user and keeping the focus on the user as well as choosing the right methods for analyzing the users and evaluating the design. Several activities have been programmed in the context of the Campus project in order to proselytize and teach the project teams the importance of a UCD process and the best way to apply it. As a part of these activities, we decided to deploy the UCD game for the Campus project teams.

2 The UCD Game: Origin, Audience and Goals The UCD game idea was created after the celebration of World Usability Day (WUD) 2005. As part of the UCD diffusion goal at UOC, different activities were organized for the occasion. There were formal presentations about in-house projects that followed the UCD process. Outside the conference room, a set of independent stations was placed for visitors to receive an overview of the UCD process and methods, experience a usability test in a lab setup, and use a computer with a screen reader installed (JAWS) in order to understand the importance of accessibility. This accessibility station, where participants had to browse the Internet with the monitor off and only with the help of the screen reader, was the most successful of all the activities organized. The aim when designing the game was to obtain a set of engaging stations where participants can experience the different steps of a UCD process. It is structured as a team and participatory activity with a set of interrelated tasks because the goal is not only to show how each project phase is accomplished individually but also how the project is completed and how these different phases relate to one another. 2.1 The Game Goals The setup of the UCD game is similar to the Interactionary, a design exercise envisioned and organized by Berkun [1], however, the main difference is that unlike

448

M. Garreta-Domingo, M. Almirall-Hill, and E. Mor

Interactionary, our game is not an on-stage competition and does not address designers or an HCI audience. In this sense, the goals of the game show created by Twidale and Marty [10] are more closely related to our objectives. Yet, while their game illustrated usability evaluation methods, our game strives to illustrate the UCD process and techniques. Buchenau and Fulton [3], in their paper about experience prototyping, quote the Chinese philosopher Lao Tse: “What I hear, I forget. What I see, I remember. What I do, I understand!” Our game is a way of promoting understanding by doing. Like Buchenau’s and Fulton’s, there are several papers on how to include “doing” in the design process by role-playing, informance design, interactive scenarios, participatory design, etc. [4,5,6,8,9]. However, these papers address a different problem than the UCD game and therefore are aimed to a different audience, pursuing different goals. While addressing designing exploratory design games, Brandt [4] describes various kinds of games, one of which is similar in concept to the UCD game: “The primary aim with the negotiation and workflow oriented games is for the designers to understand existing work practice. Game boards and game pieces are produced in paper. The outcome of the game playing is often flow diagrams showing relations between people and various work task or tools.” In our case, we want the Campus project participants to understand UCD work practices using game pieces for each of the UCD phases and a game board to show the relations between the different phases and the end design. In summary, the purpose of the game is: 1) To show the key steps of a UCD process in an enjoyable and informal setting. 2) To help participants understand how these steps relate to each other. 3) To provide an overview of the main HCI techniques and methods. 4) To illustrate that the user target and the methods used affect the end design. 2.2 The Target Audience We initially created the UCD game in the context of the UOC, a completely online university with more than 40,000 students that offers 19 official undergraduate degrees as well as several graduate programs. As a result, the virtual campus plays a key role at the UOC as it is the work tool for UOC employees, the teaching tool for faculty members, and the learning tool for students. In such a context, UCD should play a central role in all UOC departments that design products for the virtual campus users. Nevertheless, this is still not the case today. Although the introduction of usability and HCI concepts in the organization started in 2002, they are still not well understood and therefore not always properly applied. Hence, we created the game as a tool to promote a better understanding of a good design process in hopes of demonstrating the importance of understanding and focusing on the end user as well as choosing the right methods for analyzing the users and evaluating the design. The target audiences are organizational departments that participate in the creation, definition, and update of the virtual campus applications. Even though this target audience is formed by people familiar with the concept of usability and UCD, the goal is to ensure that the game is comprehensible even for people unaware of the existence of UCD.


449

Within the context of the Campus project, there are nine universities actively working in its development. Additionally, there are several other universities and public institutions that act as observers of the project and whom, in the near future, may use the virtual campus as their learning management system. Therefore, the audience is much more diverse than at UOC since it includes active and passive management, project leaders and developers from both public and private institutions. For our purposes, the target audience of the game was management and project leaders since they are responsible for ensuring that their teams follow a UCD process. However, our aim was to include developers because the more participants that understand the value of UCD afford greater opportunities for UCD to be applied throughout the development processes.

3 The Game Structure The game consists of four different stations; each station representing a phase in the UCD design process: defining the users, analyzing the users’ needs, designing the artifact and evaluating the resulting artifacts. Like the exploratory design games [4], players do not compete. Each team goes through the stations and at the end, all game boards are shown together in a separate room so that participants and observers can evaluate the design solutions. The WUD promoted by the Usability Professionals’ Association was the first background for the UCD game. The Campus project was the context of our second application of the game. Groups of 3 to 4 people are created with participants from different institutions and departments. To begin, they read an overall description of the game and are given a one-page description of the design problem. 3.1 The Design Problem Like Berkun [1], we decided that a non-web problem would work best with a large audience and that a physical design of a public and not-work related object would be better as these concepts are familiar to everyone, and the details are broad enough for everyone to follow along. Therefore, when considering a design problem, these issues played a key role; in the end, we opted for the design of an airport self-check in machine. The initial design problem was to create a ticket vending machine. To narrow the scope, the machine was supposed to only sell tickets to the airport and it was to be placed in a central railroad station of Barcelona, Spain. For the self-check in machine, we also narrowed the project to the flights between Barcelona and Melbourne or Sydney of a specific airline company. 3.2 The First Game Station: Defining the Users The aim of this first station is to introduce the idea that good design is accomplished by thinking of the end user and that this end user is neither the designer, nor anyone else. The team is presented with four groups of people, each containing four users with pictures and a short demographic description. Participants are asked to choose the group of users for whom they will design and write down their main characteristics.

450


Initially, all possible users were presented individually but we realized that participants needed a substantial amount of time to choose and group a set of users in order to build their primary user type. We have found that having the groups already formed is clearer for our audiences. 3.3 The Second Game Station: Analyzing the Users’ Needs The aim of this second station is to show that designers use several quantitative and qualitative methods to gather data about the chosen target. Defining the users is the first step, in this phase participants analyze the users’ needs, wants, contexts, and limitations by choosing a maximum of three methods from the UCD toolbox. After opening the envelopes of the selected methods, the team has to summarize the findings and write down a list of characteristics that should be considered when designing the artifact. For example, during the contextual inquiry method, the team watches a video of the Barcelona airport. For benchmarking, they have pictures of other self check-in machines already in use at the airport. Other methods available in the toolbox are: in-depth interview, focus groups, surveys and log analysis. Outside of the envelope there is a short description of the technique to help participants choose the ones that they consider most useful. Inside there is more information about the technique being applied to the design problem and the results of conducting it. For instance, for an in-depth interview, the inside page contains a list of possible interview questions and a list of possible answers given by users. 3.4 The Third Game Station: Designing the Artifact The goal of the third station is to show that a successful design is focused on the end user. As a consequence, designers should not jump directly to the end design but they should consider the output of the previous stations and follow an iterative design process. The team is also asked to use one of the evaluation methods available in another UCD toolbox. Assuming the team has understood the UCD philosophy, the result of this station will be a simple prototype and a list of changes that should be made to it after applying an evaluation technique. The game organizers are the pretended users for the evaluation techniques. For user testing, the team has to think of one or two tasks they would like the user to accomplish. The organizer will then perform these tasks using the prototype. 3.5 The Fourth Game Station: Evaluating the Designed Artifacts At the end of the game, each team pastes the one-page output of each station on a horizontal game board. The board is separated into four quadrants: 1) photos of the target users and key characteristics, 2) required characteristics of the artifact according to the user analysis and the methods used, 3) the first low-fidelity prototype, a list of changes resulting from the evaluation of the prototype and 4) the evaluation of the development process. Game boards are displayed in a room where participants and other observers can see the different designs and UCD processes. In order to evaluate the designs, participants and observers have a questionnaire that contains questions such as “Does the


451

design take into account the context of use?” or “Did the team evaluate their first design solution?”

4 Deploying he UCD game In order to test the game structure and its different stations, we initially ran a pilot of the game using a small group, half being HCI experts while the others were familiar with UCD but had never applied a full UCD process. The mixed groups were required to traverse through each of the four stations of the game: defining the target user, analyzing its needs, designing, and evaluating. It was very rewarding to see the groups make different decisions at each of the stations. Since the groups defined different user characteristics and target goals as well as selected different evaluation methods, the final designs varied greatly. In this sense, the pilot study proved that the game is useful in showing how phases relate to each other and that designs depend on characteristics of the end user and the methods used. Through the post-game questionnaire, we concluded that all participants considered the game useful to show the value of UCD methods and process and that it was an enjoyable, refreshing and enriching experience. We also obtained feedback on areas to improve, such as a tighter control of time for each station and a less technical and ambiguous description of the phases and methods. Our second application of the game was on November 14th during World Usability Day 2006. Around thirty people (8 groups) participated in the game. From observing the teams and the post-game questionnaire, we gathered that most participants enjoyed the experience and found it a successful tool to show UCD process and methods. Again the time spent on each station was perceived as too long despite the timers at each station and the organizers, who tried to encourage groups to move to the next station. Participants that had an interest in UCD wanted to do a good work on each phase and therefore they took longer than the allowed time. The biggest problem caused by lack of time was that the teams were not able to view the other teams´ solutions. Thus, visualizing how different target users and processes led to different results was one of the goals not accomplished in this application of the game. Running the UCD game for the Campus project participants was more challenging since most people are not interested in UCD. This lack of interest reduced the problem of the time issue but still made the game useful as another tool to promote understanding about the UCD process and approach.

5 Conclusions We created the game in order to show the UCD process and methods to an audience of non-experts but whose tasks are related to the definition, creation, and update of a product or service. We have deployed the game in three different occasions with diverse contexts and audiences. The feedback given by the different types of participants told us that the game is perceived as enjoyable and useful for our purpose. Recalling our goals when creating the game (1) To show the key steps of a UCD process in an enjoyable and informal setting. 2) To help participants understand how these steps relate to each other. 3) To provide an overview of the main HCI

452


techniques and methods. 4) To illustrate that the user target and the methods used affect the end design.), we are confident that it manages to accomplish the four objectives in a short period of time. However, a new question arises: does it make a difference in the participants’ everyday work? Will they consider applying a UCD approach in their next project? Will they be more willing to include the results of UCD methods in their work? We plan to answer these questions by deploying again the UCD game in the Campus project context but with the real design problem. As it has been mentioned, the first work package of the project is to gather user requirements. The output of the package will be personas, scenarios and needs of the future campus users. We will use the project main goal as the design problem and these outputs to prepare the game materials. With this new focus of the game, we expect to increase the developers’ involvement in the UCD process as well as interesting feedback from both developers and observers for the project development. The UCD game is a powerful and flexible tool that can be applied for different goals, in diverse contexts and for different audiences. Although each setting will require an adapted design problem, the overall structure of the game is a useful guide for all cases. Acknowledgments. This work has been partially supported by a Spanish government grant under the project PERSONAL (TIN2006-15107-C02-01) and by the Campus project promoted by the Generalitat de Catalunya.

References 1. Berkun, S.: Interactionary: Sports for design training and team building. http://www.scottberkun.com/dsports 2. Bias, R.G., Mayhew, D.J. (eds.): Cost justifying Usability: An Update for the Internet Age. Morgan Kaufmann, San Francisco, CA, USA (2005) 3. Buchenau, M., Fulton Suri, J.: Experience Prototyping. In: Proceedings on Designing Interactive Systems, pp. 424–433. ACM Press, New York (2000) 4. Brandt, E.: Designing Exploratory Design Games: A Framework for Participation in Participatory Design? In: Proceedings Participatory Design Conference, pp. 57–66. ACM Press, New York (2006) 5. Burns, C., Dishman, E., Verplank, W., Lassiter, B.: Actors, Hairdos & Videotape – Informance Design. In: Proceedings of CHI 1994, pp. 119–120. ACM Press, New York (1994) 6. Klemmer, S.R., Hartmann, B., Takayama, L.: How Bodies Matter: Five Themes for Interaction Design. In: Proceedings on Designing Interactive Systems, pp. 140–149. ACM Press, New York (2006) 7. Siegel, D.: The Business Case for User-Centered Design: Increasing Your Power of Persuasion. Interactions 10(3), 30–36 (2003) 8. Simsarian, K.T.: Take it to the Next Stage: The Roles of Role Playing in the Design Process. In: Proceedings of CHI 2003, pp. 1012–1013. ACM Press, New York (2003) 9. Svanaes, D., Seland, G.: Putting the Users Center Stage: Role Playing and Low-fi Prototyping Enable End Users to Design Mobile Systems. In: Proceedings of CHI 2004, pp. 479–486. ACM Press, New York (2004) 10. Twidale, M.B., Marty, P.F.: Come On Down! A Game Show Approach to Illustrating Usability Evaluation Methods. Interactions 12(6), 24–27 (2005)

DEPTH TOOLKIT: A Web-Based Tool for Designing and Executing Usability Evaluations of E-Sites Based on Design Patterns Petros Georgiakakis1, Symeon Retalis1, Yannis Psaromiligkos2, and George Papadimitriou1 1

University of Piraeus, Department of Technology Education and Digital Systems 80 Karaoli & Dimitriou, 185 34, Piraeus Tel.: 0030 210 414 2746 2 Technological Education Institute of Piraeus General Department of Mathematics Computer Science Laboratory 250, Thivon & P. Ralli, 122 44 Athens, Greece Tel.: 0030 210 5381193, Fax: 0030 210 5381351 {geopet,retal,papajim}@unipi.gr,[email protected]

Abstract. This paper presents a tool that supports a scenario based expert evaluation method called DEPTH (usability evaluation method based on DEsign PaTterns & Heuristics criteria). DEPTH is a method for performing scenario-based heuristic usability evaluation of e-systems. DEPTH focuses on the functionality of e-systems and emphasizes on usability characteristics within their context. This can be done not only by examining not only the availability of a functionality within an e-system but also the usability performance of the supported functionality according to a specific context of use. The main underlying ideas of DEPTH are: i) to minimize the preparatory phase of a usability evaluation process and ii) to assist a novice usability expert (one who is not necessarily familiar with the genre of the e-system). Thus, we (re)use expert’s knowledge captured in design patterns and structured as design pattern languages for the various genres of e-systems. This paper briefly describes the DEPTH method and presents the way a specially designed tool supports it along with the findings from an evaluation study.. Keywords: Heuristic evaluation, design patterns, reuse of design expertise.

1 Introduction The world wide web has grown tremendously with a vast amount of information bombarding its users every second. Users have been turned into consumers of services and products from various types of e-sites. However the developers of the e-sites do not make the users life easy due to bad design. Usability evaluation though necessary is often neglected characteristically in the area of e-commerce sites. Systems’ user interfaces are now redesigned, taking into J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 453–462, 2007. © Springer-Verlag Berlin Heidelberg 2007

454

P. Georgiakakis et al.

account the user needs. The benefits anticipated are as follows: increased sales, customer satisfaction, customer retention, reduced support, stronger brand equity [13]. Usability evaluation of e-sites is not an easy task and requires a lot of effort [9]. One approach is the use of usability experts which raises the cost for the organization undertaking the task [2]. It is often difficult to find a usability expert who will be able to perform his/her tasks and pinpoint a lot of usability problems which stem from the general usability heuristics as well as to successfully determine usability problems which have to do with the specific context of use for the e-site. Not only is it difficult to find usability experts [10], but it is even harder to find genre specific e-sites usability experts. Digital genres are described as a classification system for kinds and types of digital products [12]. During the last years several digital genres of e-sites have been studied such as online newspapers, e-shops, e-travel sites, etc. Thus, a practical approach for solving the problem of finding usability experts for the specific genre of an e-site under evaluation could be to, accurately and efficiently, help a typical novice usability engineer in performing usability evaluation for that genre of e-sites. This can be achieved by transferring the expert knowledge to the novice usability engineers and guiding them to perform an e-site evaluation with the aid of systematic approaches and supported toolkits. Such an approach is the DEPTH method (usability evaluation based on DEsign PaTterns and Heuristics criteria). DEPTH is a scenario based expert evaluation method. It eliminates the difficulties of expert based evaluation described above and provides an integrated framework where the novice usability evaluator can find and (re)use expert knowledge for better performing the evaluation tasks of genres of e-sites. The innovative ideas behind the DEPTH approach are: i) the reuse of expert knowledge in the form of design patterns during the evaluation process. A design pattern describes a problem, a solution to it in a particular context, and the benefits or drawbacks from using that solution [1, 3]; ii) the use of scenarios of genres specific of e-sites. In this paper we describe the DEPTH toolkit which is a prototype Web-based tool, for designing and implementing usability evaluations of e-sites, based on the DEPTH usability method [4, 11]. The rest of the paper is organized as follows: Section 2 describes DEPTH in detail. Section 3 describes the application of DEPTH on two systems classified as Learning Brokerage Platforms (LBPs) in order to clarify the main points of the method. Finally, in section 4 we discuss the current status of the method, as well as our future plans.

2 The DEPTH Approach 2.1 Principles of DEPTH According to DEPTH the evaluation process of an e-site should focus on three dimensions: functionality according to genre, usability performance of the functionality according to context of use, and general usability performance according to heuristics criteria. We present the whole process in Figure 1 using an activity diagram which depicts the general activities and responsibilities of the elements that make up our method.

DEPTH TOOLKIT: A Web-Based Tool

455

The basic aim of our method is to provide a framework where an evaluator can find and (re)use expert knowledge in order to perform an evaluation that supports the above dimensions.

Fig. 1. The whole process depicting the general activities and responsibilities of the elements those make up DEPTH method

The first swim lane presents the general steps/actions of the evaluator according to DEPTH. These steps are guided and supported by the “DEPTH-Repository” which is the element that is constructed during the preparatory phase. The last element shows the deliverables of the execution phase of the evaluation process. Each evaluation study should start by first selecting the specific genre of the e-sites under evaluation. There are various checklists with the functionality of various genres of e-sites which one can easily use and re-use. In case one cannot find such a checklist, an analysis of the most well-known systems of a specific genre should be made in order to find out their functionality and provide a super set of all features categorized in groups as well as an analytical table. Such genres of systems along with their analytical tables of the supported functionality become part of the “DEPTH- Repository”. Having as input the analytical table of the functionality of the system under evaluation, the evaluator can easily perform the next step, which is a simple checking to ensure whether the system supports the underlying functionality. This step provides the first deliverable of our method which is a functionality report. This report describes the functions supported by the selected system. At the next step the evaluator has to decide which of the supported functions will be further analyzed for usability performance. As we have already mentioned, the production of the functionality table alone is not enough for someone to select the right e-site. We may have systems of similar genre, like e-commerce systems, which may contain the same set of features but vary in usability. In other words, “It is not only the features of the applied technology but especially the way of implementation of the technology”, as Lehtinen, et. al. [6] says for different genre of systems.

456


Evaluating the usability performance of the system involves two primary tasks: (a) evaluation in the small, i.e. at the specific context, and (b) evaluation in the large, i.e. evaluating the general usability conformance to well-defined heuristics criteria. The first task is the most difficult since it implies the use of domain experts and therefore it is very expensive. Moreover, the availability of domain experts is very limited. At this point our method suggests the (re)use of domain knowledge through the design patterns and the underlying design pattern languages. Such a language can adopt issues from HCI design patterns since usability is of prime importance, while at the same time will take into account the particularities of the type of genres under evaluation, and so forth [14]. So, at the next step the evaluator for each specific function (or a set of functions) identified for usability performance, can see a related scenario. As we described above, one or more related scenarios are bound to specific design patterns during the preparatory phase and are part of the DEPTH’s repository. The evaluator may also decide to modify a related scenario to better suit his/her case. The next step is the execution of the underlying related tasks of the specified scenario. We have to stress here the essential role of the underlying usage scenario which acts as an expert wizard guiding the evaluator. After the execution, the evaluator is motivated by DEPTH to see the ideal solution as it has been recorded in the related pattern(s). This is necessary because the evaluator hasn’t seen the solution till now, but only the related usage scenario. By seeing the actual solution, the evaluator can complement his/her findings about the e-site under evaluation and he/she becomes more prepared to compose the evaluation report. The final evaluation report has two parts: a context specific part and a general part. The first reveals/measures the usability performance of the system under evaluation according to its specific context of use, while the second presents the general usability performance according to the expert/heuristic criteria. 2.2 The DEPTH TOOLKIT DEPTH TOOLKIT supports the tasks of two categories of users: i) Creators of the evaluation studies to be performed and the ii) the novice usability engineers. The DEPTH toolkit from the creator’s point of view supports four main tasks for each genre of e-sites: i) Specifications of the features of genres of e-sites, ii) Assignment of scenarios and appropriate tasks to features of genres iii) Editing of design patterns, as well as links between those patterns and specific features of the genre and, iv) Management of evaluation sessions and recording of evaluation reports. From the novice usability engineer perspective, the toolkit supports the evaluation study in two phases, first a preparatory and then an execution phase. During the preparatory phase, the user (novice usability engineer) chooses the genre of the e-site, and selects from a list the features of the system that she is interested in evaluating. This list is generated by the toolkit and includes features related to systems of the same genre as the one specified. During the execution phase, the selected set of features is being evaluated through the context oriented scenarios that have been proposed and written by the creators of the evaluation studies for that specific genre of e-sites (Fig. 2). At the end of the evaluation process, a detailed report is automatically


457

produced, describing the usability performance of the examined system on the chosen function(s) at its specific context of use, along with the general usability performance of the examined e-site according Nielsen’s heuristics criteria [8].

Fig. 2. DEPTH TOOLKIT interface

3 Evaluating the DEPTH Method 3.1 Scope of Evaluation Study In order to evaluate our method we conducted an experiment with non expert usability evaluators. Twenty three (23) graduate students of our Department after having completed an introductory MSc course on Human - Computer Interaction, (we call them novice usability engineers) were asked to evaluate two e-sites of a specific genre. In order to make an experiment that would be related to their interests (they attend a MSc programme on e-learning technologies) we proposed the evaluation of two Learning Brokerage Platforms (LBPs), namely the Premier Training Online (https://www.premiertrainingonline.com/default.aspx) and the Adobe Store of North America (https://store1.adobe.com/cfusion/store/index.cfm). All students had average knowledge of such a genre of systems, and none of them claimed to be experts in using (nor designing) such systems. Actually, none of them had ever used any of the systems under evaluation. These e-sites allow the user to search, view, and purchase selected online learning objects that have to do with training in specific areas of interest. The Premier Training Online site offers distance learning programs in order to provide comprehensive home study courses for those pursuing careers in the health and fitness industry. The Adobe Store provides Adobe training about Adobe products via an online training center. This center gives access to many libraries full of engaging, interactive course contents, assessment features, and additional resources to maximize design and development of skills. These two e-sites had been carefully picked up since various usability problems had been identified during expert based evaluations previously organized by our group. We used DEPTH only from the evaluator’s point of view since we wanted to focus in this specific perspective. The main research questions of this evaluation

458


study were: Can the DEPTH method help novice usability engineers identify usability problems (especially complex ones)? Can the DEPTH method make novice usability engineers improve their ability to propose solutions to the identified usability problems? Is the DEPTH method easy to apply? Does the DEPTH method make the novice usability engineers’ evaluation process easier, more flexible and enjoyable? Does the DEPTH method make novice usability engineers feel confident that they performed a good evaluation study? Do the novice engineers appreciate the added value of Design Patterns for usability evaluation? 3.2 Evaluation Process Several systems of LBP genre have been thoroughly examined with respect to the features they provide and a superset of those features is shown in Figure 3 (Fig.3 ).

Fig. 3. Features supporting online purchases

Selected design patterns (DP) from Martijn van Welie’s web design patterns repository (http://www.welie.com/) were related to a number of features (F) that an LBP may support as shown in Table1 below (each feature was related to one or more design patterns from Welie’s repository). Table 1. Examples of relations between Features and Design Patterns

FEATURES (F) F1. Select preferred language F2. Directions to the right section of the website F3. Know where you are in a hierarchical structure F4. Navigate a hierarchical structure

DESIGN PATTERNS (DP) http://www.welie.com/patterns/showPattern.p hp?patternID=language-selector http://www.welie.com/patterns/showPattern.p hp?patternID=doormat http://www.welie.com/patterns/showPattern.p hp?patternID=crumbs http://www.welie.com/patterns/showPattern.p hp?patternID=doubletab

For each of these features we created a related usage scenario. For example for the functionality “F11: Buy/use shopping basket” we assigned the scenario “S11: Shopping Cart” as shown in Table2.


459

Table 2. Task Scenario and Questions for a specific functionality

(S11) Description: Task:

Questions:

Shopping Cart Collect and purchase several items in one transaction. Locate the Shopping Cart / Shopping Basket. The basket is initially empty. Search for a product (either manually or with assistance from a search mechanism) and, if found, add it to the contents of the basket. Add to the basket a new product that is advertised in the home page. Search for a new product, other than those already included in the basket. Browse through the store. Delete one item from the shopping cart. Select another instance of a product included in the cart and add that instance to the contents of the cart. Select one of the products that are already in the cart and view its price. While viewing the shopping basket contents, try to locate a link related to shipping and handling costs and the calculation of their cost. While viewing the shopping basket contents, try to locate a link related to the return policy. • Was the name of the shopping cart used appropriately? • Was the shopping cart easily located? • Were you able to add in the basket a product advertised in the home page? How easy did you find this operation? • While viewing search results were you able to see the contents of the shopping cart? • Could the operation of searching for a new product other than those already included in the basket, be executed with zero or one click / move? Were you still able to view the contents of the shopping cart? • Was it easy to delete items from the shopping cart? Was it easy to modify their quantity? Was the total price automatically recalculated? • Was it easy to view the price of any product item included in the cart once you selected that item? • Was it easy to locate the link related to shipping and handling and the calculation of their cost? Was the information provided satisfactory? • Was it easy to locate the link related to the return policy? Was the information provided satisfactory?

The novice usability evaluators (i.e. the DPEs) had to conduct all the tasks of the proposed scenarios while having the ability to look at the related design pattern that the DEPTH method proposes. After having fulfilled the inspection of the LBP, they had to express their overall opinion about the e-site according to Nielsen heuristic criteria [8] that are: Visibility of system status, match between system and the real world, user control and freedom, consistency and standards, error prevention,

460


recognition rather than recall, flexibility and efficiency of use, aesthetic and minimalist design, help users recognize, diagnose, and recover from errors, help and documentation. Finally a report in which all the answers of the questions proposed from the scenario and Nielsen’s heuristic criteria is automatically generated. Not only did we analyze the reports written by the DPEs but we conducted focus group interviews (in teams of three students) to get better insight of their opinion about DEPTH and the DEPTH toolkit. The major advantage of conducting a focus group interview [5] was the ability to obtain detailed information through group cooperation. The findings resulted in important and promising conclusions as shown below. 3.3 Evaluation Findings Using, throughout this experiment, novice usability engineers helped us verify what we intended to prove: the DEPTH method can actually enable novice usability evaluators perform evaluations of expert quality. After following the related task scenarios, they were able identify simple usability problems, while at the same time they were also assisted in identifying complex problems, which could not be easily spotted if scenarios and design pattern have not been given. The novice usability evaluators clearly stated that the design patterns helped them in realizing the good design practices concerning the various features of a LBP e-site. As an unexpected outcome, many of the evaluation reports that we received showed that the DPEs were also suggesting solutions for each of the problems identified. We, as reviewers of the experiment, wanted to know where this kind of knowledge came from. When we asked our students how they got these references, all of them mentioned the added value of the Design Pattern that accompanied each feature. By considering the solution given from the pattern and customizing to the context of the specific e-site, they were able to offer clear solutions to the usability problems. This assistance made them more confident, not only in indicating the usability problems, but also in proposing solutions for them. All students stated that the interface of the Toolkit made the evaluation process flexible and enjoyable. The use of specific task scenarios along with the categorization of all features provides a source of tasks and requirements that can be easily evaluated. Among other remarks, it was also mentioned that DEPTH can be used in evaluating isolated areas of interest by simply choosing only few features. However the method has some disadvantages. Design patterns are not that many. So it is difficult to find mature pattern languages to support the variety of e-sites genres. That became obvious from the collection of Design Patterns we proposed as we deliberately chose some that are not pretty matured. Even if we assume that the pattern language is there, pretty matured, will there always be a design pattern to validate all areas of interest in a digital genre? A problem that occurred during the evaluation was that the TOOLKIT didn’t allow the users to revise their report after they had submitted it. This problem is not difficult to be solved and the next revised version of the toolkit will include such functionality. Another major issue related to the evaluation of the DEPTH method and mainly depending on the user of the DEPTH Toolkit is the creation of genre dependent scenarios. Who should be the creator of those? Will the scenarios be highly scripted or


461

loosely defined? What will the granularity of each scenario be? The need of experts in the creation of these task scenarios is meaningful. We may want to define scenarios that are very descriptive, or we may want to use scenarios that are more general. We need to have several scenarios, of different granularities, for each feature and let the user decide between cost and efficiency and choose the one that is most appropriate to the case of study.

4 Conclusions In this paper we provided an overview of DEPTH, which is an innovative method for performing scenario-based expert heuristic usability evaluation for e-sites. It is innovative since it uses the added value of design patterns in a very systematic way in the usability evaluation process. This method can be easily used by a novice usability engineer. When DEPTH was used by non-expert engineers in the evaluation of LBPs using a supported toolkits, called DEPTH toolkit, the results were satisfactory. The expert knowledge embedded in the form of design patterns and usage scenarios was readily available to the novice engineers, thus enhancing their testing methods and improving their perspective towards the usability of each functionality being tested. As the field of design patterns grows and matures, this method will be very promising and highly applicable. Acknowledgments. This work has been partially funded by through the EU IST FP7 project Grid4All (http://grid4all.elibel.tm.fr/).

References 1. Alexander, C.: The Origins of Pattern Theory: the Future of the Theory, And The Generation of a Living World, In: Keynote speech at the 1996 ACM Conference on ObjectOriented Programs, Systems, Languages and Applications (OOPSLA) (1996) retrieved from http://www.patternlanguage.com/archive/ieee/ieeetext.htm 2. Dix, Alan, Finlay, Janet, E., Abowd, Gregory, D., Beale, Russell.: Human-Computer Interaction, 3rd edn. Prentice Hall, Englewood Cliffs (2003) 3. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns – Elements of reusable object oriented software. Addison –Wesley, London, UK (1994) 4. Georgiakakis, P., Tzanavari, A., Retalis, S., Psaromiligkos, Y.: Evaluation of Web applications Using Design Patterns. In: Costabile, M.F., Paternó, F. (eds.) INTERACT 2005. LNCS, vol. 3585, Springer, Heidelberg (2005) 5. Krueger, R.A., Casey, M.A.: Focus Groups: A Practical Guide for Applied Research, 3rd edn. Sage Publications, Thousand Oaks, CA (2000) 6. Lehtinen, E., Hakkarainen, K., Lipponen, L., Rahikainen, M., Muukkonen, H.: Computer supported collaborative learning: A review of research and development (The J.H.G.I. Giesbers Reports on Education No. 10). Department of Educational Sciences. University of Nijmegen, Nijmegen, the Netherlands (1999) 7. Martín, G.Y.: Wiki tools for collaborative learning environments, Final project thesis, Telecommunications Engineering, Universidad de Valladolid (August 2005) 8. Nielsen, J.: Usability Engineering. Academic Press, London (1993)

462


9. Nielsen, C.: Testing in the field. In: Werner, B. (ed.) Proceedings of the third Asia Pacific Computer Human Interaction Conference, IEEE Computer Society, Los Alamitos, CA (1998) 10. Nielsen, J.: Designing Web Usability: The Practice of Simplicity. New Riders Publishing, Indianapolis (2000) 11. Sartzetaki, M., Psaromiligkos, Y., Retalis, S., Avgeriou, P.: Usability evaluation of ecommerce sites based on design patterns and heuristic criteria, In: 10th International Conference on Human - Computer Interaction, Hraklion Crete (June 22-27, 2003) 12. Schmid-Isler, S.: The Language of Digital Genres: a Semiotic Investigation of Style and Iconology on the World Wide Web. In: Proceedings of the 33rd Hawaii International Conference on System Science, IEEE Press, CD-ROM, Hawaii (2000) 13. Stefani, A., Xenos, M.: A model for assessing the quality of e-commerce systems: PC-HCI 2001 Conference (2001) 14. Van Welie, M., Klaassen, B.: Evaluating museum websites using design patterns, Technical Report: IR-IMSE-001 (2004) Available at: http://www.welie.com/articles/IR-IMSE001-museum-sites.pdf

Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui Susana Gómez-Carnero and Javier Rodeiro Iglesias Escuela Superior de Ingeniería Informática de la Universidad de Vigo Campus As Lagoas S/N, Ourense {jrodeiro,susanagomez}@uvigo.es

Abstract. User Interfaces has an important role on the success of an application. Due the relevant temporal and economic cost of its development is necessary to obtain a high acceptability and effective design. To consider a user interface acceptable this must be kind to user, do its objectives and be easy for the user. In this paper an abstract model specification is presented to allow evaluate the acceptability of user interfaces. This is made in a semiautomatic way validating the three items defined before. We also present a notation for the user interface testing and a tool that allows the user executes user tasks over the graphic user interface prototyping generates by the tool. Keywords: user interface design, usability, user interface modelling, prototyping, user interface test.

1 Introduction The user interface has a big importance on success or failure of a software application and on time and cost spend in its development. Myers [8] determines after questionnaires to developers about the 48% and Garthner Group [2] determines the 70% in the effort to the interface building interactive software. By this is interesting optimize the quality of user interface to obtain maximal acceptability by an effective design. The user interface acceptability is based in three main topics: 1. Is the user interface kindly to the user view? In this case the acceptability are related with aesthetical concepts. Its not consider in this topic load memory of the user or the correctness of the functionality. The basic of this topic is that user had a friendly and kindly to the view. 2. Does the user interface what it has to do? Now it determines if the user interface is useful for the intention which it was created. Useful as for the compliment of user tasks as the user could do what he wants to do. This item is much related with tasks that user have to perform over user interface. 3. Is it easy the use of user interface by user? Here is introduced the usability term [13] [1] [5] [6] [9] [4]. If a user considers complex the use of the user interface, this user does not accept this user interface although this be efficient or kindly to the user view. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 463–471, 2007. © Springer-Verlag Berlin Heidelberg 2007

464

S. Gómez-Carnero and J. Rodeiro Iglesias

The evaluation of the compliment of these criteria is very complex. This complexity is given by the subjective approximation to this evaluation in most of the cases, mainly using expert opinion or questionnaires for the user [7] [11] [9] [14]. Given the subjectivity on user interface evaluation, and due to the importance of personal human perception in user interface qualitative evaluation techniques it will be interesting can reach a more direct and discrete method to avoid this personal perceptions. One engineering approach could be that user can interact with a prototype generating from a specification model obtained in a previous phase of analysis the requirements. With this approach the user can obtain a view of user interface more tangible, identifying early possible problems before spending time and money for the industry. Due the cost of develop software and that most of evaluation methods is realized after this development seems correct try to do the evaluation previously to the implementation. Following the criteria exposed before, a big amount of research related exists but almost all of them are in theoretical state and is not of practical application. Before to examine representation or notation techniques to probe if they can be applied to complex interfaces for an objective evaluation bases on the criteria the conclusion is negative. These are not appropriated to cover the necessities previously defined. By this, we have presented an abstract notation that allows represent the user interface using components, visual presentation (in graphic terms) and user interaction defined at component level. We present in this paper the part of the notation which represents the functionality of user interface, extracting of this task user that allows doing semiautomatic evaluation making minimum the actual development and evaluation costs. In section 2 it’s presented the notation for user interface behaviour representation. In section 3 we present the notation for user task test definition. In section 4 we present the EAU tool that allows the user dynamic evaluation of usability of the user interface by an interactive simulation of user interface. In section 5 we present the conclusions and future work of this research.

2 DGAUI Representation A review of user interface representation models had been made on literature. This review was focus in models that proposed a visual representation and behaviour of user interface [3], but considering the problems that we have found in these representations we need to present an alternative solution. The proposed representation DGAUI considers that visual user interface is not a continuous structure. The fact, it considers that is composed by discrete finite elements, defining the interface as a composition of individual elements called user interface components. This user interface components have a topological hierarchy could be and component into another. [12]. For the definition of the visual user interface the notation allows: - Define the visual user interface components, with standard graphical primitives if the component has visual representation on the user interface, or determinates properties if the component is for input information or only a container of other user interface components.

Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui

465

- Determine the topological composition of visual user interface components to construct the visual user interface which the user interact in a moment of time. - Represent the dialog between components. Identifying the user events that user can use over components and which is the response of the other user interface components when the interaction occurs. The best choice for structure the notation is XML. By this we have create a DTD to allow an easy parse of notation structure. Attending to the notation semantic, one part of it has the initial representation of visual user interface and the second part allows represent everyone states of user interface obtained from interaction defined over user interface components and the transitions between states. Due the different nature of the two parts we divide the notation in two DTD. The first (called DGAUI-DEF) consist in a details definition of each one of the user interface components that composes the whole interface. This separation is adequate for allowing reusing of component definitions in other visual user interface representations. The second (called DGAUI-INT) depending of the first. This dependence is because this second notation is calculated from the first. The DGAUIDEF contains all states that visual user interface can reach. This set of states can be calculated from the initial representation of user interface. The initial state is formed from the properties of the user interface components definition. After this, the possible individual events over user interface components of this state are simulated, and the changes on components determines a new state (that already exists or that is identifier as new) and a transition between the state actual and the new state. This is the application of the concept the state diagrams for a user interface but generates from interaction on individual user interface components. This notation is oriented to the state of visual user interface component instead the state of whole visual user interface. A state of visual user interface is obtained from the combined states of the visual user interface components. Thus, the notation has a separation between presentation and behaviour of user interface. The presentation is located in the representation definition while functionality is located on states and transitions between states. This transition between states is calculated using the possible user actions (events) on visual user interface components and using also event of the system. We define an interface state as the join of all visual user interface components that, by the value of its properties, can be reached by user to interact in a moment of time. For the definition of the notation we consider that: - The user actions are not arbitrary. - The set of visual user interface states are finite and can be described and evaluated. - A visual user interface state depends on components that own and the properties of each one of them. - A state is a moment of visual user interface when is waiting by a user action, and doesn’t change while the user does not interact with it. Each state is characterized by the component value in these four properties:

466


- Visible: Visibility property of the component. Visible (T) or not Visible (F) on screen. - Activo: Indicates if the component responds to the user action (T) or not (F). If the component has Activo(F) in a state it doesn’t exist transition to other state caused by this component. - InfI: This component property activates the input data from user for itself component. If the component property has value True accepts data given by user. - InfO: the data output function of the component is activates with value true. If this property has value True the component will visualize the data send from the “core” of the application to the user. Events are user action over input hardware devices on system. These events are detected by the system and the system will respond as be defined for each event. An event is a single user action, for example, drag and drop is the combination of three single actions or events: click, move and release. We can define pre-conditions and post-conditions. If an event is defined using the notation over a visual user interface component and it has not pre-condition, the changes for other components can be performed always if the event over this component is produced. If a pre-condition exists, for example, for event RightClick over ComponentTwo is “ComponentOne:Activo(T)” (the property Activo of componentOne component have value True), this user action over ComponentTwo will not be performed if the value of ComponentOne in property Activo is F. For post-conditions, we define the values of properties that must be satisfied to reach the next state. The notation does not limit the events that can be defined for interaction. The HCI engineer can define the events that considers necessary and communicates its meaning to the workgroup. Some examples of basic event that we use are: - LeftClick: click over Mouse left button - RightClick: click over Mouse right button - ReLeClick: Release Mouse left button - ReRiClick: Release Mouse right button - MouseOn: Mouse pointer over a component - Key(key of keyboard): keys of keyboard combination With DGAUI from the events over the visual user interface components is possible to calculate the events over states. Thus, we can build a oriented labelled state graph of user interface and establish what the following state of user interface is if we know that component is affected by an event. The vertices are the visual user interface states and labelled arcs are the transitions between states. Two determining special states for the functionality of the interface exist: - Initial state: vertex in which all their associate arcs are of exit and do not have any arc of entrance that it can not possible be reached without passing by initial state.


467

- Final State: vertex in which all their associate arcs are of input and no one of exit. An anomalous situation exists if two or more vertex has only arcs of input because it will exists more that one final state in visual user interface. The set of possible following states if a user action is performed can be obtained from the initial state. This is possible applying the events over the visual user components with property Activo(T) and making the associate changes of interaction on the other visual user components. From this first set of following states, and applying the same process to each one, the rest of states can be obtained until to reach a final state where any visual user component has value True in properties Activo and Visible. During the building state process can be identified transitions or arcs (labelled with a component and event that is applied) to states identified before. Two states are equal (and therefore the same state) if all its visual user interface components have the same value in properties Activo, Visible, InfI and InfO. A visual user interface component belongs to a state if has one of the following functionality. Los componentes que forman parte de un estado son los que tienen una funcionalidad dentro del estado. One of the functionality is that the visual user interface component has a visual appearance in the visual interface that provides of relevant information to the user (in this case the component property Activo has value True). Other of the functionality is that the visual user interface component causes changes on the properties of other components when event is produced over it. One of the advantages of this notation is that allows the visual properties modification of a component without the behaviour of this component varies (the user could not seen the component but its behaviour is maintained along intermediate states). If the visual user interface component appearance modification varies its functionality, this would be a different visual user interface component. The user interpretation of the visual user interface component appearance must be unique for each visual user component and must also identify to the user its functionality. In other case the component will be ambiguous and the visual user interface design will be wrong. This situation exists in interactive models that using interactor. The specification of interactor is defined to support the different interactor states reached by the occurrence of user actions. According to traditional specification of interactor, for an interactor state there is a concrete functionality and a unique appearance of state interactor. There is no model that considers multiple rendering functions for a unique interactor state. This is because the specification is based on interactor dialog instead of its visual appearance. In DGAUI proposal may exists different appearance for a visual user interface component, caused by user actions (visual operations), but the visual user interface component behaviour is the same. Visual operations are for example resizing or changes on the size of visual user interface components. Using DGAUI the visual interface consistence is maintained because two visual user interface components, with the same appearance must have the same behaviour. But if the visual user interface component appearance is modified as a personal choice of user, this modification will not affect to visual user interface component behaviour. Because this work is oriented to early phases of prototyping DGAUI proposal does not consider the abstract representation of application data. The participation on user interface of visual user interface component as elements to allow the user choice of input and output information are defined including in version 3.04 domain definition

468


of data or mask for text input. If a user action changes drastically the visual user interface component appearance then the new appearance must be a new visual user interface component and therefore a different visual user interface state. If the visual user interface is correct would be exists only a state where all visual user interface component have the properties Activo and Visible with value False (final state). Also, it can identify the initial state from the visual user interface components representation in DGAUI-DEF examining their properties. The visual user interface component definition, the topological composition, and the dialog between components are constant for a visual user interface. La information for each state of visual interface is determined by the values of visual user interface component properties. Once obtained the states of the visual user interface we use a state graph (multidigraph) to represent the whole set of transitions between states. In it the vertex are the visual user interface states and the arcs are the transitions between states. The arcs are labelled with the name of visual user interface component and the event that causes the transition. The XML document (DGAUI-INT) contains the following information: - Topological Composition of visual user interface components contained in other visual user interface components. - Information about the visual user interface states. All visual user interface states are defined by the description and properties of its components. The initial state is obtained from visual user interface components description and the other states are obtained from a automatic process. - Set of transitions between states. This is obtained during the automatic process of states identification. Information about XML the structure and examples of notation DGAUI may be seen in http://www.ei.uvigo.es/~susanagomez/hci.html

3 User Test Definition Once defined the visual user interface, if we want to evaluate it we must define the test over this interface. The objective is that the user interacts with the prototype and during this interaction we record as parameters or actions as we desire. DGAUI provides of the components and state appearance description that may be rendering in a standard rendering device. Also it provides of the user tasks and the states that user can find using the visual user interface. The first phase if we want automate the evaluation of visual user interface is define a notation that allow describe the atomic parts of the evaluation. Equal that DGAUI-DEF we use XML for structure the notation. By this we have create a DTD to allow an easy parse of notation structure. It is possible define as many evaluations as it is desired. Each evaluation is formed by a set of user task to perform. Each user task is described by the following information: - A description. (textual description of the user task for documentation) - The parameters to evaluate and record during the evaluation process. May be time parameters, for example total time using by user to perform the task, time until the user starts interaction, medium time between events,


469

time to first user mistake, etc. los parámetros que se quieren evaluar durante la realización de la tarea. Other parameter may be counter parameters, for example, number of user events, number of user mistakes during the evaluation task, number of times that an error happens, etc. The last parameter may be error parameter that is dedicated to identify and control types of user mistakes, for example, what is the most frecuent user mistake or what is, produced in a state, the user mistakes previously defined. - The visual user interface states that it will be evaluated. It is defined which are the states and the transitions (visual user interface components and user actions) presented in the prototype to the user. Information about XML the structure and examples of notation EUA may be seen in http://www.ei.uvigo.es/~susanagomez/hci.html

4 Evaluator of User's aActions (EUA) The EAU tool allows the user evaluate dinamicaly the visual user interface usability. This evaluation is doing with a interactive simulation of the interface. From the abstract notation of DGAUI (concretly DGAUI-INT) we can build the visual appearance of the interface states and simulate the user actions over the components. With de EAU notation it can posible define the user task that user can probe. The simulation reproduces the visual appearance of interface following the user task described in section 3. The user interaction on the simulation record information according the parameters defined in section 3. This information is stored in a data base for its posterior study and analysis. This tool allows the HCI engineer define as evaluations as be necessary and obtain cuantitative information about the real use of a user on the visual user interface. Normally, the HCI engineer explains the user which are the objetives that has to reach while evaluate the visual user interface on the prototype. Then with the information obtained the HCI engineer can determine if the interface has any problem before start to code it. EAU tool has a simple interface with two basic functionalities: - Load Interface: allows select a XML file that contains the visual user interface description (DGAUI-DEF and GDAUI-INT) and generate the visual appearance of the user interface states. - Test Interface: it is necessary to evaluate the visual user interface select the individual user tasks definitions that complete the evaluation (AEU XML file). With this definition and with the configuration parameters to access to the data base the simulation is executed. For each user interaction with the prototype the information about this interaction it is stored in the data base for its study. Fig. 1 shows a visual user interface state generated by EAU tool from a description DGAUI. The interface example corresponds with a basic text processor.

470


Fig. 1. Text processor prototype

5 Conclusions and Future Work In this work we present an abstract representation of user interfaces specially designed for visual interactive systems. The focus of this representation is the visual aspect of user interface because this is the most important part of the user interface for the user. The most significate information that user obtain from user interface is throught appareance and his interaction on user interface is based on this signification. Other contribution of this work is the concept of visual user interface component with different appearance (in most of the cases size and position) for the same behaviour allowing diferent rendering functions for the same state of component. We showed that is possible to describe in a notation a set of interface user tasks and create automatically a prototype to evaluate with users its behaviour and its acceptability in a cuantitative way. As future work we are working on exception definitions for the user interface behaviour, including system events and user interface answers as result of data basw querys. Other line of work is to define metrics to be used with the information obtained from the evaluation and create a system that can find automatically errors on user interface prototype using agents. Acknowledgements. This work has been founded by projects TIN2005-08863-C0302 and 05VI-C02.

References 1. Eason, K.: Information Technology and Organizational Change. Taylor and Francis, London (1988) 2. Gartner Group Annual Symposium on the Future of Information Technology, Cannes (November 7-10, 1994)


471

3. Gómez Carnero, S., Rodeiro Iglesias, J.: Aplicación de los sistemas de representación para la sistematización de la validación de interfaces de usuario. Technical Report TR-LSIGIG-05-2. Computer Science Department. University of Vigo (2005), http://www.ei.uvigo.es/ susanagomez/hci.html 4. IBM, IBM Dictionary of Computing. McGraw-Hill (1993) 5. ISO. Software product evaluation quality characteristics and guidelines for their use (1992) 6. ISO. Ergonomics Requirements for Office Work with Visual Displays Terminals: Guidance and Usability (1993) 7. Molich, R.y., Nielsen, J.: Heuristic evaluation of user interfaces. In: Proceedings of ACM CHI 1990. Seattle, WA, April 1990, pp. 249–256 (1990) 8. Myers, B.A.y., Nielsen, J.: Survey on user interface programming. In: Bauersfeld, P., Bennett, J.y., LYNCH, G. (eds.) CHI’92 Conference Proceedings on Human Factors in Computing Systems, pp. 195–202. ACM Press, Nueva York, NY (1992) 9. Nielsen, J.: Usability Engineering. Academic Press, London (1993) 10. Nielsen, J., Mack, R.L.: Usability Inspection Methods. John Wiley and Sons, New York (1994) 11. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S.: Human-Computer Interaction. Addison-Wesley Publishing, Reading, MA (1994) 12. Rodeiro Iglesias, J.: Representación y análisis de la componente visual de la interfaz de usuarios. PhD Thesis. Universidad de Vigo (September 2001) 13. Shackel, B.: Ergonomics in designing for usability. In: Harrison, M.D., Monk, A. (eds.) People and Computers: Designing for Usability, Cambridge University Press, Cambridge (1986) 14. Wharton, C., et al.: The cognitive walkthrough method: a practitioner’s guide. In: Nielsen, J.y., MACK, R.L. (eds.) Usability Inspection Methods, pp. 105–140. John Wiley & Sons, New York (1994)

Adaptive Evaluation Strategy Based on Surrogate Model Yi-nan Guo, Dun-wei Gong, and Hui Wang School of Information and Electronic Engineering,China University of Mining and Technology, 221008 Xuzhou, China [email protected]

Abstract. Human fatigue is a key problem existing in interactive genetic algorithms which limits population size and generations. Aiming at this problem, evaluation strategies based on surrogate models are presented, in which some individuals are evaluated by models instead of human. Most of strategies adopt fixed substitution proportion, which can not alleviate human fatigue farthest. A novel evaluation strategy with variable substitution proportion is proposed. Substitution proportion lies on models’ precision and human fatigue. Different proportion cause three evaluation phases, which are evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. In third phase, population size is enlarged. Taking fashion evolutionary design system as an example, the validity of the strategy is proved. Simulation results indicate the strategy can effectively alleviate human fatigue and improve the speed of convergence.

1 Introduction Interactive genetic algorithms (IGAs) are a kind of genetic algorithms in which fitness of individuals are evaluated by human subjectively. Now their realization touch many optimization problems whose objectives can not be expressed by explicit fitness functions, ranging from music composition[1] to production design[2].However human are easy to feel tired in evaluation, which limits population size and generations. So human fatigue is a key problem which limits the applications of IGAs. According to physiological characters of human, if human are absorbed in one work for a long time, they are easy to feel tired. In IGAs, human need to evaluate each individual in each generation. More individuals human evaluate, they feel more tired. If cognition knowledge about human preference to optimization problems can be extracted from the evolution and be utilized to evaluate individuals instead of human, the number of individuals evaluated by human should decrease which will alleviate human fatigue. This cognition knowledge is generally described by surrogate models. Up to now, many researches about surrogate models have been done. On the one hand, different methods were introduced in the description of surrogate models, such as artificial neural networks[3] and support vector machines[4]. On the other hand, different evaluation methods, such as sparse expression[5] and hybrid methods[6], J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 472–481, 2007. © Springer-Verlag Berlin Heidelberg 2007

Adaptive Evaluation Strategy Based on Surrogate Model

473

were adopted in order to lower the complexity of evaluation and reduce human burden. In above all researches, surrogate models replace human to evaluate all individuals or part of individuals in each generation so as to reduce the number of individuals evaluated by human. But they did not utilize surrogate models enough. First, the proportion of population evaluated by surrogate models in each generation is fixed which alleviate human fatigue limitedly. Second, population size is small and fixed all the time which limits the performance of IGAs. Surrogate models compute the fitness of individuals by computers which do not need human participation. So population size can be enlarged when only surrogate models are adopted in evaluation. Aiming at solving above problems, a novel adaptive evaluation strategy based on surrogate model is proposed. The number of individuals evaluated by surrogate model is adaptively tuned according to the degree of human fatigue and the evaluation precision of the model so as to effectively alleviate human fatigue. When population is evaluated only by surrogate model, population size is enlarged so as to improve the speed of convergence. In the rest of the paper, adaptive evaluation strategy is explicated in Section2. To validate the validity of the strategy, experiments based on fashion evolutionary design system and testing results are analyzed in Section3. At last, future work planned to introduce distributed neural networks into surrogate model is included.

2 Adaptive Evaluation Strategy Based on Surrogate Model When human preference to optimization problems is stable, surrogate model is adopted to evaluate individuals instead of human. Here, two problems must be taken into account. First, surrogate model must keep consistency with human cognation and preference exactly in order to ensure the convergence of the algorithm. So how to obtain the model with great prediction precision and generalization is the base of the strategy. Second, how to utilize the model instead of human in evaluation influences the performance of the algorithm. In this paper, the latter is of interest. In adaptive evaluation strategy, when the model is started up in the evaluation process and how many individuals are evaluated by the model in each generation are two key problems. And now few of researches concern them. 2.1 Startup Mechanism About Surrogate Model Startup mechanism offers some conditions which decide when to start up surrogate model in the evaluation process. That is in which generation these conditions are satisfied, population can be evaluated by surrogate model in proper proportion. In general, when human feel tired and surrogate model has learned human preference exactly, the model is adopted to calculate the fitness of individuals. So startup mechanism about surrogate model includes two conditions. They are the condition of human fatigue and the condition of models’ precision. When any of conditions is satisfied, surrogate model is start up to evaluate. This evaluation strategy is shown as follows.

474

Y.-n. Guo, D.-w. Gong, and H. Wang

F(P(t)) ={Fm(I,t), Fu (I ' ,t)| I ≠ I ' , I, I ' ∈P(t)}, Fm ≠∅, ∀((Fa(t) ≥ ε) ∨ (Trf (t) ≥Ψ))

(1)

where Fm denotes fitness value calculated by surrogate model and Fu denotes fitness value given by human. I and I ' express individuals evaluated by surrogate model and human respectively. P ( t ) denotes the population in t-th generation. Fa(t) ≥ ε describes the condition of human fatigue where Fa(t ) expresses the degree of human fatigue and ε is the threshold for human fatigue. The degree of human fatigue reflects how tired human are. Letting v(t ) denotes time that human spend

evaluating and β (t ) denotes the proportion of population evaluated by human. The degree of human fatigue is defined as follows [7].

Fa (t ) = 1- e - tv ( t )β( t ) S ( t )

(2)

where t is generation and S (t ) is the similarity of population which describes average similarity of individuals in population, shown as follows. | P −1| | P|

S (t ) =

2∑

n

∑ ∑σ ( x (t ) , x (t )) l

i

j

i =1 j = i +1 l =1

(3)

| P || P − 1|

where | P | is population size and n is the length of individuals. σl ( xi (t ), x j (t )) expresses the similarity of l-th bit between two individuals. σl ( xi (t ), x j (t )) =1 if l-th bit of xi (t ) is same as it of x j (t ) , otherwise σl ( xi (t ), x j (t )) = 0 . Human will spend more time evaluating the individuals when similar individuals in the population are more. It is obvious that human feel more tired when the total number of individuals evaluated by human is more and time for evaluation in each generation is more. Trf (t) ≥Ψ describes the condition of models’ precision where Trf (t) expresses the reliability of surrogate model and Ψ is the threshold for the reliability of the model. The reliability of the model reflects the consistency between surrogate model and human preference. It is measured by the average Euclid distance between fitness value calculated by the model and fitness value given by human of individuals in sampling population, shown as follows.

Trf ( t ) =

| Ps | | Ps |

∑( F ( I , t ) - F ( I , t ) ) u

i

m

2

(4)

i

i =1

where

| Ps | is the sampling population size.

In a word, whether surrogate model is start lies on two conditions: whether or not the degree of human fatigue exceed the threshold for human fatigue; whether or not reliability of surrogate model exceed the threshold for the reliability of the model.


475

2.2 The Proportion of Population Evaluated by Surrogate Model

The proportion of population evaluated by surrogate model decides how many individuals are evaluated by the model in each generation. Up to now, in most of evaluation strategy based on surrogate models, the proportion of population evaluated by the model is fixed which limits the effect of surrogate models on performance. Aiming at this problem, the proportion of population evaluated by the model adaptively varies. Two factors are taken into account to decide the proportion of population. First, when human feel more tired, human hope that fewer individuals are evaluated by themselves. Second, when the reliability of surrogate model is higher, the model is more urgent to evaluate more individuals instead of human. So the proportion of population evaluated by the model in t-th generation is defined as -Trf (t )

ρ(t ) = Fa (t )(1- e

)

(5)

So the number of individuals evaluated by the model in t-th generation is

N f (t ) = ⎣⎢| P | ρ (t ) ⎦⎥ 2.3

(6)

Substitution Mechanism About Surrogate Model

In general, the evaluation process of IGAs adopting evaluation strategies with fixed proportion of population evaluated by surrogate model can be divided into two phases. There are two kinds of division. If ρ(t ) = 1 while the conditions of startup mechanism are satisfied, two phases include phase evaluated by human only and phase evaluated by surrogate models only. If ρ(t ) < 1 while the conditions of startup mechanism are satisfied, two phases include phase evaluated by human only and phase mixed evaluated by human and surrogate models. But in this paper, adaptive proportion of population evaluated by the model is adopted. So the evaluation process is different with above instances. According to the number of individuals evaluated by human in each generation, there are three phases in the evaluation process of IGAs, including evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. Here, the third phase is of interest. In this phase, population size is enlarged because there does not exit human fatigue when surrogate model is adopted as an implicit fitness function. Phase I: Population is evaluated by human only In this phase, all of individuals are evaluated by human and surrogate model is not started up. So the evaluation mode and the number of individuals evaluated by the model are defined as follows.

F ( P(t )) = {Fu ( I ' , t ) | I ' ∈ P(t )}, ∀Fa(t ) < ε , Trf (t ) < Ψ

(7)

Nf (t) =0

(8)

476


It is obvious that in this phase, human do not feel tired and surrogate model can not reflect human preference exactly. Above phenomenon possibly appear in the beginning of evolution. So the evaluation mode of this phase is usually adopted in the former of evolution. Phase II: Population is mixed evaluated by human and surrogate model In this phase, surrogate model is startup. Some of individuals are evaluated by human and others’ fitness values are calculated by the model. So the evaluation mode and the number of individuals evaluated by the model are shown as follows.

F(P(t)) = {Fm (I , t), Fu (I ' , t)| I ≠ I ' , I , I ' ∈ P(t)}, ∀Fa(t) < ε ,Trf (t) ≥ Ψ −Tr ( t ) N f (t ) = ⎢⎣| P | Fa (t )(1 − e f ) ⎥⎦

(9)

(10)

In this phase, the degree of human fatigue does not exceed the threshold and surrogate model learns human preference exactly. So the number of individuals evaluated by the model is increasing. In above phases, population size is fixed and small because human participate in the evaluation process. In general, population size in IGAs is less than ten to alleviate human visual fatigue. Phase III: Population is evaluated by surrogate model only In this phase, all of individuals are evaluated by surrogate model. So the evaluation mode is shown as follows.

F (P(t )) = {Fm ( I , t ) | I ∈ P(t )}, ∀Fa(t ) ≥ ε

(11)

Human often feel very tired in the latter of evolution while this evaluation mode is adopted. Because evaluation strategy based on surrogate model is done by computers, the evaluation process in this phase is the same as traditional genetic algorithms. So population size can be enlarged. But how to extend population size is a key problem. Higher the precision of surrogate model is, generalization of the model is better. And population size can be larger. So enlarged population size is defined as follows.

N p (t ) =| P | where

⎢ ⎥ 1 + 0.5 ⎥ ⎢ ⎢⎣ Tr f ( t0 ) Fmax ⎥⎦

(12)

Fmax denotes the upper limit of fitness. Trf (t0 ) expresses the reliability of

surrogate model as evaluation strategy in Phase III is adopted. It is obvious that the value of exponent in formula (12) may be 1, 2 or 3. So population size may be enlarged to corresponding multiple of | P | .


477

3 Simulations and Analysis 3.1 Background for Simulations

In this paper, fashion evolutionary design system is adopted as a typical background to validate the rationality of adaptive evaluation strategy. The goal of the system is to find a dress which wins the favor of human [8]. Visual Basic 6.0 as programming tool for human-machine interface and Microsoft Access as database are utilized. Matlab 6.5 is adopted to train surrogate model based on artificial neural networks. In fashion evolutionary design system, each dress is composed of collar, skirt and sleeve. Each part has two factors including pattern and color which described by two bits. So each dress is expressed by 12 bits, which act as 6 gene-meaning-units (GMunits)[9]. Each gene-meaning-unit has four alleles. The meanings of each allele in gene-meaning-unit are shown in Table.1. Table 1. The meanings of each allele in gene-meaning-unit allele GM-unit meaning

code

meaning

code

meaning

code

meaning

code

collar’s pattern

medium collar

00

high collar

01

wide collar

10

gallus

11

sleeve’s pattern

long sleeve

00

medium sleeve

01

short sleeve

10

nonsleeve

11

skirt’s pattern

long skirt

00

formal skirt

01

medium skirt

10

short skirt

11

color

pink

00

blue

01

black

10

white

11

3.2 Desired Objectives and Parameters in Experiments

In order to validate the rationality of adaptive evaluation strategy and the influence on performance of IGAs, two groups of experiments are designed. They have different desired objectives which reflect different psychological requirements of human. Desired objectives of experiments are shown as follows. Experiment I: To find a favorite dress fitting for summer without the limit of color. Experiment II: To find a favorite dress fitting for summer and the color is blue. In both experiments, artificial neural network is adopted as surrogate model. The values of parameters about the model and the evolution are shown in Table.2. 3.3 Analysis of Performance About Adaptive Evaluation Strategy

In order to validate the rationality of IGAs with adaptive evaluation strategy (AESIGAs), 30 persons are gathered to do two groups of experiments aiming at desired objective of experiment II.

478

Y.-n. Guo, D.-w. Gong, and H. Wang Table 2. The values of parameters

parameters about the evolution

parameters about the model

crossover probability

mutation probability

population size

generation

ε

ψ

0.5

0.01

8

40

0.7

0.7

input neurons

hidden neurons

output neurons

learning rate

epochs

error

6

15

1

0.09

15000

10

-2

Group I: Comparison of different proportion of population evaluated by surrogate model Fixed proportion of population and adaptive proportion of population evaluated by surrogate model are adopted in experiments respectively. Testing results done by all persons are integrated, as shown in Table.3. Table 3. Comparison of the performance by different proportion of population the proportion of population

average generation

The average number of individuals evaluated by human

ρ( t ) = 0.5

16

100

ρ (t ) = 1

14

72

12

52

-Trf (t )

ρ(t) = Fa (t)(1- e

)

It is obvious that the difference of average generation among different proportion of population is small. But the difference of the average number of individuals evaluated by human is large. First, if ρ(t ) = 0.5 , the number of individuals evaluated by human is equal to half of population size when the condition of startup mechanism is satisfied. But human should evaluate individuals all along. So human shall feel more tired adopting this evaluation strategy than other strategies. Second, if ρ(t ) = 1 , all individuals are evaluated by surrogate model when human feel tired and the model can reflect human preference exactly. Although human evaluate fewer individuals than first strategy, the model are started later than adaptive evaluation strategy. So the number of individuals evaluated by human is more than last strategy. Group II: Comparison of different population size in Phase III Fixed population size and enlarged population size are adopted in experiments respectively. Testing results are shown in Table.4.


479

Table 4. Comparison of performance by different population size in Phase III

population size

average generation


|P|

13

52

12

52

⎢ ⎥ 1 +0.5⎥ ⎢ ⎣⎢Trf (t0 ) Fmax ⎦⎥

| P|

It is obvious that different population sizes in Phase III do not influence the degree of human fatigue in evaluation because the average number of individuals evaluated by human is same. But the speed of convergence adopting enlarged population size is faster than it adopting fixed population size. The reason for this result is that exploration of the algorithm is better while population size is larger. 3.4 Comparison of Performance About IGAs

In order to validate the improvement in performance of IGAs with adaptive evaluation strategy, 30 persons are gathered. Everyone do four experiments, including experiment I adopting IGA and AES-IGA respectively, experiment II adopting IGA and AES-IGA respectively. Aiming at each experiment, testing results done by all persons are integrated, as shown in Table.5. Table 5. Comparison of performance with IGAs and AES-IGAs Experiments

I

II

Evaluation strategies

IGA

AES-IGA

IGA

AES-IGA

Average generation

28

9

40

12


224

41

240

52

The average number of individuals evaluated by human in each generation

8

4

8

4

Generations when Fa(t) ≥ ε

-

7

-

9

Comparison of testing results in experiment I, generation adopting AES-IGA averagely reduces 68.9% than IGA. The total number of individuals evaluated by human adopting AES-IGA averagely reduces 80%. These indicate adaptive evaluation strategy can effectively alleviate human fatigue and speed up convergence so as to reduce human burden for evaluation which makes human absorbed in more creative design work.

480


Comparison of testing results between two groups of experiments, generations when Fa(t) ≥ ε in experiment I is lower than it in experiment II. This means human are easy to feel tired when they concern more gene-meaning-units. This matches the physiological rules of human.

4 Conclusion In order to farther alleviate human fatigue in interactive genetic algorithms, a novel adaptive evaluation strategy with variable substitution proportion is proposed. Startup mechanism about surrogate model considering the degree of human fatigue and the evaluation precision of the model is given. Variable proportion of population evaluated by surrogate model is proposed. Three phases are given according to the number of individuals evaluated by human in each generation, including evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. In third phase, population size is enlarged. Taking fashion evolutionary design system as a testing platform, the validity of adaptive evolution strategy is validated aiming at different psychological requirements of human. Comparison of testing results adopting IGAs with fixed proportion of population evaluated by surrogate model or fixed population size and AES-IGAs with adaptive evaluation strategy proposed in this paper, they indicate adaptive evaluation strategy can convergent faster than others and human feel less tired. Compared with canonical IGAs, AES-IGAs can effectively alleviate human fatigue and improve the speed of convergence. The surrogate model based on distributed neural networks is the future research. Acknowledgements. This work was supported by the National Postdoctoral Science Foundation of China under grant 2005037225, the Postdoctoral Science Foundation of Jiangsu under grant 2004300, the Youth Science Foundation of CUMT under grant OC 4465.

References 1. Biles, J.A., Anderson, P.G., Loggi, L.W.: Neural Network Fitness Functions for A Musical IGA. In: Proc.of the Symposium on Intelligent Industrial Automation & Soft Computing, pp. 39–44 (1996) 2. Takagi, H.: Interactive Evolutionary Computation: System Optimization Based on Human Subjective Evolution. In: Proc.of IEEE Conference on Intelligent Engineering System, pp. 1–6 (1998) 3. Zhou, Y., Gong, D.W., Hao, G.S., et al.: Neural Network Based Phase Estimation of Individual Fitness in Interactive Genetic Algorithm. Control and Decision 20, 234–236 (2005) 4. Wang, S.F., Wang, S.H., Wang, X.F.: Improved Interactive Genetic Algorithm Incorporating with SVM and Its Application. Journal of Data Acquisition & Processing 18, 429–433 (2003) 5. Lee, J.Y., Cho, S.B.: Sparse Fitness Evaluation for Reducing User Burden in Interactive Genetic Algorithm. In: Proc. of IEEE International Fuzzy Systems, pp. 998–1003 (1999)


481

6. Sugimoto, F., Yoneyama, M.: An Evaluation of Hybrid Fitness Assignment Strategy in Interactive Genetic Algorithm. In: 5th Workshop on Intelligent & Evolutionary Systems, pp. 62–69 (2001) 7. Guo, Y.N., Cheng, J., Dun, W.G.: Knowledge-inducing Interactive Genetic Algorithms Based on Multi-agent. In: Jiao, L., Wang, L., Gao, X., Liu, J., Wu, F. (eds.) ICNC 2006. LNCS, vol. 4221, pp. 769–779. Springer, Heidelberg (2006) 8. Kim, H., Cho, S.B.: Application of Interactive Genetic Algorithm to Fashion Design. Engineering Applications of Artificial Intelligence, 13, 635–644 (2000) 9. Hao, G.S., Gong, D.W., Shi, Y.Q.: Interactive Genetic Algorithm Based on Landscape of Satisfaction and Taboos. Journal of China University of Mining & Technology 34, 204–208 (2005)

A Study on the Improving Product Usability Applying the Kano’s Model of Customer Satisfaction Jeongyun Heo, Sanhyun Park, and Chiwon Song MC R&D Center, LG Electronics Inc, Kasan-Dong, KumChon-Ku, Seoul, Korea {jy_heo,sanghyun,chiwon}@lge.com

Abstract. User-Centeredness is the popular approach for achieving users’ satisfaction. Nevertheless, when considering profit optimization under economy efficiency and the limit of development period, it is almost impossible to apply solutions to all the usability problems reported during the test. Therefore, the strategic approach is required to maximize the perceived usability under the limited circumstance. Physical User Interaction (PUI) is defined as the physical side view of the usability and the broader concept of the usability. In this research, we constructed UI guidelines for PUI (Physical Usability Interaction) of mobile phone reflecting the user’s value. This research applied the Kano’s model of customers’ satisfaction to classify the PUI guidelines into two groups. One is the design standards which must be satisfied to guarantee the minimum satisfaction. The other is the value-adding criteria to hold a dominant position compared to competitive product. From this categorization, we could use the PUI design guidelines not only for evaluating current product quality, but also for finding the direction of strategic value improvement. Keywords: PUI(Physical User Interaction), Customer satisfaction, classification of usability problem, Perceived usability, kano’s model of customer satisfaction.

1 Introduction As the mobile phone has become the centre of digital convergence, a large number of features like the camera, MP3 player, TV, and other functions are loaded to the mobile phone. Besides, these added new features brought various trials to product design. As a consequence, new types of usability problems mainly related to product design are arisen. Bevan(1999) suggested the concept of “Quality in Use” as the enlarged concept for the usability to emphasize that a system should match real user needs in a working environment. If the designer did not reflect the real usage context into the product design, the user may experience inconvenience and discomfort and dissatisfied with the product. This research introduces the concept of Physical User Interaction (PUI) to define the physical side view of the usability and the broader concept of the usability. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 482–489, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Study on the Improving Product Usability Applying the Kano’s Model

483

Physical User Interaction (PUI) is defined as the physical side view of the usability and the broader concept of the usability. PUI issues come not only from the issues in the ergonomics area which you could easily conceive, but also from the usage experience of similar devices, emotional preferences, and the usage context of each function. Imagine a camera-phone which needs lots of clicks to take a photo cause of the absence of quick access key for the camera mode, you probably could not satisfied with it. Likewise if the buttons is too hard to press, your perceived usability would be bad no matter how other features’ usability is high. A research based on the real industrial field cases reported that usability evaluation and improvement activity not considering users’ value could not lead to the product value enhancement in the real market field. The main reason behind seems that evaluation during the product-development-phase usually focus on the detection of dissatisfaction factor and improvement of detected issues. Besides internal development environment does not allow to improve all the detected issues cause of the economic concept like maximum value ROI(Return on Investment) or constraint on development time. In other words, applying all the issues found to the product is almost impossible. That’s why strategic approach like applying issues with priority is needed. Priority for usability issues are usually decided with considering the severity of the issue itself, and most of the gap between the evaluated usability and user-perceived usability comes from the priority difference. To enhance the perceived usability, we should find the way to reflect the users’ value to the priority of the usability issues. Kano's methods(1984) show a reasonable approach to reflect users’ value and to understand customer-defined quality. He reveals the relations between the customersatisfaction with the product requirement. Furthermore he characterizes product requirements which influence customer satisfaction into three different groups; must-be requirement, attractive requirement, and one-dimensional requirement. Must-be quality is the mandatory one, without that users could not satisfies at all. Attractive requirement is an optional one. If this type of requirement is provided, users may be attracted by the product but without that users may not feel the inconvenience. Onedimensional requirement is functionally related to users’ satisfaction. If not provided, users may be un-satisfied. If provided, users may be satisfied. In this research, we applied Kano’s model on the constructing UI guidelines for PUI (Physical Usability Interaction) of Mobile phone. Classification of usability issues considering potential effect is the first start point of this research. Then we applied Kano’s model of customers’ satisfaction to prioritize the issues reflecting users’ value. This priority is the base of strategic improvement. This research categorizes the UI guidelines into two groups. One is the design standards which must be satisfied. The other is the comparison criteria to hold a dominant position compared to competitive product. From this UI design guidelines may applied not only to evaluating product quality, but also to providing value improvement direction. Furthermore, the use of UI design guideline may expand to user satisfaction from quality control.

2 Kano Model: The Theory of Attractive Quality Kano’s Model(1984) is based on the two-factor theory of job satisfaction by Herzberg(1974) which suggests that the factors causing job satisfaction are different from the factors causing job dissatisfaction. According to Kano, quality could not be explainable with one-dimensional recognition. For instance, people are very dissatisfied

484

J. Heo, S. Park, and C. Song

if they could not make a call, but they are not satisfied if it does. The one-dimensional view of quality could not explain this case. Kano et al(1984) introduced a model which categorize quality attributes, based on customers’ satisfaction with the level of quality. This view is useful to understand how customers evaluate a product. Kano defined the customer expectations for product quality as five levels: 1) must-be, 2) one-dimensional, 3) attractive, 4) Indifferent, and 5) Reverse. Must-be quality is the minimum requirement to avoid the customer’s dissatisfaction. It also introduces as must-have level. One-dimensional quality has the one-dimensional characteristics regarding quality and satisfaction; users’ level of satisfaction is proportional to the provided quality. Attractive quality has the contrary meaning of the basic quality. Though the absence of attractive quality does not promote the user’s dissatisfaction, if provided these features could excite and delight users. Indifferent quality is the quality which does not result in either customer satisfaction or customer dissatisfaction. Though Kano model explains the relationship between product quality and users’ satisfaction, it also applicable to the relationship between the usability and users’ satisfaction. (Jolka, 2005) Fig.1 shows the applied Kano Model.

Fig. 1. Kano Model applied on the usability domain

We defined the characteristic of utility for uses’ satisfaction as three levels: 1) basic, 2) opportunity, and 3) attractive. For clear meaning the term is changed, and the meaning is almost same as explained above. While developing a product, all of the guideline could not be applied cause of environment constraints like time and resource. Even worse there may be some conflicts among guidelines, so we should decide which one to apply. By adopting the Kano’s model of customers’ satisfaction, we could classify the PUI guidelines with priorities, and this will be helpful to for finding the direction of strategic value improvement.

3 Constructing PUI Guideline Using Kano’s Model The guideline is a usable tool in organization to systematically improve and monitor the usability of the product. A guideline could be structuralized by grouping the cases of PUI issues then revised from the applicable design principles. This section introduces the details of PUI guideline suggested.


485

3.1 Constructing PUI Guidelines Total 106 design evaluation lists are designed based on the design principals which are obtained from the usability problem reported through commercialized products. Among these defined evaluation lists, some are proposed to Appendix. 3.2 Adopting Kano’s Model to Categorize He Collected PUI Guidelines Kano’s survey model asks the users’ preferences in both cases of where the effect is given and where it is not. The organized evaluation lists are performed with sixty users. These users are not sexually considered and the average age is between twenty four and thirty one. Kano’s survey model is designed as following.

Fig. 2. A pair of requirement questions in a Kano questionnaire

The results can be classified into three categories according to the Kano’s evaluation standard shown in Fig. 3 below.

Fig. 3. Kano evaluation table adapted from Berger et al.(1993)

Kano’s classification defines the representing property as the one that has the most votes. However, this method hardly reflects the difference between preferences of each user. To compensate Berger(1993) proposed users’ satisfaction coefficients. However, If

486


most of responses are irrelevant one like indifferent, reverse or questionable, another index for selecting meaningful properties is needed. We proposed effective in order to isolate effective responses from those are not. All three coefficient is represented as follows; Users’ satisfaction coefficients = (A + O) / Total responses .

(1)

Users’ dissatisfaction coefficients = (O + E)/ Total responses.

(2)

Effective Coefficients = (A + O + E)/ Total responses

(3)

The user satisfaction coefficient is the ratio of ratio of positive responses. “Onedirectional” and “Attractive” are the positive responses because they are directly proportion to the increment in usability. The user dissatisfaction coefficient is defined as the ratio of negative. “One-directional” and “Must-be” are the negative responses cause which tend to decrease users’ satisfaction when usability is inferior. At this time, the total response is defined as the sum of four kinds of response except the irrelevant responses which seems discreditable and the questionable response. Also, effective coefficient is defined as the ratio of meaningful properties. “Must-be,” “One-directional,” and “Attractive” are the elements that directly affect the user. In this research, we defined the property with 0.65 or higher effective coefficient as valid. Fig.4 shows graph of valid properties with 0.65+ effective coefficients representing relationships between the user satisfaction and dissatisfaction coefficients.

Fig. 4. Three Utilities Classified Using Satisfaction Coefficients

According to Fig. 4 ”One-Dimensional” property is positioned in the first quadrant, “Must-be” is in the second quadrant, and “Attractive” is in the fourth quadrant. The properties in the third quadrant are known as indifference property according to the Kano’s model. 3.3 Strategic Use of Proposed PUI Guidelines Properties that are related with fundamental quality are defined as “Basic Utility.” The Basic Utility includes screen resolution and size when watching DMB (Digital


487

Media Broadcasting) or setting keys to prevent user error such as an outer interruption or unintended input. Moreover, emotional satisfaction factors relating cell-phone design and images such as a well-taken picture are defined as “Opportunity Utility.” This categorization well explains that users’ expectation becomes higher as cellphone’s functions and quality get better. However, unexpectedly, “Attractive Utility” barely includes any lists, because the lists used in survey are already extracted from the preexisting usability problems and design principals. Also, improvement based on the preexisting usability evaluation can rarely bring an epochal reformation of the product and customer’s satisfaction as well. Improvement strategy can be described as the following based on Kano model’s three types of property classification. Basic utility is defined as a product usability standard which should be applied to the product. If not, product would be failed at the market cause of users’ claim. Opportunity is utilized as a comparison evaluation standard, because the products need a marketability power in order to compete other companies. Below diagram shows the suggesting strategy for effectively improving the usability of product.

Fig. 5. Strategy for Improving Usability considering the characteristic of Utility

Especially, since these properties have an independent relationship, Basic utility should be applied to the product because users’ dissatisfaction can be occurred when this is not satisfied.( Jokela, 2004) Moreover, this cannot be helped with an addition of attractive utility. Also, this property classification depends on time so that attractive utility could be basic utility after some years. Particularly, this trend tends to appear on the products, which requires a short development period such as a cell phone; therefore, a constant usability testing is necessary in order to understand a new kind of attractive utility and to apply this new property to the product.

488


4 Conclusion In this research, we applied Kano’s model on the constructing UI guidelines for PUI (Physical Usability Interaction) of Mobile phone. Physical User Interaction (PUI) could be seen as the Physical side view of the usability and a broader view of the usability. Besides, PUI seems the most influential aspect on users’ satisfaction. Usability issue of PUI should be taken into account from the concept phage of product considering the issue characteristics. As development process goes on, possible region of physical design change shrinks rapidly. Guideline is a usable tool in organization to systematically improving and monitoring the usability of the product. Classification of usability issues considering potential effect is the first start point of this research. Then we applied Kano’s model of customers’ satisfaction to prioritize the issues reflecting users’ value. This priority is the base of strategic improvement. This research categorizes the UI guidelines into two groups. One is the design standards which must be satisfied. The other is the comparison criteria to hold a dominant position compared to competitive product. From this UI design guidelines may applied not only to evaluating product quality, but also to providing value improvement direction. Furthermore, the use of UI design guideline may expand to user satisfaction from quality control. The benefits of applying Kano model to design process are summarized as follows; 1) It is possible that the characteristics and criteria of product which affect to user satisfaction could be revealed. Besides, potential element which users usually may not describe explicitly could be understood. 2) It helps to find out the effect of the designer’s intention to users’ satisfaction. 3) The categorization of characteristics may be used as the criteria for the decision making. Especially limits on development resources exists, proposed categorization may be used as the criteria for deciding to focus on which characteristics. 4) It is easy to apply with many numbers of users, while most of common methods collecting users’ needs like Focus Group Interview are only applicable with small numbers of users.

References 1. Berger, C., Blauth, R., Borger, D., Bolster, C., Burchill, G., DuMouchel, W., Pouliot, F., Richer, R., Rubinoff, A., Shen, D., Timko, M., Walden, D.: Kano’s methods for understanding customer-defined quality. The Center for Quality Management Journal, 2(4) (1993) 2. Bevan, N.: Quality in use: meeting user needs for quality. Journal of Systems and Software 49(1), 89–96 (1999) 3. ChiWon, S., JeongYun, H., SangHyun, P.: Evaluating elements of the physical user experience (usability) of mobile device. In: Proceedings of HCI2006, Korea (2006) 4. ChiWon, S., JeongYun, H., SangHyun, P.: Classifying emotional elements of Mobiledevices to evaluate physical interface usability. In: Proceedings of Korean Society for Emotion and Sensibility 2006, Korea (2006) 5. Herzberg, F.: Work and the Nature of Man (1974) 6. JeongYun, H., SangHyun, P., ChiWon, S.: A Study of Improving Product Usability Based on the Classification of Usability Problems Considering Users’ Satisfaction 7. Jokela, T.: When Good Things Happen to Bad Products: Where are the Benefits of Usability in the Consumer Appliance Market?”, Ineractions, pp. 29–35 (2004)


489

8. Lofgren, M., Witell, L.: Kano’s Theory of Attractive Quality and Packaging, 2005. Quality Management Journal 12(3), 7–20 (2005) 9. Kano, N., Seraku, N., Takahashi, F., Tsuji, S.: Attractive quality and must-be quality. The Journal of Japaneses Society for Quality Control 14(2), 39–48 (1984) 10. Matzler, K., Hinterhuber, H.H.: How to make product development projects more successful by integrating Kano’s model of customer satisfaction into quality. Technovation 18(1), 25–38 (1998) 11. Zhang, P., von Dran, G.M.: Satisfiers and Dissatisfiers: A Two-Factor Model for Website Design and Evaluation. Journal of The American Society For Information Science 51(14), 1253–1268 (2000)

Appendix: Part of PUI Guideline with Kano Classification Part of PUI guideline consisted with valid items considering efficient coefficient is provided. Classification applying Kano model and users’ satisfaction coefficient and dissatisfaction coefficient is also included. Table 1. Classification example of PUI guideline

The Practices of Usability Analysis to Wireless Facility Controller for Conference Room Ding Hau Huang, You Zhao Liang, and Wen Ko Chiou 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C Chang Gung University [email protected]

Abstract. Increasingly there are more and more advantageous technical facilities and automated systems visible in business conference rooms. One of most advantageous media from the central system to users is the wireless facility controller and it is expected to bring individuals more convenience and efficiency by assisting them to control many kinds of media. This paper discusses ‘usability analysis’ with a ‘scenario-based’ approach on ‘user-oriented’ design concepts early on in the product design process through a practical case study concerning the controller. This study suggests a practical approach of scenario and usability analysis through a simple, structured framework. The framework is outlined by three major components: the design strategy from analyzing competitors’ products with scenario-based approach consisting of user, product, applications and field of use as context variables, usability analysis with product interaction and user’ observations with existing problems. Keywords: Wireless facility controller, User-oriented design, Usability, Interaction, Innovation.

1 Introduction Increasingly there are more and more advantageous technical facilities and automated systems visible in business conference rooms such as lighting control systems, presentation tools, window treatments and multimedia systems for example. These systems are multi-functional and really very complicated; therefore an integrated central control system has been developed to organize them. One of most advantageous media, from the central system, to users is the wireless facility controller. The interactive wireless controls are expected to bring individuals more convenience and efficiency to assist them in the control of many kinds of media. Generally we are accustomed to controlling different kinds of equipment with their individual handheld infrared remote controller. For example setting up a home theatre system for movies may employ the usage of several different remote controllers. Each one has its’ own unique user interface therefore users have to remember how to use all of them. Today users can integrate controllable appliances when using some wireless communication technologies. In most cases all the appliances can be controlled using a centralized controlling device. Generally, the user needs some manner of ‘user J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 490–498, 2007. © Springer-Verlag Berlin Heidelberg 2007

The Practices of Usability Analysis to Wireless Facility Controller

491

interface’ to control these controllable appliances and when there are many different kinds of appliances to control, the user interface of the controlling device might be very challenging to use. The changes in the infrastructure may need software or even hardware changes to the user interface as well, therefore, controlling them all in an efficient way requires much thought on the user interface and usability issues [4]. Moreover the important function of a smart business conference room is to help document meetings: i.e. to capture and index the various activities that occur during meetings, presentations and teleconferences. Other functions include controlling room equipment and ambience, managing media streams, and providing networked electronic whiteboards and note taking devices [1]. All of the above are highly interactive activities and user orientated, if there was a universal wireless facility controller that could integrate all of these multi-functions, then how to ‘understand’ the context of people interacting with them is a very important issue. ‘Scenario’ based approach is useful in stimulating user-centered ideation, illustration of issues, evaluation of design ideas from a user's point of view and showing the role of a product in a larger context of use. All these are important activities in establishing a common user centered focus, particularly in the earliest phases of product design. Scenario building provides the human factors professional an elective additional method for exploring, prototyping and communicating human factors issues within a product design context [5]. Lim & Sato [3] created a method for generating scenarios through use of aspect models based on design information framework (DIF) structure and addressed the scenario generation technique to design is that designers can effectively analyze complex use situations through multiple aspects, and identify problems and requirements that lead to further design problem solving. The scenarios that clearly embed rationales for solutions become valuable reference sources of use context information throughout the design process. However the DIF was developed to enable designers to organize and manipulate information throughout a design process. In a design process, templates for archiving information into a DIF structured database can be generated, and all types of design information, such as user studies data, design concepts, models, scenarios, and prototype models, are then structured by those templates (shown as figure 1) [2].

Fig. 1. Design information framework [2]

492

D.H. Huang, Y.Z. Liang, and W.K. Chiou

During this research, we collaborated with ADVANTECH Co. Ltd, Taiwan, who wish to develop the wireless facility controller, and is a leader in the industrial computing and automation market, and has more than 20 years experience [6]. ADVANTECH covers the complete market share of integrated solutions, from industrial automation to medical computing to home automation. Nevertheless, it is the engineering and marketing departments who are concerned about user oriented design approaches to fit the new interactive generation. Thus this paper discusses usability analysis with a scenario-based approach on UOD concepts early on in the product design process through a practical case study. This paper describes some advantages and potential pitfalls in using scenario and usability analysis, and provides examples of how innovative concepts are developed and can be applied usefully.

2 Methods This research took on the form of a case study in progress of new interactive product concept develop processes which included a wireless facility controller for conference rooms. The wireless facility controller is a kind of interactive device which has a wireless touch panel and fast operation keys that can control other automated devices. We applied the usability analysis and scenario-based approach to UOD concepts to create an innovative design concept and to synthesize the processes to develop an interactive product development framework. The detailed process is as follows. Firstly, the analysis of competitor’s products took place to elicit all raw information such as the function of the product, applications, user groups and field of uses. Then deeper more involved analysis was conducted, using scenario-based approach, which included product functions to applications, field of use to scenarios, and scenarios synthesis. The third stage was product usability analysis to define the product positions and the fourth was user observation. After appraising all the information, we discussed all the processes to establish an interactive product development framework. 2.1 Monitored Competitor Offering Competitor’s product analysis: The two most famous companies and worldwide leaders in advanced controlled and automation systems were chosen to elicit all raw information. The first, CRESTRON, is the world's leading manufacturer of advanced control and automation systems, innovating technology and reinventing the way people live and work. Offering integrated solutions to control audio, video, computer, IP and environmental systems, CRESTRON streamlines technology, improving the quality of life for people in corporate boardrooms, conference rooms, classrooms, auditoriums, and in their homes [8]. The second is AMX a worldwide leader in advanced control and automation technology for commercial and residential markets. The company’s hardware and software products simplify the way people interact with technology. This includes making it easier for system integrators to sell, program and install AMX products – ranging from touch panels, keypads and handheld remotes to customizable resource management tools – as well as making the overall AMX enduser experience intuitive and simple [7].


493

During this stage initially a product ‘tree map’ was used to identify all the related devices to facilitate an overview of the whole product line and to know what kind of types we should focus on. After deducing all the raw product information such as company, product type, prices, specifications, fields of use, characteristics, product semantics and related accessories with product analysis card which can build the product information database. And the database can be a strong information bank. 2.2 Scenario Analysis and Product Classification This stage was separated into three parts: The first was ‘product functions to applications’ in order to find which functions were supplied in the main market. For example after analyzing competitors’ products we could elicit all the applications including lighting automation, AV control, light control, presentation tools, windows treatments, voting systems, drapes shades & screens, climate control, internet access, remote manager, network campuses, media manager, security & intercoms, central control, security systems, automated bell towers, record manager and electronic menus. The second was ‘field of use to scenarios’ to understand what kinds of applications fit the individual field of use such as business, whole home, home theater, government, education, house of worship, MDU, private transportation, entertainment, healthcare, broadcasting, network operation, and retail hotels; as well as choosing the main target of these fields of use. For this research we conducted the business conference as an example. The third part was ‘scenario synthesis’ to factor-in ADVANTECH’s strategy and to identify the main scenarios, field of uses and its applications. 2.3 Usability Analysis and Product Positioning In this stage there were seven related products from seven different companies to facilitate usability analysis to understand basic specifications, hardware interface and their accessories. Finally the illustration of radar was used to synthesize and compare with the product characteristics. According to this we could define the product position. 2.4 User Observation with Existing Problems During this stage researchers went to observe the ‘field situation’ where the central control system and wireless facility controller was installed at an assembly hall and business conference room. We conducted direct ‘user observation’ to discover how the user interacted with the system, as well as asking questions which focused on existing problems. 2.5 Constructive Demo Design After each stage ‘rough’ design issues were synthesized facilitating several ‘demo’ designs. However only after completing all processes would the key design issues be gleaned, and then accordingly a ‘constructive demo’ design could be developed.

494


3 Results After completing all processes the main findings will be discussed as follows: firstly, the findings gleaned from ‘scenario analysis’, secondly, the findings from the ‘usability analysis’; and thirdly the findings gleaned from ‘user observation’. To synthesize all the results, the new design concept can be accepted by the development team members and the innovative results are better than the traditional method. 3.1 The Results of Competitor’s Product and Scenario Analysis Firstly, the findings gleaned from scenario analysis: modularization, standardization and diversification (shown as figure 2).

Fig. 2. The application of different field of use

By analyzing the ‘product functions to applications’, ‘field of use to scenarios’ and ‘scenarios to interaction’, we found that many applications had the same key functions, so we need to design a main type which has the same key functions but can still add functions to fit other fields of use. The main type must have one base cover which is different from the front cover and has different fast keys, styles and textures. According to these results the ‘demo design’ shown as figure 3, was refined from ADVANTECH’s UbiQ350 (a kind of wireless facility controller). In this stage we also applied ‘scenario approach’ to simulate the main user target and the different styles of the conference room (shown as figure 4). All users’ characteristics and space ‘styles’ can be references early on in the design process.


495

Fig. 3. The demo concept from competitor’s product and scenario analysis

Fig. 4. Scenario analysis of main targets and styles

3.2 The Results of Usability Analysis By analyzing the product hardware interaction interface shown in figure 5, the product illustration radar has eight dimensions including product functionality, hotkey numbers, the ease degree of handling, shape, price, battery life, whole product size and LCD capability. Through the main findings of usability analysis by the radar: we discovered that the controller should retain its multi-function aspect to remain convenient, so we separated the touch panel and fast key to form different parts. The independence of the fast key part can be used conveniently and also supply advanced settings through the touch panel. 3.3 The Results of User Observation Concerning the findings of user observation, we found that after ‘user observation’ there was an existing problem: that is that the handling should be improved. This should include advanced settings and accident prevention (shown as figure 6).

496


Fig. 5. The product position analysis

Fig. 6. User observation situation

3.4 The Results of Final Design To synthesize all the results, the new design concept could be accepted by the development team members and the innovative results were better than the traditional method. There are seven design issues which were synthesized including: (1) Modularizing the hardware and software to easily fit different fields of use; (2) The number and the function of fast keys are changeable with different applications; (3) More function, but retaining the same convenience; (4) Personality and safety setting; (5) Easy to handle; (6) Preventing unexpected start; (7) One touch scenario pre-setting.


497

According to all of the above the final design is shown in figure 7.

Fig. 7. Final design

The touch panel and fast key form different parts. The tube of the fast key part can be used conveniently and also supply advanced settings through the touch panel. 3.5 Design Framework The user-oriented innovation design framework (UOIDF) was developed by means of a practical design case. The UOIDF was developed to enable designers to organize and manipulate user data or user oriented information throughout a design process. In a design process, templates for archiving information into a UOIDF structured database can be generated, and all types of design information, such as product functions, applications, field of use, observation and interactions, are then structured by those templates. Product strategy

Product positioning

Competitor offering

Usability analysis

Scenario based approach

*user *product *applications *field of use

User interaction

Product improvement User observation Scenario based approach

*software interface *hardware interface *accessories

*filed survey *participate observation

User-oriented innovation design framework Fig. 8. User-oriented innovation design framework

498


4 Conclusion This study suggests a practical approach of scenario and usability analysis through a simple, structured framework. The framework was outlined by three major components: the design strategy from analyzing competitors’ products with scenario-based approach consisting of user, product, applications, and field of use as context variables, usability analysis with product interaction and user’ observations with existing problems. Based on this framework, this study established methods to specify interactive product features, to define development context, and to measure usability. The effectiveness of this framework was demonstrated through case studies in which the usability of interactive products was developed by using UOD concepts in this study.

5 Implications This study is expected to help product design practitioners in the consumer electronics industry in various ways. Most directly, it supports the plan and conduct of product development teams to develop new concepts in a systematic and structured manner. In addition, it can be applied to other categories of consumer interactive products (such as appliances, automobiles, communication devices, etc.) with minor modifications as necessary.

References 1. Chui, P., Wilcox, L.: Kumo Interactive: A Smart Conference Room. DARPA/NIST/NSF Workshop on Research Issues in Smart Computing Environments. 2. Lim, Y., Sato, K.: Development of design information framework for interactive systems design, In: Proceedings of the 5th Asian International Symposium on Design Research, Seoul, Korea (2001) 3. Lim, Y., Sato, K.: Describing multiple aspects of use situation: applications of Design Information Framework (DIF) to scenario development. Design Studies, 27(1) (2006) 4. Ritala, M., Tieranta, T., Vanhala, J.: Context Aware User Interface System for Smart Home Control. In: HOIT 2003 Conference. Irvine, California (2003) 5. Suri, J.F., Marsh, M.: Scenario building as an ergonomics method in consumer product design. Applied Ergonomics 31, 151–157 (2000) 6. ADVANTECH: http://www.advantech.com/ 7. AMX: http://www.amx.com 8. CRESTRON: http://www.crestron.com/

What Makes Evaluators to Find More Usability Problems?: A Meta-analysis for Individual Detection Rates Wonil Hwang1 and Gavriel Salvendy2 1

School of Industrial & Information System Engineering, Soongsil University 511 Sangdo-Dong, Dongjak-Gu, Seoul, 156-743, South Korea 2 School of Industrial Engineering, Purdue University 315 N. Grant St. West Lafayette, IN 47906, USA and Department of Industrial Engineering, Tsinghua University Beijing 100084, P.R. China [email protected], [email protected]

Abstract. Since many empirical results have been accumulated in usability evaluation research, it would be very useful to provide usability practitioners with generalized guidelines by analyzing the combined results. This study aims at estimating individual detection rate for user-based testing and heuristic evaluation through meta-analysis, and finding significant factors, which affect individual detection rates. Based on the results of 18 user-based testing and heuristic evaluation experiments, individual detection rates in user-based testing and heuristic evaluation were estimated as 0.36 and 0.14, respectively. Expertise and task type were found as significant factors to improve individual detection rate in heuristic evaluation.

Keywords: Usability evaluation, user-based testing, heuristic evaluation, meta-analysis, evaluator’s expertise, and task type.

1 Introduction Usability evaluation, a growing research area in Human-Computer Interaction (HCI), seeks the best way to improve the quality of systems or products in terms of ease of use. Researchers have developed various methods, such as user-based testing and heuristic evaluation, for usability evaluation and have drawn significant conclusions based on these methods. Even though usability evaluation research has accumulated significant quantities of results since the 1990s, it provides only context-based conclusions and often has left the generalization issues to future study. More generalized conclusions are needed to solve the research issues, such as estimates of individual detection rates and hidden factors to affect individual detection rates, and to provide more meaningful knowledge (or guidelines) to usability practitioners. There have been few efforts to synthesize the empirical results of usability evaluation research and to develop new useful implications for usability practitioners from the body of studies. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 499–507, 2007. © Springer-Verlag Berlin Heidelberg 2007

500

W. Hwang and G. Salvendy

In this situation, that is, a situation in which many empirical results have been accumulated from previous usability evaluation research, providing usability practitioners with generalized guidelines by analyzing the combined results of this body of work will be more useful than conducting research that involves all the factors. The objectives of this study are (a) to estimate individual detection rate for user-based testing and heuristic evaluation through meta-analysis, and (b) to find hidden factors, which affect individual detection rates.

2 Related Literatures Individual detection rate, which indicates the ratio of the number of usability problems found by individual evaluator or test user against the number of real usability problems that exist, is an important measure in usability evaluation research, because it reflects the individual ability to detect usability problems in a certain situation. If the individual detection rate can be estimated more reliably, the optimal sample size issue, one of the most disputable issues in usability evaluation research, can be resolved. When Nielsen [14] suggested so-called ‘4±1’ or ‘magic number five’ rule that indicates we need only 3 ~ 5 evaluators for detecting 80% of usability problems with heuristic evaluation method, the underlying assumption would be that the mean probability of detecting a problem by an evaluator (i.e., mean of individual detection rate) existed between 0.32 and 0.42. However, a lot of empirical research reported that the mean of individual detection rate did not exist in the above range, but in much lower range. For example, Law and Hvannberg [10] reported that mean of individual detection rates was 0.14 when user-based testing with think aloud method was employed, and Andre, Hartson and Williges [1] indicated that mean of individual detection rates was 0.179 when heuristic evaluation was used. Since the results of usability evaluation experiments could not support the underlying assumption of Nielsen’s conclusion, there have been many arguments about optimal sample size and the conditions, in which Nielsen’s conclusion can be supported. Thus, in order to get a more generalized conclusion about the optimal sample size issue, we need valid and reliable estimation of individual detection rate from the accumulated results of usability evaluation experiments. In addition, in case that the means of individual detection rates are significantly heterogeneous, we need to find hidden factors that have effects on the variability of individual detection rates, in order to suggest usability evaluation conditions, in which evaluators’ ability to detect more usability problems can be improved. There are many scientific ways to summarize, integrate, and interpret the independent studies. One of them is meta-analysis, which is the statistical methodology for combining findings from a selected set of studies. Lipsey and Wilson [11] indicated three conditions, under which meta-analysis can be applied. First, meta-analysis applies only to empirical studies that produce quantitative results. Second, metaanalysis is conducted based on the statistics of summarizing the research results rather than the original data sets. Third, meta-analysis aggregates and compares the results of independent studies that deal with the same constructs and relationships and report their results in similar statistical forms. Due to these limited applications, historically meta-analysis has been applied to combine the results from studies that have been

What Makes Evaluators to Find More Usability Problems?

501

repeated, such as astronomical and physical experiments, and social science research [4, 6]. During the 1990s, researchers have tried to combine the results from software engineering studies using the meta-analysis methods [3, 12]. When statistical information, including means and standard deviations, for estimating the effect sizes can be obtained, the parametric estimation models, such as fixed effects model, are utilized [9]. The underlying idea of these models is to combine the estimates of effect sizes that are computed from each study by weighting studies based on inverse of variance. When a statistical test for the homogeneity of effect sizes shows a non-significant result, there is no problem in combining the estimates of effect sizes. Otherwise, researchers need the parametric models that explain the variance of effect sizes (or studies variability) among individual studies. In general, the fixed effects model is employed when a researcher believes that there are systematic sources as potential moderators, which can explain the studies variability. Hedges [8] explained the fixed effects model as the ANOVA analog model, which is used when the study characteristic variables (independent variables) are categorical.

3 Methods 3.1 Data Collection We collected 103 usability evaluation experiments from HCI-related journals, the proceedings of HCI-related conferences, technical reports and books, since 1990. Online academic databases and offline sources were used to search for the relevant studies. Most of references of the papers found as relevant studies were also checked to make sure that no relevant studies were missed. As a result of such extended search efforts without pre-selected sources of studies, most of major HCI-related journals, such as International Journal of Human–Computer Interaction, Behaviour & Information Technology, International Journal of Human–Computer Studies, and Human Factors, and the proceedings of major HCI-related conferences, such as CHI Conference on Human Factors in Computing Systems and Human Factors Society Annual Meeting, were included as the sources of relevant studies. However, only 18 experimental results were used for meta-analysis, because we selected usability evaluation experiments under two criteria: (a) user-based testing or heuristic evaluation method was employed in the experiments, and (b) the experiments reported mean and standard deviation of individual detection rates. Because multiple experiments, which result in an independent experimental result for each, may be reported in a paper, 18 experimental results that are used for metaanalysis in this study come from 10 usability evaluation papers, which were published between 1990 and 2004. Albeit some of experiments were conducted under the same experimental conditions and reported in the same publications, each experiment is considered as an independent experiment that shares some of experimental conditions, such as evaluated systems, task type, and report type, because each experiment is administered independently. All 18 experiments reported mean and standard deviation values of individual detection rates after the interfaces of software products or information systems were evaluated for checking usability problems based on user-based testing (9 experiments) or heuristic evaluation (9 experiments) (see Table 1).

502

W. Hwang and G. Salvendy Table 1. Data collected for meta-analysis

Usability evaluation methods

User-based testing

Heuristic evaluation

Number of test user or evaluators 20

Mean of individual detection rates 0.36

Standard deviation of individual detection rates 0.0006

12 20

0.32 0.42

0.0014 0.0015

[18] [18]

17 36

0.14 0.4625

0.07 0.2032

[10] [13]

18 18

0.0799 0.0926

0.0269 0.0595

[16] [16]

7 6

0.1984 0.2432

0.0707 0.1172

[16] [16]

12 12

0.084 0.094

0.038 0.064

[19] [19]

14 10

0.19 0.179

0.0199 0.032

[5] [1]

16 9

0.203 0.14

0.075 0.123

[2] [2]

16 11

0.279 0.222

0.11 0.095

[2] [2]

18

0.046

0.025

[10]

References [17]

3.2 Methods of Analysis First of all, bubble charts were used to see the overall shape of data before the metaanalysis was conducted. In bubble charts, each bubble represents an experiment, the center of bubble indicates mean of individual detection rates across the number of test users or evaluators, and the radius of bubble means standard deviation of individual detection rates. According to the philosophy of meta-analysis, relatively small bubbles, which indicate smaller standard deviations, are given more importance than big bubbles, which indicate bigger standard deviations, when mean of individual detection rates are estimated. Second, means of individual detection rates from individual usability evaluation experiments were combined using inverse of variance as a weight in order to estimate individual detection rate for user-based testing and heuristic evaluation. Q statistic [15] that is known to follow Chi-square distribution was calculated to test homogeneity of effect sizes (i.e., means of individual detection rates in this study) that were used to estimate parameter (i.e., individual detection rate in this study). Third, the fixed effect model was applied for finding hidden factors that explain the variability of effect sizes (i.e., means of individual detection rates in this study), in case that means of individual detection rates were significantly heterogeneous. In


503

practice, candidates of hidden factors were selected, and then homogeneity tests were conducted repeatedly to check whether each candidate of hidden factors contributed to making homogeneous subgroups until hidden factors were identified as moderators in the fixed effects model.

4 Results 4.1 Bubble Charts Analysis In order to see the overall shape of data that were used for meta-analysis, two bubble charts were drawn for user-based testing and heuristic evaluation. In bubble charts, x-axis represents number of test users or evaluators, and y-axis represents mean of individual detection rate. As shown in Figure1 and Figure 2, the bubbles of user-based testing were more scattered according to x-axis and y-axis than those of heuristic evaluation, and also in terms of size, bubbles of user-based testing were more various than those of heuristic evaluation. It means that the collected data of user-based testing were more scattered in a wide range of mean and standard deviation of individual detection rate and number of test users than the data collected from heuristic evaluation experiments. Thus, we can conclude from bubble charts analysis that means of individual detection rates reported from user-based testing experiments are fairly heterogeneous, whereas means of individual detection rates from heuristic evaluation experiments show somewhat heterogeneous, but the possibility of being sub-grouped.

User-based testing

Mean of individual detection rate

0.6 0.5 0.4 0.3 0.2 0.1 0 0

10

20

30

40

Number of test users

Fig. 1. Bubble chart of user-based testing data

4.2 Estimation of Individual Detection Rate Meta-analyses were conducted for combining results from 9 user-based testing experiments and results from 9 heuristic evaluation experiments, respectively. As shown in Table 2, when user-based testing is used, the estimated individual detection rate is

504


Heuristic evaluation

Mean of individual detection rate

0.6 0.5 0.4 0.3 0.2 0.1 0 0

5

10

15

20

Number of evaluators

Fig. 2. Bubble chart of heuristic evaluation data

0.36 and its 95% confidence interval is 0.361 ~ 0.363. When heuristic evaluation is used, the estimated individual detection rate is 0.14 and its 95% confidence interval is 0.115 ~ 0.164, which is not consistent with the assumption of Nielsen [14]’s conclusion. However, the results of homogeneity tests for effect sizes show that means of individual detection rates both from user-based testing and heuristic evaluation are significantly heterogeneous. Thus, we need to employ the fixed effects model to find hidden factors that affect heterogeneity of means of individual detection rates. Table 2. Estimated individual detection rates and homogeneity tests Usability evaluation methods User-based testing Heuristic evaluation

Estimated individual detection rate

95% confidence interval of mean Lower bound Upper bound

Q statistic

0.3615

0.3605

0.3625

2552.558

0.1393

0.1150

0.1637

27.669

Chi-square (d.f. = 8, α = 0.05) 15.507

4.3 Hidden Factors for Individual Detection Rate in Heuristic Evaluation We considered experimental conditions, such as expertise of test users or evaluators, task type, type of evaluated systems, experimental duration, and report type, as candidates of hidden factors that would explain the heterogeneity of means of individual detection rates. As for user-based testing, means of individual detection rates were too heterogeneous to find homogeneous sub-groups based on the above candidates of hidden factors. Thus, in this study we cannot find proper hidden factors that affect individual detection rate in user-based testing.


505

As for heuristic evaluation, evaluator’s expertise (experts vs. novice) and task type (scenario-based task vs. free exploration) were found as significant factors that explain the variability of individual detection rates. Using evaluator’s expertise and task type, heterogeneous data could be sub-grouped into three homogeneous data, such as heuristic evaluation done by experts, by novice with free exploration task, and by novice with scenario-based task (see Table 3). Because we had only one data resulted from evaluation done by novice with scenario-based task, homogeneity test was not conducted for this case. When novice evaluators conduct heuristic evaluation with free exploration task, the estimated individual detection rate is the highest (0.19). It implies that in order to improve evaluator’s problem detection ability the evaluation conditions of heuristic evaluation need to be set up similar to that of user-based testing (novice + free exploration). Table 3. Hidden factors for individual detection rate in heuristic evaluation Hidden factors Expertise

Task type

Expert

Mixed

Novice Novice

Free exploration Scenariobased

Estimated individual detection rate

95% confidence Interval of mean Lower Upper bound bound

Q statistic

0.1386

0.0958

0.1815

4.915

0.1935

0.1563

0.2306

0.651

0.0460

-

-

-

Chi-square (α = 0.05) 9.49 (d.f. = 4) 5.99 (d.f. = 2) -

5 Conclusion and Discussion We conducted meta-analyses for combining results from user-based testing and heuristic evaluation experiments, and thus, for estimating individual detection rates in user-based testing and heuristic evaluation. Estimated individual detection rates in user-based testing and heuristic evaluation were computed as 0.36 and 0.14, respectively, but they need to be interpreted carefully because they were derived from heterogeneous data. As for heuristic evaluation, however, based on the fixed effects model with expertise and task type as moderators, we estimated two individual detection rates from homogeneous sub-data: 0.14 (when experts conduct heuristic evaluation) and 0.19 (when novice evaluators conduct heuristic evaluation with free exploration task). Those individual detection rates are not consistent with the assumption of Nielsen [14]’s conclusion. This study makes two contributions in usability evaluation research. First, this study combined results from user-based testing and heuristic evaluation experiments, and estimated individual detection rates in those usability evaluation methods. Usability practitioners can use this generalized conclusion of individual detection rates for deciding optimal sample sizes for usability evaluation. Second, this study found significant factors, such as expertise and task type, to improve individual detection rate

506


in heuristic evaluation. Usability practitioners can consider these factors to improve the performance of usability evaluation. However, this study has limitation in that the number of collected data is small. There are not enough experiments that have reported statistical information for conducting meta-analysis. It was one of reasons why meta-analysis was often given up to apply for usability evaluation research [7]. In addition, the collected data from userbased testing were significantly heterogeneous, but we could not find significant factors that explain the variability of means of individual detection rates. This issue would be left to the future study.

References 1. Andre, T.S., Hartson, H.R., Williges, R.C.: Determining the effectiveness of the usability problem inspector: a theory-based model and tool for finding usability problems. Human Factors 45, 455–482 (2003) 2. Baker, K., Greenberg, S., Gutwin, C.: Empirical development of a heuristic evaluation methodology for shared workspace groupware. In: Proceedings of the 2002 ACM Conference on Computer supported cooperative work, pp. 96–105. ACM, New York (2002) 3. Chen, C., Rada, R.: Interacting with hypertext: a meta-analysis of experimental studies. Human-Computer Interaction 11, 125–156 (1996) 4. Cook, T.D., Leviton, L.C.: Reviewing the literature: a comparison of traditional methods with meta-analysis. Journal of Personality 48, 449–472 (1980) 5. De Angeli, A., Matera, M., Costabile, M.F., Garzotto, F., Paolini, P.: Validating the SUE inspection technique. In: Di Gesù, V., Levialdi, S., Tarantino, L. (eds.) Proceedings of Advanced Visual Interfaces (AVI’2000), pp. 143–150. ACM, New York (2000) 6. Glass, G.V., McGaw, B., Smith, M.L.: Meta-analysis in social research. Sage Publications, Beverly Hills CA (1981) 7. Hartson, H.R., Andre, T.S., Williges, R.C.: Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction 15, 145–181 (2003) 8. Hedges, L.V.: Fixed effects models. In: Cooper, H., Hedges, L.V. (eds.) The handbook of research synthesis, pp. 285–299. Russell Sage Foundation, New York (1994) 9. Hedges, L.V., Olkin, I.: Statistical methods for meta-analysis. Academic Press, Orlando FL (1985) 10. Law, L.-C., Hvannberg, E.T.: Analysis of combinatorial user effect in international usability tests. In: CHI Conference on Human Factors in Computing Systems, pp. 9–16. ACM, New York (2004) 11. Lipsey, M.W., Wilson, D.B.: Practical meta-analysis. SAGE Publications, Thousand Oaks CA (2001) 12. McLeod, P.L.: An assessment of the experimental literature on electronic support of group work: results of a meta-analysis. Human-Computer Interaction 7, 257–280 (1992) 13. Nielsen, J.: Finding usability problems through heuristic evaluation. In: CHI Conference on Human Factors in Computing Systems, pp. 373–380. ACM, New York (1992) 14. Nielsen, J.: Estimating the number of subjects needed for a thinking aloud test. International Journal of Human–Computer Studies 41, 385–397 (1994) 15. Shadish, W.R., Haddock, C.K.: Combining estimates of effect size. In: Cooper, H., Hedges, L.V. (eds.) The handbook of research synthesis. pp. 261–281. Russell Sage Foundation, New York (1994)


507

16. Spool, J., Schroeder, W.: Testing web sites: Five users is nowhere near enough. In: CHI ’01 extended abstracts on Human factors in computing systems, pp. 285–286. ACM, New York (2001) 17. Virzi, R.A.: Streamlining the design process: Running fewer subjects. In: Human Factors Society 34th Annual Meeting. Human Factors and Ergonomics Society, pp. 291–294. Human Factors and Ergonomics Society, Santa Monica CA (1990) 18. Virzi, R.A.: Refining the test phase of usability evaluation: how many subjects is enough? Human Factors 34, 457–468 (1992) 19. Zhang, Z., Basili, V., Shneiderman, B.: Perspective-based usability inspection: An empirical validation of efficacy. Empirical Software Engineering 4, 43–69 (1999)

Evaluating in a Healthcare Setting: A Comparison Between Concurrent and Retrospective Verbalisation Janne Jul Jensen Department of Computer Science, Aalborg University Fredrik Bajers Vej 7, E2-220, DK-9210 Aalborg East, Denmark [email protected]

Abstract. The think-aloud protocol, also known as concurrent verbalisation protocol, is widely used in the field of HCI today, but as the technology and applications have evolved the protocol has had to cope with this. Therefore new variations of the protocol have seen the light of day. One example is retrospective verbalisation. To compare concurrent and retrospective verbalisation an experiment was conducted. A home healthcare application was evaluated with 15 participants using both protocols. The results of the experiment show that the two protocols have each their strengths and weaknesses, and as such are very equally good although very different.

1 Introduction One of the most commonly used protocols in usability evaluations is think-aloud. It is also known under the name concurrent verbalisation, which will be the term used in this paper. Concurrent verbalisation was originally introduced by Karl Duncker [2] and has since then incorporated into HCI. Some of the strengths of the protocol are that it is easy to learn [1, 5], it can be used by non-specialists in usability [6] and it gives a fairly good insight into the cognitive processes of the participant in the evaluation [4]. However, over the years some weaknesses have also been revealed. These include a heightened mental workload of the participant [8] and that the thinking aloud disturbs the participant’s interaction with the application [7]. Originally in HCI concurrent verbalisation was used in laboratory settings, but as applications have evolved and become both mobile and context aware among other things, the protocol has been challenged to cope with these new changes. Similarly to bringing telephone conversations out into the public space, using think-aloud in all settings might prove troublesome. Take, for instance, a newer branch of applications for families or friends. Here we are dealing with information that can be very private to the people involved and thus a certain amount of awkwardness can be expected if they are to verbalise this in an evaluation. If verbalisation in the classical sense of concurrent verbalisation is not always appropriate, then it is necessary to think in alternatives. Another version of verbalisation that has been used in several contexts is retrospective verbalisation. Just like concurrent verbalisation this protocol has both strengths and weaknesses. One advantage is J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 508–516, 2007. © Springer-Verlag Berlin Heidelberg 2007

Evaluating in a Healthcare Setting

509

the decrease of mental workload, as the participant is now free to focus on the task at hand. However, a drawback could be that participants quickly forget specific details that occurred in the task solving process and they are then unable to recall these details afterwards [3]. To shed some light on the pros and cons of the two protocols an experiment was conducted. This was done as a field evaluation in the home healthcare system. The reason for choosing this setting and type of evaluation was to make the setting as realistic as possible in order to investigate any possible effects the surroundings might have with regards to sensitivity. Is it possible to observe any awkwardness in using the concurrent think-aloud protocol compared to the retrospective thinkaloud protocol, with respect to a sensitive setting?

2 The Experiment To compare concurrent vs. retrospective verbalisation in a healthcare setting and to test the appropriateness of each protocol, an experiment was conducted. It was set up as a field evaluation to create as realistic settings as possible. The system chosen for evaluation was an application developed to aid home healthcare workers in their daily work. It is an electronic replacement to the existing paper-based system which is currently in use in many municipalities in Denmark. It supports the current work-procedure as well as offer new functionality such as wireless access to added information about the elder citizens and the progress of coworkers, information that earlier was available only at the main office building. 2.1 Participants 15 participants were chosen with the help of the head of the group of home healthcare workers with due consideration for work plans etc. All 15 were trained home healthcare workers and their demographic data is shown in table 1.

Experience computer (1-6)

Concurrent

Average High Low Average High Low

Experience total

Retrospective

Experience local

Protocol

Age

Table 1. The demographic data of the 15 participants in the two protocols

42.0 54 33 42.4 57 31

5½ 12 2½ 7 18 1

8¼ 13 3¾ 10.3 23 1½

3 6 1 3.9 6 1

The table shows the age, the experience as home healthcare workers in the municipality where the experiment took place, the experience as home healthcare workers in total and the level of experience with computers on a scale from one to six where 1 is

510

J.J. Jensen

most experienced and 6 is least experienced. For each of these variables, the high low and average has been calculated for each of the protocols. 2.2 Equipment To support the field evaluation a mobile laboratory was used. It consists of small clip-on wireless mobile cameras (see figure 1), wireless microphones and a mobile digital video recorder. To run it all, it furthermore requires various types of batteries and receivers for the wireless technology. Only the camera and microphone are carried by the participant, the rest is carried by the test monitor packed in a small bag (see figure 2 and 3).

Fig. 1. The small clip-on wireless mobile camera from the mobile laboratory

Fig. 2. The equipment in the mobile laboratory used for concurrent verbalisation

Fig. 3. The mobile laboratory packed up for use

Fig. 4. The setup for retrospective verbalisation

For retrospective verbalisation, the digital recordings from the mobile video recorder were played back to the participant and the retrospective verbalisation was caught using a camcorder (see figure 4).


511

2.3 Procedure To gain the necessary insight into the field of home healthcare, a small ethnographic field study was conducted. Based on a thorough examination of the system and the insight gained from the ethnographic field study the 8 tasks that covered a wide range of the commonly used functionalities in the application were designed and the experiment was then designed in detail. With the design of the experiment in place, a pilot was conducted for both protocols and the setup was adapted according to the minor issues discovered. 15 participants were recruited from a local municipality. 14 were female and one male, which was representative for the employment situation where women far outweighed the men. The actual experiment took six days and all evaluations were recorded on video. The evaluations took place in six different homes of actual elderly citizens, with the citizen present during the evaluation to further heighten the realism in the experiment. 7 of the 15 participants were assigned to evaluate using retrospective verbalisation, while the remaining 8 participants evaluated the application using concurrent verbalisation. Each of the participants was given a thorough introduction to the experiment, explaining the equipment and its function, what their contribution was, what was expected of them, what would happen etc. They were also instructed thoroughly in how to apply the protocol assigned to them. They were then given 10 minutes to freely familiarise themselves with the system, before trying to solve the tasks. After the introduction the experiment itself took place in the home of an elderly citizen where the participants attempted to solve the tasks handed out. 8 participants solved them thinking aloud during the evaluation whereas the other 7 had their test session played back to them on a screen afterwards and were thinking aloud during the replay. Upon completion of the evaluation each participant was debriefed. All the raw video data was analysed afterwards and a list of problems was constructed. The severity of each of the problems was categorised according to the definition by Rolf Molich [5]. According to the definition a problem experienced by a participant falls in one of three categories: • Cosmetic: The user is delayed for less than one minute, is mildly irritated or is confronted with information, which to a lesser degree deviates from the expected. • Serious: The user is delayed for several minutes, is somewhat irritated or is confronted with information, which to some degree deviates from the expected. • Critical: The users attempt to solve the task comes to a halt; the user is very irritated or is confronted with information which to a critical degree deviates from the expected. The categorisation was done by observing the video recording of each participant, and then evaluate each situation according to the guidelines described above. A given problem is often not experienced equally serious by each participant, and in those cases the problem is categorised in the most severe category.

512

J.J. Jensen

3 Results This section sums up the observations made from the list of problems, which was extracted from the analysis of the raw video data. 3.1 Problems Revealed In total, 105 problems were identified through the evaluation and interestingly the participants using concurrent verbalisation revealed a total of 87 problems whereas the participants using retrospective verbalisation only experienced 61 problems in total. This is a quite big difference which origin is not clear. One explanation could be that the participants evaluating with retrospective verbalisation has an average computer experience level that is almost a point better (3.0) compared to that of the participants using concurrent verbalisation (3.9) on a scale from 1 to 6 (see table 2). Table 2. Total number of problems, unique problems and the average computer skill of the participants

All Problems revealed Unique problems* Average computer experience

105 44 3.4

Concurrent Verbalisation 87 30 (47) 3.9

Retrospective Verbalisation 61 14 (33) 3.0

* Note that the number in parentheses refers to problems that are unique to that protocol and not necessarily unique in total.

3.2 Unique Problems When looking at the number of unique problems the experiment in total reveals 44 unique problems. 30 of these are problems revealed by the concurrent verbalisation protocol, whereas the retrospective verbalisation protocol only experience 14 of the 44. Even if we look at problems that are unique to each of the protocols, concurrent verbalisation discovers 47 problems that are unique to that protocol, whereas retrospective verbalisation only encounters 33 problems that are unique to that protocol (see table 2). It has long been debated in the literature whether unique problems were real or “false” problems, since they had only been encountered by one participant during the evaluation, and how this seems increasingly likely when the number of participants increase. If unique problems are indeed “false” problems, then this experiment could indicate that retrospective verbalisation is better at eliminating these “false” problems. This could be because the protocol is of a recall-nature, where the participant simply recalls fewer of these “false” problems afterwards than what would be verbalised in the situation, due to it not really being a problem after all. 3.3 “False” Problems – Do They Exist? However, retrospective verbalisation finds only slightly more than half of the total number of problems, and the question is if nearly half of the problems found can be


513

considered “false” problems. When looking at the severity, concurrent verbalisation finds more problems in all three categories. If the problems found extra by concurrent verbalisation were “false” problems, it would be fair to assume that they would appear mostly as cosmetic problems. However it is difficult to dismiss problems that are categorised as critical as being false, so eliminating “false” problems can only partly explain why retrospective verbalisation finds only slightly more than half the problems. Another explanation might be that the participant forgets some of the problems in the short time between the evaluation and the retrospective verbalisation. Perhaps problems seem less frustrating when looking back, than when in the middle of it. It is possible that it is easier for the participant to keep the overview when sitting outside the situation looking in. 3.4 Problems Detected by Both Protocols There are 43 problems that are registered by both protocols. As an example one problem was that the participant did not enter username and password before pressing the “login”-button. In another problem the participants did not understand the error message displayed to them. Thirdly, the participants think “Unplanned task” adds an extra task to the visit in progress. These three problems are typical for the 43 problems in common of the two protocols and the initial inspection does not reveal any connection between them that explains why exactly those problems have been revealed by both protocols. The same is the case with the unique problems that also doesn’t seem to have anything in common. Examples of those are: The participant thinks TAB will move the cursor to the next text field. Secondly, a participant is unsure how to end a visit in progress. Thirdly, a participant is unsure what data the “search”-button searches in. 3.5 Few or Many – Nothing in Between It is notable that in concurrent verbalisation it seems like the participants fall in one of two groups. They either experience few or many problems and not the average in between, whereas the number of problems experienced by the participants in retrospective verbalisation is more evened out. Three of the participants using concurrent verbalisation experience only few problems (6-11) while the other five experience many (21-36), but none of the participants experience the average number of problems in between (12-20). This could be due to difficulties in verbalising concurrently with the task-solving, as has been reported as a drawback of the concurrent thinkaloud protocol [7]. This can materialise itself either as very little verbalisation due to difficulties doing that simultaneously with the task-solving (few problems experienced) or by extra problems occurring due to lack of concentration caused by the simultaneous verbalisation (many problems experienced). In retrospective verbalisation this is much more evened out, because the mental workload is lowered by letting the participants concentrate on one thing at a time and the differing number of problems experienced might simply be caused by their varying computer skills and also differing skills in recalling their thought process at the time in details.

514

J.J. Jensen

3.6 The Diverse Participants Each participant in concurrent verbalisation revealed an average of 20.8 problems, whereas each participant in retrospective verbalisation only discovered an average of 16.0 problems (see table 3). This difference is not particularly big though when considering the large spread in experienced problems between the participants, and this spread is probably to be expected in a group of participants as diverse as the present one. The group contained a wide variety both in job experience and computer experience and as such it would have come as a surprise if the amount of problems experienced were similar between the participants. Table 3. Average number of problems experienced totally and for each of the two protocols

Total Average problems

18.5

Concurrent Verbalisation 20.8

Retrospective Verbalisation 16.0

4 Discussion Many attempts have been made to determine which of the two verbalisation protocols are better, but so far the results are differing between studies. Nielsen et al. [7] discover quite a few weaknesses in concurrent verbalisation, and propose that Mind Tape (a version of retrospective verbalisation) is a more viable option, whereas van den Haak et al. [9] rate the two protocols as being equally good although clearly different. This study indicates that concurrent verbalisation finds more problems than retrospective verbalisation, but it seems that this can be both a good and a bad thing. Good, if it means that the number of “false” problems (unique) is minimized; bad since it is not only “false” problems that aren’t discovered. Concurrent verbalisation on the other hand seems to lay a higher mental workload upon the participant, causing them to focus either on the task-solving process and thus tend to forget to verbalise or to focus on the verbalisation thus loosing concentration on the task-solving. However, the reason that retrospective verbalisation finds less problems might be that even in the short time between the actual evaluation and the retrospective verbalisation, things have already started to fade in the memory of the participant and problems are being forgotten. Thus, the conclusion tends to lean towards that of van den Haak et al. [9] that they are equally good, but very different. As the observant reader might have noticed, the two protocols in the experiment had an uneven number of participants: 8 participants used concurrent verbalisation, while only 7 participants used retrospective verbalisation. This of course influences the results in the subsection Problems Revealed of the Results-section, but even if the numbers are corrected to compensate for that (done by taking all possible combinations of 7 participants out of the 8 and then taking of the average of the amount of problems found by these combinations of 7 participants in concurrent verbalisation), concurrent verbalisation still reveals 81.125 problems to retrospective verbalisations 61. This is still a notable difference and does not change the conclusions drawn. The same is the case in the subsection Unique Problems where concurrent verbalisation


515

still finds 27.3 of the globally unique problems (compared to the 30) and 41.1 problems that are unique to that protocol (compared to 47) when the numbers are corrected to compensate for the extra participant as descried above. Here the differences too are still noteworthy even after the compensation and therefore does not change any of the above written. It of course looks a bit odd to be talking about a fraction of a problem, but it is simply to illustrate the average amount of problems that would have been experienced, if we had only used 7 participants and not 8, regardless which 7 participants we were to choose of the 8. With the corrected numbers, table 2 would then look as can be seen in table 4. Table 4. Table 2 as it would look with the corrected numbers for concurrent verbalisation

All Problems revealed Unique problems* Average computer experience

105 44 3.4

Concurrent Verbalisation 87 27.3 (41.1) 3.9

Retrospective Verbalisation 61 14 (33) 3.0

* Note that the number in parentheses refers to problems that are unique to that protocol and not necessarily unique in total.

One purpose of the experiment conducted was to look at the suitability of the protocols for sensitive settings, in this case healthcare in a field evaluation: Surprisingly, and contrary to expected, there was no evidence that the participants using concurrent verbalisation were influenced by the awkwardness or private nature of the information they were verbalising about. This indicates that this is not an issue that affects the test situation or the participant. It is however unclear if this goes for other settings and it would be interesting to explore if, what can be described as sensitive settings, influence the suitability of verbalisation. However, this requires a definition of what makes a sensitive setting, such as surroundings, participants etc., and then identifying application areas where this could pose a problem. Acknowledgements. The research behind this paper was partly financed by the Danish Research Councils (grant number 2106-04-0022, the USE-project), without which it would not have been possible. I would also like to thank my supervisor for his continuously constructive comments on the paper. Finally, a thank you to the home healthcare workers of Aars kommune in Denmark, who agreed to participate in this experiment, and to the elderly citizens, who so willingly opened their homes to us.

References 1. Dix, A., Finlay, J., Abowd, G., Beale, R.: Human-Computer Interaction. Prentice Hall, Englewood Cliffs (1997) 2. Duncker, K.: On Problem-solving, in Dashiell, John F.: Psychological Monographs. The American Psychological Association, Inc.vol. 58, pp.1–114 (1945) 3. Ericsson, K.A., Simon, H.A.: Protocol Analysis: Verbal Reports as Data. MIT Press, Cambridge, MA (1993)

516

J.J. Jensen

4. Hackos, J.T., Redish, J.C.: User and Task Analysis for Interface Design. Wiley, Chichester, UK (1998) 5. Molich, R.: User Friendly Systems (in Danish), Teknisk Forlag (1994) 6. Nielsen, J.: Estimating the Number of Subjects Needed for a Thinking Aloud Test. International Journal of Human-Computer Studies 41(3), 385–397 (1994) 7. Nielsen, J., Clemmensen, T., Yssing, C.: Getting access to what goes on in people’s heads? – Reflections on the think-aloud technique. In: Proceedings of NordiCHI, ACM Press, New York (2002) 8. Preece, J.: Human-Computer Interaction. Addison-Wesley, London, UK (1994) 9. van den Haak, M., de Jong, M.D.T., Schellens, P.J.: Retrospective vs. concurrent thinkaloud protocols: testing the usability of an online library catalogue. In: Behaviour and Information Technology, vol. 22, pp. 339–351. Taylor & Francis Ltd, London (2003)

Development of AHP Model for Telematics Haptic Interface Evaluation Yong Gu Ji, Beom Suk Jin, Jae Seung Mun, and Sang Min Ko Yonsei University, 134 Sinchon-Dong, Seodaemun-gu, Seoul, Korea {yongguji, kbf2514jin, mjs, sangminko}@yonsei.ac.kr

Abstract. These days, the main focus in developing telematics systems is to promote safety by decreasing the workload of the driver. To achieve this goal, simplification of the interface as well as the resolution of GUI interaction problems must be worked on. For this research, objective and quantitative assessments are provided in the early steps of building the haptic interface model. The purpose of this research is to create an evaluation model that uses the Analytic Hierarchy Process (AHP) method to fulfill user requirements. This research developed an AHP evaluation model that can present recommendations, as well as the degree of importance, for haptic interface design with quantitative assessments of the prototype by finding out the absolute and relative importance for evaluation groups and factors in early design levels using AHP. Keywords: Analytic Hierarchy Process, Haptic Device, Haptic Interface, Telematics.

1 Introduction “Telematics” is a compound word that combines “telecommunication” and “informatics.” It is a wireless data service that allows the exchange of information through computer systems, wireless communication technology, GPS (Global Positioning System), and TTS (Text to Speech) technology in transportation mediums such as vehicles, aircrafts, and ships. Telematics in vehicles is considered to be the field with the most potential [5]. Based on advanced technology like mobile telecommunication and LBS (Location Based Service), the vehicle telematics service offers accident detection, a security system, a drive path guide, and traffic and living information. So, next generation telematics systems will develop into multi-functional systems from a traffic information provider [8]. The primary task of driving requires various interactions with numerous devices. Thus, it is necessary to develop an eyes-free haptic interface to reduce workloads from telematics manipulation and distractions from interaction with systems. Generally, telematics devices in vehicles require the drivers to use numerous interaction methods such as push buttons, touch screens, and speech recognition. These complex interactions may cause difficulties in controlling the devices during driving. Also, there is a technological limitation in applying multi-function displays and controls for next generation telematics, which need a multi-modal interface [9]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 517–526, 2007. © Springer-Verlag Berlin Heidelberg 2007

518

Y.G. Ji et al.

Therefore, simplifying complex GUI interactions to reduce driver workload and secure driver safety has become an important issue for the next generation’s telematics systems. It is necessary to introduce new interaction methods to overcome setbacks in some parts of speech recognition and touch screen technology. Multifunction display and device control using a mental model would be a solution. Haptic interface will be an important part in intelligent vehicles, and drivers will easily manipulate devices using a haptic interface [1]. Moreover, using the haptic interface will help us obtain the core technology that can affect the future market of intelligent vehicles. If tactile feedback is supported in telematics devices, it would greatly reduce the chances of malfunctions in telematics systems caused by overload during driving. Furthermore, the tactile feedback provides drivers with useful information on the functions of telematics. It will contribute to the reduction of driver distraction to further guarantee driver safety. While the previous haptic interface model, equipped with difficult and complex manipulations, has been inefficient, the new telematics device will support efficient and instinctive manipulation using combinations of tactile feedbacks [11]. Due to the lack of quantitative evaluation in early design steps, customer needs have not been reflected enough in the haptic device. In this study, to reflect customer need properly and to offer an effective evaluation method, we developed the Analytic Hierarchy Process (AHP) evaluation model for Multi Criteria Decision Making. In the AHP evaluation model for an early prototype using hierarchical analysis, quantitative values of each evaluation factor were computed [14]. As a result, an objective and quantitative evaluation model for a usercentered haptic device design was developed. It will be useful to both drivers and developers.

2 Literature Review Previous studies about haptic interface and device evaluation were researched to collect the evaluation factors of the AHP model. As human-machine interaction within vehicles is starting to get more and more complex, an advanced interface is needed. Interaction types like haptic interface, touch screen, and voice control have been developed to manipulate multi-functional systems, and interaction types were compared in usability studies [2]. To study the relationship between practical applications and the haptic interface’s hardware/software, user perception and motor control in haptic mode were evaluated. E. Kirkpatrick (2002) researched the requirements of hardware/software for haptic interface [6]. The study on how to design a haptic user interface and application was performed by Steven Wall (2006) [15]. In this study, a heuristic guideline for improving usability was presented. The heuristic guideline was acceptable to users. Also, relations among the haptic device, human perception, and computer application were analyzed. As a result, the haptic interface to improve interaction between the user and computer was proposed. A study of interface attributes of navigation in vehicles shows that controllability—like speed and accuracy—as well as ease of use are highly related to user satisfaction in car navigation systems. So, Robert (2003)

Development of AHP Model for Telematics Haptic Interface Evaluation

519

researched device design that considers users who have no experience manipulating complex devices [12]. E. Kirkpatrick (2001) researched predicting the performance of usability in the haptic environment by measuring the necessary time to perceive the shape of a physical object [7]. This study offered ways to improve user performance by using tactile feedback from the haptic interface. Mark Evans (2005) researched the usability evaluation study of tactile feedback devices to improve interactions in a virtual environment [10]. In this study, the limitations of commercial haptic feedback were described, and a guideline for haptic feedback to develop new products was provided. P. Richard (1999) evaluated controllability and accuracy related to object control using haptic feedback to evaluate user performance of haptic interaction in a virtual environment [13]. As a result, task performance decreased when visual, auditory, and tactile feedback were offered in a simple task. According to the study, multiple feedbacks are not necessary, and only one feedback is more effective in a simple task. Also, Camilla Grane (2005) compared haptic information, graphic information, and integrated information (haptic and graphic) in a simple menu selection, and Dario D. (2005) measured the effectiveness of different input methods in driving [3][4].

3 Methodology To develop the AHP evaluation model for evaluating the haptic interface, the preceding study on haptic interface was reviewed and then evaluation indexes were extracted. After reviewing, 25 evaluation indexes were finally selected and classified with 7 evaluation groups by factor analysis. Then, using AHP, each index’s weight was generated and the hierarchical structure of evaluation indexes was organized. 3.1 Selection of Evaluation Indexes The literature review was used to generate evaluation indexes, which will be used in the AHP evaluation model. Based on this review, and on studies about general device evaluation indexes or haptic interface, 50 usability evaluation indexes were extracted, including previous studies’ haptic interface evaluation indexes as criteria for usability and functionality of the haptic interface model. These 50 indexes were reviewed twice to make the criteria more subjective and clear. An expert group interview was performed in the first index selection. The criteria of selection, unification, and exclusion were decided in this step. Similar or duplicate meanings of the evaluation indexes were resolved considering the research purpose and characteristic of object. If an evaluation factor was ambiguous, it was not used. Secondly, 25 indexes were finally chosen to evaluate the haptic interface model. The criterions were: scope of definition (inclusion of index’s concept or scope of generality), hierarchical relation among concepts (one index’s concept is the other’s subset), and correlation among concepts (causation or correlation between concepts of indexes). Table 1 shows the 25 indexes and descriptions.

520

Y.G. Ji et al. Table 1. Evaluation indexes Index Learnability Memorability Ease Flexibility Efficiency

Effectiveness

Performance Speed Accuracy

Controllability Feedback

Prevention Recoverability Visibility Durability Safety Size Familiarity

Arrangement

Attractiveness Complexity

Evaluation index definition The way of manipulation should be easy for novices to learn. The manipulation way of device controller should be easy for users to remember once learned. The device controller should be easy for the selection/execution/level control of the function. The manipulation way of device controller should be designed to be connected flexibly for each menu/mode. Device controller should be worked efficiently. (Functions have to be executed by the minimum number of the key operations.) Device controller should be worked effectively. (Functions have to be executed to minimize the workload of users’ hands and brain.) The user’s capacity for tasks should be excellent. The search menu and list and the performance of the task should be manipulated fast. The input information like the selection/execution/control of the level should be delivered to the system exactly through the device controller. Device should provide users with controllability. The feedback of errors, including tactile, visual, and auditory, should be indicated and provided so clearly that users can recognize errors easily. The user’s input error or incorrect operations are prevented in advance. The exit or cancel should be provided in order that users can escape from wrong input or unwanted menus. The current state of the device/system should be given visually. Device controller should be designed robustly so that malfunction or damage is prevented. Device controller should exclude unnatural manipulation so that the overload of the users is minimized. Device controller should be designed considering the common user’s hand size. Device controller’s features, like the shape/form/surface/elasticity/weight/tactile, should be designed to increase the sense of a grip. Device controller should be set up within hand’s radius for action so that users have no difficulty in operation. (The control of the device location should be provided.) The way to manipulate or form type of device controller should attract user’s interest. The complex manipulation for a selection/execution/navigation of the functions should be excluded.


521

Table 1. (continued) Simplicity

Cognition Consistency Discriminability

Device controller should be simply manipulated for selection/execution/navigation of the functions (to minimize user’s workload). Device controller should be designed to anticipate how to manipulate and which function to perform. The one that has a similar way to manipulate the function should be designed to be manipulated similarly. The function that has a different manipulation should be designed to use the different manipulation or controller.

Table 2. Results of factor analysis Evaluation Indexes Learnability Memorability Ease Flexibility Efficiency Effectiveness Performance Speed Accuracy Controllability Feedback Prevention Recoverability Visibility Durability Safety Size Familiarity Arrangement Attractiveness Complexity Simplicity Cognition Consistency Discriminability

1 0.899 0.877 0.864

2

3

Factors 4

5

6

7

0.198 0.819 0.758 0.704 0.686 0.509 0.366 -0.828 -0.763 -0.646 -0.590 -0.642 -0.599 0.452 0.644 0.731 0.366 0.876 0.881 0.682 0.765 0.710

3.2 Hierarchical Classification of Evaluation Indexes For the AHP analysis, a hierarchical classification was conducted on the basis of 25 indexes in table 1. Hierarchical classification and grouping related indexes guaranteed more effective and efficient evaluation of the haptic device model. The degree of relation among evaluation factors was assigned by 9 evaluators: 2 points for high relation, 1 point for middle relation and 0 points for low relation. Using factor analysis,

522

Y.G. Ji et al.

the evaluated points were used to organize and group evaluation indexes. Table 2 shows the results of factor analysis along with the degree of relation among evaluation factors. 25 evaluation factors were classified into 7 groups in accordance with the result of factor analysis and terminology was defined to represent each group. The 7 groups were “interaction support,” “function support,” “user support,” “information support,” “device capacity,” “device appearance” and “device control.” These were reclassified into “time,” “manipulation,” and “device” for the haptic model. These grouped evaluation indexes formed a basis of the AHP evaluation model to evaluate the haptic interface. Table 3 shows description about each group. Table 3. Grouping of evaluation indexes

Device

Manipulation

Evaluation group Definition Index Interaction Evaluation index group related to Learnability Support interaction between user and device for Memorability controlling the device Ease Function Evaluation index group related to Flexibility Support device function Efficiency Effectiveness User Evaluation index group related to user’s Performance Support usability to perform a given task Fast Accuracy Controllability Information Evaluation index group related to Feedback Support feedback or information about the state Prevention of the device Recoverability Visibility Device Evaluation index group related to Durability Capacity duration and capacity in design the Safety hardware Device Evaluation index group related to the Size Appearance physical features like shape, size, and Familiarity arrangement of the device controller Arrangement Attractiveness Device Evaluation index group related to Complexity Control manipulation way for selection and Simplicity operation of function using the device Cognition controller Consistency Discriminability

3.3 Analytic Hierarchy Process Evaluation Model The hierarchical structure generated from the factor analysis on evaluation indexes indicates the index’s structural level of the AHP evaluation model. To calculate the degree of significance, a cross comparison of parallel indexes was conducted, and the relative comparative value was collected on 10 evaluator’s checklists. With this value, we generated the Eigen value of the evaluation index, which was used to deduce the relative significant value of the criterion indexes. Expert-choice, as an AHP tool, was used to generate the absolute significant value by the upper criterion’s significance.


523

Fig. 1. Hierarchical structure of evaluation indexes (Analytic Hierarchy Process evaluation model)

4 Results The weights of the evaluation factors of the haptic interface were generated in the AHP model. The evaluation factors’ weights were divided into local and global results. Local result refers to the importance of the evaluation factors in each group, and global result refers to the importance of the evaluation factors for the whole model. Table 4. Local results Evaluation group

Manipulation

Device

0.642

Interaction Support

0.286

Function Support

0.240

User Support

0.230

Information Support

0.244

Device Capacity

0.315

Device Appearance

0.319

Device Control

0.365

0.358

Evaluation Indexes Learnability 0.302 Memorability 0.362 Ease 0.335 Flexibility 0.300 Efficiency 0.354 Effectiveness 0.345 Performance 0.260 Fast 0.281 Accuracy 0.243 Controllability 0.216 Feedback 0.295 Prevention 0.246 Recoverability 0.259 Visibility 0.201 Durability 0.519 Safety 0.481 Size 0.242 Familiarity 0.287 Arrangement 0.265 Attractiveness 0.207 Complexity 0.182 Simplicity 0.192 Cognition 0.233 Consistency 0.213 Discriminability 0.179

524

Y.G. Ji et al.

4.1 Results of Local Through analysis using the AHP model, the values of local and global section’s results were generated. Local section’s results refer to the importance between the evaluation group and evaluation factors. Table 4 shows the comparative importance between the evaluation group and evaluation factors. In the haptic interface evaluation model, the “Manipulation” group is more important than the “Device” group. Also, in the Manipulation group, “Interaction Support” was most important. “Device Control” was most important in the Device group. This shows that the manipulation method and offered information to the driver is more important than the device appearance. 4.2 Results of Global The value of the global results refers to the importance from the evaluation factors to the haptic interface model. Table 5 shows the values from each evaluation group and factors to the haptic interface model. In the haptic interface evaluation model, “Interaction Support,” “Information Support” and “Function Support” were most important among 7 evaluation groups, and “Memorability,” “Ease,” “Durability,” and “Learnability” were most important among the 25 evaluation factors. Table 5. Global results Evaluation group

Manipulation

Device

0.642

Interaction Support

0.183

Function Support

0.154

User Support

0.148

Information Support

0.157

Device Capacity

0.113

Device Appearance

0.114

Device Control

0.131

0.358

Evaluation Indexes Learnability 0.055 Memorability 0.066 Ease 0.061 Flexibility 0.046 Efficiency 0.054 Effectiveness 0.053 Performance 0.038 Fast 0.042 Accuracy 0.036 Controllability 0.032 Feedback 0.046 Prevention 0.039 Recoverability 0.041 Visibility 0.031 Durability 0.059 Safety 0.054 Size 0.028 Familiarity 0.033 Arrangement 0.030 Attractiveness 0.024 Complexity 0.024 Simplicity 0.025 Cognition 0.030 Consistency 0.028 Discriminability 0.023


525

As a result, developing simple interaction method with haptic device that is easy to learn to use, and supporting feedback and information about system status will make effective and efficient haptic device. Consequently, considering core factors of haptic device will improve its usability.

5 Conclusion In this study, we offered a priority-based, quantitative evaluation generated from statistical analysis using qualitative evaluation from experts in early design steps of the haptic interface model. In conclusion, a developer can design a user-centered haptic device with important considerations, and a heuristic evaluation for a haptic interface’s prototype is possible using the AHP model. This will have a great impact on the advancement of haptic interface design and improvement.

References 1. Marcus, A.: The next revolution: vehicle user interfaces. Interactions, 11(1) (2004) 2. Rydström, A., Bengtsson, P., Grane, C., Broström, R., Agardh, J., Nilsson, J.: Multifunctional Systems in Vehicles: A Usability Evaluation. In: Proceedings of CybErg 2005, The Fourth International Cyberspace Conference on Ergonomics, Johannesburg, International Ergonomics Association Press (2005) 3. Grane, C., Bengtsson, P.: Menu Selection with a Rotary Device Founded on Haptic and/or Graphic Information. In: Proceedings of the First Joint Eurohaptics Conference and Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, IEEE (2005) 4. Salvucci, D.D., Zuber, M., Beregovaia, E., Markley, D.: Rapid Prototyping and Evaluation of In-Vehicle Interfaces. In: CHI 2005, Portland, Oregon, USA (April 2-7, 2005) 5. Electronics Information Center: Market trend of Telematic. Knowledge Research Group report (November 2003) 6. Kirkpatrick, E., Douglas, S.A.: Application-based Evaluation of Haptic Interfaces. In: Proceedings of the 10th Symp. On Haptic Interfaces For Virtual Envir. & Teleoperator Systs. IEEE, New York (2002) 7. Kirkpatrick, E., Douglas, S.A.: A Shape Recognition Benchmark for Evaluating Usability of a Haptic Environment. In: Brewster, S., Murray-Smith, R. (eds.) Haptic HCI 2000. LNCS, vol. 2058, pp. 151–156. Springer, Heidelberg (2001) 8. Gartner Inc.: Automotive Telematics Overview and Forecast (2002) 9. Isaksson, J., Nordquist, J.: Evaluation of haptic interfaces for in-vehicle systems. IEA (2003) 10. Evans, M., Wallace, D., Cheshire, D., Sener, B.: An Evaluation of Haptic Feedback Modelling during Industrial Design Practice. Design Studies 26(5), 487–508 (2005) 11. Payette J., Hayward V., Ramstein V., Bergeron D.: Evaluation of a Force Feedback (Haptic) Computer Pointing Device in Zero Gravity. In: Proceedings ASME Dynamics System and Control Division, DSC-vol. 58 (1996) 12. Llaneras, R.E., Singer, J.P.: In-Vehicle Navigation Systems. In: 2nd International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design (2003)

526

Y.G. Ji et al.

13. Richard, P.: Dextrous Haptic Interaction in Virtual Environments: Human Performance Evaluations. In: Proceeding of the IEEE, International workshop on robot and human interaction, pisa, Italy (September 1999) 14. Saaty: The analytic hierarchy process: planning, priority setting, resources allocation. McGraw-Hill (1980) 15. Wall, S., Brewster, S.: Design of haptic user-interfaces and applications. Virtual Reality 9, 95–96 (2006)

How to Make Tailored User Interface Guideline for Software Designers Ilari Jounila University of Oulu P.O. Box 3000, 90014 University of Oulu, Finland [email protected]

Abstract. A large numbers of user interface guidelines and patterns have developed by different researchers. These patterns and guidelines are, however, either too generic or too specific to use. In addition, a multitude of guides cause problems to find and use them effectively. Because of these problems, using different guides are not enough useful e.g. for software designers. This paper describes experiences and findings of a case study project. As a result of an iterative development process, the tailored user interface guideline is presented. Other result was that the guideline was well received by the software designers. Keywords: User interface guidelines, software designers.

1 Introduction User interface design guidelines, or guidelines for short, are a one form to convey existing usability knowledge for software developers. Also, other similar tools than guidelines existing: patterns, standards, design rules, and conventions. So far, amount of these tools has increased extensively, especially guidelines and patterns. Guidelines could be divided for platform-independent design guidelines as well as platform-specific style guides [3]. Examples of style guides are the Apple Human Interface Guidelines [1] and Windows interface guidelines [9]. Examples of design guidelines are the ISO 9241 standard with parts 10 to 17 [4] and Research-based Web Design and Usability Guidelines [6]. In addition, the design patterns, such as Patterns for Effective Interaction Design by Tidwell [11] and Interaction Design Patterns by van Welie [12], and also the mistakes that software developers often make [5], could be included to go with design guidelines rather than style guides. All of these artefacts included a large number of different methods, and that produce problems such as management and use of patterns [2]. Other problem is generality of the guidelines, which is remarked by software designers and by usability specialists alike [10]. Thus, it is difficult to find which guidelines are most appropriate to a designer’s specific situation [3, 8]. However, Mosier and Smith state that it is not appropriate to make more specific guidelines due to the reduced suitability to a variety of system design application [10]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 527–535, 2007. © Springer-Verlag Berlin Heidelberg 2007

528

I. Jounila

There have been several tools to make usability guides more accessible through different systems. For example, the tool that provides the management of multiple guidelines via Internet or locally [8]. This paper approach is case study in the project, which purpose is creating tailored and useful user interface guideline for software designers.

2 Related Work The case project, Kätsy, consisted of our usability research group and the company, whose business activity is software development. The company contacted our usability researchers because of their growing interest in usability issues. In the first meeting, the company listed their interest of usability, such as usability knowledge generally, and existing methods. They had also expectations of two issues: usability evaluation of their existing web-based Content Management System (CMS) as well as short-term and long-term advantages of usability for the company. The company had developed the system without knowledge of usability; e.g. their product development process had not included participation of end users at all. The CMS is consisted of several modules such as content production, content editing, content updating etc. A new version of the CMS was under development, and therefore the company had interested in how to improve the usability of the CMS before releasing it. Improving the usability of the CMS was consisted of several parts. First of all, our usability research group was educated by the company for a few hours to use the system. We also organised a workshop together with the company, in which user groups of the system and their typical tasks were identified by using a persona method. Based on a selected persona, usability test tasks were designed with our usability research group and company’s software designers. After the workshop a specified test environment was constructed for the purpose of the usability tests by the company. Defining the current state of the system’s usability was started by using expert evaluation. Expert evaluation was used, when the usability research group familiarised themselves with the use of the test environment. Also four usability test sessions, each with one participant, were conducted for gathering user feedback of the system. Three participants were end users of the tested product and one had used of the same kind product earlier. Test facilitator and observers were all from our research group. Test participants were briefed shortly to the test session situation and they filled out a user profile questionnaire. After that they had about five minutes to familiarise themselves with the system before beginning the session. The think aloud method was used during the all sessions. Each of the four test sessions took about one hour and was recorded. Finally, an after test questionnaire was filled out by each participant Findings from the all test sessions were analyzed and combined to the test report. The test report consisted of the usability problems of the system mostly, but also good design solutions were included. The test report and a test video run-through were introduced to the company in a test report meeting. The company was very satisfied with the results and provided solutions. However, one thing came up in the meeting: the company needed some

How to Make Tailored User Interface Guideline for Software Designers

529

concrete guides and rules for the aid of user interface development. At the end of the project, our usability research group produced two deliverables: a content analysis document based on the expert evaluation and the findings of the usability test sessions as well as the user interface guideline document based on a literature research, the expert evaluation, and the findings of the usability test sessions.

3 Development Process of the Guideline Findings from the usability tests and the expert evaluation, and the software development company’s needs were all as a starting point to develop the user interface guideline. Using generic design guidelines and design patterns were basic aspect to illustrate bad and good design solutions of the system. With generic guidelines in this study means guidelines provided e.g. in text books such as GUI Bloopers [5], but also ISO 9241 standard with parts 10 to 17 [4] and Research-based Web Design and Usability Guidelines [6]. In addition, the design patterns, such as Patterns for Effective Interaction Design by Tidwell [11] and Interaction Design Patterns by van Welie [12] were also included in this study. A linguistic form of the guideline was Finnish because of a small-sized national organisation. The guideline was developed iteratively based on empirical findings. The first version of the guideline was implemented as text and picture examples of the generic design guidelines and design patterns. However, the use of this kind of general examples included some problems. It was not easy to understand and use too general text descriptions and examples of pictures. Company pointed out that they preferred less general pictures of literature, and more picture examples of their own system. After three iteration rounds and three meeting discussions (like evaluation sessions), the guideline was accepted by the company. Depending on the occasion, 4 to 5 researchers and two persons from the company, software designer and development manager, participated the all meeting sessions. In these sessions, all of the participants followed the content of guideline from wide-screen. The user feedback was gathered through the observations done by the researchers, and discussions with the company. Final version of the guideline provided 28 different individual guides. Each of them included text based description of the problem or good design solution, sample picture(s) of the company’s own system only, and also recommendation of the better design solution or mention of already good design solution in the existing system. 3.1 Iteration 1: Preliminary Version The first iteration started with literature research of existing guidelines, principles and patterns. Some exclusions were made because of a large numbers of different guideline collections were found. It was decided to include only well-known guideline and pattern collections into this guideline (e.g. GUI Bloopers [5], ISO 9241 standard [4], Research-based Web Design and Usability Guidelines [7], Patterns for Effective Interaction Design by Tidwell [11], and Interaction Design Patterns by van Welie [12]). A purpose of this iteration was also to find only such guidelines that were not followed by the CMS.

530

I. Jounila

The first version of the guideline was produced as text and picture examples of the original generic design guidelines and design patterns. This preliminary version consists of only four guides because of needs to get user (company) feedback before doing too much time-consuming work. Figure 1 shows an example of the individual guideline. Structure of the guideline was presented as the title, the description how to would have to design, and after all an example picture. The title (number 1 in the fig 1) was translated in Finnish but also English analogue was included. Textual description (number 2 in the fig 1) was translated in Finnish only from the original source. The example picture was included in it original form. This example was found from Research-based Web Design and Usability Guidelines [6].

Fig. 1. An example of the first version of the guideline. (1) Title of the individual guideline, (2) longer description of the followed guide (translated from English into Finnish from original source, (3) an example picture from original source. Numbers (1-3) were added into this picture to clarifying structure of the example.

The first version of the guideline was presented in the meeting with company. The use of these version general examples included some problems. It was not easy to understand and use too general guidelines with general examples of pictures. Company pointed out that they preferred less general pictures of literature, and more picture examples of their own system. Other feedbacks were that the guideline should be concrete and logical wholeness. Also more individual guides should be included in the next version.


531

3.2 Iteration 2: Restructured Version After the feedback of the preliminary version, it was then looked at how to create the content of guideline logically. The second version of the guideline was implemented of the same kind than the first version but using the picture examples of the company’s own system. Also, a structure of an individual guideline was changed more clearly and logical with numbered guides. Each of the individual guideline followed the same structure: (1) A title of the guideline in numbered order, (2) a description of the followed guideline, (3) a description of the founded problem, (4) a description of the proposed solution, (5) an example picture. In this version, the number of individual guideline increased. The restructured version consists of eight guides and four proposal guides only with the title. The guideline was still proposal level in this iteration because of needs to get company’s feedback before continuing its development. Figure 2 shows an example of the individual guideline of the second iteration based on ISO 9241-12 standard [4].

Fig. 2. An example of the second version of the individual guideline: Initial position for entry fields. (1) The numbered title, (2) description of the followed guide, (3) founded problem in the company’s system, (4) proposed solution, (5) an example picture from the company’s system. Numbers (1-5) were added into this picture to clarifying structure of the example.

The second version of the guideline was presented in the meeting with company. This style of implementation was found quite clear and illustrative by the company. However, the company proposed to increase the number of the individual guideline, e.g. pop-up menus, wizards, and error messages. After this iteration, it was concluded also, that the structure of the individual guideline should specify more precise.

532

I. Jounila

3.3 Iteration 3: Superfine Version The primary focus of the third iteration was to increase examples of different user interface elements. Instead of only bad solutions, it was also included examples of well implemented solutions of the system into the guideline. Also, the structure of the individual guideline was revised. This version of the guideline consists of an abstract, an introduction, a table of content and several chapters of guidelines. The guideline included 28 individual guides and 38 pages (using MS Word). The structure of an individual guideline was changed again more clear and simple. Each of the individual guideline followed the same structure: the title, an example of the system with problem description and picture, and guideline/solution for the problem. Figure 3 shows an example of the individual guideline of the second iteration based on ISO 9241-13 standard [4].

Fig. 3. An example of the final version of the individual guideline: “Error prevention and error messages”. (1) The numbered title, (2) an example of the company’s system including the problem and picture, (3) guideline/solution for the problem. Numbers (1-3) were added into this picture to clarifying structure of the example.

This version of the guideline was very well received and appreciated by the company. The representatives of the company commented in the meeting (translated in English): “[the document is] concrete guideline”, “[this guideline is] superfine because of using examples of our own system”, “we will also go through [the guidelines] with our business partner” and “the guidelines will ensure of basic quality of usability to our company”.


533

The third iteration version of the guideline was accepted by company. Only some misspelling had to correct into the final deliverable version.

4 The Tailored Guideline The final version of the guideline is concrete presentation for a specific user interface design in a specific company. In this project, the deliverable form of the guideline was a Word-document by the email and also same document in a paper printout. Of course the deliverable form depends on the users needs. The tailored guideline supports development of user interface design in a small-sized software development company. 4.1 Structuring the Model of Guideline Proposed structure of guideline consists of the title with identified number, an example of the user’s own system with description of the problem/well-designed solution as well as the picture, and also short description of general guideline and solution to the specific problem (solution not needed if the example is well-designed). Figure 4 shows the simple model of the structure.

Fig. 4. A proposed simple model of an individual guideline

4.2 Criteria for Making the Tailored Guideline Proposed criteria when making the tailored guideline: 1. Mode of the generic guidelines have to be changed toward the close by user (examples have to be from users own system including the description of the problem with picture and guideline/solution how to correct it) 2. Guidelines have to be concrete 3. Guidelines have to be enough extensive but not too long

534

I. Jounila

4. Guidelines have to include bad solutions of the system as well as well designed solutions 5. Deliverable form has to decide as case-specific 6. Iteration needed when developing a tailored user interface guidelines

5 Discussion One of the case project ideas was to provide long-term usability knowledge for the case company. Because of this, the tailored user interface design guideline was developed for a case company, although Mosier and Smith suggested that it is not appropriate to make specific guidelines [10]. However, this study seemed that specific guidelines are needed at least for small-sized company. Due to a large numbers of general guidelines, finding the right guidelines for the specific needs causes problems for software designers. Also, the general guidelines are often too general for using in a specific context. The tailored individual guideline was built with a simplified structure with quite short length. The simplified structure was also supported by the expectations of the company. An important thing is to use examples of the company’s own system to describe the problems. Other important thing is that the picture examples in generic guidelines confused developers. This is the reason why should have to use sample pictures only of their own system. It was found that the form of deliverable is casespecific. Deliverable could be a paper document, a Word-document or a Web-page etc. Thus the specific tools are not needed. In this research was found, that developing tailored user interface guideline is timeconsuming due to the development bases on the findings of product education by the company, requirement specification workshop, expert evaluation, and four usability tests. Also, a large numbers of existing general guidelines caused challenges to find appropriate general guidelines to this work. However, the discussions with the company were useful between iteration rounds to decide things to be included into the guideline. Proposed criteria are useful when developing guideline for a company without knowledge of usability issues, but perhaps they could be too restrictive for a company with usability knowledge (e.g. guideline should be substantially more extensive than developed in this work.). A proposed simple model of an individual guideline seemed to work in this project. However, it needs more study in further research. Also, the initial criteria should define more specific. In the future, it would be interesting to see the usefulness of developed guideline in the case project after six to twelve months. This study was a one approach to educate guidelines to software developers with making tailored guideline. Thus, one other future work will include studies on how should educate existing guidelines and patterns for software designers but also for other groups such as students.

6 Conclusions This research concluded that adoption of tailored user interface guideline is more appropriate for software developers than generic guideline collections due to


535

understandability and expression. The most important thing is to included examples of developers own system into the guideline. Acknowledgments. I would like to thank the Kätsy project for providing a research environment. I also thank Dr. Timo Jokela, Kari-Pekka Aikio, Niina Kantola, and Mauri Myllyaho for feedback and comments on developed guideline. In addition, this work would not be possible without software designers at software development organization.

References 1. Apple Computer Inc.: Apple Human Interface Guidelines (2006) Last accessed 2007-0215, http://developer.apple.com/documentation/UserExperience/Conceptual/OSXHIGuidelines/ index.html. 2. Deng, J., Kemp, E., Todd, E.G.: Managing UI pattern collections. In: Proceedings of the 6th ACM SIGCHI New Zealand chapter’s international conference on Computer-human interaction: making CHI natural, July 07-08, 2005, Auckland, New Zealand, pp. 31–38 (2005) 3. Henninger, S., Haynes, K., Reith, M.W.: A Framework for Developing Experience-Based Usability Guidelines. In: Proceedings of DIS ’95, pp. 43–53. ACM Press, New York (1995) 4. International Standards Organization: ISO 9241: Ergonomic requirements for office work with visual display terminals. Geneva, Switzerland (1999) 5. Johnson, J.: GUI Bloopers: Don’ts and Do’s for Software Developers and Web Designers. Morgan Kaufmann, San Francisco (2000) 6. Koyani, S.J., Bailey, R.W., Nall, J.R.: Research-based Web Design and Usability Guidelines, Dept. of Health & Human Services, National Institutes of Health Publication 03-5424, National Cancer Institute, Washington, DC (2006) Last accessed 2007-02-13 http://www.usability.gov/pdfs/guidelines.html 7. Mariage, C., Vanderdonckt, J.: Creating Contextualised Usability Guides for Web Sites Design and Evaluation. In: Proceedings of 5th Int. Conf. on Computer-Aided Design of User Interfaces CADUI’2004 (Funchal, 12-16 January 2004), Kluwer Academics, Dordrecht (2004) 8. Mariage, C., Vanderdonckt, J., Pribeanu, C.: State of the Art of Web Usability Guidelines. In: Proctor, R.W., Vu, K.-P.L. (eds.) The Handbook of Human Factors in Web Design (Chapter 41), Lawrence Erlbaum Associates, Mahwah (2004) 9. Microsoft Corporation: The Windows interface guidelines for software designers. Microsoft Press, Redmond, WA (1995) 10. Mosier, J.N., Smith, S.L.: Application of Guidelines for Designing User Interface Software, in Behaviour and Information Technology, vol. 5(1), pp. 39–46 (JanuaryFebruary 1986) 11. Tidwell, J.: Designing Interfaces – Patterns for Effective Interaction Design (2006), Last accessed 2007-02-13. http://designinginterfaces.com/ 12. Welie, M.v.: Patterns in Interaction Design (2001) Last accessed 2007-02-13, http://www.welie.com/index.html

Determining High Level Quantitative Usability Requirements: A Case Study Niina Kantola and Timo Jokela P.O. Box 3000 90014 Oulu University, Finland {niina.kantola, timo.jokela}@oulu.fi

Abstract. High-level quantitative usability requirements were determined for a public health care system. The requirements determination process was iterative, and the requirements were refined step-by-step. The usability requirements are categorized first through the main user groups, then by the services, and finally by specific usability factors. Keywords: Usability requirements, requirements.

health

care

systems,

quantitative

1 Introduction It is generally agreed as a good project management practice to define quantitative requirements for system quality characteristics. Quantitative, measurable quality requirements provide a clear direction of work and acceptance criteria for a development project. In practice, usability requirements are quite seldom among the quantitative requirements in development projects. One of the consequences of not defining usability requirements is that other objectives dominate and usability is considered only as a secondary objective of a project. The obvious consequence is a product with usability problems. Our case study is a system development project where the city Oulu is the purchaser of a system, and a consortium of two software development companies will develop the system. The system-to-be-developed is a healthcare system. The goal is that is will be extensively used by the citizens. Also the healthcare professionals (doctors, nurses, etc.) would be naturally users of the system. To make usability a true issue in the project, it was decided in the beginning of the project that measurable level high usability requirements be determined. In this paper, we present how we approached usability requirements determination in order to define them at a high level of abstraction but still in a measurable way, and what were the results.

2 Related Work Usability cannot be directly measured, but it is related to several attributes that can be measured. However, measuring usability is an issue that poses many questions: What J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 536–543, 2007. © Springer-Verlag Berlin Heidelberg 2007

Determining High Level Quantitative Usability Requirements: A Case Study

537

are we measuring? How many do we measure? How do we present the measures? What measures do we take? For example these kinds of topics were discussed in the special issue on measuring usability of Interactions magazine (Interactions Nov + Dec 2006). According to [5] discussions have recently recurred on which measures of usability are suitable and on how to understand the relation between different measures of usability. Literature recognizes several usability attributes that can be measured. ISO 924111 [6] defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”. In the early phases of development measures of effectiveness, efficiency and satisfaction should be selected, and acceptance criteria based on these measures established. These attributes are generally measured on different scales such as task completion rates, average time to task completion and average task satisfaction scores. Acceptance criteria may include separate definitions of the target level and the minimum acceptable level [12]. It is also possible use different scales, for example worst, planned, best and current levels [15]. Other measurable usability attributes may include attributes such as learnability, memorability, error, affect, helpfulness and control [11, 9]. There are also attempts to standardise traditional usability metrics on a uniform scale (e.g. [14]). Several measurable usability attributes exists, but however, there are no clear guidelines how the determination of measurable, quantitative, requirements should be organized and managed. Jokela [7] points out that the existing literature mainly focus on describing and exploring the concepts and formats related to the definition of usability and the contents of usability requirements document. Some guidelines are presented for example by Wixon and Wilson [16], Nielsen [11] and Mayhew [10]. Further, there exist only very few empirical research reports on quantitative usability requirements methods in practice. One of the few reports is by Bevan et al. [3] who conducted case studies on quantitative usability evaluations following the Common Industry Format for usability testing, CIF [1]. However, the methodological aspects are not discussed in detail in his report. Jokela et al. [8] describe a case study where quantitative requirements played a key role in the development of a new user interface of a mobile phone. Because of limitations of existing methods, they developed tailored methods for determining and evaluating quantitative usability requirements. To support the process of defining usability requirements and usability criteria, the Working Group sponsored by National Institute of Standards and Technology (NIST), has recently developed a Common Industry Specification for Usability–Requirements (CISU-R). It is still in a draft form, but it aims to define the content of usability requirements, rather than requiring a specific process by which they are gathered [13]. CISU-R has three parts: the context of use, usability measures and the test method. Scenarios of use play an important role in the process. In the first part scenarios are used to specify how users carry out their tasks in a specified context. In the second part usability measures are provided for defined scenarios of use. Such task (scenarios) should have to be selected that are the most frequent or most critical to the business or user [13]. Determining whether the quantitative requirements have been achieved can be done through a usability test. Also user preference questionnaires provide a subjective

538

N. Kantola and T. Jokela

metric for the related usability attribute. Usability can be quantitatively evaluated with theory-based approaches such as GOMS and keystroke level model, KLM, too [4]. Hornbæk [5] has reviewed current practice in how usability is measured. He has also analysed problems with the measures of usability employed. According to his analysis such problems include for example such: measures of the quality of interaction are used only in a few studies; measures of learning and retention of how to use an interface are rarely employed; measures of users’ satisfaction with interfaces are in a disarray and validated questionnaire readily available are ignored. Based on his review, he proposes several challenges with respect to measuring usability [5].

3 Flow of the Case Study The case study is a health care system, the aim of which is to provide web-based health services to citizens of the city. Because the use of the services will be based on the voluntariness of the citizens, usability is a key success criterion for the system. Therefore, it is essential to explicitly define the usability requirements for the system. A typical way of defining usability requirements is by user task performance. Because there typically are many kinds of users and many user tasks, the number of user requirements easily is quite big. The number of requirements should not be too many, but still well depict the required usability. In practice, the requirements were determined in a qualitative and iterative way. The steps were: 1. The available documentation of the project was examined. 2. The key persons behind the project idea were interviewed; these persons were at various managerial positions in the city and the at the development companies. The goal of the interviews was to get an understanding of the planned use of the system, and the critical success factors. 3. Thereafter, an interpretation of the interviews was made, and the first proposal of usability requirements was produced. The requirements included (1) three main user categories and (2) two to three usability requirements for each category. 4. This first set of requirements was presented in a small working group of the project steering group. The discussion revealed that some updates were needed. For example, the appropriate number of user categories should be four (not three) 5. Based on the feedback from the working group, the requirements were revised. 6. The revised set of requirements, then, were presented to a larger steering group of the project, with larger number of participants. Another problem in the requirements was noticed in the meeting. An additional requirement was added to the requirements “on-line”, and thereafter the requirements were approved. In summary, the research method was an iterative and constructive process. An artifact (the requirements) was constructed, based on usability experience and on the data that had been gathered through documentation and interviews. The artifact was evaluated and refined two times before being accepted.


539

4 Result: The Usability Requirements As a result, we have defined a set of usability requirements. The requirements identify main user categories, and a set of quantitative usability requirements (= 1 to 4 requirements) is defined for each category. The main user categories are: • “Customers”; i.e. the citizens of the city • “Professionals”; i.e. the healthcare personnel of hospitals and health centers. In the following, we discuss the usability requirements for each two main user categories separately. The “Customer” Category In the “Customer” category, the following main services of the health care system were identified: • Proactive healthcare: information about proactive healthcare issues, such as weight control, nutrition, and physical training • Occasional healthcare problems: information for self-assisted care (how to act in occasional healthcare problems such as occasional fever, flu, small accidents, etc.) • Chronic diseases: support for self-assisted care for diseases such as diabetes, asthma and arterial hypertension Four main users groups within the “customer” category were identified: • • • •

Parents of small children Young (teenagers, students) Adults Seniors

The relationship between the services and the users groups are illustrated in Table 1. One can find that the single most important service/user group segment is “chronic diseases” service used by “seniors”. On the other hand, for example, chronic diseases are quite seldom in younger user groups. Table 1. Services and the users of the “Customer” category of the healthcare system. The higher number, the more important user group. User category Parents of small children Teenagers, students Adults Seniors

Proactive healthcare

Occasional healthcare Chronic diseases problems

1

2

1

1

2

1

2

2

2

2

1

3

540


The representative user groups for each service were chosen to be the ‘demanding’ ones (= if these user groups can use the service, then one can assume that the other user groups can use it, too)1: • Proactive healthcare: adults (including parents of small children) • Occasional healthcare problems: adults (including parents of small children) • Chronic diseases: seniors The usability requirements for the service “Occasional healthcare problems” and “chronic diseases” are as shown in Table 2 and Table 3. One can see, for example, the importance of positive first-time usage in the “Chronic disease” service: 9 users out of 10 should be able to use the system and have a positive first experience.

Table 2. Usability requirements: Occasional healthcare problems (user group: adults) Criterion Easy-to-find instructions

Description The user finds instructions to those typical sicknesses and accidents that one can care by him or herself (a separate specification of those sicknesses and accidents exists)

Goal 50% of users find instructions without contacting the health care personnel

Measuring means Usability tests Post-release follow-up

Table 3. Usability requirements: Chronic diseases (user group: seniors) Criterion First-time usage

Every day use

Description The user needs to experience the system useful and easy to use

Goal Measuring means 9 users out of 10 can Usability test perform the routine tasks related to his/her sickness correctly, and find the experience positive The users should 9 users out of 10 Follow-up studies continue to use the regularly use the system on daily basis system

The service “Proactive health care” was identified as a separate service only later, and the goals and measuring means have not been determined yet, Table 4. 1

It is assumed, however, that the users have used internet (web).


541

Table 4. Usability requirements: Proactive health care Criterion Easy-to-find programs

First time usage Every day use

Description The user finds instructions to proactive health care programs, as appropriate to him/her Taking the system must be very easy The users should continue to use the system on daily basis

Goal

Measuring means

The “Professionals” Category Several user groups were identified within this category: • • • •

Doctors Nurses Public health nurses Other personnel

The usability requirements for these different user groups, however, were consolidated into one table, Table 5. At this stage, it was neither found necessary to define the requirements separately for different services. Table 5. Usability requirements: Professionals Criterion Learnability

Description Can be learnt without training

Efficiency

The users need to be able to quickly carry out time-critical tasks

Subjective satisfaction

Pleasant to use regularly

Goal 9 experienced professionals out of 10 learn how to correctly carry out the routine tasks without training Users can carry out time-critical tasks (which need to be identified) within the pre-defined time limits 9 users out of 10 rates the system 1 point (scale 1…7) more pleasant to use than a reference system (= a system widely used in hospitals and health care centers).

Measuring means Usability test

Usability test

Satisfaction measurement questionnaire

542


5 Conclusions In this study, a natural way of determining the requirements was first through user categories. This is probably not very surprising – “who are your users” is the key question when designing usability. In all, the quantitative usability requirements determined in this study fall into the following hierarchical categorization. • First by the main user groups (“Customers”, “Professionals”) • Then by the services (“Occasional health problems”, “Chronic diseases”, etc.) • Finally by specific usability factors (“Learnability”, “Subjective satisfaction”, etc.) We find that the usability requirements determined in this study have some new features: • The overall idea of determining ‘high level’ usability requirements. The requirements outlined in section 0 are defined in a quantitative but abstract level at the level of services. User task based usability requirements could be determined without detailed user tasks analysis. The ‘routine tasks’ are not determined at this stage, and need to be determined later. • This kind of hierarchical categorization is, in our knowledge, quite new. Typically usability requirements are “just” a set of individual requirements [2]. • The types of appropriate usability requirements for different users are quite different between the different services and user groups.

6 Discussion of Results In this study, preliminary quantitative usability requirements for a public health care system were determined. Overall, this study is one of the few case studies on quantitative usability requirements. A meaningful set of quantitative, high-level usability requirements could be determined – which really was not obvious in the beginning of the research. The requirements determination process was iterative, and the requirements were refined step-by-step. The usability requirements are categorized first through the main user groups, then by the services, and finally by specific usability factors. As research contributions, we find (1) the idea of having high level usability requirements determined at the level of services; (2) hierarchical organization of the requirements; and (3) the finding that the types of usability requirements may be quite different for different categories of users and service. One should understand that the approach for defining usability requirements described in this paper is not proposed to be applicable as such for other development contexts. For example, the authors ended up with quite a different set of usability requirements in the context of a development project for a user interface of a mobile phone [8]. Another limitation of this study is that the health care system is still very much under development, and we do not yet have data on the appropriateness and usefulness of the requirements. These issues are the topic of other papers in the future.


543

For practitioners, the results indicate that the appropriate set of usability requirements in dependent on the specific application and development context. One should try to define requirements such that truly depict the usability of the system or product under development. Research on quantitative usability requirements is quite limited. There is space and need for different kinds of research efforts: from better theoretical understanding to the development of effective practical methods.

References 1. ANSI. Common Industry Format for Usability Test Reports. NCITS 354-2001 (2001) 2. Bevan, N.: Practical Issues in Usablity Measurement. ACM interactions 13(6), 42–43 (2006) 3. Bevan, N., Claridge, N., Athousaki, M., Maguire, M., Catarci, T., Matarazzo, G., Raiss, G.: Guide to specifying and evaluating usability as part of a contract, version1.0. PRUE project. London, Serco Usability Services: 47 (2002) 4. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 5. Hornbæk, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64(2), 79–102 (2005) 6. ISO/IEC. 9241-11 Ergonomic requirements for office work with visual display terminals (VDT)s - Part 11 Guidance on usability. ISO/IEC 9241-11: 1998 (E) (1998) 7. Jokela, T.: Guiding designers to the world of usability: Determining usability requirements through teamwork. In: Seffah, A., Gulliksen, J., Desmarais, M. (eds.) Human-Centered Software Engineering. Kluwer HCI series (2005) 8. Jokela, T., Koivumaa, J., Pirkola, J., Salminen, P., Kantola, N.: Methods for quantitative usability requirements: a case study on the development of the user interface of a mobile phone. Personal and Ubiquitous Computing 10(6), 357–367 (2006) 9. Kirakowski, J.: The Software usability measurement inventory: background and usage. In: J.P., W., Thomas, B., Weerdmeester, B.A., McClelland, I.L. (eds.) Usability Evaluation in Industry, pp. 169–177. Taylor & Francis, London (1996) 10. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufman, San Francisco (1999) 11. Nielsen, J.: Usability Engineering. Academic Press, Inc. San Diego (1993) 12. NIST. Proposed Industry Format for Usability Requirements. Draft version 0.62. 8-Aug-04 (2004) 13. NIST. Common Industry Specification for Usability - Requirements (2006) (Retrieved 16.2.2007) http://zing.ncsl.nist.gov/iusr/ 14. Sauro, J., Kindlund, E.: A Method to Standardize Usability MetricsInto a Single Score. In: Conference on Human Factors in Computing Systems, Portland, Oregon, USA, pp. 401– 409. ACM Press, New York (2005) 15. Whiteside, J., Bennett, J., Holtzblatt, K.: Usability Engineering: Our Experience and Evolution. In: Helander, M. (ed.) Handbook of human-computer interaction. Amsterdam, North-Holland, pp. 791–817 (1988) 16. Wixon, D., Wilson, C.: The Usability Engineering Framework for Product Design and Evaluation. In: Helander, M., Landauer, T., Prabhu, P. (eds.) Handbook of HumanComputer Interaction, pp. 653–688. Elsevier Science B.V, Amsterdam (1997)

Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat Sami Karjalainen VTT, P.O. Box 1000, 02044 VTT, Finland [email protected]

Abstract. A diversity of usability problems with office thermostats were found in a preceding study. In this paper, the reasons behind the problems are studied by analysing a room thermostat. The analysis shows that a substantial amount of information is needed to use a simple thermostat, and the system image of the thermostat does not deliver the information. From the viewpoint of the analysis, it is not surprising that office occupants have serious problems with thermostats. Keywords: thermostat, knowledge, information needs, user interface design.

1 Introduction Even a simple device can be very difficult to use. Contextual techniques were used to gain an understanding of usability problems with thermostats in offices [1]. Twentyseven office occupants in 13 Finnish buildings were interviewed in their offices. They were asked to show and tell how they use the thermostats. The results show that office occupants have serious problems in using room thermostats, although the thermostats are simple. The problems office occupants have with room thermostats are fundamental and diverse. Office occupants do not always know they have a room thermostat in a room, because the equipment is not recognised at all or the purpose of the equipment is unclear. Room thermostats are commonly installed high up on a wall or in some other location where they are impossible or awkward to use. It is also common that occupants in offices think that room thermostats are for service personnel only and do not dare to touch the thermostats themselves. User interfaces of room thermostats are often difficult to understand and use. Users do not know how much the dial should be turned to get the desired effect on room temperature. They may not dare to touch thermostats because their effect is not known. The lights and other symbols in thermostats are commonly not understood correctly. It is often even not known if the thermostat is active or passive at the moment. When a user adjusts a thermostat, she or he has the goal of setting the room temperature to a desired level. One may think it is a very simple task to adjust the room temperature with a thermostat. However, as shown above, a diversity of usability problems with thermostats were found. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 544–548, 2007. © Springer-Verlag Berlin Heidelberg 2007

Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat

545

This paper concentrates on the reasons behind the problems: why is it difficult to use a room thermostat? The paper presents an analysis of a room thermostat and the knowledge a user must have to be able to use the room thermostat with effectiveness and efficiency. The analysis is based on the experiences gained in interviewing 27 office occupants in 13 Finnish offices. Twenty-three of the occupants had a room thermostat in their office. All the room thermostats were non-programmable and simple. The interviewees had been working in their present rooms from one-and-ahalf months to more than ten years, but they still had serious problems with the thermostats in the offices.

2 A Typical Example of a Room Thermostat Many kinds of room thermostats have been designed. Many of them are very complex and it is clear that they can not be used without a manual. This study, however, concentrates on simple room thermostats. I have chosen a typical example of a room thermostat for a closer examination (see Fig. 1). The model presented in Fig. 1 is common in Finnish offices. Several companies manufacture practically similar versions of the room thermostat.

Fig. 1. An example of a room thermostat

The room thermostat has a dial for adjusting room temperature set point. The scale presents no temperature values, but only the symbols "+" and "–". To increase the room temperature, the user should turn the dial to the "+" direction, and to decrease the room temperature the dial should be turned to the opposite direction ("–"). The room thermostat presents a light symbol in the upper right corner of the interface. If the light is red, the system is increasing the room temperature. Correspondingly, a green light means that the system is decreasing the room temperature at the moment. A blank light denotes a stable situation.

546

S. Karjalainen

The room thermostat can be connected to a cooling or heating system, or it can be shared with both systems. Most typically in offices it is connected to a cooling system, for example, to a cooled beam system or a fan convector system. A separate heating system typically exists in Finnish offices. The heating systems typically include thermostatic valves for user adjustment.

3 Information Needs for Using the Room Thermostat The use of the room thermostat (Fig. 1) is analysed in Table 1. The table presents the information needs and possible misunderstandings with the thermostat. It also presents the consequences of the misunderstandings. Table 1. Information needs for use of the room thermostat

1

2

3

4

5

Information needed Correct knowledge for use of the room thermostat What is the purpose It is a user-adjustable of the device? thermostat.

Are office occupants allowed to touch the device? Is the room thermostat active or passive at the moment?

Yes. The room thermostat is for the use of occupants. Depends on the cooling/heating system and the current conditions (e.g. season). What do "+" and "–" "+" means increase mean in the and "–" decrease in interface? room temperature set point.

Possible misunderstanding

Consequences of misunderstanding

The purpose of the device remains unclear. It is not recognised as being for temperature control. The room thermostat is for service personnel only. Passive thermostat is considered to be active, or the other way around.

The room thermostat is not used even in thermal discomfort.

As above.

Use of a passive system leads to dissatisfaction with the system (or to a placebo effect). "+" means increase The dial is turned and "–" decrease in (i.e. the room cooling power. temperature set point is adjusted) to the wrong direction. This may lead to dissatisfaction with the system. How much should There is no clear The adjustable range The dial is turned the dial be turned to answer, because that of room temperature (i.e. the room get the desired effect depends on the may be understood temperature set on room characteristics of completely wrong. point is adjusted) temperature? cooling/heating For example, it may too little or too system and the be thought that the much. This may lead current conditions. room temperature is to unnecessary adjustable with a adjustments and very large range. dissatisfaction with the system.

Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat

547

Table 1. (continued) 6

(After adjusting the thermostat), Has the room temperature changed to the desired level or is it still changing?

A red light means that the system is increasing the room temperature at the moment. The green light means that the room temperature is decreasing. A blank light means a stable situation.

The light is not recognised at all, or the meaning of the light symbol is not understood.

The user may think, for example, that the room temperature has reached the new level, although it is still changing. This may lead to unnecessary adjustments and dissatisfaction with the system.

All the misunderstandings presented in Table 1 are real and are taken from the contextual interviews with the occupants. Most of the misunderstandings are common in practice. The analysis explains that bad user interface design may easily lead to dissatisfaction with the system or disuse of the system. The analysis also shows that a lot of information is needed to use a simple thermostat. From the viewpoint of the analysis, it is not surprising that office occupants have serious problems with the thermostats and that the significance of the thermostats on thermal comfort is low, as it was found [1]. Some of the information (Table 1) can be gathered by trial and error. For example, the meaning of "+" and "–" should be easy to learn by experience. However, it is clear that office occupants need instructions for use of the thermostats. Occupants need to understand, for example, whether the thermostat is connected to a cooling or heating system. If the thermostat is connected to a cooling system, the user of the thermostat needs to know when the cooling system is active. It is, however, unrealistic to suppose that office occupants would spend their valuable time on learning the way in which the building works.

4 Incoherent Mental Models Norman [3] distinguishes three aspects of mental models: the design model, the user’s model and the system image. The designer creates the image of the system, the visible part of the system (the user interface including the labels and the documentation), according to the design model. The user is confronted with the system image. The user acquires all knowledge of the system from the system image, and the user’s model is the way the user perceives the system operates. We can drive a car without an understanding of how it actually works. Similarly, we should be able to use the thermostats with only limited knowledge of the cooling and heating systems. Unfortunately, the system image of the thermostat in Fig. 1 does not deliver the information that is needed to operate the thermostat. The design model is not similar to the user’s model as it should be. The designer has not had a realistic view of the users but has supposed that office occupants have knowledge they do not have in reality. Misunderstandings with thermostats have earlier been reported by Kempton [2]. He analyzed folk theories for home heating control and found two common theories of

548

S. Karjalainen

how a thermostatic valve works: a feedback theory and a valve theory. In the feedback theory a thermostat senses room temperature, but in the valve theory that is not understood and a thermostat dial is like a gas pedal and controls the amount of heat.

5 Improving the Design of the Thermostat The analysis shows that for the successful use of the thermostat the user must have a lot of knowledge. Although room thermostats are common in offices, the office occupants do not have that knowledge. This had lead to a situation where the thermostats are used very little. It is clear that the user interface of the thermostat in Fig. 1 could be improved considerably. At first, the thermostat should clearly present its purpose. Identifiability can be enhanced by symbols that refer to temperature, e.g. a degree sign, a thermometer, or red and blue colours (denoting warm and cool). Many of the problems users have could be avoided by just two more modifications to the user interface. If the thermostat had an understandable temperature scale, there would be fewer problems in adjusting the thermostat. Another main improvement concerns the feedback the thermostat gives after a user adjustment. Users need to know whether the system is working to fulfil the request. The feedback is especially important since the rate of room temperature change is slow, because of the thermal inertia of the building materials and the cooling/heating system itself. Many thermostats do not give any feedback, but the thermostat in question shows a light symbol when the room temperature is changing. However, the light symbols are not intuitively understandable but learning is needed to understand their meaning. The feedback should be presented more clearly (in one way or another), for example, by arrow symbols that show the direction of the temperature change.

6 Conclusion and Future Work Even a simple device can be very difficult to use if the user does not have the information needed for the use of the device. The designers often overestimate the knowledge the users have, and that overestimation leads to usability problems and dissatisfaction with the system or even disuse of the system, which is the case with the room thermostat analysed in this paper. No specific usability guidelines are available for room temperature controls in the literature. In future work I will concentrate on developing such a guideline. Acknowledgments. I thank Raino Vastamäki for the picture of the thermostat in Fig. 1.

References 1. Karjalainen, S., Koistinen, O.: User Problems with Individual Temperature Control in Offices. Building and Environment (In Press) 2. Kempton, W.: Two Theories of Home Heat Control. In: Quinn, N., Holland, D.C. (eds.) Cultural Models in Language and Thought. pp. 222–242. Cambridge University Press, Cambridge (1987) 3. Norman, D.A.: The Design of Everyday Things. Basic Books, New York (1988)

Usability Improvements for WLAN Access Kristiina Karvonen and Janne Lindqvist Department of Computer Science and Engineering, Helsinki University or Technology, P.O. Box 5400, 02015 TKK, Finland {Kristiina.Karvonen, Janne.Lindqvist}@tml.hut.fi

Abstract. Wireless Local Area Networks (WLANs) have become commonplace addition to the normal environments surrounding us. Based on IEEE 802.11 technology, WLANs can now be found in the working place, at homes, and in many cities’ central district area as open or commercial services. These access points in the public areas are called “hotspots”. They provide Internet access in various types of public places such as shopping districts, cafés, airports, and shops. As the hotspots are being used by a growing user base that is also quite heterogeneous, their usability is becoming evermore important. As hotspots can be accessed by a number of devices differing in their capabilities, size, and user interfaces, achieving good usability in accessing the services is not straightforward. This paper reports a user study and usability analysis on WLAN access to discover user’s needs and suggest enhancements to fight the usability problems in WLAN access. Keywords: WLAN, Usability, user interface design, security, accessibility, authentication.

1 Introduction Wireless Local Area Networks (WLANs) have become commonplace addition to the normal environments surrounding us during the last few years, turning these environments into truly ubiquitous. Based on IEEE 802.11 technology [16], WLANs can now be found not only in the working place, but also at homes, and in many cities’ central district areas as open or commercial services. These access points in the public areas are called “hotspots”. They provide Internet access in various types of public places such as shopping districts, cafés, and airports [25]. As the hotspots are being used by a growing user base that is also quite heterogeneous by nature, their users will differ greatly in their capabilities and level of understanding of the services and network connectivity available. This means that the usability of these access points is becoming evermore important to provide truly universal access to everyone. Furthermore, being able to understand how secure the WLAN connections are at each time as well as how to authenticate onto the service, may present major obstacles to the actual accessibility of the provided services to many potential and actual users of these connections. As [25] point out, users will demand improvements in the simplicity, interoperability, reliability, usability, and quality of experience when interacting with these services. At the same time, they J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 549–558, 2007. © Springer-Verlag Berlin Heidelberg 2007

550

K. Karvonen and J. Lindqvist

expect their security be in place and privacy protected, and be in control of what information is disclosed of them. A further difficulty to providing easy-to-use WLAN access is caused by the fact that hotspots can be accessed by a number of devices differing in their capabilities, size, and user interfaces. As a result, achieving good usability in accessing the WLAN and the services it can provide is by no means straightforward. In this paper, we will look into the current work done in this area, covering both relevant user studies done on hotspot, as well as other types of WLAN access, such as public WLAN service or private home WLAN access to discover user’s needs and current usage of the access points, and usability work done to enhance the current solutions. We will also present and evaluate the methodologies used to study the usability of the hotspots and other types of WLAN access and the usability issues in the controllability and visualisation embedded in current approaches. The main body of this work consists of a report on a user study and usability analysis conducted for determining the current level of usability of WLAN hotspot access. The work covers a representative selection of earlier usability work done in this area or an area related to it: the relevant user studies, and the usability studies of existing solutions and UIs. It will also look into how various usability methods have been applied to the study of the usability of WLAN hotspot access and discuss on their feasibility. The novelty of the work lies in that it not only covers both the usability of the access points themselves, and the usability issues in some of the most probable end devices to be used to access the WLANs, but also seeks to point out the usability of security issues involved in this access. The paper is organized as follows: First, we will give a short presentation of usability and user-centred design in general in regard to mobile usage situations with small devices. We will proceed by presenting the relevant work done in this area, and discuss the state of the existing usability work. We will then present our user field studies, where we searched for and located publicly visible WLAN access points in several locations in two cities in Finland. Since providing access means dealing with users’ privacy and security issues also, we will complement the analysis by a short discussion on the privacy and usability of security in the area of WLAN access.

2 Background 2.1 Security Issues in WLAN Access The standardized way to secure WLAN access is based on the link-layer: the radio traffic between the access point and the user’s device is encrypted. The first version of the WLAN link-layer security architecture – Wired Equivalent Privacy (WEP) [16] – was proven to be insecure [12] and a working attack tool was quickly implemented and published [33]. Today, there are free easy-to-use attacking tools downloadable from the Internet for e.g. Windows and Linux that can passively break the WEP protection in few seconds [2]. The WLAN vendor community decided to solve the problem without the IEEE standardization body and formed an alliance to correct the problem. The result of the alliance was Wi-Fi Protected Access (WPA) which corrects the deficiencies of WEP. The standard security scheme is IEEE 802.11i [17] which is

Usability Improvements for WLAN Access

551

also known as WPA2. Despite that the WEP is very insecure, it is still widely used for backwards compatibility reasons. Since the radio network is a shared medium, anyone in the proximity (with clear and unobstructed space, the WLAN signal can reach more than 500 meters) of the WLAN network can receive the broadcasted traffic. This allows for a technique called wardriving [8]. The attacker merely drives around the city looking for open or vulnerable networks. The equipments needed are again software downloadable from the Internet [4] and a laptop or even a PDA. A community of security professionals and hobbyists gathered world wide data, which revealed that out of 22 8537 access points found 61.6 % were not using link-layer encryption at all [3]. In addition to link-layer encryption, WLAN networks are secured by two common ways: MAC address based and application layer, usually Web based, authentication. The MAC address based authentication is used to allow only known computers to the network. However, it practically provides protection only against benevolent network visitors. Attackers can just eavesdrop that which MAC addresses can access the network and reconfigure their WLAN devices accordingly. Web based authentication is used to authenticate users to the network. The network might not even use link-layer encryption; instead the authentication process is secured with TLS as are secured Web sites. Before the connection is authenticated, all traffic originating from the unauthenticated WLAN device is forwarded to the authentication web page. On the page, the user is required to give a correct user name and password, and is then given access to the Internet. It is also common to bind the authentication to the particular MAC address of the device that performed the authentication. This results in problems that we give in the discussion section. 2.2 User Studies and Usability Testing of WLAN Hotspot Access The usability work done in areas related to usability of WLAN access fall under several subcategories of work done in the field of Human-Computer Interaction (HCI). These areas include work done in the areas of personal computing, pervasive computing, wireless computing, mobile HCI, and, of course, usability of WLAN access. However, most user studies that we know of focus on Hotspot/WLAN access in a specific area, most notably in university campuses, such as [14],[15],[24],[30], [31], building-wide local-area wireless network [34], or a scientific conference [6]. This work does not, however, really give insight to understanding the usability issues embedded in accessing public APs on the fly, since the users were not at all mobile in their usage behaviours. [9] have tracked how public APs were utilized in Manhattan, N.Y., finding some similarities with the campus studies: the users were repeatedly using the same APs, another proof of the relative “immobility” of the subjects. However, finding the APs was again not a part of this study. [5],[7] have identified the current challenges and future directions for wireless hotspots. They mention, among other things, “device-independence” as one key goal in enhancing the usability of WLAN access – the need for this became obvious in this study also. Further challenge is presented by defining the identity for user when accessing hotspots – which attributes to use to preserve privacy at the same time preserving accountability. Existing solutions for easy-to-use WLAN hotspot access include the FriendZone usage study [10], which was however not tested with real users. Another interesting

552


piece of work is presented by [32], who bring up the issue that in mobile usability in general, the work on understanding the interaction between device and its physical environment is still scarce and better utilization of geographical information could prove beneficial to locating and naming of WLAN APs also. [31] deal directly with how hotspots are currently found, the key outcome being that “word-of-mouth” was currently serving as primary information source about the location of available hotspots. They identify several usability problems in finding WLAN APs. These include the locating of distant networks, notification of new hotspots, finding and accessing the strongest signal (an approach that could easily be used for malicious purposes also as reported in [29]), and getting information about hotspots. [11] write interestingly about the free WLAN access as part of wireless commons, and what means to use to prevent its misusage. The usability methods applied to studying WLAN hotspot access usability include the usage of questionnaires, as e.g. in [20], [21], usage tracking and analysis, e.g. [5], [10], [24], and observations and interviews of users [31]. On basis of the related work, there are several areas in WLAN usability where we several usability issues can be detected. These are: • The multitude of devices, differing on form, input mode, processing power, battery life, and screen size/resolution/colour depth [7], [32]. • The relative immobility of users in how the hotspots are currently used and why this is so [9], [26], [31], and how hotspots can be found [7], [31] • Location privacy, tracking, transactions [1], [13], [15], [21], [23] • Tradeoffs between usability and security. E.g. according to [23], users require that their transactions over public WLANs be safe, yet they want seamless, automated roaming without need for manual sign-on. What form of authentication is best from usability point-of-view [7]? [25] concentrate on privacy enhancements AP usage. Further, work on finding out about home users’ network access behaviours can be found in e.g. [19], who have evaluated the usage behaviours and UI expectations in a smart home environment with several users over extended period of time (6 months), where part of the network access was via WLAN. Three devices, PC, mobile phone, and a media terminal, were tested, and UI prototypes for these devices were designed and adjusted according to user feedback. It became clearly evident that the user expectations for each device were different, mobile phone becoming the most used device to control the smart home functionalities despite initial reluctance and suspicion towards it as suitable for operating the home. These results may have repercussions for the work at hand, since the initial resentment of small terminal of the mobile phone was later overcome and this device was preferred. In practice this means that users were willing to negotiate usability for mobility and personal possess in the actual usage.

3 The Usability Study 3.1 Test Setting We searched for and located publicly visible WLAN access points in several locations in two cities in Finland. The discovered access points were of several different nature:


553

part of a publicly available WLAN provided by the city; WLAN access offered by a private vendor, such as a café, as free service; company WLAN offered for outside visitors; private home WLAN, and WLAN offered by a public institute, such as a local university. The user studies were done by two researchers, one security expert and one usability expert, with three types of end devices: an Apple iBook G4 laptop with Mac OS X operating system, a Nokia 770 series PDA device, and a Nokia 9500. 3.2 Test Procedure A cognitive walkthrough method was used to simulate the steps and mindset of a actual mobile user, who would be moving within a district, trying to find and utilize existing WLAN APs. The cognitive walkthrough method is a well established usability methodology that has been effectively used e.g. in the classical study by Whitten et al [36]. As further methodology, expert analysis, consisting also a heuristic analysis as reported e.g. by [27], was used to analyse the usability problems detected during the usage. The two test persons would test different locations, dispersed in the capital city region in Finland in two cities, Helsinki and Espoo. The searching and accessing procedures were repeated in each location with at least two different devices, in most cases with all three.

4 Discussion On basis of the study, we were able to detect several generic usability problems in how the current access points are provided and visualised to the end users regardless of the device used. These usability problems include the naming of the WLANs available, their actual availability, visualisation of the security and access possibilities of the WLANs, as well as usability problems arising from controlling issues in managing the connections due to the dynamism of the WLAN search functionalities in the tested devices. Next, we will discuss each of these generic issues in detail. Naming. In the study, it became obvious that there were no standard and intuitive ways to name the various WLAN available. For the most, the WLANs were named either according to the service provider (the company; the university; the city), according to the manufacturer of the WLAN device used (Linksys; Motorola), or according to the WLAN owner (“pete”), or a generic location name (“home network”; “home base”). Possible better naming policies could be induced from [22], where users were asked to name locations in a mobile use situation. The name classifications that came up in the study included 1) generic locations, 2) point of interest and 3) geographical areas. Since these may be the natural and intuitive location names for users, even if the focus of this work is on presence information, utilizing these categorizations in WLAN AP naming might prove beneficial to the overall usability of WLAN access and its understandability. Actual availability. The found WLAN access points were often not really available. Also, the lists were not updated according to what APs were currently available.

554


Visualisation of the security and access possibilities. The various types of locks associated with different types of WLAN security features (WEP, WPA) were not intuitive to the users. Further, the WLAN APs were often visually suggestive of being openly accessible, when in reality they were not. In many cases, like in trying to find open access in a public place, it is futile to show users the An improvement suggestion would be to allow user selecting listing only non-secured WLAN APs. Managing the connections list. The tested devices had a built-in feature of searching for available access points constantly, regardless of user actions conducted at the same time. It was not possible for user to stop the search when a desired AP had been found. It was also not possible to organize the list of access points in any way, except by naming policy. The only possible listing that was easily available (or at all?) to the users was alphabetical listing on basis of the default of user-specified name of the AP. In many cases, this was the worst order, since the list contained all WLAN APs added to the list at any point during usage – including APs found in another city, for example. A usability improvement would include more advanced ways to arrange the APs listed, according to preference and most recently used aspects, for example. Further, the search should be stoppable – and restartable – by user command. In addition to the generic usability problems discovered, with each end device tested, there were also several usability issues specific to each device adding to the usability issues embedded in the WLAN access itself, including how the established connections are established, shown, maintained, managed and accessed via the devices. Nokia 770. The Nokia 770 clearly was easiest to use for WLAN access, which is natural since WLAN is its major connectivity type among other NFC such as Bluetooth connections via mobile phone. Access to WLAN was rather straightforward, with a one-step access from main screen to the connections. However, changing the connectivity settings was accessible only via control panel, not from the connection manager directly. A clear improvement would be to allow connectivity editing also from the connection manager directly. Further, the dynamic search process with no ‘pause’ possibility made the managing of the found APs very low in usability, since the list of connections was changing constantly. Adding a ‘pause’ button to search would benefit the experienced usability of the search and list handling. Further, the once established and saved APs were listed as a singular list in alphabetical order. Because of this, as first items in the list could appear APs that were not accessible at the time. A clear improvement would be to include multiple ways to organize the list of APs according to e.g. recent use, location, currently available, etc. Further, even if the device was showing via an icon, whether the AP was in fact reachable, the user was able to connect to any AP, go all way through with the process, get an acknowledgement for successful connection, and only when opening e.g. a web browser get a notice of failed network connection. A usability improvement would be not to able connections to APs that the system has detected as unreachable. Nokia 9500 (Communicator). Nokia 9500 Communicator offers two ways to start using a WLAN access network. The main screen shows a white W sign on the left if WLAN access is available. The first is "EasyWLAN", where the user is shown a list


555

of available access points. However, the list does not provide any information of the access point, only the name of currently available access points. To get more information, the user must explicitly configure the access point with the process shown in Figure 1. After the access point has been configured, the user can start e.g. a Web browser. The user is prompted to "Select an access point". By default, the list of available network connections shows also list of e.g. GPRS and other configured network access options. The user thus must remember where to connect.

Fig. 1. The multiple steps required for handling WLAN access in Nokia 9500

With this device, the steps required for managing and starting the WLAN access was quite cumbersome, consisting of multiple non-straightforward steps. Further, the language and terminology used in the UI was quite technical, thus effectively diminishing an average user’s capability for any WLAN access management. A clear improvement for the usability for WLAN access with Nokia 9500 would, then, include at least changing the UI language to more user-friendly, as well as cutting down the steps required to form a WLAN connection in the first place. Further, major visualization enhancements would be desirable. The current icons used for actual access and for the strength of the signal will probably be incomprehensible for most users, especially if WLAN usage is infrequent no learning effect can be expected. Apple iBook G4 laptop with Mac OS X. Maybe one of the biggest problems with the laptop, besides the obvious fact that a laptop size connection device is not truly feasible in the use situation described in this study, was that it would find only small percentage of the available APs at each location, as compared with the other two devices. This was eating on its reliability as providing WLAN access in the first place, and leave the user frustrated and without any connections, since they would not

556


be found. In usability there is a saying, “if functionality is not found by the user, it doesn’t really exist”. This truly holds for trying to get WLAN access with a laptop. A further difficulty was presented by the change of the connecting device. In the case of restricted access with a temporary username and password to a publicly available WLAN access such as Helsinki city public wlan, once the user had logged into the system with one device, it was not possible to change the end device. In the case the batteries would run out or the user would decide to change to a device that would be better suited for browsing the available services, the initial wrong choice would effectively stop the user from accessing the service at all, since it was not possible to log out of the service. 4.1 On the Privacy and Usability of Security in the Area of WLAN Access The Privacy Enhancement Technologies (PET) address four essential requirements for privacy. These are: anonymity, pseudonymity, unlinkability, and unobservability [25]. [25] describe the current state of the privacy protection in existing solutions for WLAN access as “complex, non-adaptive, intrusive for the user and not contextaware”. Further, these solutions use only a very limited set of possible user identification parameters for accountability, like ID, address, or location. In their approach, medical information of the user, collected from a body sensor network is protected by automatic filtering when user is using a hotspot service. However, the system described also allows for user control by enabling the user to assign preferred levels of privacy to the data. The different aspects of the data the privacy of which needs to be protected include location, context, identity of the user, and private information available. The user of the AP should, then, be able to choose to reveal these different types of information of himself in different amounts and different combinations in different situations – in an easy fashion.

5 Conclusions On basis of the usability analysis presented, it is clear that there are several serious usability issues in the current UIs for handling WLAN access management. On basis of the analysis, we are in the process of implementing the suggested usability improvements on Nokia 770 and then intend to do extensive usability testing with real users on the new design.

References 1. Ackerman, M.S.: Privacy in pervasive environments: next generation labeling protocols. Pers. Ubiq. Comput. 8(6), 234–240 (2004) 2. Anon: [Aircrack-ng]: Referenced 15.2.2007 (2007) Web page http://www.aircrackng.org/doku.php 3. Anon: The Official WorldWide Wardrive (2007) Referenced 16.2.2007 Web page available at http://www.worldwidewardrive.org/wwwdstats.html 4. Anon: Wardriving Tools, Wardriving Software, Wardriving Utilities, (2007) Referenced 16.2.2007 Web page available at http://www.wardrive.net/wardriving/tools


557

5. Balachandran, A., Voelker, G.M., Bahl, P.: Wireless hotspots: current challenges and future directions. In: WMASH ’03: Proc. of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots, pp. 1–9. ACM Press, New York (2003) 6. Balachandran, A., Voelker, G.M., Bahl, P., Rangan, P.V.: Characterizing user behavior and network performance in a public wireless lan. In: SIGMETRICS ’02: Proc. of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pp. 195–205. ACM Press, New York (2002) 7. Balachandran, A., Voelker, G.M., Bahl, P.: Wireless hotspots: current challenges and future directions. Mob. Netw. Appl. 10(3), 265–274 (2005) 8. Berghel, H.: Wireless infidelity I: war driving. Comm. ACM. 47(9), 21–26 (2004) 9. Blinn, D.P., Henderson, T., Kotz, D.: Analysis of a wi-fi hotspot network. In: Proc. of the 2005 workshop on Wireless traffic measurements and modeling, pp. 1–6. USENIX Association, Berkeley, CA, USA (2005) 10. Burak, A., Sharon, T.: Analyzing usage of location based services. In: Proc.of Human factors in computing systems, pp. 970–971. ACM Press, New York (2003) 11. Damsgaard, J., Parikh, M.A., Rao, B.: Wireless commons perils in the common good. Commun. ACM 49(2), 104–109 (2006) 12. Fluhrer, S., Mantin, I., Shamir, A.: Weaknesses in the Key Scheduling Algorithm in RC4. In: Vaudenay, S., Youssef, A.M. (eds.) SAC 2001. LNCS, vol. 2259, Springer, Heidelberg (2001) 13. Gruteser, M., Grunwald, D.: Enhancing location privacy in wireless lan through disposable interface identifiers: a quantitative analysis. In: Proc. of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots, pp. 46–55. ACM Press, New York, USA (2003) 14. Henderson, T., Kotz, D., Abyzov, L.: The changing usage of a mature campus-wide wireless network. In: Proc. of the 10th annual international conference on Mobile computing and networking, pp. 187–201. ACM Press, New York, USA (2004) 15. Hong, J.I., Ng, J.D., Lederer, S., Landay, J.A.: Proc. of the 2004 conference on Designing interactive systems: processes, practices, methods, and techniques, pp. 91–100. ACM Press, New York, USA (2004) 16. IEEE: 802.11-1999 Information technology. Telecommunications and information exchange between systems- Local and metropolitan area networks. Spec. req. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE New York (1999) 17. IEEE: 802.11i-2004 IEEE Standard for Information technology- Telecommunications and information exchange between systems- Local and metropolitan area networks. Spec. req. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications Amendment 6: Medium Access Control (MAC) Security Enhancements. IEEE New York (2004) 18. Kanter, T.G.: Going wireless, enabling an adaptive and extensible environment. Mob. Netw. Appl. 8(1), 37–50 (2003) 19. Koskela, T., Väänänen-Vainio-Mattila, K.: Evolution towards smart home environments: empirical evaluation of three user interfaces. Personal Ubiquitous Comput. 8 (2004) 20. Lederer, S., Mankoff, J., Dey, A.K.: Who wants to know what when? privacy preference determinants in ubiquitous computing. In: CHI’03: extended abstracts on Human factors in computing systems, pp. 724–725. ACM Press, New York, USA (2003) 21. Lederer, S., Hong, I., Dey, K., Landay, A.: Personal privacy through understanding and action: five pitfalls for designers. Pers. and Ubiq. Comp. 8(6), 440–454 (2004)

558


22. Lehikoinen, J.T., Kaikkonen, A.: PePe field study: constructing meanings for locations in the context of mobile presence. In: Proceedings of the 8th Conference on HumanComputer interaction with Mobile Devices and Services MobileHCI ’06, vol. 159, pp. 53– 60. ACM Press, New York (2006) 23. Matsunaga, Y., Merino, A.S., Suzuki, T., Katz, R.H.: Services: Secure authentication system for public WLAN roaming. In: Proceedings of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots WMASH ’03, ACM Press, New York, USA (2003) 24. McNett, M., Voelker, G.M.: Access and mobility of wireless pda users. SIGMOBILE Mob. Comput. Commun. Rev. 7(4), 55–57 (2003) 25. Mitseva, A., Imine, M., Prasad, N.R.: Context-aware privacy protection with profile management. In: Proc. of the 4th international Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, pp. 53–61. ACM Press, NY (2006) 26. Nicholson, A.J., Chawathe, Y., Chen, M.Y., Noble, B.D., Wetherall, D.: Improved access point selection. In: Proc. of the 4th international conference on Mobile systems, applications and services, pp. 233–245. ACM Press, New York (2006) 27. Nielsen, J.: Usability engineering. Academic Press, Inc. Boston, USA (1993) 28. Palmieri, A., Sigona, F.: A QoS management system for multimedia applications in IEEE 802.11 wireless LAN. In: Proc. of the 5th international Conference on Mobile and Ubiquitous Multimedia MUM ’06,, vol. 193, ACM Press, New York, USA (2006) 29. Potter, B.: Wireless hotspots: petri dish of wireless security. Comm. ACM 49(6), 5 (2006) 30. Päykkänen, K., Räisänen, H., Isomäki, H.: Mobile studying and social usability on a wireless campus. In: Proc. of the 8th Conference on Human-Computer interaction with Mobile Devices and Services, vol. 159, pp. 269–270. ACM Press, New York (2006) 31. Roto, V., Laakso, K.: Mobile guides for locating network hotspots. In: Workshop on HCI in Mobile Guides (2005) 32. Ryan, C., Gonsalves, A.: The effect of context and application type on mobile usability: an empirical study. In: Proc. of the Twenty-eighth Australasian conference on Computer Science, Australian Computer Society, Inc. pp. 115–124 ( 2005) 33. Stubblefield, A., Ioannidis, J., Rubin, A.D.: Using the Fluhrer, Mantin, and Sharmit Attack to Break WEP. In: Proc. of the Network and Distributed System Security Symposium, Internet Society (2002) 34. Tang, D., Baker, M.: Analysis of a local-area wireless network. In: Proc of the 6th annual international conference on Mobile computing and networking, pp. 1–10. ACM Press, NY, USA (2000) 35. Venkatesh, V., Ramesh, V., Massey, A.P.: Understanding usability in mobile commerce. Commun. ACM. 46(12), 53–56 (2003) 36. Whitten, A., Tygar, J.D.: Why Johnny Can’t Encrypt: A Usability Evaluation of PGP 5.0, In: Proc. of the 8th USENIX Security Symposium, USENIX (1999)

A New Framework of Measuring the Business Values of Software In Ki Kim1, Beom Suk Jin2, Seungyup Baek3, Andrew Kim4, Yong Gu Ji3,∗, and Myung Hwan Yun1 1

Department of Industrial Engineering, Seoul National University Department of Industrial and information Engineering, Yonsei University 3 Department of Industrial Engineering, Pennsylvania State University 4 Ubiquitous Computing Laboratory, IBM Korea {lookat2, mhy}@snu.ac.kr, {kbf2514jin, yongguji}@yonsei.ac.kr, [email protected], [email protected] 2

Abstract. A new framework for measuring the business values of software is presented. The business values of software are categorized to two groups: tangible- and intangible-benefit. An implicit approach is used to quantitatively measure the intangible benefit of software by introducing two concepts, product attribute and quality attribute. The approach can relate the quantitative value from the usability test into the qualitative, intangible benefits of software. As an example, the proposed framework is applied to a software system in the development stage. We demonstrate the capability of the framework to quantitatively measure the intangible benefits of software as well as the tangible benefit by studying the usability test. Keywords: Software, Business value, Product attribute, Quality attribute, Usability test.

1 Introduction In the development stage of business application software, many project managers need to minimize the risk of the failure of the investment. To do that, they usually conduct project reviews by assessing the potential benefit of the software usability in terms of business value. There have, however, been many qualitative explanations of the potential benefit, which are somewhat vague in essence. The one reason is that the potential benefits of software usability are relatively harder to be quantified in comparison to the cost. In addition, it is difficult to find hard evidence to support the common sense expectation that the ease-of-use of a software leads to improved productivity or specific ease-of-use characteristics that truly make the software easier to use for a majority of users, as cautioned by Fried [1]. Furthermore, when problems are recognized after implementation begins, they cost considerably to find and fix them compared to at the stage of requirements and design [2]. ∗

Corresponding author.


560

I.K. Kim et al.

There are, however, few systematic approaches to early evaluate the problems in software engineering [3]. Therefore, a framework is proposed for a quantitative measurement of business value (BV) of software in conjunction with usability test. For this end, an implicit approach is applied by introducing several concepts to relate the quantifiable value of the business value to usability test which can provide tangible data and related financial value. The remainder of this paper is organized as follows. Section 2 briefly reviews some related works. In Section 3, details of the proposed framework are described. The proposed framework introduces an approach to quantify the intangible value of benefits of business values by introducing two concepts, quality attribute and product attribute, and utilizing them to relate the tangible data from the usability test into the intangible value. Section 4 presents and discusses the results obtained on quantitatively measuring the business value for an example of software, named Dsoftware, which is in the development stage. The software system has been developed to support the developers of RFID application by providing a way of visual representation, namely Graphical Composition Language (GCL). Finally, the conclusive remarks are drawn in Section 5.

2 Literature Reviews According to Karat [4], usability engineering can help developers produce marketable products that will be useful to all the organization, the users, and the customers of their products’. In this paper, we take the assumption that usability itself has significant advantages in the business perspective. The assumption leads to develop metrics for quantifying of software usability. The standard economic model, which is one of the traditional methods to estimate the value of software, assumes that software consists of various design attributes [3]. Under this assumption, the economic model compares costs and benefits of the implementation properties to estimate the values of the design attributes. This model, however, requires thorough and long-term observation of trajectory in sales and cost structure to estimate the value of software. In the competitive industry environment today, this is often not practical and not possible. Thus, there are suggested several alternative methods to analysis the cost and benefit of software attributes in the information system [5]. Mantei and Teorey [6] tried to tie human factors into the software development lifecycle. They calculated tangible benefit and cost of applying human factors approach into the software lifecycle by using the results of task analysis and usability test. They found out that even on small scale, the task analysis and the usability test can reduce a great amount of time and efforts required to estimate the total tangible benefit and cost. In addition, the usability test which is performed prior to the release of the product to market also enables the stakeholders to understand the potential intangible benefit and cost. (See [6] for details of the intangible benefit and cost) Krishnan [7] utilized two concepts, QA (Quality Attribute) and PA (Product Attribute), to quantify the benefits from the result of task analysis and usability test. QA can be regarded as abstract values such as capability and reliability perceived by customers from a number of PAs. QA can be defined conceptually and the relative

A New Framework of Measuring the Business Values of Software

561

importance of each of QAs can be estimated from the business and organization’s perspective. There is, however, no predefined set nor unique importance values for the QA. The reason is that new QA can be always introduced with the change of market environment, which can accordingly modify the original relative-importance of QA. According to Bachmann and Bass [8], QA is incurred from multiple interactions between multiple PAs and it is reflected through specific User Interfaces. Contrary to QA, PA is concrete and tangible, which can be component and/or function of software. Individual PA is closely related to the result of the usability test.

3 Methods In this section, a framework is proposed for a quantifiable evaluation of business value of software in conjunction with task analysis, questionnaire and usability test. 3.1 Business Values of the D-Software The survey and interview method are used to select key business-values of the Dsoftware. First, a business value pool for a RFID-related software is surveyed. Next, five factors for the D-software are identified in the list of business values through the interview with a sales expert (table 1). Table 1. Key business values Business Value Proposition Cost savings Tangible benefit (BV1) Ease (BV2) Quality (BV3) Intangible benefit

Flexibility (BV4) Extendibility (BV5)

Operational definition The amount of quantifiable savings in the system development and maintenance phase derived from usability-related factors Potential benefits due to ease of development, ease of use, ease of maintenance derived from usability-related factors Potential benefits in the quality of the system derived from usability-related factors Potential benefits regarding agilely responding to external environments and various requirements of users derived from usability-related factors Potential benefits that supports the system to extend based on components in the light of usability-related factors

To develop a method for the quantifiable assessment of the key business-values (BVi, i=1,..,5), the values are categorized as two kinds of benefits by deciding whether they are financially measurable or not: 1) tangible benefit which is financially measurable and 2) intangible benefit which is not financially measurable. 3.2 Usability Evaluation Framework The D-software is a component based solution to facilitate the RFID application composed of various physical and logical devices such as RFID Tags, Readers, Motion sensors and servers. Since the D-software is in the stage of development and revision, it is difficult to estimate and quantify the potential benefit of the software system due to lack of information such as sales data and feedback from market. Thus,

562

I.K. Kim et al.

the usability test and task analysis will be conducted to quantify the benefit of the Dsoftware. A hierarchical structure model by Seffah et al. [9] is utilized to quantify a user’s performance in modeling of a RFID application using the D-software (fig. 1).

Fig. 1. Usability evaluation framework

From the usability test including small task using the D-software, we can measure time, error and subjective response directly coming from the D-software. It is important to divide the whole process of use into sub tasks matching each PA respectively in the sufficient sub-level. For the usability test, a use case is selected, which is utilized in many commercial RFID-based solutions for supply chain management. 3.3 Measuring the Tangible Benefits of Software For tangible benefit, at first, two categories in the standard cost structure of a software development are considered as the main factors of the cost to develop a RFID application: 1) development cost and 2) maintenance cost. In the estimation method of development cost, there are estimating method of developing scale and number of people and period. Using the estimated number of people and period, total development cost (Cdevelopment) is computed as follows: C direct labor = C engineer × N people × N day × N month C over head = C direct labor × w over head

C technology = (C direct labor + C over head ) × w technology

(1)

C development = C direct labor + C over head + C technology

where Cdirect labor means the direct labor cost, Cengineer the unit cost of engineer payment, Cover head the overhead expenses, Ctechology the technology cost, Npeople the number of people, Nday the number of average working days, Nmonth the number of month, wover head the ratio of the Cover head to the Cdirect labor ranging from 110 to 120%, and wtechnology the ratio of the Ctechology to the sum of the Cdirect labor and Cover head ranging from 24 to 40%. The maintenance cost per year (Cmaintenance) is estimated using MD (Maintenance Difficulty) and Cdevelopment. The maintenance difficulty (MD) is computed using TMC (Total Maintenance Complexity) which is calculated by measuring frequency of maintenance, frequency of data manipulation, interconnectivity to other system,


563

required knowledge, and divided transaction. Then, the Cmaintenance induced by a RFID application is computed as follows: ⎛ [TMC] ⎞ MD(%) = 10 + ⎜ 5 × ⎟ 100 ⎠ ⎝ C maintenance = C development × MD

(2)

3.4 Measuring the Intangible Benefits of Software Benefit of PA comes from two different aspects - product and Task (T). Assuming causal relationship between PA and usability test, the benefit of PAj can be expressed as follows:

∑ (U Tk × X jk ) K

j B PA =

k =1

J

K

∑ ∑ X jk

(3)

j=1 k =1

j represents the benefit of PAj, J and K are the number of PA and Task, where B PA

respectively, U Tk refers the increased usability in a task Tk, which means increased effectiveness, saved time, or increased subjective satisfaction during conducting Tk, and Xjk is the random variable representing the relevance between each PAj and Tk. The random variable is defined as follows: ⎧1 if PA j is fully used during a task Tk X jk = ⎨ ⎩0 otherwise

(4)

Since the reference point of comparison is before the implementation of PAj, U Tk is expressed in terms of percentage. When PAj is partially used during a task Tk, it is highly recommended that Tk should be divided further into subtask Tk1 and Tk2 . Thus, either of the subtasks can be attributed to PAj. After all, the benefit of PAj is the averaged sum of error reduction, time savings or marginal satisfaction for each partitioned task involving PAj. The task analysis works here to divide the whole tasks into sub-tasks involving specific product attributes. From the estimated benefit of PAj (j=1, …,J) in Eq. 2, the intangible benefit of each of QA is calculated in the form of the linear combination as follows: j B lQA = ∑ ∑ (B PA × C lj ) J

L

j=1 l =1

(5)

where BlQA represents the intangible benefits of each of QA and C lj means the

contribution of PAj to QAl, ranging from -1 to 1. The contribution is subjectively assessed by the stakeholders. The stakeholder should have a thorough idea of both PA and the concept of QA. The range [-1, 1] is intuitive (negative value mean lessening QA and vice versa) as used by Kazman et al. [10]. Assessed value of contribution is then normalized between 0 and 1. Sometimes the benefit of each QA can be more than 100 because the assessed value of each PA’s contribution is independent each other.

564

I.K. Kim et al.

The benefit of QAl means the total of reduced time, error or increased satisfaction from all the related product attributes. To integrate all the benefit of each QA into the possible intangible benefit, the relative importance of each QA should be assessed from the business perspective. It is advisable that the cross-functional group composed of project managers, sales person, and decision makers is to assess the relative importance of each QA collectively. The estimation of total intangible benefit (TIB) of software comes as follows: TIB = ∑ (B lQA × I lQA ) L

(6)

l =1

L

where I lQA is the relative importance of QAl, so that ∑ I lQA = 1 . l =1

4 Results 4.1 Usability Test and Task Analysis of the D-Software

The usability test includes the evaluation of user satisfaction and performance by questionnaire and measurements on small group test. Especially, the modeling aspect of an RFID application using the D-software is focused in the test. The subject group is composed of 32 under/graduate students in the software engineering, where male are 28 and female are 4. Subjective responses from questionnaire are also collected. For setting the reference point of usability, apart from usability testing, 2 experts who have experiences in RFID programming over 3 years with MS degree in software engineering participated with the same use case without the D-software (using manual coding). We analyzed the usability degree of user satisfaction (qualitative) and performance (quantitative) for understanding the usability degree of the D-software through usability evaluation model. From the observation of modeling with the D-software and manual coding, modeling stages can be categorized as follows from the results of task analysis: 1) Component searching, 2) Component modeling, 3) Component manipulation, 4) Model searching, 5) Error detection & correction and 6) Attaching related library supports. Table 2 below shows the metrics of usability testing. Table 2. Metrics of usability testing Metrics Component searching time (CST) Component manipulation time (SMT) Component connection time (CCT) Model searching time (MST) Number of deleted component (NDC) Frequency of code line deletion (FOCLD) Frequency of copy & paste of previously made codes (FOCP) Frequency of adding description (FOAD)

Description Search time for choosing appropriate component Cost time for arranging component Cost time for connecting each component Cost time that subject searches conceptual model to constructing system Number of deleted component while task completion


565

4.2 Tangible Benefit of the D-Software

The table 3 below shows the consumed time during each modeling stage calculated from measured time and frequency, where τdel means the unit time to delete a component or a line, τcp the unit time to copy and paste a bunch of codes, and τdes the unit time to add one line of description. Table 3. Consumed time during each modeling stage Modeling stages Component searching (CST) Component modeling (CCT) Component manipulation (CMT) Model searching (MST) Error correction (1NDC×τdel; 2FOCLD×τdel) Copy & Paste / Descriptions (2FOCP×τcp+FOAD×τdes)

D-software 372.10 75.66 226.22 77.60 28.21 0

Manual Coding 2646.27 1549.00 1261.17 716.17 10.00 463.00

Total modeling time is computed as the summation of the consumed time of the six modeling stages from the usability test. The value of the total modeling time was 779.79sec (=0.217hr) using the D-software and 6645sec (=1.846hr) using manual coding, respectively. The total modeling time can be used to calculate the savings in the development cost Cdevelopment in Eq. 1. The daily working hour is set to be 8hrs. It is assumed that the decrease in the productivity caused by increasing size of model is relatively low in the D-software modeling, thus not applying to the cost in the D-software. For conservative estimation of savings in cost, we assume that the line of codes of general use case amounts to 30,000 lines of codes. To eliminate the cost effects of programming language, we assume that GCL in the D-software is in the similar level of difficulties with Visual Basic, HTML, Delphi, etc. Also, we assume that relatively little java code is used for the D-software modeling. Then, the development cost Cdevelopment is calculated using Eq. 1. For the overhead expenses Cover head, it usually takes 110~120% of direct labor cost. For the technology cost Ctechnology, it usually takes 20~40% of direct labor cost plus overhead expenses. As a result, the saving percentage in the development cost Cdevelopment by the D-software compared to manual coding by C# is estimated to be greater than 93% in the use case. To estimate the maintenance cost Cmaintenance in the D-software system over manual coding, the total maintenance complexity (TMP) is set to be 35 by the D-software and 40 by the manual coding using C#, respectively. The major difference between the Dsoftware and the manual coding in terms of operation and maintenance is that the Dsoftware requires much less knowledge of hardware devices and the code itself. As a result, TMP in the D-software is 5 points less than that of the manual coding. Here, we assume that number of maintenance is below 4 per year (score = 0), that number of data transaction is below 500,000 per year (score = 10), that interconnection to other system is more than 3 (score = 10), and that divided transaction is in the integrated state (score = 10). We assume that the life cycle of the product is 5 years from the moment of completion of development. The product is purchased at the end of completion of development, which is set to be the 0 year. Maintenance is repeated

566

I.K. Kim et al.

every year except the two years of 0 and 5. The discount rate is assumed to be 6% yearly. The saving percentage in the maintenance cost Cmaintenance by the D-software compared to the manual coding using C# is estimated to be greater than 93% in the use case. 4.3 Intangible Benefit of the D-Software

Before estimating intangible benefit of the D-software, key PA and QA should be identified. Five PAs which mean the specific and representative characteristics incarnated in the D-software are identified by developers: 1) GCL (Graphical composition language) (PA1), 2) State machine (PA2), 3) Component Library (PA3), 4) Deployment (PA4), and 5) Code generator (PA5). The survey and interview method are used to select major QAs. First, candidate quality attributes used and emphasized in similar RFID-related software are collected. Next, four key QAs are identified through the interview with sales expert and project manager. The resulting list of QA is the same with the four business values that consist of the intangible benefit as shown in Table 1. First step is to identify and quantify the relative importance of the QAl, I lQA , for all l in Eq. 6. The I lQA can be imposed from the managerial perspective. That is, the value of each I lQA is obtained from focus group interview (FGI) with a sales manager. According to the sales manager, Ease in the development, use and maintenance (QA1) is emphasized to be the most important among four QAs. Quality (QA2) and Flexibility (QA3) are equally less important than QA1. Last, extendibility (QA4) has the least importance. From the comments, the values of I lQA for l=1,…4, are obtained as follows: Table 4. The relative importance of four qualitative attributes QA

Ease (QA1)

Quality (QA2)

Flexibility (QA3)

Extendibility (QA4)

Relative importance

I 1QA = 0.5

2 I QA = 0.2

I 3QA = 0.2.

4 I QA = 0.1.

Next, the contribution of each of PAj to each of QAl is derived on a scale of 0 to +1 from the subjective evaluation by a programmer. The matrix of J× L, where J = 5 and L = 4 will be derived as follows, which is the contribution matrix C lj for j=1,…,5, and l=1,…4. in Eq. 5. Table 5. The contribution of PA to QA

C lj

QA1

QA2

QA3

QA4

PA1 PA2 PA3 PA4 PA5

0.70 0.70 0.80 0.60 0.90

0.40 0.20 0.10 0.20 0.30

0.70 0.70 0.70 0.60 0.20

0.50 0.20 0.50 0.70 0.20


567

We analyzed the result of user satisfaction (qualitative) and performance (quantitative) for understanding the usability degree of the D-software through the usability evaluation framework. The result calculated by experiments is as follows. In order to simplify the analysis, the set of tasks is assumed to be identical to that of the modeling stages shown in Table 3. Therefore, the increased usability in a task Tk, U Tk , can be computed as follows: Table 6. The increased usability in a task Tk Task Component searching (T1)

Increased usability U 1T = 0.85

Component modeling (T2)

U 2T = 0.95

Component manipulation (T3)

U 3T = 0.82

Model searching (T4)

U 4T = 0.89

Error correction (T5)

U 5T = -1.82

Copy & Paste / Descriptions (T6)

U 6T = 1.00

The random variable Xjk defined by Eq. 4 is shown in Table 7. The benefit of PA, defined by Eq. 3, is computed using the U Tk in Table 6 and Xjk. Table 7. The relevance between PA and T Xjk PA1 PA2 PA3 PA4 PA5

T1 1 0 1 0 0

T2 1 1 1 1 1

T3 1 0 0 0 1

T4 0 1 1 0 0

T5 1 1 0 0 1

T6 0 1 0 1 1

j B PA

0.05 0.06 0.16 0.11 0.06

j The intangible benefit of QA, defined by Eq. 5, is computed using the B PA in

Table 7 and C lj in Table 5. Now, we can quantitatively compute the total intangible benefit (TIB) of software, defined by Eq. 6, using I lQA in Table 5 and B lQA in Table 8. The resulting total intangible benefit (TIB) of the D-software is 0.25. Table 8. The benefit of QA QA

QA1

QA2

QA3

QA4

TIB

B lQA

0.33

0.09

0.27

0.21

0.25

5 Concluding Remarks This paper has presented a framework for measuring the business value of software. An implicit approach that introduce five product attributes and four quality attributes

568

I.K. Kim et al.

allows to quantitatively measure the intangible benefits of software, which have been usually assessed in qualitative ways. Therefore, all kind of business values, which are classified into the tangible- and the intangible benefit, can be quantitatively measured. This can enable project managers to evaluate the project of software development in quantitative way. The result of an example, which tested the D-software in the development stage, showed that the proposed framework can be used to quantitatively evaluate the business value of software. This will help the project managers to reducing the risk of failure of the investment.

References 1. Fried, L.: Nine principles for ergonomic software. Datamation 28(12), 163–166 (1982) 2. Boehm, B., Basili, V.R.: Software Defect Reduction Top 10 List. Computer 34(1), 135– 137 (2001) 3. Scaffidi, C., Arora, A., Butler, S., Shaw, M.: A Value-Based Approach to Predicting System Properties from Design. In: The seventh international workshop on Economicsdriven software engineering research, Missouri, St. Louis, ACM Press, New York (2005) 4. Karat, C.M.: Usability Engineering in dollars and cents. IEEE Software 10(3), 88–89 (1993) 5. Sasson, P.G.: Cost benefit analysis of information systems: a survey of methodologies. ACM SIGOIS Bulletin 9(2-3), 126–133 (1988) 6. Mantei, M.M., Teorey, T.J.: Cost/benefit analysis for incorporating human factors in the software lifecycle. Communications of the ACM 31(4), 428–439 (1988) 7. Krishnan, M.S.: Cost, Quality and User Satisfaction of Software Products: An Empirical Analysis. In: CASCON ’93. Toronto, Ont. Canada Nat. Res. Council of Canada (1993) 8. Bachmann, F., Bass, L.: Introduction to the Attribute Driven Design Method. In: The 23rd International Conference on Software Engineering. ICSE 2001. Toronto, Ont. Canada IEEE Comput. Soc. (2001) 9. Seffah, A., Donyaee, M., Kline, R.B., Padda, H.K.: Usability measurement and metrics: a consolidated model. Software quality journal 14(2), 159–178 (2006) 10. Kazman, R., Asundi, J., Klein, M.: Quantifying the costs and benefits of architectural decisions. In: The 23rd International Conference on Software Engineering. ICSE 2001. Toronto, Ont. Canada IEEE Comput. Soc. (2001)

Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study P. Koutsabasis, T. Spyrou, and J. Darzentas University of the Aegean Department of Product and Systems Design Engineering Hermoupolis, Syros, Greece, GR-84100 Tel.: +30 22810 97100, Fax: +30 22810 97109 {kgp, tsp, idarz}@aegean.gr

Abstract. The paper proposes an approach to comparative usability evaluation that incorporates important relevant criteria identified in previous work. It applies the proposed approach to a case study of a comparative evaluation of an academic website employing four widely-used usability evaluation methods (UEMs): heuristic evaluation, cognitive walkthroughs, think-aloud protocol and co-discovery learning. Keywords: Usability evaluation methods, comparative usability evaluation, case study.

1 Introduction There are many case studies of application of usability evaluation methods (UEMs). However, knowledge about particular methods has not been extensively validated comparatively. Van den Haak et al [17] note for the two widely used UEMs of thinkaloud protocols and co-discovery learning that ‘while there is a substantial body of literature, which describes the methods as a tool for uncovering cognitive processes, hardly any research has focused on comparing the methods as a tool for usability testing’. The value of comparative studies of UEMs is indisputable since that they provide a consolidated understanding based on multiple usability evaluations. However, as noted in [4] ‘researchers find it difficult to reliably compare UEMs because of a lack of standard criteria for comparison; standard definitions, measures, and metrics on which to base the criteria; and stable, standard processes for UEMs evaluation and comparison’. In addition [1] the validity and reliability of results of the various UEMs have not been studied, while and the methods themselves continue to evolve. The paper proposes an approach to comparative usability evaluation that incorporates important relevant criteria identified in previous work ([1] [2] [4] [5] [8] [9] [13] [16] [17]); and applies this approach to a case study of a comparative evaluation of an academic website employing of four widely-used UEMs: heuristic evaluation (HE), cognitive walkthroughs (CW), think-aloud protocol (T-AP) and co-discovery learning (C-D). J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 569–578, 2007. © Springer-Verlag Berlin Heidelberg 2007

570

P. Koutsabasis, T. Spyrou, and J. Darzentas

2 Related Work A comparative usability evaluation involves multiple evaluators or evaluation teams that employ a single or multiple UEMs to carry out parallel evaluations of the same target system. There are few comparative evaluations in HCI literature. Hertzum and Jacobsen [5] present a comparative study concerning eleven UEMs evaluations carried out with three of the four methods studied in this paper, namely CW, HE, and T-AP. Their results show that the average agreement between any two evaluators who have evaluated the same system using the same UEM ranges from 5% to 65%, and no one of the three UEMs is in general more consistent than the others. Unfortunately, Hertzum and Jacobsen could not find studies where heuristic evaluation was performed by evaluators who aggregated the results of their individual inspections to a group output (which is the case for our study). Heuristic evaluations are usually applied by a group of inspectors or users and the individual results are then aggregated [12]. Van den Haak et al (2004) make a comparison of T-AP and C-D to test the usability of online library catalogues. The UEMs were compared upon four criteria of comparison related to digital libraries: number and type of usability problems detected; relevance of the problems detected; overall task performance; and participant experiences. The study involved 80 students. The main result of their study was that the UEMs revealed similar numbers and types of problems that were equally relevant. Molich et al [9] report on the results a comparative evaluation of a single web site (Hotmail) by nine professional teams. The goal of this study was to investigate the consistency of the results obtained. Each team was let alone to select their particular UEM and carry out the evaluation according to their work practices. The results of this evaluation are quite surprising: a large ratio (75% - 232 of 310) of usability problems identified were unique for each team that participated in the experiment, while there were only two usability problems of the target system that were reported from six or more teams. Other comparative evaluations with different foci are presented in [3] [4] [6] and [10]. These comparative studies differ in terms of goals and the criteria used to compare evaluator performance and/or UEMs. Of particular interest for comparative evaluation work is the work of Hertzum and Jacobsen [5] who investigate the evaluator effect in usability evaluations. The term denotes the fact that multiple evaluators evaluating the same interface with the same user evaluation method detect markedly different sets of problems [6]. They [5] propose three generic guidelines to minimize the evaluator effect: ▪ Be explicit on goal analysis and task selection. ▪ If it is important to the success of the evaluation to find most of the problems in a system, then we strongly recommend using more than one evaluator. ▪ Reflect on your evaluation procedures and problem criteria. The work presented in this paper contributes to related work by synthesising a general set of criteria from previous work into a structured approach for comparative usability evaluations. Furthermore, it presents a case study of a comparative usability evaluation that provides various insights about the UEMs employed.

Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study

571

3 A Structured Approach for Comparative Usability Evaluations: Criteria and Process 3.1 Criteria for Comparative Usability Evaluations The criteria that can be taken into account for comparative usability evaluations can be distinguished by whether they refer to the evaluation target or to the UEMs themselves. An example of the first category of criteria is [17] that evaluate a web-based digital library focusing on: layout, terminology, data entry and comprehensiveness. However, criteria that are related to the target system are quite different for systems that follow different user interface paradigms. On the other hand there are also generic criteria that refer to the UEMs and not the target system. Among these (a useful review is provided by [4]), the paper identifies as most important the following: Realness (or relevance) refers to whether a usability finding is a real usability problem or not (or to what degree, i.e. a severe or not important problem). According to [4] the realness of usability findings can be determined by: a) comparing with a standard usability problem list; b) expert review and judgment; and c) by end-user review and judgement. Any approach includes advantages and drawbacks regarding applicability, cost-effectiveness and trustworthiness. In this respect further research includes: severity ratings [11] and combinations of severity and probability of occurrence [15]. Validity (or accuracy) can be defined as the ratio of the number of real usability problems with respect to the total number of findings (i.e. real and ‘false alarms’) for each application of UEM [4] [16]. Thoroughness (or completeness) is identified in [4] and [16] as the ratio of the number of (real) usability problems found by the application of a UEM with respect to the total number of usability problems that exist in the target system. Obviously validity requires that the total number of real problems has been identified through a detailed cross-examination of results produced by all UEMs. Effectiveness. The criterion of effectiveness for UEMs has been synonymous to thoroughness and validity of usability findings by most related work [2] [4] [8]; this is also in line with the definition of effectiveness by the ISO 92412 standard for usability as the ‘accuracy and completeness with which users achieve specified goals’. Thus, the effectiveness of UEMs can be identified as the product of thoroughness and validity [4]. Some related work goes even further to the definition of effectiveness by adding the issue of predictive power of UEMs in relation to the uptake of usability findings by developer teams [7] [8]. The latter perspective has to cope with additional methodological considerations not only about the persuasiveness of usability findings reporting, but also about the nature of usability findings themselves e.g. ‘objective’ usability problems (such as broken links in a web site) are far more likely to be addressed by development teams, rather than ‘subjective’ findings (such as findings related to terminology), which are the most difficult to explain in usability reporting anyway.

572


Consistency has been related to reliability [4] and repeatability [13]. In our work, we use a working definition of consistency of UEMs in terms of repeatability, as the extent to which multiple applications of different usability inspection methods produce ‘reasonably similar’ results. This working definition is similar to the approach of [9]. Again, the need for the identification of means for trustworthy interpretation of the similarity of usability findings is required and may be addressed by the same ways as with the realness problem. 3.2 Essential Process Steps for Carrying Out Comparative Usability Evaluations The set up and carrying out of any comparative usability evaluation needs first of all to ensure that it has controlled as much as possible the aspects of the experiment that are related to the evaluator affect, thus conform with the guidelines proposed by [5]. Furthermore, the processing of results needs to ensure effective decision making about the problems of realness or relevance of the results and about the similarity of the results obtained by the parallel usability evaluations. In order to address the issues above, we propose the following guidelines for comparative usability evaluations: Ensure Commons when Carrying out the Usability Evaluations: A number of issues related to the preparation and carrying out of the parallel usability evaluations need to be addressed uniformly for each evaluation. In particular: ▪ Select evaluators that have a similar level of experience for usability evaluations. This can be achieved by selecting professional evaluators for carrying out the experiments. When this is not possible and novice evaluators must participate, then ensure that they work in teams and that they are closely supervised. Having more than one evaluator to carry out a usability evaluation is also proposed by [5] to maximise the number of results that can be obtained; when novice evaluators are employed, then working in teams can also assist their interaction towards resolving issues about the carrying out of the usability evaluation provided they are supervised by an experienced evaluator. ▪ Assign UEMs to evaluators according to their experience. It is generally better to allow evaluators to select a method in which they have experience or feel most comfortable in using it. ▪ Provide a common set of tasks to carry out. Unless a common set of tasks is provided, there is no way to ensure that evaluators have examined the same or at least similar areas of the target system. ▪ Provide a common format for documentation - reporting of usability findings. As Hartson et al [4] remark “many UEMs are designed to detect usability problems but problem reporting is left to the developers/evaluators using the UEM... problem report quality will vary greatly according to the skills of the individual reporter at communicating complete and unambiguous problem reports.” Reporting of usability problems can aid significantly the processing of results, especially for the case of parallel usability evaluations.


573

Ensure effective decision making when processing the results of multiple usability evaluations: In particular, ▪ Select criteria for comparative usability evaluation: As discussed in related work there are various criteria that can be considered for comparative evaluations. We make use of the criteria list presented in section 3.1, in order to draw more general conclusions about UEMs. However, aspects related to the target system affect the performance of UEMs, such as the user interface paradigm (e.g. hypertext, WIMP, 3D, etc.) and the level of maturity of the target system (e.g. application or prototype). For example, it has been argued that usability inspection methods may be more appropriate for finding problems in the early design stage of an interactive system [3]. Therefore, any conclusions drawn need to be interpreted carefully in the context of the particular class of target systems. ▪ Select a decision criterion for relevance of usability findings: making decisions about the realness or relevance of usability findings has been addressed in various as discussed above. We have addressed the relevance problem in a two-stage approach: first, the evaluation teams provided as part of their documentation their argumentations upon each usability finding; secondly, all usability findings were rated by an experienced usability evaluator (the first author of the paper) upon a three-scale severity scheme: 0 – not a problem; 1: minor problem; 2: serious problem. ▪ Select a decision criterion for similarity of usability findings: when all evaluations are available, there is a need to go through the reports in order to identify the similarity of usability findings. Again, we followed an expert-based approach for this task, which is the most usual condition in parallel usability evaluations. In this case it is however generally advisable that more than one expert performs this task. Van den Haak et al [17] have used five experts to interpret the results of their comparative study. However, there are various practical problems with involving more than a single expert. The amount of time that is needed to go through all evaluation reports, to process the large pool of data in terms of relevance and similarity and to resolve ambiguities and disagreements actually requires a lot of synchronous work. Therefore, we have used a single expert to go through the data, as well as others (e.g. [9]).

4 A Case Study of Comparative Usability Evaluation 4.1 Evaluation Object The web site evaluated is that of the Department of Product and Systems Design Engineering (www.syros.aegean.gr), University of the Aegean and has been operating since September 2000. The web site was designed to address the emerging needs of the new department and has been extended since, by the addition of web-based subsystems (both open source and in-house developments) for the support of administrative and teaching tasks. 4.2 Participants The usability evaluations were carried out by the MSc students of the department in terms of partial fulfilment of their obligations for the course on interaction design.

574


The students have a wide range of backgrounds about design having graduated from departments such as arts, graphic design, industrial engineering and information systems. Only two (out of a total of 27) students had limited experience on usability from their bachelor studies and had carried out a usability evaluation before. However, all students had considerable knowledge about the web site since they had been using it repeatedly. Thus the selected subjects had a similar level of usability experience (novice) but a good knowledge of the target system. According to Nielsen [10] who reports in the context of heuristic evaluation: “usability specialists are better than non-specialists at performing heuristic evaluation, and “double experts” with specific expertise in the kind of interface being evaluated perform even better”. Thus the lack of previous experience of selected subjects on usability evaluations was partly compensated by their good knowledge of the target system. Furthermore, the progress of the exercise was reviewed in weekly sessions with all teams in order to allow for resolution of queries and guide the smooth progress of the usability evaluations. Finally the fact that there was a team that carried out the evaluation instead of single novice designers encouraged critical discussion and group decision making about the findings of the usability evaluation. 4.3 Tasks and Methods Selected The evaluation teams were assigned one from the four usability evaluation methods of heuristic evaluation (HE - 3 teams), cognitive walkthroughs (CW - 3 teams), thinkaloud protocol (T-AP - 3 teams) and co-discovery learning (C-D 1 team) according to their degree of confidence for carrying out a usability evaluation with each one of these methods. All four methods are widely used in industry and academia for usability evaluation. The evaluation teams were provided with an analytic template for documenting the results, which included table of contents for the usability report and a predefined categorization of types of usability problems. The evaluation teams should test the system by following two given user tasks: ▪ For a student, to locate information about a specific course: course description, instructor and online notes. ▪ For a visitor of the department, to locate necessary information about visiting the department at Hermoupolis, Syros, Greece: the map of the town, accommodation information and travel information. The evaluation teams were given a two-month period to organise, carry out and document the usability evaluation. Their main deliverables were the usability report and their presentation of their results to an open to all discussion session. 4.4 Results Realness or relevance: The realness of usability findings (Table 1) is generally high in most methods even reaching 100% in one case of HE. However, three UEMs were identified with a rather large number of false (not real) usability findings HE2, CW2 and C-D1. The fact that this variability appeared in three different UEMs leads to the conclusion that it cannot be safely related to intrinsic characteristics of methods themselves but rather to the inexperience of the evaluation teams.


575

Table 1. Realness of usability findings and severity ratings UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1

Usability findings 18 28 14 21 24 21 18 17 39 200

1 8 0 3 6 1 1 3 10

0: not a problem 5.6% 28.6% 0.0% 14.3% 25.0% 4.8% 5.6% 17.6% 25.6%

1: minor problem 6 33.3% 8 28.6% 4 28.6% 4 19.0% 4 16.7% 6 28.6% 3 16.7% 4 23.5% 14 35.9%

2: major problem 11 61.1% 9 32.1% 10 71.4% 14 66.7% 13 54.2% 13 61.9% 14 77.8% 10 58.8% 15 38.5%

Real usability problems (1+2) 17 94.4% 17 60.7% 14 100.0% 18 85.7% 17 70.8% 19 90.5% 17 94.4% 14 82.4% 29 74.4% 162

Validity: The validity of UEMs (Table 2) can be directly measured out of the process of identifying the realness (or relevance) of usability findings. Baring in mind that the evaluator teams had little experience in usability evaluations, the validity of UEMs was quite satisfactory besides the three applications of methods that were discussed above. Table 2. Validity of Usability Evaluation Methods UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1

Usability findings 18 28 14 21 24 21 18 17 39

Real usability problems 17 17 14 18 17 19 17 14 29

Validity (%) 94.4% 60.7% 100.0% 85.7% 70.8% 90.5% 94.4% 82.4% 74.4%

Thoroughness: Thoroughness can be specified by the total number of real usability problems identified by each UEM divided by the total number of real problems that exist in the system, which is the sum of unique real problems identified by all methods. The eight out of nine UEMs demonstrated similar performance regarding the thoroughness measure (Table 3): they identified about 1/4 to 1/5 of the total number of the usability problems found throughout the system. The last UEM (co-discovery learning) resulted to an impressive (in comparison to the other UEMs) 41.4% of usability problems identified. Effectiveness: The effectiveness of UEMs can be identified as the product of thoroughness and validity (Table 4). The effectiveness of UEMs has demonstrated wide ranging results: ▪ Five out of nine UEMs identified about 1/4-1/5 of the total number of usability problems effectively (HE1: 22.9%; HE3: 20%; CW1: 22%; T-AP1: 24.6%; TAP2: 22.9%)

576


▪ Another three out of nine methods identified 1/6 of the total number of usability problems effectively (HE2: 14.7%, CW2: 17.2% and T-AP3: 16.5%). ▪ Only one UEM identified almost 1/3 the total number of usability problems effectively (C-DL1: 30.8%) The overall results about the effectiveness of UEMs are unsatisfactory with regard to one of the central questions in usability evaluation: whether the application of a single UEM can identify a considerable amount of usability problems. This was also shown by the comparative usability evaluation work of [9] that uses professional design teams. A second interesting result, regarding the comparison of the effectiveness of UEMs themselves is that the co-discovery learning method was significantly more effective than all other methods. Thus, it seems that this method seems to significantly help young teams to perform better than the other three methods. On the other hand, the fact that only one team selected this method constraints the safety of the conclusion, which can also be further pursued in other comparative usability evaluations. Table 3. Thoroughness of Usability Evaluation Methods

UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1

Total number of real usability problems 17 17 14 18 17 19 17 14 29

Total number of usability problems that exist in the system

70

Thoroughness (%) 24.3% 24.3% 20.0% 25.7% 24.3% 27.1% 24.3% 20.0% 41.4%

Table 4. Effectiveness of UEMs UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1

Effectiveness 22.9% 14.7% 20.0% 22.0% 17.2% 24.6% 22.9% 16.5% 30.8%

Consistency: The consistency of UEMs was not satisfactory (Table 5). About the half of usability problems found (50.7%) were uniquely reported by the application of just one UEM. Furthermore, only 2 of a total of 9 teams found a consistent set of about 1/4-1/5 of the total number of usability problems (22.9%). On the contrary there was not a single usability problem that was identified by all UEMs.


577

Table 5. Consistency across UEMs Total number of usability problems ... found by 9 teams / UEM … found by 8 teams / UEM … found by 7 teams / UEM ... found by 6 teams / UEM … found by 5 teams / UEM … found by 4 teams / UEM … found by 3 teams / UEM … found by 2 teams / UEM … found by 1 team / UEM

70 0 1 3 5 0 5 5 16 35

% 0.0% 1.4% 4.3% 7.1% 0.0% 7.1% 7.1% 22.9% 50.0%

4.5 Discussion The main conclusions that stem out of the case study are that: ▪ The employment of a single method is not enough for comprehensive usability evaluation. If it is important to find most problems, parallel evaluations can be carried out. ▪ No method was found to be significantly more effective or consistent than others. ▪ The realness and validity of evaluation results was considerably high for most teams, which counts for young designers’ supervised participation to usability evaluations. In the case study presented, we have followed the proposed approach to inform current practice regarding the use of UEMs. The educational setting in which the case study was carried out imposed restrictions regarding the selection of evaluators (i.e. supervised teams of novice evaluators), the assignment of UEMs (i.e. only one team felt confident to carry out the usability evaluation following co-discovery learning) and the processing of results (i.e. an expert-based approach was followed to make final decisions about the relevance and similarity of the usability findings). On the other hand, the educational setting was convenient for a number of other reasons including that: UEMs were applied according to a common set of lecture notes; evaluators followed a common format for reporting; and they followed the same tasks to evaluate the system. These conditions are hard to achieve in an industrial setting. For example, Molich et al [9] perform a comparative usability evaluation where the evaluator teams use different UEMs (actually combinations of UEMs that have evolved by practice) and different templates for reporting.

5 Summary and Conclusions Comparative usability evaluations are important for the throrough identification of usability problems and the comparison of UEMs in particular contexts. The paper contributes to the understanding of criteria for comparative usability evaluation both in terms of providing a method for this task and by presenting a relevant case study

578


for a web-based system. It is envisaged that the approach taken can be applied to other comparative studies as well. Also the results of the case study can inform the selection of UEMs particularly when young designers need to be employed in comparative usability evaluations.

References 1. Andre, T.S, Hartson, H.R., Belzand, S.M., McCreary, F.A.: The user action framework: a reliable foundation for usability engineering support tools. Int. J. Human-Computer Studies 54, 107–136 (2001) 2. Cockton, G., Woolrych, A.: Understanding inspection methods. In: Blandford, A., Vanderdonckt, J., Gray, P.D. (eds.) People and Computer, vol. XV, pp. 171–192. Springer, Heidelberg (2001) 3. Doubleday, A., Ryan, M., Springett, M., Sutcliffe, A.: A comparison of usability techniques for evaluating design. In: Proceedings of Designing interactive systems (1997) 4. Hartson, H.R., Andre, T.S., Williges, R.C.: Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction 15, 145–181 (2003) 5. Hertzum, M., Jacobsen, N.E.: The Evaluator Effect: A Chilling Fact about Usability Evaluation Methods. International Journal of Human-Computer Interaction 13(4), 421–443 (2001) 6. Jacobsen, N.E., Hertzum, M., John, B.E.: The evaluator effect in usability tests. In: Summary Proceedings of the ACM CHI 98 Conference, pp. 255–256. ACM Press, New York (1998) 7. John, B.E., Marks, S.J.: Tracking the effectiveness of usability evaluation methods. Behaviour and Information Technology, 16(4/5), 188–202 (1997) 8. Law, E.L-C., Hvannberg, E.T.: Analysis of strategies for estimating and improving the effectiveness of heuristic evaluation. In: Proceedings of NordiCHI 2004, Tampere, Finland (October 23-27, 2004) 9. Molich, R., Ede, M.R., Kaasgaard, K., Karyukin, B.: Comparative usability evaluation. Behaviour and Information Technology 23(1), 65–74 (2004) 10. Nielsen, J.: Finding Usability Problems Through Heuristic Evaluation. In: Proceedings of CHI Conference on Human Factors in Computing Systems, pp. 373–380. ACM, New York (1992) 11. Nielsen, J.: Usability Engineering. Academic Press, San Diego (1993) 12. Nielsen, J.: Usability Inspection Methods. In: CHI’94, Boston, Massachusetts (1994) 13. Öörni, K.: What do we know about usability evaluation? - A critical view, In: Conference on Users in the Electronic Information Environments, September 8 - 9, 2003 Espoo, Finland (2003) 14. Rosson, M.B., Caroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan-Kaufmann, San Francisco (2002) 15. Rubin, J.: Handbook of Usability Testing. John Wiley & Sons, Inc. New York (1994) 16. Sears, A.: Heuristic Walkthroughs: Finding the Problems Without the Noise. International Journal of Human-Computer Interaction 9(3), 213–234 (1997) 17. Van den Haak, M.J., De Jong, M.D.T., Schellens, P.J.: Employing think-aloud protocols and constructive interaction to test the usability of online library catalogues: a methodological comparison. Interacting with Computers 16, 1153–1170 (2004)

Concept of Usability Revisited Masaaki Kurosu National Institute of Multimedia Education [email protected]

Abstract. Based on the historical review, a new model on the concept structure of usability and satisfaction was proposed. As a proposer of user engineering, the author redefined the concept of usability of which the usability engineering is responsible and linked the concept of satisfaction to the user engineering. It is based on the differentiation of the objective characteristics of artefact and the subjective impression of user. Keywords: usability, satisfaction, usability engineering, user engineering.

1 Introduction Since ISO13407 was standardized in 1999, usability engineering entered a new era and increased amount of attention has begun to be cast on the usability or the quality in use. At least in Japan, the concept of usability is based on ISO13407 which cites the definition of ISO9241-11. As the alias of “big usability” suggests, it covers wider range of quality compared to the “small usability” originally proposed by Nielsen. But the author was questioning the conceptual dependency among the sub-concepts of the definition of ISO9241-11. Here the author present a revised version of the notion of usability and put more emphasis on the satisfaction as an ultimate goal for user engineering.

2 Goal Achievement The usability concerns the goal achievement of users. Before achieving their goals, users are at the current status where they have not yet obtained the satisfactory result. Between the goal status and the current status, there is a certain distance that should be conquered. With the help of some artefact, the user tries to minimize this distance and hopefully to make it null. But users may frequently face with the difficulty while using that artefact. Sometime they will have to go along the long path, and sometimes they will have to give up because of the lack of usability. It is because the artefact may have difficulties of operation, and may have difficulties for understanding, furthermore may not match with the users’ characteristics and/or the context of use. What is expected for the usability engineering is to straighten and shorten the path to the goal. By the way, the author prefers to call it as the “user engineering” rather than the “usability engineering” because of the narrow connotation of the word J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 579–586, 2007. © Springer-Verlag Berlin Heidelberg 2007

580

M. Kurosu

“usability”. As will be discussed later, the goal of our activity should not be limited to the traditional connotation of the term “usability” but should be more broadened.

3 Concept of Usability In this section, some major definitions of usability will be reviewed and finally a new concept of usability and satisfaction will be proposed. Nielsen The formal and structural definition of usability concept was first given by Nielsen, J. as follows. As is shown in this figure, usability is composed of such sub-concepts as the learnablity, the efficiency, the user retention over time, the error rate and the satisfaction. And it should also be noted that the utility is put aside as a mutually exclusive concept to the usability. This concept structure may be related to the activity of Nielsen himself. As is well known, he proposed the heuristic evaluation method for evaluating the usability, i.e., for detecting the problems. Thus, for him, the usability is an activity to improve the negative aspects of the artefact that will be found by the evaluation method. In other words, it’s a “non-negative” concept of the usability and it is aiming to improve the artefact from minus level to zero level or the normal level. Looking back the history of usability engineering, it started from the evaluation activity using the usability testing, the inspection method, etc. So it is quite natural that Nielsen focused on the evaluation and proposed the concept structure of usability as such. But the usability activity based on the evaluation had some limitations. For one, engineers and designers who designed the artefact wouldn’t easily agree to accept the result of evaluation, and claim that the users can use it if they should follow the procedure that was designed by them. For another, managers wouldn’t put emphasis on the evaluation-based usability activity because just improving the defects will not contribute to the sales. Of course, there were some engineers, designers, and

Concept of Usability Revisited

581

managers who could understand the significance of the usability activity even though it is a “non-negative” approach. But most of them put their energy to the development of utility, or the functionality and the performance. So it could be said that the “non-negative” concept of usability, sometimes called the “small usability” is not sufficient, and something more should be considered.

ISO 9126 ISO9126 was standardized for defining the quality of software. As can be seen in the figure, there are many quality characteristics that include the usability as just a part. In this standard, the usability is considered to be consisting of the understandability, the learnability and the operability.

582

M. Kurosu

It is reasonable that this standard included the usability as one aspect of the software quality, but its definition is narrow and insufficient. ISO9241-11 An influential definition of usability was proposed by ISO9241-11. The definition of usability clearly specifies that the usability is related to the goad-achievement and it put emphasis on the context of use. The sub-concepts of usability consist of the effectiveness, the efficiency and the satisfaction. It is important that the effectiveness and the efficiency are not only related to the “non-negative” aspects but also the “positive” aspects of the artefact. Regarding the effectiveness, the artefact can become usable by minimizing the difficulty of use. But at the same time, the artefact can become usable by providing the function that will solve the user’s problem and make it easier to achieve the goal. Regarding the efficiency, the usability will be improved by changing the interaction procedure in order to shorten the time of operation. But it could also be improved by providing the faster CPU. In this sense, the definition of usability of ISO9241-11 is not just the “nonnegative” one but also is the “positive” one. In other words, it could be said that this definition includes both the usability and the utility in the definition of Nielsen, and is almost the same with his definition of usefulness. This definition is sometimes called as the “big usability”.

This definition also includes the satisfaction as a promoting factor of the use of artefact. But this point is a bit controversial. The effectiveness and the efficiency are the property of the artefact so is the usability. But the satisfaction can be achieved as the result of the object property, i.e. the effectiveness and the efficiency, and is the subjective impression on the side of the user. Another point that should be carefully looked at is the use of the term “specified users”. It is quite natural that the manufacturer presupposes the user as the targeted

Concept of Usability Revisited

583

user. But it was frequently observed that the profile of the targeted user was based on the engineers and designers themselves. Thus, it is sometimes criticized that the profile of user is a male, aged around 30’s, having a certain level of knowledge of IT. As a result, the artefact they designed frequently becomes difficult to use by everyday people. This was sometimes pointed out by those who are working in the field of universal design and accessibility. Thus it should be redefined as to include the every possible type of users. Anyway, this definition of usability of ISO9241-11 was so influential that ISO13407, the core standard of usability, and other standards such as CIF, ISO18529, ISO16982, ISO20282, etc. are adopting this definition. Jordan Patrick Jordan put emphasis on the pleasure and proposed a three-level concept structure. The functionality is placed as the first level and the usability as the second level. He put emphasis on the pleasure as the third level, because it is inevitable for the artefact not just fulfilling the ease of use but also enhancing the emotional aspects. This corresponds to the current trend that focuses on the emotional aspect of the artefact as was proposed by Norman. In his definition of usability, he cited that of ISO9241-11. But this point is a bit confusing. As was mentioned above, the definition of usability of ISO9241-11 includes the satisfaction as its part. So it is difficult to clearly differentiate the satisfaction and the pleasure. Furthermore, his definition is a bit too simple and does not refer to other aspects of the artefact. Kurosu Considering the insufficiencies of past definitions, Kurosu proposed a new hierarchical model of usability. He differentiated the objective properties of artefacts (on the left hand side) and the subjective characteristics of users (on the right hand side). In the left half, the effectiveness and the efficiency are included as to be influenced by the utility and the (small) usability. The former consists of the functionality and the performance and the latter consists of the ease of operation and the ease of cognition. The ease of operation was once a main target of usability activity by applying the methods and knowledge of the human factors engineering and ergonomics. The ease of cognition later became the center of the concern of the usability professionals that was triggered by the advent of computer and its applications. It was put forward by applying the methods and knowledge of the cognitive psychology. The effectiveness, the efficiency and the satisfaction were regarded as the subconcepts of the usability in ISO9241-11, but Kurosu limits the range of usability concept only to the effectiveness and the efficiency on the left hand side of the figure, admitting the influence of these concepts to the satisfaction. It is based on the notion that the usability is the property of artefact.

584

M. Kurosu

Besides the effectiveness and the efficiency, such quality properties as the cost, the safety and the reliability are located in the property of artefacts that may influence the satisfaction. Some other properties such as the re-usability could be added to the list of quality properties if necessary. On the right hand side of the figure, the satisfaction is located as the top concept and some other subjective characteristics on the side of user such as the pleasure, the aesthetic impression, the attachment, the motivation, the drive and the value system are described as influencing the satisfaction. It is also suggested that the satisfaction, the supreme goal of the artefact, may be related to the user experience (UX), the customer satisfaction (CS) and the quality of life (QOL).

In addition to this structure, Kurosu points out that it is important to include the spatial dimension and the temporal dimension. He put emphasis for considering the diversity of user characteristics and the diversity of context of use as a spatial expansion of the concept. This could be the basis of the concept of universal usability as was originally proposed by Shneiderman. He also introduced the temporal dimension and put emphasis on the long term use or the prolonged use. It is a contrasting approach to the former usability engineering that focus on just a short time use as can be evaluated in the situation of the usability testing. These points will be explained in detail in later sections.

Concept of Usability Revisited Characteristics Age Generation Gender Physical traits Mental traits Educational background Social status Knowledge and skill Language Culture Communication style Cognitive style Learning style Functional Insufficiency

Situation Life style Economical situation Political situation Emotional status Geographical environment Historical background Urgency

585

Value Preference Political attitude Religion Tradition

4 Concluding Remarks Based on the notion of goal achievement, a few notable definitions of usability were reviewed and a new concept of usability and the concept of satisfaction were redefined respectively, thus putting more emphasis on the user engineering. Although this idea is not quite new to the author, he is now confident that the satisfaction is the ultimate goal of people (user) living in this world. Artefacts include invisible systems such as the educational system, the local government system, the banking system, and the transportation system. The concern of author was now enlarged to consider how the educational system can satisfy people who are considering their life-path and the carrier-path. In this sense, the usability of the educational system that include the hardware, the software, the humanware and their total system should be inspected from the viewpoint of the satisfaction. The system should support people to finally select their life path and empower them to have knowledge and skill to realize that goal. The effectiveness and the efficiency of the educational system is just a matter of usability and that will not fulfil their goal of life. In this sense, the author is now interested in pursuing the difference of artefacts in time and in place. He is now conducting the ethnographic research to find out how people invented and decided to use some specific form or pattern of artefact for supporting their life. This is called the “Artefact Development Theory” and will be presented in the next opportunity.

References 1. ISO ISO13407, Human-centred design processes for interactive systems (1999) 2. ISO ISO9241-11, Ergonomic requirements for office work with visual display terminals (VDTs). Guidance on Usability (1998)

586

M. Kurosu

3. Jordan, P.W.: An Introduction to Usability? Taylor and Francis, London (1998) 4. Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland, I.L.: Usability Evaluation in Industry? Taylor and Francis, London (1996) 5. Kurosu, M.: What is usability?, HCD-Net News 2006. In: Nielsen, J., Usability Engineering?AP Professional (1993)

How to Use Emotional Usability to Make the Product Serves a Need Beyond the Traditional Functional Objective to Satisfy the Emotion Needs of the User in Order to Improve the Product Differentiator – Focus on Home Appliance Product Liu Ning1 and Shang Ting2 1

Corporate Technology of Haier Group [email protected] 2 ISAR User Interface Design [email protected]

Abstract. A traditional definition of usability cites the successful attainment of some related control within a specified period of time and a minimum number of errors. Therefore, most of attempts focused on the function of the product. At present, user centered design is highly emphasized; in addition, more entertainment-oriented products has received high attention by consumer. So, whether or not the product can meet the emotion needs of the consumer is significant for the brand. This paper provides the definition of emotional usability based on the traditional usability research and introduce one of the most famous home appliance company Haier how to use it during the course of product development through case study and provides the process to apply emotional usability to make the product serves a need beyond the traditional functional objective to satisfy the emotion needs of the user in order to improve the product differentiator.

1 Introduction Nowadays, the new product competition could be considered as the silence war, the batter is the market place. A product which is easy to use and have a trustful quality is not the powerful weapon at all. In China, many customers pay a lot attention on whether or not the product could project their taste, status and so on. Therefore, emotional needs have become more significant to customers. So, how to use transfer those needs into emotional usability to make the product serves a need beyond the traditional functional objective to satisfy the emotion needs of the user is the key issue to improve the product differentiator. However, the truth is one of the most difficult things is figure out way of thinking of human been. From the aspect of cognitive perspective, human always change mind, especially for the emotion, there are many kinds of emotion. Particularly, emotion is the J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 587–593, 2007. © Springer-Verlag Berlin Heidelberg 2007

588

N. Liu and T. Shang

key when people made decision, for example people choose die because of the certain brief; people choose loyal to their lover because of deep falling in love. No doubt, people make decision to pay for the product partly is also partly influenced by emotion. This emotion could come from various origins, such as requirements, impression, trustily and so on. Even sometime people buy the product not because they need it, just because of experience or deeply moved. In the general, the requirement of human for the product performance usually includes the functional, atheistic and quality. From the view of functional usability which means the people who use it can do so quickly and easily to accomplish their own tasks. People will never pay for the product which let them spent a great deal of time on figuring out how to work or keep making errors when customers use the product. It’s easy for customer to abandon these kinds of products. For atheistic, a dictionary provides a definition, “the beautiful, in the taste and art.” This component includes the first and last impression of the product. The role of atheistic has already be the significant component of the product. But sometime, the product which is equip with the aesthetic couldn’t meet customer’s needs or hard to use. This paper will deeply explain the definition of emotional usability and introduce one of the most famous home appliance product company how to address the emotional usability into the new product development.

2 Emotion Usability Hence, what’s the emotion usability really? Let’s talk about usability first. Usability contains two parts; one is functional usability which focuses on easy and efficiency to use the product. The other part of usability is emotion usability refers to a degree to which a product is desirable or serves needs beyond the traditional functional usability. Take the refrigerator for example, Chinese people consider this product which could make the food last longer in 1985s which is very beginning of refrigerator in China; later, people think this function is not enough, customers want to fridge make the food more fresh, save the energy, keep the humidity and so on. However, at present, refrigerator is not just a home appliance for them. Chinese expect the product could be nice furniture in their apartment. Therefore, traditional components of appearance, such as color which includes white and argentine are obvious lack of attractive for customer. Even those products equip with satisfied quality and function. This case could be applied for many types of human electronic product, like wash machine, TV and IT products. Does this mean quality is no more important or no more tempting for Chinese? Absolutely not, let’s back to see how people interact with the product. Figure 1 describes the cognitive judgment for the products. Normally, customer interacts with the product from five senses which is smell, touch, taste, taste and sight. Nevertheless, touch, sight and hearing are main sense of customer interact with home appliance product. Basically, hear could have connection with the performance and quality, such as the degree of noise etc. However, part of touch and particularly sight are related to customer’s emotion reaction. Sight prefers more about aesthetic part and whether or not

Focus on Home Appliance Product

589

the product could serve their needs, such as appearance, user interface design, and function. Touch tends more about the process of interact with the product, such as the usability. Certainly, not only aesthetic could impact human’s emotion changing. Functional requiems point the product could meet people’s potential needs and the enjoyable interactive process not just easy to use. These factors all take great influence on human’s emotion. Some time the product is unique enough, but it couldn’t meet the customer needs, the product will still suffer Waterloo. In the history of industry design, innumerable well designed product failed in the commercial place. As we mentioned, serving the potential needs beyond the functional usability is the key point. This means the most important thing is address the needs which customer doesn’t realize but they really need in the product. In this point, understanding customer’s way of thinking is significant.

Interaction Obje ctive sm ell touc

tast

heari sig ht

Cognitive judgment

Fig. 1. Cognitive judgment for the product

Nowadays, many western companies try to won huge China market. However, there is a deep gap between Chinese and western people. Primarily, there’s distinct difference between western culture and Chinese culture. These differences reflect to the emotion part. The core of Chinese culture is implicit, middlebrow, be afraid of loosing face or let other people feel embarrassed. These are the some important reasons why many Chinese pay attention to the products which can stand out their status, taste and

590

N. Liu and T. Shang

personalize. In contrast, simple, straight forward are the key points of western culture. Western people pay attention to more about simple, practical about product. As we mentioned the refrigerator case, Chinese feel proud when the guest comes to their apartment to see the refrigerator with attractive features. Chinese like to show the new mobile phone which they just bought to their friends, even the old one is not bad all. People feel bored or out of date to use the refrigerator with traditional design or the old model of the mobile phone, even this product has good quality. In this case, the product which can successfully address western people emotion usability might not be able to work in China.

3 The Process to Apply Both Functional and Emotion Usability in Haier Haier is one of the most famous home appliances in China. The company gains the compelling success in China. The secret of success of Haier in the beginning is meticulous support after sell. At present, one more secret of success is always address Chinese customer’s potential needs. Haier have the practical process to adrress emotional usability in the new product. Frist, let’s introduce the principle of Haier design. 3.1 The Principle of Haier Design In 2002, Haier develop the compact wash machine for summer clothes and other small clothes which couldn’t wash quickly because of too light by the normal wash machine. This product has broken the rules which wash machine is relative hard to sell in summer also won G-Mark of Japanese design competition. In 2006, Haier air conditional again won the International Industrial Design competition (IF) in German.

In fact, the principle of Haier’s design is the product must satisfied user’s needs, here mentioned needs not only for those which are already been discovered, the most important part of the needs is keeping exposure the user’s potential needs. Central concepts of Haier emotional usability are two related product attributes. First, the


591

product must have appropriate aptness, and first/lasting impressions. In another words, the product can catch and maintain user’s attention quickly and longer; second, eliminate the machine feeling which includes fear and lack of humidity to user. 3.2 Process From the beginning of the new product development, usability team is involved in Haier. The responsibility include discover user’s needs; build user interface design, usability testing and other types of testing. In summary, usability team plays a key role in the early new product development process. The following shows the snapshot of these activities (Figure2).

Fig. 2. The process of combine functional and emotional usability

Through conducting user research for the previous products, Haier can get the directly customer feedback. To some extent, this way could also gather user requirements to the new products. Obviously, Haier also conduct other type’s research on discovering potential customer needs. Then, these needs will be imputed into the new product developments team to develop the new concept. In fact, the most important steps to discover users emotional usability needs are conducted in the pre-development stage even before. Because, customer usually couldn’t provide very useful suggestion directly, mostly, customer doesn’t really know what they want. Through the various researches in pre-development stage, Haier gather more answer about why rather than how from the customers, research team keep working to figure out the story which is behind the reason to discover people’s potential needs. For example, in 2003, China was suffered by SARS, especially in South part. This disease can be easily infected even through air. Chinese is really serious afraid of it. The product team through user research to find that when people wash the clothes, people used to sterilize it after washes machine finished washing. However, the clothes got terrible flavor by disinfector. Because there is no wash machine which equip with disinfection function, some people use disinfector to wash the clothes first by hand then

592

N. Liu and T. Shang

put it into the wash machine. The reason of people use this unusual way to wash clothes obvious because of SARS. This potential needs was quickly identified by the new product development team. Haier launched the first wash machine which has disinfection function in Chinese market very soon at that moment. This product brought the huge profit to Haier. In this case, it’s easy to see, the product development team properly transfer customer needs into the new product which achieved the first aim of Haier’s emotional usability- the product must be engaging. Customers could be deeply moved by the unique design, also could because of the product provide something which they never saw but they really need it. Without doubted, aesthetics engaging is also important. Once goes to the concept stage, the product design and user interface design team will provide design concept based on those needs. After the new concept is finished, Haier will send the prototype to the laboratory to conduct one to one acceptance testing or using other research methods to evaluate the feature of the product. The opinion which is collected from the customer will directly sent to the product development team in order to quickly improve the product. At the same time, acceptance testing or type’s research is also conducted. Based on user’s reaction to the product, the company could simply evaluate whether or not the product could catch people’s eyes. It includes product design, usability, user interface even inside frame.

Take this red refrigerator for granted, after this product was finished design stage, Haier did competitive testing between Haier’, Sumsung, Panasonic and LG in China. The result is this model got pretty much praise by customer. Once Haier launch this model, Haier got quite a lot success in the market. Meanwhile, Haier also sent this model to German to participate the IF competition in 2006. Haier won the award. In order to better to achieve the emotional Usability aims, especially for eliminate the fear of the machine, Haier emphasizes the user interface design in the beginning of the new product development. Actually, achieving the functional usability is just a base goal for Haier. The interface design team combines functional usability and visual appealing together to let people are engaging with the product by the easy and enjoyable interaction process between product and customer. The user interface design


593

concept finally will be conducted by usability testing with the customer to evaluate whether or not the concept could achieve the aim of usability. • Weak point of the process There is no omnipotence weapon in the world. This process already made fantastic impaction on new product development for Haier. However, it also showed the negative parts. First, the cognitive of human keep changing, it also include the opinion which they provide to the company. Sometime, what they said is not really carefully considered. Second, usability is adopt from western, it include many methods. This question back to the previous issue which is western has great difference with China. Chinese people tend to be silence, implicit, consider carefully about company’s face while the research is conducted which are not good. Except that, many methods could work well in western, however, those methods couldn’t work ideal in China. The bad result of research might serious mislead the company’s decision. Therefore, finding a proper methods and make the methods localization are another key issue now.

4 Conclusions To summarize, usability program is new and still growing in Haier. Haier actually just start the fundamental process of usability process; include early and continual focus on consumer and perfect our interactive design process that relies heavily on the research and prototype. Based on this process, Haier customize the different department into new product development process in order to finally improve the differentiator in China. As the principle of Haier design group, emotional usability is the key to improve the product differentiator and eventually achieve success in the market place.

References 1. Micheal, E. wiklund Robert, J.: Usability in Practise, Logan, PhD, Behavior and emotional usability (1994) 2. Oscar Person, Usability is not enough

Towards Remote Empirical Evaluation of Web Pages' Usability Juan Miguel López1, Inmaculada Fajardo2, and Julio Abascal1 1

Laboratory of Human-Computer Interaction for Special Needs (LHCISN) Computer Science Faculty. University of the Basque Country Manuel Lardizabal 1; Donostia - San Sebastian [email protected], [email protected] 2 Cognitive Ergonomics Group Department of Experimental Psychology. University of Granada Cartuja Campus; Granada [email protected]

Abstract. The functional description of EWEB, a tool for automatic empirical evaluation of web navigation, is presented in this document. EWEB supports naïve evaluators for designing experiments which contain experiment type (within-subject, factorial, etc.), web logs to be captured (time, visited pages, etc.), task models (search task, free navigation) and surveys (questionnaires, card sorting) to be performed by experimental participants. EWEB stores navigational data preserving the experiment structure and supports data analysis and interpretation, with the possibility of generating usability metrics. Requiring minimal installation on client computer, EWEB can be used for both lab evaluation and remote evaluation in multiple browsers. One empirical web study, designed and performed by means of EWEB, is described in order to illustrate its validity as a research tool. Keywords: web usability experiments, log capturing and analyzing, web navigation metrics.

1 Introduction The recursively invoked virtue of hypertext and hypermedia documents composing the World Wide Web is their flexibility to be navigated, which offers users the possibility of accessing information in the way they want and according to their necessities or knowledge. Instead, in linear documents, users have to follow the sequence established by the author. Paradoxically, as [1] pointed out, this flexibility of hypertext and hypermedia usually laid to a hyperchaos , which is often related to the lack of adjustment to usability design guidelines and empirical testing. In fact, empirical testing has allowed observing hypertext/hypermedia interaction and establishing models and theories from which it is possible to derivate usability design guidelines [2]. However, despite its potential rewards, empirical testing is frequently neglected because it is considered time consuming and costly in terms of training, experimental participants and material resources (e.g. laboratory, computers, etc.). For J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 594–603, 2007. © Springer-Verlag Berlin Heidelberg 2007

Towards Remote Empirical Evaluation of Web Pages' Usability

595

that reason, tools for facilitating the users’ recruitment and for automating the process of designing, registering and interpreting web usability experiments are essential. With the aim of assisting researchers, there exist tools that automate these processes separately. Many useful tools for the capturing of user behavior during web interaction can be found in the literature. On the one hand, tools such as [3], [4] or [5] store data generated by the HTTP level communication between a web server and a browser on a client machine. On the other hand, there are tools for capturing data from user interface, in this case a local web browser, such as [6], [7], [8], [9] and [10]. A subgroup in this category is the one of the tools that use a new or modified client browser specifically prepared for storing user navigation information, such as [11], [12], and [13]. There also exist a number of tools for automating knowledge elicitation tasks administration (mainly card sorting) in the Web context, such as [14], [15], [16] and [17]. However, these tasks are not usually integrated in automated capture tools. [15] is an exception that combines automation of event capture and card sorting tasks, so that researchers can design the task introducing the concepts to be classified by users and gathering the result of this categorization. Once user actions are registered, next step consists on analyzing and interpreting them. Among the tools that allow analyzing registered web navigation information, [10], which provides comparison between task models and user behaviour, [13] and [18] can be found. Other analysis tools such as [19] and [20] do not register user logs but provide interesting navigation metrics, for instance, the disorientation degree (L index of Lostness, [21]), St index of linearity and/or Cp index of complexity of the Web navigation route followed by the user [22]. As a summary, it can be stated that there are numerous tools which automate some of the processes implicated in an empirical evaluation of website usability, mainly capturing and analyzing. However, tools aiding the process of designing a web experiment are fewer and incomplete. Furthermore, to our knowledge there are no tools that facilitate conducting jointly the processes of designing complex web experiments, capturing user interaction, and analyzing and interpreting captured data. This fact may conduct evaluators to acquire and invert a big amount of resources in learning how to use different tools for each process, which paradoxically would interfere instead of making lighter the empirical evaluation of web navigation processes. Given the deficits found, and with the aim of suiting the need of one tool including all previously mentioned aspects, EWEB tool was developed. EWEB (acronym of Experimentation in the WEB) is a tool for automatic empirical evaluation of web navigation. EWEB supports naïve evaluators for creating experiments which contain experiment type (within-subject, factorial, etc.), web logs to be captured (time, visited pages, etc.), task models (search task, free navigation) and surveys (questionnaires, card sorting, etc) to be performed by experimental participants. Additionally, EWEB stores web navigation data preserving the experiment structure and supports data analysis and interpretation, with the possibility of generating usability metrics such as Lostness or similarity to the optimum path [21]. Finally, EWEB can be used for both lab evaluation and remote evaluation in multiple browsers requiring minimal installation on client computer.

596

J.M. López, I. Fajardo, and J. Abascal

2 EWEB Tool: Technical and Functional Description EWEB tool (Experimentation in the WEB) consists on three different modules (see Figure 1): Experimental Session Design module, User Guidance and Monitor module and Analysis module.

Fig. 1. Architecture of EWEB tool

An experimenter defines a session using the design module, creating a XML file as output. This file is used to define the experiment session. User guidance and monitor module uses the XML file to conduct the session in user's computer while monitoring navigational data. These data are stored in a remote repository, from which it can be analyzed according to the experiment session. Each part of the architecture is described next. 2.1 Experimental Session Design Module Any experiment can be described as a study that investigates the effect of X on Y. Therefore, when a Web experiment is carried out, it must be decided which variables are going to be manipulated (X or independent variables), controlled (strange


597

variables) and observed (Y or dependent variables), and in which way. For instance, let's consider a company that wants to evaluate the impact of the background colour (blue, white or green) of its website on the time that users require to complete a search task. In the design module of EWEB, the evaluator must add one Independent Variable (Background colour) and specify its levels (blue, white or green). Furthermore, the evaluator must decide whether all users will perform the search task for the three levels of the Independent Variable (within-subject design) or each user will perform it for just one level (between group design). As a result of these initial steps, the design model calculates the number of experimental groups and conditions automatically. In our example, the between group design would have three experimental conditions and three experimental groups. The evaluator must include three different groups of users and each of them would perform the search task with an unique background colour level. If the evaluator selected the within-subject design, the number of experimental conditions would also be three but the number of experimental groups would only be one since all users will perform all three experimental conditions. The next step is to define the task models and the procedure for each experimental condition. Currently, EWEB is designed to implement two types of web navigation tasks (search and free navigation) and two types of surveys (card sorting and questionnaires). 2.1.1 Search Task and Free Navigation Task The search task consists of users searching for a series of targets in the web site with a temporal limit. The design module allows configuring instructions, number of search tasks, time limit for the task, target URL, initial URL for the task, data to be logged (accuracy, time to find the target, pages accessed, total time per page, order of pages accessed) and order of searching trials (random or fixed order). This last point is very important in experimentation in order to prevent practice or fatigue effects, which may mask the effects of the independent variables. The free navigation task consists of asking users to navigate freely through the web starting from a specific web page for a given time. The design module allows configuring initial URL, time limit (if any) and navigational data to be logged. 2.1.2 Surveys Card sorting task is used as a knowledge elicitation task and it has been used extensively in Cognitive Psychology and Artificial Intelligence to study user learning or the so called Mental Model [23]. The card sorting task consists of asking users to sort cards which contain task relevant concepts. The output is a vector or matrix with the user data that can be compared graphically or statistically with a theoretical or expert matrix. Design module allows evaluators to introduce task relevant concepts and specify the theoretical vector. Questionnaire option allows evaluators to design a set of questions to be fulfilled by users. The number and type of questions (true-false, forced choice, scale, etc.) and the presentation format can be designed in this module. Finally, although the general instructions for the experimental conditions are not users tasks, they can be designed within each condition in order to facilitate the description of the procedure.

598


Continuing with previous example, if the figured company sells books, the speed users are able to find and match their targets is a relevant usability index. Therefore, if the evaluator started with the “blue background” experimental condition, he/she could select the search task. In the design module interface, the evaluator would select the number of search tasks (e.g. two searching trials), time limit (e.g. 20 seconds) and initial and target URLs. As some of the experimental conditions can be identical to the previously defined ones, EWEB allows the evaluator to copy tasks and procedures and later change some parameters on them. Finally, the procedure, that is, the presentation order of the sets of tasks for a specific experimental condition, can also be defined by the evaluator as random or fixed. For instance, if an experimental condition includes two tasks, search and card sorting, and the evaluator selects a fixed order, he/she should indicate which one must be performed first. This module provides as output a XML file with a specific format for the experiment design created. All different variables used and their different conditions are coded in this file. The specification file is later used by the User Guidance and Monitor Module, which assigns the different tasks users have to perform based on its information. The file is also used by the Analysis Module to facilitate the analysis of all user evaluation data. 2.2 User Guidance and Monitor Module In order to perform an experiment, this module must be downloaded to be run locally in user’s computer. As this module is developed using Java technology, the only requirement for the user machine is that Java Virtual Machine is installed. If so, the module will run by means of Java Web Start technology. This module is composed of two different parts: User Guidance Module and Monitor Module. 2.2.1 User Guidance Module User guidance module is based on the experimental design created by the experimenter in the previous stage. A XML file describing the design of the experiment is received as an input. According to the design, the tasks to be performed by the user and their order are established. For instance, if the procedure of an experimental condition is defined as random, all tasks related to it will be randomized when the user passes through this condition. Tasks to be performed are prepared and executed, based on the given experimental design. Instructions for the different groups of tasks to be performed are also provided by the experiment file, as are the texts for error or OK messages that may appear depending on user actions. 2.2.2 Monitor Module The Monitor Module is executed locally on the client machine and its goal is to monitor all information related to user interaction while performing the given tasks. In order to ensure that the evaluation is performed in a realistic scenario, data recovery must be performed in such a way that the user is not aware of any difference from the web navigation he/she performs in his/her browser. Therefore, this module defines no user interface. As almost all current browsers allow the option to connect to the web through a proxy server, in this module a proxy or intermediate software is used to route all the


599

incoming and outgoing client browser’s web traffic, so that relevant user navigation data can be captured using this technique. This approach permits the proxy to be used by almost all existing browsers and operating systems. In fact, EWEB has been successfully tested with different browsers such as Internet Explorer, Mozilla, Mozilla Firefox and Konqueror. Modularity of this approach allows a rapid adaptation for this module to be used with new browsers or new versions of supported browsers. This module performs both the proxy navigation activation and deactivation automatically, so that the user does not notice changes in browser configuration. The mechanism is different for each browser, so that different pieces of code have been developed to perform this, one for each different browser. If user’s browser already has a proxy configured, a hierarchy is created in the Monitor Module proxy so that the incoming and outgoing web traffic is rerouted to previously defined proxy. When user session ends, browser settings are restored so that the user can navigate with the previously defined proxy. Use of cached information is disabled to ensure that all users perform the evaluation in the same conditions, because the use of cached web pages can affect the results of the experiments. As user navigation data are recovered locally by the proxy, there is no problem with network latency and received data are more accurate than data obtained in a remote proxy server (millisecond accuracy can be achieved). Data from user performed tasks can be stored either in a remote repository or in a local file, depending on the information specified in the experiment. All recovered data are also stored and labeled according to the design. 2.3 Analysis Module Once users’ data are stored in the remote data repository after an experimental session, both directly or by adding local files manually, they can be analyzed using this module. In addition, data stored in the repository can be reformatted in plain text format so it can be directly imported from different tools such as Excel or Statistica for performing statistical analysis. The type of analysis that can be performed is different for each task type. In the case of the search task, this module allows analysis of parameters such as the number of correct trials per user (target found), average correct trials per experimental condition, total and average elapsed time for finding a target per user or per experimental condition, and the similarity of user path to task or task model's optimal one. The last parameter can be evaluated by means of the Lostness metric [21] per trial and subject, or the average Lostness per trial. Lostness index ranges between 0 and 1. The greater the values are, the greater user's Lostness will be. For the free navigation task, this module allows calculating the total time required by the user to navigate through the website (if there was no time limit) and a matrix of the transitions between actions for the analysis of user’s navigation strategies [22], or for the coherence between accessed nodes [24].

3 Case Study The material for the illustration come from an experimental study carried out by [25] with the aim of comparing ten different websites in terms of accessibility (measured

600


with the metrics proposed by [26]) and usability (measured by the accuracy, effectiveness and satisfaction searching information in the Web). EWEB was used to design, capture and analyze the experimental data. Twenty volunteers participated in the experiment (fourteen women and six men) whose average age was 25 years old. They were asked to search 54 targets in ten different Websites that was indicated each time (six searches per Website). The order for Websites presentation and searches per Website were randomized for participants. By means of the design module researchers introduced the number and types of variables and the program automatically calculated the number of experimental conditions and showed then to researchers. In this case, “Website” independent variable was manipulated within-subject and with ten levels, one for each website to be studied. Therefore, there were ten experimental conditions. Then, experiment designers selected each experimental condition they defined the tasks users should perform, in this case, a search task and a satisfaction questionnaire. Finally, researchers defined the number and characteristics of each one of the ten search trials: instructions, time limit, target, randomization, etc. Figure 2 shows a piece of the XML form that EWEB generated as a result of this process.

Fig. 2. Piece of the XML form which contains the experiment characteristics

Once researchers defined tasks and procedures of each one of the ten experimental conditions, they started the User Guidance and Monitor Module in the local computer of the experimental participant. Then, researchers selected the experiment, specified the participant identification and experimental group (the list of groups was automatically generated by EWEB). Since Website variable was a within-subjects variable, there was just one experimental group and all participants had to perform the searches and the satisfaction questionnaire for each one of the ten websites. The experiment started controlled by the user guidance module, which asked participants to perform the task according to the specified procedure (in this case, the administration of the websites and search targets was randomized for each participant). In the meantime, monitor module monitored participants’ actions while navigating and saved them.


601

The data were saved into a remote repository based on the structure of the experiment. The data report was grouped into task, experimental condition, measures, etc. In Table 1, it can be seen the accuracy and efficiency data of one participant in the 3 trials of the search task for one of the ten websites. Results were merged and exported to a statistic program in order to perform the required statistics. Table 1. Data calculated by EWEB for Participant 1 in the condition “W3C website” User Code: Participant 1 Search Task Experimental Condition: W3C website Trial Response Time (ms) 0 3656 1 60000 2 60000

Target Found

Lostness 1 1 0

0,25 0,780625 0

3.1 Results The results of one Website were removed from the analysis because the recollection of data failed for some users. Consequently, the accessibility of the remaining nine websites was calculated by means of [26] and compared to the usability metrics (search time, percentage of target found, lostness and satisfaction) calculated automatically by EWEB. The results showed the nine websites differed significantly in usability and accessibility and, which is more interesting, accessibility and usability metrics are not correlated and provide different websites ranking (see [25] for a wide description of this experimental results). Therefore, based on the results, it was concluded that technical web accessibility is not a good usability predictor.

4 Conclusions and Future Work The experimental study carried out by [25] illustrates that EWEB automates jointly the processes of experimental design, data registering and data analysis. The design module automatically generates and interprets a complex XML file containing the characteristics of the experiment without the need of researchers been experts in XML language. The design module safeguards the requirements for manipulation and control of an experimental study. In addition, the use of the experimental design facilitates the identification and analysis of users’ data by storing them attending to the patterns defined in the experiment. Since the experimental design and number of experimental conditions are defined in the design module, results are displayed as a function of the experimental structure. Identifying users by experimental conditions has an additional benefit because it prevents from confusing different users with a unique users or vice versa as it can happen when IPs are used for user identification. Another advantage is that EWEB automatically calculates metrics as the accuracy and users' lostness in the search task by comparing the task models provided by the researcher in the design module and user behaviour. That means that EWEB can analyze high level user behaviour and not only isolated events. Definitively, EWEB provides a great versatility at the same time that reduces the evaluators’ investment of

602


time and resources. From a technical point of view, the use of Java technology allows implementing an easily portable and multiplatform tool that eliminates network latency when measuring response time from users. Future work must consider the inclusion of new tasks and metrics such as Efficiency rating (E), Confidence rating (C ), St (index of linearity) and Cp (index of to the strategy related complexity [22]). In addition, it is interesting to improve the data display by introducing graphical information representation to facilitate the visual analysis to evaluators.

References 1. Shneiderman, B.: Designing the user interface: Strategies for effective human-computer interaction, 2nd edn. Addison-Wesley, Reading, MA (1992) 2. Salmerón, L., Salmerón, L., Cañas, J.J., Kintsch, W., Fajardo, I.: Are expert users always better searchers? Interaction of expertise and semantic grouping in hypertext search tasks. Behaviour and Information Technology 24(6), 471–475 (2005) 3. AccessWatch (n.d.), (2007) Retrieved on February 2007 from http://www.accesswatch.com/ 4. Analog (n.d.) (2007) Retrieved on February from http://www.analog.cx/ (2007) 5. WebTrends (n.d.) (2007) Retrieved on February 2007 from http://www.webtrends.com/ 6. Ellis, R.D., Jankowski, T.B., Jasper, J.E., Tharuvai, B.S.: Listener: a tool for client-side investigation of hypermedia navigation behavior. Behavior Research Methods, Instruments & Computers 30(6), 573–582 (1998) 7. Etgen, M., Cantor, J.: What does getting WET (Web Event-logging Tool) Mean for Web Usability? In: Proceeding of 5th International Conference on Human Factors and the Web, Gaithersburg (1999), http://zing.ncsl.nist.gov/hfweb/proceedings/etgen-cantor/index.html 8. Scholtz, J., Laskowski, S.: Developing usability tools and techniques for designing and testing web sites. In: Proceedings of the fourth conference on Human factors and the web. Basking Ridge, NJ (1998) Available at http://zing.ncsl.nist.gov/WebTools/ 9. Gonzalez, M.: ANTS: An Automatic Navigability Testing Tool for hypermedia. In: Proceedings of the Eurographics Multimedia’99 Workshop, Milán, Italy. Multimedia’99, Italy, Springer, Wein, Austria (2000) 10. Paganelli, L., Paternò, F.: Intelligent Analysis of User Interactions with Web Applications. In: Proceedings of ACM IUI 2002. San Francisco, CA. pp. 439–445 (2002) 11. HotJava (n.d.) (2007) Retrieved on the 10th of February 2007 from http://java.sun.com/ products/archive/hotjava/index.html 12. WebWindow (n.d.) (2007) Retrieved on the 10th of February 2007 from http:// www.javio.com/webwindow/webwindow.html 13. Edmonds, A.: Uzilla: A new tool for web usability testing. Behavior Research Methods, Instruments and Computers 35(2), 194–201 (2003) 14. WebSort (n.d.) (2007) Retrieved on February 2007 from http://www.websort.net/ 15. WebCAT (n.d.) (2007) Retrieved on the 8th of February 2007 from http://zing.ncsl.nist.gov/WebTools/WebCAT/overview.html 16. UzCardsort (n.d.) (2007) Retrieved on the 8th of February 2007 from http://uzilla.mozdev.org/cardsort.html 17. Harper, M.E, Jentsch, F.G, Berry, D., Lau, H.C, Bowers, C., Salas, E.: TPL–KATS-card sort: A tool for assessing structural knowledge. Behavior Research Methods, Instruments and Computers 35(4), 577–584 (2003)


603

18. Carmel, E., Crawford, S., Chen, H.: Browsing in hypertext: a cognitive study. IEEE Transactions on Systems, Man. and Cybernetics 22(5), 865–884 (1992) 19. Richter, T., Naumann, J., Noller, S.: LOGPAT: A semi-automatic way to analyze hypertext navigation behavior. Swiss Journal of Psychology 62(2), 113–120 (2003) 20. Brunstein, A., Naumann, A., Krems, J.F.: The Chemnitz LogAnalyzer: A Tool for Analyzing Data From Hypertext Navigation Research. Behavior Research Methods 37(2), 232–239 (2005) 21. Smith, P.A.: Towards a practical measure of hypertext usability. Interacting with Computers 8, 365–381 (1996) 22. McEneaney, J.E.: Graphical and numerical methods to access navigation in hypertext. International Journal of Human Computer Studies 6(5), 761–786 (2001) 23. Cañas, J.J., Antolí, A., Barquier, P., Castillo, A., Fajardo, I., Gámez, P., Salmerón, L.: Representación mental de los conceptos, objetos y personas implicados en una tarea realizada en una interfaz. Inteligencia Artificial 16, 107–113 (2002) 24. Foltz, P.W., Kintsch, W., Landauer, T.K.: The measurement of textual coherence with Latent Semantic Analysis. Discourse Processes 25, 285–307 (1998) 25. Arrue, M., Fajardo, I., López, J.M., Vigo, M.: Interdependence between technical web accessibility and usability: its influence on web quality models. International Journal of Web Engineering and Technology 3(3), 307–328 (2007) 26. Arrue, M., Vigo, M., Abascal, J.: Quantitative metrics for web accessibility evaluation. In: Lowe, D.G., Gaedke, M. (eds.) ICWE 2005. LNCS, vol. 3579, Springer, Heidelberg (2005)

Mixing Evaluation Methods for Assessing the Utility of an Interactive InfoVis Technique Markus Rester1 , Margit Pohl1 , Sylvia Wiltner1 , Klaus Hinum2, Silvia Miksch3 , Christian Popow4 , and Susanne Ohmann4 1

Institute of Design and Assessment of Technology, Vienna University of Technology, Austria [email protected] 2 Institute of Software Technology and Interactive Systems, Vienna Univ. of Technology, Austria 3 Department of Information and Knowledge Engineering, Danube University of Krems, Austria 4 Department of Child and Adolescent Psychiatry, Medical University of Vienna, Austria

Abstract. We describe the results of an empirical study comparing an interactive Information Visualization (InfoVis) technique called Gravi++ (GRAVI), Exploratory Data Analysis (EDA) and Machine Learning (ML). The application domain is the psychotherapeutic treatment of anorectic young women. The three techniques are supposed to support the therapists in finding the variables which influence success or failure in therapy. To evaluate the utility of the three techniques we developed on the one hand a report system which helped subjects to formulate and document in a self-directed manner the insights they gained when using the three techniques. On the other hand, focus groups were held with the subjects. The combination of these very different evaluation methods prevents jumping to false conclusions and enables for an comprehensive assessment of the tested techniques. The combined results indicate that the three techniques (EDA, ML, and GRAVI) are complementary and therefore should be used in conjunction. Keywords: Information Visualization, Evaluation, Utility, Focus Groups, Insight Reports, Methodology.

1 Introduction Several authors have pointed out the importance of evaluation studies of Information Visualization (InfoVis) techniques (see e.g. [1], [2], [3]). In the past few years usability studies concerning visualization techniques have become more frequent, and valuable information about the design of such systems has been gathered. Nevertheless, as [4] mentions, there is still too little systematic information about the specific strengths and weaknesses of the features of InfoVis techniques. Studies presenting data from practical experiences with InfoVis techniques can help to develop a more systematic framework to support the decision which InfoVis technique to use in a given context. Medical data is a very interesting application area for Information Visualization. One of the reasons for this is the complex and time dependent character of these data. For such data, interesting InfoVis techniques have been developed in the past few years. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 604–613, 2007. c Springer-Verlag Berlin Heidelberg 2007

Mixing Evaluation Methods for Assessing the Utility

605

In the following, we will describe a study analyzing several different methods used to assess the therapeutic treatment of anorectic young women. During the therapy process a large amount of highly complex data is collected. Statistical methods are not suitable to analyze these data because of the small sample size, the high number of variables and the time dependent character of the data. The data results from extensive questionnaires the young women and their parents have to fill in several times before, during and after the therapy. These questionnaires treat questions like, e.g., the young women’s propensity for depression, their social behavior or their attitude about eating. The therapists want to find patterns in the young women’s behavior and try to isolate the specific factors influencing success or failure in the therapy (predictors). InfoVis techniques might be a valuable possibility to represent these data, but in accordance with the therapists we also chose two other potential techniques (Machine Learning and Exploratory Data Analysis). Up till now, evaluation in Information Visualization was centered around two variables: time and error. This approach has been criticized recently [5]. For many applications, the measurement of time and errors is too narrow. Many visualization methods support extensive exploration processes and the formulation of hypotheses. For an exploration process, the measurement of time does not make sense, and in the context of the development of hypotheses, errors in a narrow sense do not occur. In an ill-structured domain with no clear-cut results like psychotherapy, for example, other approaches are necessary. Therefore, the concept of insights was introduced to make the results of the exploration processes based on InfoVis techniques more tangible [6]. Unluckily, there is no agreed upon definition of insights although cognitive psychology has dealt with this topic quite extensively (see e.g. [7]). Most authors define ’insight’ in a quite pragmatic manner. In addition, there are no general frameworks for categorizing insights. [8] points out that a starting point might be using user tasks as, for example, finding clusters or extreme values. There are some general cognitive activities which often appear as insight categories, as, for example, finding detailed, factual information, identifying clusters, generalizations, identifying changes over time, etc. [6,9]. We developed our own classification system, partly based on the generic categories described above and partly adapted to the specific task for which our visualization method was developed. Finding predictors plays an important part in the therapists work, therefore it is a central category of our analysis. Developing a theoretical framework for the concept of insights and the definition of relevant categories of analysis will be an important area of future research.

2 Compared Techniques An interactive InfoVis technique named Gravi++ (GRAVI) was developed to support the therapists and clinicians in exploring the multidimensional, abstract, and time dependent data [10]. GRAVI is based on a spring metaphor. The questions from the questionnaires are positioned on a circle. The icons representing the anorectic young women are arranged within this circle depending on the strength of attraction of the questions. The questions function, to a certain extent, like magnets. The final position of the patients’ icons is a combination of the forces of all given answers on the questions

606

M. Rester et al.

Fig. 1. GRAVI: Interactive InfoVis-Tool for Exploration of Multi-Dimensional Time Dependent Data (Typical Screenshot). Concept of Spring-Based Positioning Leads to Formation of Clusters.

(see Fig. 1). GRAVI uses animation to deal with the time dependent data. The position of the patients’ icons change over time. This allows analyzing and comparing the changing values. Various visualization options are available, like Star Glyphs and attraction rings to communicate the exact values of each answer or traces to show the paths of the patients’ icons over all time steps. We decided to compare GRAVI with the following techniques used so far for analyzing the data: Exploratory Data Analysis (EDA) and algorithms of Machine Learning (ML). In the case of EDA boxplots, histograms, scatterplots, and statistical measures were used (e.g., Fig. 2). The ML algorithms were: a C4.5 decision tree (e.g., Fig. 3) and a Support Vector Machine (SVM) trained by Sequential Minimal Optimization (SMO). Exploratory Data Analysis (EDA) was developed by Tukey [11] and is based on statistics. It helps users to review and analyze data on a descriptive level. Tukey thought that the emphasis on statistical testing might be too narrow an approach. He, therefore, suggested EDA as a possibility to formulate hypotheses and assess assumptions. Subjects were given printouts of these techniques. Machine Learning is an area of AI concerned with the development of algorithms that enable computers to ’learn’. A Machine Learning technique learns from observed examples or data. In general, there are two types of machine learning algorithms: supervised


607

Fig. 2. Exploratory Data Analysis (EDA) Sample: Boxplots

and unsupervised. In case of supervised learning, a priori knowledge about the data is used and in case of unsupervised learning, no prior information is given regarding the data or the output. We utilized two supervised schemes using WEKA [12]: a Support Vector Machine with Sequential Minimal Optimization algorithm [13,14] and a pruned C4.5 decision tree [15]. The output of these two techniques were again available to the subjects as printouts.

3 Evaluation Methods An extensive evaluation of InfoVis has to take place on different stages. Important areas of interest can be: usability evaluation, insight study, case study, and transferability assessment (see [16] for details). For results of a usability evaluation of GRAVI see [17]. The used methods in the insight study were insight reports [16] and focus groups (cf. [18]). A sample of 32 subjects participated in the study. They were computer science students and can therefore be described as domain novices. Therefore they received a comprehensive introduction to the domain (data, real users’ tasks, etc.) and introductions to the three different techniques to use. The evaluation with insight reports took place in a laboratory setting and lasted for an overall of 155 minutes. There was equal time for the three techniques (GRAVI, EDA, ML). Subjects were divided into three groups which used the three techniques in different order. Every technique was was once used in first, second, and third place (MEG, EGM, GME).

608

M. Rester et al.

Fig. 3. Machine Learning (ML) Sample: C4.5 Decision Tree

The subjects used a report system to formulate and document their findings during the exploration process in a self-directed manner. Whenever an insight occurred they had to generate a report with this system. The following data was collected: used material, description of finding, and confidence rating. The insight reports were later classified in the following categories: complexity of each insight, plausibility of an insight, and whether an assigned insight has been elaborated in more detail and if so, whether this elaboration was sound or not valid (see Fig. 4). Focus groups can give interesting insights into the users’ attitudes and experiences although they do not provide representative results [18]. [19] reports that focus groups are especially valuable for evaluating InfoVis techniques as they are able to uncover unexpected problems that cannot be perceived by other research methods. In this sense, they can be an interesting complementary approach to other more systematic methods. So focus groups with the same subjects were held a week after the laboratory setting. They lasted about 100 minutes each. Eight questions were discussed (e.g., ease of use and utility, major strength and weakness, similarity and difference of insights gained with the different techniques, appropriateness of combined use). The value of


609

Fig. 4. Insight Report Documented by Subject with Classification and Categorization Options for Investigators

this method is that it reveals subjective impressions on questions not asked before and gives a different perspective on as well as arguments for interpretation of the data collected in the experiment. The discussion guideline consisted of eight questions. A set of the first four questions had to be discussed by subjects for each of the three used techniques (GRAVI, EDA, ML) separately. Afterward four more questions were addressed to them concerning all three techniques: 1. 2. 3. 4. 5. 6. 7. 8.

Appropriateness of the allowed time. Ease of use and usefulness of the technique for gaining insights. Overall confidence in insights gained with the technique. Major strength and weakness of the technique. Similarity and difference of gained insights using different techniques. Assumed comprehension rates of the complex matter with each technique. Appropriateness of combined use of the three techniques. Order for best possible comprehension of the data.

4 Results 4.1 Insight Reports The 32 subjects documented an overall of 876 reports. In the classification process we defined 805 different insights which were assigned 2166 times to the reports. Statistical analysis of the collected data from this experiment was carried out. In depth details of these results are currently subject to reviewing and will be published in the near future.

610

M. Rester et al.

To sum up, the results could lead to the conclusion that ML is not a recommendable technique. The subjects’ confidence ratings were low, the complexity of the gained insights was low and few predictors were found. On the other hand GRAVI performed very well concerning insights with high domain value (finding predictors). Confidence ratings were also generally high. EDA lies somewhere in between. Histograms and scatterplots are well known. But the interpretation of boxplots and statistical measures require some familiarity with these techniques. EDA seems especially suited – or more precisely, it was utilized in particular – to analyze single values of individual patients in specific time steps. This may also be the reason why there were fewer wrong arguments with EDA. In contrast, there were many wrong arguments with ML. 4.2 Focus Groups Appropriateness of the Allowed Time. Concerning ML the subjects’ statements clearly show a connection to the position of ML in the order of used techniques: in the case ML was the first technique all of the subjects stated that there was too little time for the tasks. If ML followed GRAVI the allowed time was rated appropriate. Using ML at last led to the assessment that there was too much time left. Many explanatory statements were as follows: subjects are not familiar with ML; ML is no suitable technique to start with if one is not a domain expert already; ML is complex and confusing; and there were no new insights that have not been already gained with the other two techniques. In general the time allowed while using EDA was predominantly rated appropriate. Once more only when used as the first technique the subjects would have needed more time for the tasks. The familiarity with EDA was pointed out by the subjects. Only the statistical measures from EDA were criticized as difficult to interpret. The ratings for GRAVI are similar to ML, though not as pronounced, and follow the position of GRAVI: if GRAVI is used as first technique, subjects would have needed more time to get familiar with both technique and domain. For GRAVI in second position we have a trend towards too much time available for the tasks. Used as last technique subjects rated the allowed time appropriate. Ease of Use and Usefulness of the Technique for Gaining Insights. ML had the lowest scores regarding the usefulness for gaining insights. 55% of all statements made by the subjects belong to the lowest category on this scale. Once again, unfamiliarity with and complexity of ML led to high level of uncertainty. The assessment of EDA was twofold: scatterplots and histograms scored very well, whereas boxplots and statistical measures were not rated as useful. The former were favored for their simplicity and for being visualizations. The latter were criticized for being complicated and in the case of statistical measures for not being a visualization additionally. Also for GRAVI the subjects appreciated some elements as well as disapproved of others. The interactivity of this technique in general and its powerful capability to handle the time dependent data in particular were rated as very useful. Different visual details, like poor visibility of missing data, were mentioned to hinder usefulness.


611

Overall Confidence in Insights Gained with the Technique. ML had an even worse assessment in the focus groups compared to the ratings given in the lab setting: 65.6% of the statements rated ML in the category “low confidence”. This high ratio is most likely due to peer pressure in one of the three groups where all of the 12 participants rated ML unanimously (low confidence). Interestingly, EDA scored better than GRAVI in the focus groups. One possible explanation for this may be that EDA received a lot of high ratings in the focus group of those who used EDA at last. So we have probably on the one hand a form of learning effect leading to more domain expertise which also affects the confidence in observations. On the other hand EDA was the only technique the subjects were rather familiar with. So it is the more noteworthy that GRAVI did only receive a few more ratings in the “low confidence” category than EDA. Major Strength and Weakness of the Technique. Although the subjects could not make much use of ML they believe that for experts of ML this technique allows for very concise and valid insights. There was a strong appreciation of the automaticity of calculations and a high level of faith in the correctness of the results. The latter was also raised by the often positively mentioned confusion matrix, which is an self-evaluation on correctness provided by the ML algorithms. The visualization of the decision tree was rated a plus whereas the formula of SMO was mentioned to be confusing. The mentioned strengths of EDA were: visual elements (scatterplots, histograms), simplicity, familiarity, and clarity of displayed data. The lack of interactivity, the impossibility of comparison of patients and/or groups of patients, and problems with the exploration of time dependency are the downside of EDA. GRAVI impressed by its interactivity, many options to visualize data in different ways, the handling of time dependent data, its simplicity, and its intuitive interface. Subjects saw the major weaknesses of GRAVI in the fact that visualizing much data rapidly leads to glutted displays. Also the need for check and re-check of possible insights with different constellations is important. Otherwise false conclusions could easily be drawn. Similarity and Difference of Gained Insights Using Different Techniques. The subjects reported by majority that they found the same insights with the three different techniques. Almost 2/3 of the made statements went in this category. Nevertheless the detail of insights varied. Assumed Comprehension Rates of the Complex Matter with Each Technique. ML showed to contribute very little to the comprehension of the provided data. This is in clear accordance with the former statements of the subjects. EDA and GRAVI on the other hand could be utilized well by subjects. Appropriateness of Combined Use of the Three Techniques. 45% of statements put on record that the combined use of ML, EDA, and GRAVI makes perfect sense because all three techniques offer different views on the data and therefore facilitate a deeper understanding and extensive exploration. Other 45% of statements pleaded for omission of ML due to its marginal contribution in comprehension of the data for the subjects who were not familiar with this complex technique.

612

M. Rester et al.

Order for Best Possible Comprehension of the Data. There were almost as many preferred orders in using the three techniques as there were subjects. But there are also some major similarities in the statements: ML is not suitable as the first technique but more useful to recheck insights gained with other techniques. GRAVI and also parts of EDA (simple visual parts: histograms and scatterplots) are viable techniques for first exploration of data. Another interesting outcome in the discussion was that the different techniques should not be used sequentially like in the laboratory setting but simultaneously. The already mentioned different views they provide on the data could add much more value in this way.

5 Conclusion The use of diverse evaluation methods enables different views on the technology under investigation. Whereas insight reports can reveal strengths and weaknesses in form of summative tests followed by statistical analysis, focus groups often give reasons and additional subjective opinions of subjects and therefore also ensure correct interpretation of the former. The outcome of insight reports could lead to the conclusion that ML is not a recommendable technique because of low confidence ratings, low complexity of the gained insights, and small number of found predictors. On the other hand GRAVI performed very well. There were many insights with high domain value (predictors) and with high confidence ratings. EDA seems especially suited to analyze single values of individual patients in specific time steps. The outcome of focus groups shows that GRAVI is useful for gaining insights with a high confidence rating, because of its flexibility through interactivity, the ability to explore more dimensions simultaneously, and the straightforward navigation within the time dependent data. Moreover, subjects rated GRAVI an appropriate visualization tool. ML should be omitted unless there is enough expertise with this technique. If so, it still can and probably will be a powerful technique to gain insight. EDA rapidly leads to insights (although rather basic ones) due to the general familiarity with this technique. Combining these results we see, that all three techniques offer different views on the data and therefore a combined use will likely lead to more insight and comprehension. Acknowledgments. The project “Interactive Information Visualization: Exploring and Supporting Human Reasoning Processes” is financed by the Vienna Science and Technology Fund [Grant WWTF CI038]. Thanks to Bernhard Meyer for the collaboration in the classification process.

References 1. Chen, C.: Empirical evaluation of information visualizations: an introduction. Int. J. HumanComputer Studies 53(5), 631–635 (2000) 2. Plaisant, C.: The challenge of information visualization evaluation. In: Costabile, M.F. (ed.) Proceedings of the working conference on Advanced visual interfaces, pp. 109–116. ACM Press, New York (2004)


613

3. Tory, M., Möller, T.: Human factors in visualization research. Visualization and Computer Graphics, IEEE Transactions on 10(1), 72–84 (2004) 4. Spence, R.: Information Visualization. ACM Press, New York (2001) 5. Stasko, J.: Evaluating information visualizations: Issues and opportunities (position statement). In: Bertini, E., Plaisant, C., Santucci, G. (eds.): Beyond time and errors: novel evaLuation methods for Information Visualization – Proceedings of BELIV’06, Venice, Italy, pp. 5–7 ( 2006) 6. Saraiya, P., North, C., Duca, K.: An insight-based methodology for evaluating bioinformatics visualizations. Visualization and Computer Graphics, IEEE Transactions on 11(4), 443–456 (2005) 7. Eysenck, M.W., Keane, M.T.: Cognitive Psychology. A Student’s Handbook. Psychology Press, Taylor and Francis Group, London, New York (2005) 8. North, C.: Toward measuring visualization insight. Computer Graphics and Applications, IEEE 26(3), 6–9 (2006) 9. Lanzenberger, M.: The Interactive Stardinates – An Information Visualization Technique Applied in a Multiple View System. PhD thesis, Vienna University of Technology, Vienna, Austria ((September 2003) 10. Hinum, K., Miksch, S., Aigner, W., Ohmann, S., Popow, C., Pohl, M., Rester, M.: Gravi++: Interactive information visualization to explore highly structured temporal data. Journal of Universal Comp. Science 11(11), 1792–1805 (2005) 11. Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading, Mass (1998) 12. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, CA (2005) 13. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, pp. 185–210. MIT Press, Cambridge (1998) 14. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Computing 13(3), 637–649 (2001) 15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA (1993) 16. Rester, M., Pohl, M., Hinum, K., Miksch, S., Popow, C., Ohmann, S., Banovic, S.: Methods for the evaluation of an interactive infovis tool supporting exploratory reasoning processes. In: BELIV ’06: Proceedings of the 2006 AVI workshop on Beyond time and errors, New York, NY, pp. 32–37. ACM Press, New York (2006) 17. Rester, M., Pohl, M., Hinum, K., Miksch, S., Ohmann, S., Popow, C., Banovic, S.: Assessing the usability of an interactive information visualization method as the first step of a sustainable evaluation. In: Proc. Empowering Software Quality: How can Usability Engineering reach these goals?, Austrian Computer Society, pp. 31–44 (2005) 18. Kuniavsky, M.: User Experience: A Practitioner’s Guide for User Research. Morgan Kaufmann, San Francisco (2003) 19. Mazza, R.: Evaluating information visualization applications with focus groups: the coursevis experience. In: BELIV ’06: Proceedings of the 2006 AVI workshop on BEyond time and errors, New York, NY, USA, pp. 1–6. ACM Press, New York (2006)

Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings Jaspal S. Sandhu1,∗, P. Altankhuyag2, and D. Amarsaikhan3 1

College of Engineering, University of California, Berkeley, USA [email protected] 2 Asian Development Bank, Ministry of Health, Mongolia 3 Postgraduate Institute, Health Sciences University of Mongolia

Abstract. This paper presents an ethnographic method for assessing user needs in designing for rural settings. “Serial Hanging Out” consists of short-term participant observation with multiple, independent informants. The method is characterized by: (1) its short-term nature, (2) the use of participant observation supported by specific field techniques, and (3) the emphasis on user needs for design. It is discussed in relation to similar methodological work in associated fields. To ground the discussion, the method is presented in the context of ongoing work to develop improved information systems to support rural health workers in Mongolia. Keywords: participant observation, ethnography, design, qualitative methods, user needs, rural, Mongolia.

1 Introduction Rapid ethnographic methods play a critical role in human-centered design. They have been applied extensively not only in human-computer interaction (HCI) and computer-supported cooperative work (CSCW) [1][2][3], but also in product design [4][5], and marketing and consumer research [6][7]. The common motivation for the use of these methods across various disciplines is that they provide a much richer understanding of people, in context and from their own perspective. Blomberg et al. succinctly describe the key dimensions of ethnographic research: (1) it takes place in natural settings, (2) it is holistic, i.e. understanding is framed in systems larger than the immediate context, (3) it is descriptive, and (4) it strives to consider the member’s own perspective [8]. Rapid methods are necessary because projects operate under severe time and budgetary constraints; however, holism and the member’s perspective are often sacrificed in order to operate within these constraints [9]. While ethnographic methods have been used for requirements generation [10], the emphasis here is on design innovation in unfamiliar environments, specifically rural communities in the context of international development. The focus is not only on how technology can ∗

Corresponding author.


Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings

615

be designed, but also on whether technology makes sense in the first place. In either case, the objective is to design holistic and systemic interventions that target real development problems. Such design innovation requires a deep understanding of target users, which is challenging to obtain using rapid methods. Dourish criticizes ethnographic practice in HCI on the basis that it tends to neglect the interpretive nature of ethnography [9]. Achieving a deep understanding and preserving the interpretive nature of ethnographic research are typically correlated to long term fieldwork, but it is unrealistic to propose long-term fieldwork in most applied design settings. The presented work addresses this issue through longer engagements with individuals than are typical in applied design settings, and by operating from an interpretive perspective, namely one that sits “between two worlds or systems of meaning – the world of the ethnographer … and the world of cultural members” [11]. In HCI, Rapid Ethnography [12] and Quick and Dirty Ethnography [3] have been presented, but these methods have not been sufficiently evaluated in practice [13]. While these forms of ethnographic research focus on a place – offices and air traffic control rooms, respectively – the method proposed here is focused on the individualas-anchor; this makes more sense in many rural applications and is a major thrust of framing this method as one for rural applications. Rapid Ethnography and Quick and Dirty Ethnography share in common an emphasis on focused studies. Such a premature focus may blind researchers to critical information in relatively unfamiliar, rural settings. Although most ethnographic work is cross-cultural to some degree, work in rural communities for international development represents an extreme because those involved in conducting the research often come from very different cultural backgrounds, even if they are from in-country. The settings of international development are necessarily more diverse than business settings, where “through years of experience, trained ethnographers build up a great deal of knowledge … about those segments of the population who are reliably of interest to business” [14]. Given the relative unfamiliarity of the context to those conducting the research informal methods [15] are required, as is a need to “approach social life with a wideangle lens” [16]. Prior methodological work in HCI has not focused on rural settings, so there is an opportunity for HCI to contribute to related work in the international development community [17][18]; however, the proposed methodology – although related to HCI and innovation – is more concerned with meeting specific development objectives than it is with achieving novel technology gains.

2 The Method The proposed method for understanding user needs in the context of designing information systems for rural communities is Serial Hanging Out (SHO): sequential, short-term (2-4 days) participant observation with multiple, independent informants. The participant observation techniques are more sophisticated than the phrase

616

J.S. Sandhu, P. Altankhuyag, and D. Amarsaikhan

“hanging out” suggests.1 Still, the metaphor of “hanging out” captures the essence of the work: participant observation is “a way to collect data in naturalistic settings … [to] observe and/or take part in the common and uncommon activities of the people being studied” [19]. SHO is related to Sanjek’s [20] network method in urban ethnography. His “network-serials” consisted of intensive interviewing of informants in order “to describe the behavior and purposes of the members … and to chart the range of interaction settings.” A key similarity between Sanjek’s network serials and SHO is that they focus on informants as “anchors”, providing access to different activities, interactions, relationships, and actors. Despite the similarities, there are several key differences. First, the core field method in SHO is short-term participant observation rather than interviewing. Sanjek’s suggestion of “direct behavioral recording” as a plausible alternative to “intensive interviewing” highlights another important difference related to the respective goals of the two methods. Network-serials are intended to chronicle behavior, but SHO uses the concept of serial interactions and social networks to explore the interactions themselves. It is not simply interactions occurring in a particular time and place with particular actors that are of interest in SHO, but it is the quality, content, and meaning of those human interactions. Moreover, SHO is not limited to interpreting interactions and movements – in particular, other elements of understanding arise from contextual, ethnographic interviews. In practical terms, SHO uses multiple teams of 1-2 researchers each in order to conduct a study. The need for parallel teams is driven by the significant time investment in each participant, not only the time with the informant, but also the transport time (often significant in rural areas, especially with disparate informants) and the time spent synthesizing field data. As Millen states, international research may require multiple researchers on a single team in order to “help with language and local cultural issues” [12]. With respect to parallel teams, he notes that multiple researchers can observe different groups. Researchers will always have some influence on the situations in which they are involved. This is understood and is in fact an integral part of interpretive research; however the influence of more than 2 researchers is so disruptive that two is the maximum number of researchers to be involved with a single informant. SHO avoids Geertz’s criticism of "hit-and-run" ethnography [21] by concentrating on specific informants methods, and themes of inquiry. Further, the focus on informants-as-anchors, the longer engagement, and the emphases on holism and member’s point-of-view distinguish SHO from contextual inquiry [22].

3 Context This paper presents a theoretical argument for SHO that informs the design of an applied research study in rural Mongolia. While the method draws from the primary 1

The phrase is adopted from Geertz [21] who cites Clifford as originating the phrase “deep hanging out”. Clifford’s tone is intended as an affront to traditional ethnography, but Geertz picks up the phrase, dusts it off, and wears it proudly.


617

author’s past experience with design research in rural settings [23][24], it is being formally implemented and evaluated in the context of this Mongolia study. Mongolia is unique among developing countries in that one-quarter of the overall population is nomadic or semi-nomadic, and rural areas are extremely sparsely populated [25]. By many standards, the Mongolian health infrastructure is highly developed; however, facilities and human resources are increasingly limited in providing effective healthcare for rural populations that lie beyond aimag (provincial) capitals. Bagiin baga emch, rural health workers, provide services at the bag (smallest administrative unit) level by traveling from household to household by motorcycle or horse; however, there are significant unmet needs in continuing training of these paraprofessionals [26] and in providing support for their work practice. This research focuses on understanding the lives of bagiin baga emch in order to design improved information systems to support their work. The primary field research is being undertaken in partnership with an Asian Development Bank (ADB) project: “Information and Communication Technology for Improving Rural Health Services in Mongolia” (JFICT 9053-MON). The objective of this project is to improve the health of vulnerable, rural populations – especially mothers and young children – by using ICT (information and communication technology) tools to support health services delivery. Part of this project involves providing PDAs (personal digital assistants, or handheld computers) to bagiin baga emch, primarily in order to support data collection. PDA deployment and associated training began in spring 2007. In addition, this work will contribute to an International Development Research Council (IDRC) pilot project that will provide PDAs to rural health workers in Nepal (via HealthNet Nepal) and Mongolia (Health Sciences University of Mongolia). The intention of providing PDAs is to support continuing education and decision-making at the point of care.

4 Sampling Sampling issues are fundamental to ethnographic research, particularly in rural settings where potential users of systems may be geographically and culturally dispersed. In SHO, maximum variation sampling [27] is preferred because it provides maximal coverage of perspectives, behaviors, practices, interactions, and activities. Maximum variation sampling is purposeful selection of informants representing a broad range on key dimensions. In the Mongolia case, such dimensions include geographic zone (e.g. arid steppe, forest steppe, mountain, desert), transport (motorcycle, horse, camel), and work experience. Bagiin baga emch are a formal part of the health care system, so the researchers have access both to the entire sampling frame and to key intermediaries at the aimag (provincial) and soum (county) levels. In other settings, access to informants, or information about them, may be more difficult to be obtain. In such cases, other sampling strategies [27] may be more suitable. The goal of this method is as much to define intracultural variation [28] in apparently homogeneous groups as it is in defining cultural patterns. By understanding variation in

618


meanings and practices within our sample of bagiin baga emch, we can develop strategies that have more universal appeal and can even take advantage of knowledge owned by a subset of the population. It can be argued in this regard that von Hippel’s work with lead users is primarily motivated by intracultural variation [29].

5 In the Field Regarding specific field techniques, pens and paper notebooks will be used for field notes since, from prior experience, researchers will be highly mobile. The recommendations of Emerson et al. [30] will be followed closely for developing field jottings. In addition to their recommendations, sketching and diagramming will be used as mnemonic devices. Opportunistic digital audio recording will be used to capture unstructured interviews. Digital photographs and short digital video clips will be recorded on a limited basis to supplement observations and interviews. The primary output of an encounter with an informant will be a narrative – blending realist and impressionist styles [11] – supported by annotated photographs. The audio and video will be used primarily to support the construction of these narratives, but will also remain available for secondary analysis in later phases of the design process. The narratives will be written immediately after leaving the field in order to support maximal recall. In the Mongolia work, this will typically mean writing the narratives at soum health centers, within 24-48 hours of leaving the informant. To this point, SHO has been presented as a cleanly operationalized process; however, as with other ethnographic enterprises, this is simply not the case. Significant field preparation is required and has been undertaken with the Mongolia project. Review of previous research with bagiin baga emch and other rural health workers was an initial step in beginning to understand the culture and work conditions of bagiin baga emch. This is an imperative stage in the process although the literature may sometimes be non-existent [3] or misleading [15]. For the lone foreign researcher on this team, language preparation was critical (as it was for Sanjek’s work in Accra, Ghana [20]). The essential nature of language – and some basic cultural understanding – is why it is important for this work to be done by in-country researchers, or at the very least, in close collaboration with them. Other field preparation has included key informant interviews, observation of continuing training in aimag capitals, and pilot testing of the field protocol.

6 Time There are many factors to consider in the design of rapid ethnographic research, but time-in-the-field is often a primary concern (Table 1). Although Table 1 seems to indicate significant differences in time of engagement, the differences are in fact less dramatic, as the cumulative sum of fieldwork days is greater for those methods that involve multiple sites or informants (Rapid Ethnography, Serial Hanging Out).


619

Table 1. Sample time recommendations for different rapid ethnographic methods Author Millen [11] Hughes et al. [3]

Method Rapid Ethnography Quick and Dirty Ethnography

Beebe [18] Handwerker [31] Sandhu et al.

Rapid Assessment Process Quick Ethnography Serial Hanging Out

Time of Engagement 1 day or less 4 weeks each study, multiple studies over 3 years 4-40 days 3-90 days 2-4 days

Unit of Analysis Per site Single site

Single site Single site Per informant

In the business world, “ethnographies (read: participant observation) can last a half a day or even less. How is this possible? Ethnographers working in business are generally PhDs and typically manage this seemingly impossible feat by applying their methodological skill and accrued knowledge of theories of human behavior and social interaction” [14]. There are two problems in applying such logic to international development: (1) such familiarity does not exist for rural international development, even for trained people who are from in-country, and (2) such an attitude shifts the power from informant to ethnographer – the informant becomes a subject, rather than participant, in the research. In Mongolia, the selection of 2-4 days per informant is motivated by the nature of bagiin baga emch activities – some activities take much longer than a half day, such as monthly visits to households (2-3 days), visits to soum health centers (1-3 days), and summons (duudlaga) to a patient homes (half to full day). In addition, the strategy is to maximize the cost-benefit given the relatively high time and monetary investment in rural travel.

7 Multiple Researchers and External Reliability Issues of external reliability2 are a primary concern in SHO given the use of multiple, parallel researchers. LeCompte and Goetz indicate that ethnographers “enhance the external reliability of their data by recognizing and handling five major problems” [32]. These five problems and the mechanisms for managing them in SHO are presented below: 1. Researcher status: Mobility along the participant-observer spectrum is limited by selecting similar informants (all bagiin baga emch), by selecting a research team with comparable abilities to one another, and by making explicit the desired status among the research team. 2. Informant choices: The informants are similar along several dimensions since they represent a single class of users (bagiin baga emch). 3. Social context: An engagement of 2-4 days will ensure access to multiple social settings and actors. 2

“External reliability addresses the issue of whether independent researchers would discover the same phenomena or generate the same constructs in the same or similar settings” [32].

620


4. Analytic constructs: A single field protocol will be used by all field researchers. 5. Data collection/analysis: The field protocol and periodic team meetings will be used to manage data collection and preliminary analysis in the field, while a structured, team-based process will be used during the data analysis phase.

8 Data Analysis Following the model of Griffin and Hauser [33], the narratives resulting from the SHO will be analyzed by teams of researchers and students, in this case students from the Health Sciences University of Mongolia. Part of the motivation for doing so is to evaluate the effectiveness of this method in uncovering user needs. User needs are one, but by no means the only, way to bridge ethnography and design. Urban and Hauser define needs as “statements in the words of the customer that describe the benefits they need, want, or expect to get from a product” [34]. SHO extends this definition in 3 ways. First, while statements are important, other aspects of ethnographic work are also included. Second, products include services, not just artifacts or technological systems. Third, in international development more than business development, such products or services as are being developed may not exist or may exist in a radically different forms, so wants and expectations may be difficult to obtain. The needs analysis is a team-based process of identifying both explicit (stated) and implicit (latent or unarticulated) needs from the narratives. Upon completion of the needs analysis, all needs will be merged and redundant needs will be removed. The team will then use affinity diagramming3 to create a hierarchy of needs. Such a hierarchy makes the process translating user needs into novel design concepts more tractable. Finally the user needs will be tied to the context from which they came to preserve the richness of the design research. The resulting needs and associated data will be used to generate novel design concepts. The concepts will then be prototyped and tested with users, whether or not the prototyped systems are “technological”.

9 Summary and Assessment Serial Hanging Out (SHO) is a rapid, ethnographic method for uncovering user needs in rural settings. This method is particularly well-suited to the study of a geographically dispersed class of users [20], as may be found in rural institutions (in the present case, Mongolia’s rural health care system). Moreover, by selecting informants as anchors, and by experiencing interactions over a multi-day period, a rich sample of interactions and activities can be included in the development of ethnographic narratives. Although the method draws from Sanjek’s network-serial method, SHO is less concerned with behavioral mapping than it is with using spatial movements and interactions as a scaffolding for ethnographic inquiry. Also, as 3

Affinity diagramming is a team-based process of grouping ideas, in this case user needs, based on the ideas themselves, rather than external categories. It is also known as the KJ method after Japanese ethnologist Kawakita Jiro, the inventor of the method.


621

opposed to network-serials, SHO emphasizes emic rather than etic4 perspectives, and operates in an interpretive frame of reference. Although there are some similarities to other rapid methods in HCI [3][11], this work is different in that is emphasizes the use of multi-day participant observation and that it has a less specific focus at the onset. In any case, none of these methods has been substantially evaluated in practice [12]. By formally implementing this method in the context of applied research in the rural Mongolian health sector, this method can be evaluated in situ, providing evidence as to its usefulness and to the key elements study design. The evaluative component of this research will address the efficacy, efficiency, and quality of the methods. Details of the evaluation and procedures for evaluation will be presented in future publications. Although this research will have relevance to the design of information systems, it is also expected to have utility beyond design [9], in both applied and theoretical senses. The ethnographic results should provide a deeper understanding of bagiin baga emch for those developing health strategy for the bag/soum level in Monoglia and should also provide a unique view into the work culture of rural health professionals in a particular place and time. This plurality is a main driver of this research. Acknowledgements. Thanks to those who have partnered in, or supported, the prior field research which serves as the basis for the concepts in this paper: Jonathan Hey, Catherine Newman, Alice M. Agogino, Teresa DeAnda, Jessica Granderson, Expedita Ramirez, and Kirk R. Smith. Countless discussions with colleagues, mentors, and friends (and a few non-academic strangers) have been instrumental in the development of these ideas. Special thanks on this front to Judd Antin, Michael Barry, Sara Beckman, Griff Coleman, Peter Lyman, and AnnaLee Saxenian. Preliminary fieldwork in Mongolia was supported by a Foreign Language Area Studies grant. The current research is funded by a Fulbright Fellowship and an NSEP Boren Fellowship. Mahad Ibrahim and Andrei Marin provided invaluable feedback on early drafts of this article. Finally, none of this would be possible without the cooperation of past and current research participants, who have invited us into their homes and daily lives. To them we are most indebted.

References 1. Gilmore, D.: Business: Understanding and Overcoming Resistance to Ethnographic Design Research. Interactions, 9(3) (May 2002) 2. Wasson, C.: Ethnography in the Field of Design. Human Organization 59(4), 377–388 (2000) 3. Hughes, J., Rodden, T., King, V., Anderson, H.: The Role of Ethnography in Interactive Systems Design. ACM Interactions 2(2), 56–65 (1995) 4. Rosenthal, S.R., Capper, M.: Ethnographies in the Front End: Designing for Enhanced Customer Experiences. Journal of Product Innovation Management 23, 215–237 (2006) 4

Emic refers to terms or concepts meaningful to the cultural member, while etic refers to terms or concepts meaningful to the external researcher.

622


5. Squires, S., Byrne, B. (eds.): Creating Breakthrough Ideas: The Collaboration of Anthropologists and Designers in the Product Development Industry. Bergin and Garvey, Westport, Connecticut (2002) 6. Mariampolski, H.: Ethnography for Marketers: A Guide to Consumer Immersion. Sage, Thousand Oaks, California (2006) 7. Arnould, E.J., Wallendorf, M.: Market-Oriented Ethnography: Interpretation Building and Marketing Strategy Formulation. Journal of Marketing Research 31(4), 484–504 (1994) 8. Blomberg, J., Burrell, M., Guest, G.: An Ethnographic Approach To Design. In: Jacko, J.A., Sears, A. (eds.) The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, pp. 964–986. Lawrence Erlbaum Associates, Mahwah, New Jersey (2003) 9. Dourish, P.: Implications for Design. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montréal, Québec, Canada, 541-550 (2006) 10. Sommerville, I., Rodden, T., Sawyer, P., Bentley, R., Twidale, M.: Integrating Ethnography into the Requirements Engineering Process. In: Proceedings of IEEE International Symposium on Requirements Engineering, San Diego, California, pp. 165– 173 (1993) 11. Van Maanen, J.: Tales of the Field. University of Chicago Press, Chicago (1988) 12. Millen, D.R.: Rapid Ethnography: Time Deepening Strategies for HCI Field Research. In: Proceedings of the Conference on Designing Interactive Systems, New York City, New York, pp. 280-286 (2000) 13. Kujala, S.: User Involvement: A Review of the Benefits and Challenges. Behaviour and Information Technology 22(1), 1–16 (2003) 14. Plowman, T.: Ethnography and Critical Design Practice. In: Laurel, B. (ed.) Design Research: Methods and Perspectives, pp. 30–38. MIT Press, Cambridge, Massachusetts (2003) 15. Agar, M.: The Professional Stranger: An Informal Introduction to Ethnography. Academic Press, New York (1980) 16. Spradley, J.P.: Participant Observation. Holt, Rinehart and Winston, New York (1980) 17. Chambers, R.: The Origins and Practice of Participatory Rural Appraisal. World Development 4(7), 953–969 (1994) 18. Beebe, J.: Rapid Assessment Process. Altamira Press, Walnut Creek, California (2001) 19. Dewalt, K.M., Dewalt, B.R.: Participant Observation. Altamira Press, Walnut Creek, California (2002) 20. Sanjek, R.: A Network Method and Its Uses in Urban Ethnography. Human Organization 37(3), 257–268 (1978) 21. Geertz, C.: Deep Hanging Out. The New York Review of Books 45(16) (October (1998) 22. Beyer, H., Holtzblatt, K.: Contextual Design: Defining Customer-Centered Systems. Morgan Kaufmann, San Francisco (1998) 23. Sandhu, J.S., Hey, J., Newman, C., Agogino, A.M.: Informal Health and Legal Rights Education in Rural, Agricultural Communities using Mobile Devices. Proceedings of IEEE International Conference on Advanced Learning Technologies, Kaohsiung, Taiwan, pp. 988-992 ( 2005) 24. Granderson, J., Sandhu, J.S.: Efficiency and Design of Improved Woodburning Cookstoves in the Guatemalan Highlands. Technical Report Max-05-1, School of Public Health, University of California, Berkeley (2005) 25. Ebright, J.R., Altantsetseg, T., Oyungerel, R.: Emerging Infectious Diseases in Mongolia. Emerging Infectious Diseases 9(12), 1509–1515 (2003)


623

26. Directorate of Medical Service, Department of Human Resource Development: Survey Report on Training Needs of Bag Feldshers. Ulaanbaatar, Mongolia (2004) 27. Patton, M.Q.: Qualitative Evaluation and Research Methods. Sage, Newbury Park, California (1990) 28. Pelto, P.J., Pelto, G.H.: Anthropological Research: The Structure of Inquiry. Cambridge University Press, Cambridge, United Kingdom (1978) 29. von Hippel, E.: The Sources of Innovation. Oxford University Press, New York (1988) 30. Emerson, R.M., Fretz, R.I., Shaw, L.L.: Writing Ethnographic Fieldnotes. University of Chicago Press, Chicago, Illinois (1995) 31. Handwerker, W.P.: Quick Ethnography. Altamira Press, Walnut Creek, California (2001) 32. LeCompte, M.D., Goetz, J.P.: Problems of Reliability and Validity in Ethnographic Research. Review of Educational Research 52(1), 31–60 (1982) 33. Griffin, A., Hauser, J.R.: The Voice of the Customer. Marketing Science 12(1), 1–27 (1993) 34. Urban, G.L., Hauser, J.R.: Design and Marketing of New Products. Prentice Hall, Englewood Cliffs, New Jersey (1993)

Effectiveness of Content Preparation in Information Technology Operations: Synopsis of a Working Paper A. Savoy1 and G. Salvendy2 1

Purdue University, West Lafayette, IN, USA [email protected] 2 Purdue University, West Lafayette, USA and Tsinghua University, Beijing, China [email protected]

Abstract. Content preparation is essential for web design [25]. The objective of this paper is to establish a theoretical foundation for the development of methods to evaluate the effectiveness of content preparation in information technology operations. Past studies identify information as the dominant concern of users, and delivery mechanism as a secondary concern [20]. The best presentation of the wrong information results in a design with major usability problems and does not aid the user in accomplishing his task. This paper shifts the focus of existing usability evaluation methods. It attempts to fill the void in usability literaoture by addressing the information aspect of usability evaluation. Combining the strengths of content preparation and usability evaluation yields major implications for a broad range of IT uses. Keywords: Content preparation, World Wide Web, Usability.

1 Introduction The internet has evolved into an essential part of everyday life for a variety of purposes: education, communication, and entertainment [1]. Accordingly, Human Computer Interaction (HCI) research has made great strides in the design of interfaces. Past research identify information as the dominant concern of users, and delivery mechanism as a secondary concern [20]. Yet, most studies concentrate solely on presentation and/or mode of communication. The best presentation of wrong information still results in a design with major usability problems and does not aid the user in accomplishing his task. Therefore, effectiveness of interface design and Information Technology (IT) communication with humans is a function of what is presented (i.e. content preparation) and how well the ‘what’ is presented (i.e. usability evaluation). Usability is a common term appearing in numerous publications in the area of HCI. In addition, usability evaluation is not a new topic of interest. Over the years, researchers have discussed fundamental issues of formative and summative evaluation approaches, needs for objective measures, and optimal numbers of evaluators. Information aspects are rarely mentioned in usability definitions. Less than five percent mention information in usability evaluation studies [10]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 624–631, 2007. © Springer-Verlag Berlin Heidelberg 2007

Effectiveness of Content Preparation in Information Technology Operations

625

This paper unveils a theoretical foundation for the development of methods to evaluate effectiveness of content in IT operations. Combing this with existing methods of usability evaluation will provide an overall effectiveness evaluation of IT systems with which humans interact. It attempts to fill the void in usability literature by addressing the information aspect of usability evaluation.

2 Content Preparation The concept of content preparation emerged from a conference panel discussion in 2002. This panel assessed the preparation of content and its management for four elements of web design: knowledge elicitation, information organization, information retrieval, and information presentation [21]. Identifying specific information needed by users and/or customers is a main goal. Content preparation is a fairly new concept in web design with relevance to e-business and cross-cultural design issues [16]. Due to its infancy, there is no study or development of an evaluation tool for this concept and its principles. Further, there has not been documentation of an information structure that aids the development of an evaluation tool. This study aims to equip designers with end-user information requirements. The focus of Content Preparation evaluation and traditional Usability evaluation differ considerably. The former concentrates on “what” information is needed and provided. The latter evaluates presentation and functionality [16]. Both evaluations are important to the design of human-centered interfaces. Currently, only usability testing is practiced. If Content Preparation evaluations were used as a supplement to traditional usability testing, interface evaluations would be more comprehensive. A similar concept of content preparation was established in the yester years. Its principles were captured in the production of catalogs, newspaper, product manuals, and paper-based bank statements. Studies have investigated differences between traditional and non-traditional media: printed versus online catalogs, printed versus online magazines [17], printed versus online newspapers, printed versus online references, and printed versus online presentation of information. As a result, the varying characteristics of printed and online materials influence modifications/additions for web-based content preparation.

3 Literature Review Many research studies have addressed the issues of data and information quality [24], [14], [19]. In the Information Systems (IS), Human-Computer Interaction (HCI), and Web-design areas, there have been different approaches for classifying important factors and evaluating quality. 3.1 Information Quality The majority of IS studies are founded upon the effectiveness of databases, search and retrieval algorithms, and content management systems [12]. Research in these areas tends to include an overall approach. In hopes of discovering a construct defining useful content for general use, studies in IS were investigated.

626

A. Savoy and G. Salvendy

Research pertaining to information/data quality has been conducted covering traditional and non-traditional contexts of use and presentation. Common classifications of the dimensions include Quality Category, Assessment Class, Criteria, and Dimensions. The Quality Category classification stems from an information quality framework developed to allow IS managers to better understand and meet the needs of their information consumers [24], [12]. This framework consists of the following four categories: Intrinsic information quality, Representational information quality, Accessibility information quality, Contextual information quality. The first three categories refer to general credibility issues. Intrinsic information quality suggests that information should have an independent quality [11]. This category is defined further with related dimensions among which accuracy is labeled the most important [22], [11]. It could be viewed as General Content information quality. Representational information quality is concerned with the format and presentation of information. This category deals with the visual design (e.g. “how” aspect) of web-design, which is not the focus of this paper. Accessibility information quality focuses on security and privacy [12]. This category could be viewed as Trust information quality. The final category is inline with this paper’s objective. Contextual information quality denotes consumers’ need for information may differ according to their tasks. However, general categories: Relevancy, Value-added, Timeliness, and Amount of data can be used for classification. Content requirements for these categories were not mentioned. However, it demonstrated the use of general categories for classification. Findings from IS, provided details for aspects of credible information rather than useful information. While useful information should be credible, credible information is not necessarily useful. A clear definition of useful content can not be deduced from the studies in this area alone. 3.2 Internet Domains Content preparation emphasizes information that will aid users in their decisions and tasks. Although studies show that user preferences and information needs change according to their purposes for using the web or different website domains [3], [26], the desire of useful information is common in all domains. A number of studies are restricted to specific domains, which include E-Commerce, Entertainment, Education, Advertisement, and Medical. The results of each study provide in-depth analysis of what aspects/features were important to usability and customer satisfaction, rarely is information content addressed. E-commerce is ranked among the top two domains for visitation and use. Online shopping surfaced as a by-product of the internet, which produces billons of dollars in revenue [8]. Customers need specific information that will aid in their decisionmaking. However, similar research often does not consider specific elements of information for e-commerce websites [8]. It takes more than simple convenience to persuade a shopper to buy from a particular website due to the vast number of competitors. Deltor et al (2003) was one of the first studies to address the challenge of identifying specific information elements of e-commerce websites. Their research


627

concerning browse vs. search for pre-purchase online information seeking is highly referenced. The elements were categorized into three groups: Product, Retailer, and Interface related dimensions of information content. The construct was developed after conducting research on consumers’ information preferences across browsing and searching activities. Some of Deltor et al’s (2003) elements are listed in Table 1. Table 1. Essential Domain Specific Information Components Domain Ecommerce

Aesthetics Product Specification Price

Advertisement

Education

Information components References Deltor, Reliability Delivery 2003 Purchase Advice Retailer Retailer Reputation service Brand Retailer Policy Product alternative Availability Manufacturer

Price-Value Quality Availability Special Offer Packaging Guarantees Company-Sponsored Research Admissions Alumni Facilities FAQs Placement Programs Board Members

Performance Components Taste Nutrition New Ideas Safety Independent Research

Resnik & Stern, 1977

Awards Calendar Gallery News Speakers Faculty Course Descriptions

Griffin, 1999

Online advertisements have boosted revenues “from $1.9 billion in 1988 to $4.6 billion in 1999 alone” [8]. In addition, the role of information has established itself as a central factor in many discussions of how advertising works [9]. Advertising has many models and theories dedicated to explaining how a consumer searches for functional information to assist decision-making during the purchasing process. Advertisement research has roots that date back to the 1970s. Research conducted in the marketing area addressed needs for certain types of information in ads. Its focus on information cues has the closest relation to the proposed conceptual model. Information cues are defined as categories of information that are potentially useful to consumers [9].The majority of studies cited refer to cross cultural television advertisements. The most referred set of cues is accredited to Resnik and Stern (1977). These cues spawned from the results of a content driven investigation of 378 television commercials. The set originally included 14 cues (refer to Table 1). The longstanding validity of Resnik & Stern’s (1977) information cues provides a strong foundation for the development of a definitional framework of useful content for IT operations. According to studies conducted by Zhang et al (2001), the education domain ranks among the top two (paired with E-commerce) based on user familiarity. After 9/11, universities noticed a dramatic drop in campus visits and need a new method of recruitment. This and other factors have influenced the onset of e-recruitment techniques and tools. Among those tools, websites are the primary. Now university websites contain with vast amounts of information to aid visitors in performing different tasks: selecting a university or course, retrieve personal records, and bill payment.

628


Most research in this area address the website as a recruitment tool. Prospective students view university websites for information to assist in their school selection process. Therefore, the preparation of the website content should deliver information that would market the school appropriately [8]. Griffin (1999) conducted a qualitative analysis of content provided on 16 web sites. Evaluation of these websites over a two week period generated the list of informational cues cited in Table 1.

4 Conceptual Model All domains should consider content preparation for the design of their websites. Ecommerce, Advertising, and Education have relatively strong internet presence. However, there are other domains (i.e. Financial, Government, Medical, and Entertainment) attempting a transition to online environments. For example, egovernment is the attempt to make government more citizen-friendly with well designed websites [15]. The new domains need to identify and assess the content and functionality necessary to motivate their audience to use these websites [15]. Chan & Swatman (2002) attempted to improve university recruitment websites with lessons learned form e-commerce. They conducted a review comparing universities and their use (or potential use) of methods from the e-commerce domain. This comparison was founded on the application of Ho’s framework (1997) to the websites of universities in Australia and Hong Kong SAR. The results concluded 15 different information components for education websites. This research ties the information needs of the education domain with those in the e-commerce domain. Again, the demonstration of transferability encourages the development of a definitional framework of useful information for IT operations that is not limited by domain. The elements listed by domains as important information elements are not mutually exclusive. They can be integrated to form an information structure that would provide a baseline definition for useful content. The categories have to be selected appropriately, to capture all the elements and have relevance to web-based interfaces. Inspecting the elements discovered from the literature review a subjective analysis suggested classification by eight categories: 1.

2. 3.

4.

Site Information – Information concerning the overall perceived quality of information provided be the website. Information content should be frequently updated and the users should be aware of when these updates are occurring [2]. Transaction information – Information explaining different aspects of the purchasing process. This is supported by the E-commerce and Education domains. Users want to make informed purchase decisions [5], [22], [6]. Company information – Information providing details on the many characteristics of a company. All domains support this predicted factor. The internet allows anyone to conduct E-business. Therefore, prospective consumers require information about company characteristics [4], [7], [6]. Security Information – Information describing measures implemented in the website to ensure transfer and storage of personal data is secure. The majority of websites request personal information. Websites should describe their efforts to secure users’ information [2].


Users

Information Needed

Information Provided

Site

Revision Date Creation Date Date of Next Update

Product

Aesthetics Price Availability

Shipping

Delivery Date Tracking Number Shipping Cost

Company

Name Mission Sponsors

Security

Payment Information Security

Customer Service

Help Contact Information Refund Policy

Transaction

Taxes Payment methods Quantity

Membership

Account Status History Personal Information

Generalization

629

Domains

Specialization

Fig. 1. Conceptual Model

5.

6.

7. 8.

Product Information – Information providing details about the product and/or services. This predicted factor has the most content requirements. It is important because obtaining products/services is the main purpose of interaction between the user and interface [6], [4]. Customer Service Information – Information describing the purchase assistance and/or after-sales support. Some aspects (i.e. Customer Service) of traditional shopping must be retained by E-commerce. Users are concerned with services during and after sale [7], [2]. Shipping Information – Information explaining the shipping process, payments, and tracking options. The content components in this predicted factor increase product awareness beyond initial interaction with the website [22], [6]. Membership Information – Information pertaining to customer account status, fees, purchase history, and preferences. Most sites allow users to register accounts. This affords desired customized web experience [9], [7], [23].

630


The challenge is to develop a definitional framework depicting the characteristics of useful information. Figure 1 illustrates the classification of the specific content elements noted as essential in the literature review. Useful content is defined as information that is needed to aid a user in accomplishing his/her task. The conceptual model portrays the information needed by users which is the same information that developers should provide. This illustration will equip web designers with a guide for basic content preparation for any domain. There is a general and specific view of the information elements. Please note that only a portion of the specific components are captured in Figure 1. Moreover, this model serves as a framework for the development of an evaluation tool for the effectiveness of content preparation. The tool will evaluate the developer’s interpretation and implementation of the content guide.

5 Conclusion Usability evaluation has established its ability to improve a wide range of interactive systems over the years. However, less than five percent of these methods have addressed the information aspect of interface design for IT operations. Content preparation has been documented as an essential phase of website design [25]. The information provided by the website is a dominant concern of users [20]. Therefore, a tool for evaluating information content is greatly needed to assess the developer’s implementation and interpretation of Content Preparation guidelines. The lack of literature in this area prevents the immediate construction of such a tool. A clear structure of content specific elements was necessary for its development. This paper delivers such a structure as the theoretical foundation for development of methods to evaluate the effectiveness of content preparation in IT operations.

References 1. Akoglu, C., Ozcan, O.: Usability evaluation of architecture based web sites. In: Proceedings of the Tenth International Conference on Human-Computer Interaction, 22-27 June 2003, Heraklion, Crete, Greece, pp. 743-747 ( 2003) 2. Alexander, J.E., Tate, M.A.: Web wisdom: how to evaluate and create information quality on the Web. Lawrence Erlbaum, Mahwah, NJ (1999) 3. Baierova, P., Tate, M., Hope, B.: The impact of purpose for web use on user preferences for web design features. In: Proceedings of the 7th Pacific Asia Conference on Information Systems, 10-13 July 2003, Adelaide, South Australia pp. 1853-1872 (2003) 4. Barnes, S., Vidgen, R.: Assessing the quality of auction web sites. In: Proceedings of the 34th Hawaii International Conference on System Sciences, 3-6 January 2001, Maui, HI, p. 7055 ( 2001) 5. Chan, E.S.K., Swatman, P.M.C.: Web content and design: a review of e-Commerce/eBusiness program sites. In: Proceedings of the 13th Australasian Conference of Information Systems, 4-6 December 2002, Melbourne, Australia, pp. 49-60 (2002) 6. Detlor, B., Sproule, S., Gupta, C.: Pre-purchase online information seeking: Search versus browse. Journal of Electronic Commerce Research 4(2), 72–84 (2003) 7. Gehrke, D., Turban, E.: Determinants of successful website design: relative importance and recommendations for effectiveness. In: Proceedings of the 32nd Annual Hawaii International Conference on System Sciences, 5-8 January 1999, Maui, HI (1999)


631

8. Greer, J.: Evaluating the credibility of online information: a test of source and advertising influence. Mass. Communication and Society 6(1), 11–28 (2003) 9. Griffin, G. 1999, A typology of online positioning strategies among creative programs. Available online at: http://www.ciadvertising.org/studies/student/99_fall/phd/griffin/ online paper/abstract.html (accessed 9 January 2006) 10. Hornbaek, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human computer Studies 64(2), 79–102 (2006) 11. Huang, H., Lee, Y., Wang, R.: Quality Information and Knowledge. Prentice-Hall, Upper Saddle River (1999) 12. Ives, B., Olson, M.H., Baroudi, J.J.: The measurement of user information satisfaction. Communications of the ACM 26(10), 785–793 (1983) 13. Jones, M.Y., Pentecost, R., Requena, G.: Memory for advertising and information content: Comparing the printed page to the computer screen. Psychology and Marketing 22, 623– 648 (2005) 14. Katerattanakul, P., Siau, K.: Measuring information quality of web sites: development of an instrument. In: Proceedings of the 20th International Conference on Information Systems, 12-15 December 1999, Charlotte, NC, pp. 279-285 (1999) 15. Krauss, K.: Testing an e-government website quality questionnaire: a pilot study. In: Proceedings of the 5th Annual Conference on World Wide Web Applications, 10-12 September 2003, Durban, South Africa (2003) 16. Liao, H., Proctor, R., Salvendy, G.: Content preparation for cross-cultural e-commerce: a review, Behaviour and Information Technology ( 2006) 17. Lu, M.Y.: Evaluating and selecting online magazines for children [Electronic Version]. Eric Digest. Available online at http://www.indiana.edu/ reading/ieo/digests/d180.html (accessed 25 March 2006) (2003) 18. Mueller.: An analysis of information content in standardized vs. specialized multinational advertisements. Journal of International Business Studies 22(1) (1st Quarter), 23–39 (1990) 19. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Communications of the ACM 45(4), 211–218 (2002) 20. Pitt, L.F., Watson, R.T., Kavan, C.B.: Service quality: a measure of information systems effectiveness. MIS Quarterly 19(2), 173–187 (1995) 21. Proctor, R.W., Vu, K.P.L., Salvendy, G.: Content preparation and management for web design: eliciting, structuring, searching, and displaying information. International Journal of Human-Computer Interaction 14(1), 25–92 (2002) 22. Resnik, A., Stern, B.L.S.: An analysis of information content in television advertising. Journal of Marketing, vol (January), pp. 50-53 (1977) 23. Salvendy, G., Fang, X.: Siemens report: guidelines and rules for design of e-business: Purdue University (2001) 24. Strong, D.M., Lee, Y.W., Wang, R.Y.: 10 potholes in the road to information quality. Computer 30(8), 38–46 (1997) 25. Vu, K., Proctor, R.W.: Web site design and evaluation. In: Salvendy, G. (ed.) Human Factors and Ergonomics, 3rd edn., John Wiley and Sons, Inc., New York, NY (2006) 26. Zhang, P., von Dran, G., Blake, P., Pipithsuksunt, P.: Important design features in different website domains: an empirical study of user perceptions. e-Service Journal 1(1), 77–91 (2001)

Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture for Early Usability Evaluation: Basic Principles and Comparison Jean-Claude Tarby1, Houcine Ezzedine2, José Rouillard1, Chi Dung Tran2, Philippe Laporte1, and Christophe Kolski2 1

Laboratoire LIFL-Trigone, University of Lille 1, F-59655 Villeneuve d’Ascq Cedex, France {jean-claude.tarby, jose.rouillard, philippe.laporte} @univ-lille1.fr 2 LAMIH – UMR8530, University of Valenciennes and Hainaut-Cambrésis, Le Mont Houy, F-59313 Valenciennes Cedex 9, France {houcine.ezzedine, chidung.tran, christophe.kolski} @univ-valenciennes.fr

Abstract. Early evaluation of interactive systems is currently the subject of numerous researches. Some of them aim at explicitly coupling design and evaluation by various software mechanisms. We describe in this paper two approaches of early evaluation exploiting new technologies and paradigms. The first approach is based on aspect oriented programming; the second one proposes an explicit coupling between agent-oriented architecture and evaluation agents. These two approaches are globally compared in this paper. Keywords: Human-computer interaction, Early evaluation, Usability, Traces, Agent-based architecture, Aspect oriented programming.

1 Introduction Interactive systems evaluation is a very rich research and application domain since the seventies. To evaluate how people use the interactive applications, many techniques and methods can be applied [18,19,22]. Some of them are widely used in companies and universities. New methods or variants of methods appear and are tested progressively, according with new needs and specificities coming from the emergence of information and communication sciences and technologies. Among them, automatic and semi-automatic methods and tools are considered as promising (see [16,17]). This paper concerns early usability evaluation, which is also currently particularly studied in the HCI literature. We present in this paper two complementary approaches contributing to early evaluation during the first stages of a project. It is a question of explicitly coupling design and evaluation by exploiting innovative technologies and paradigms. This coupling aims to produce traces concerning how is used the interactive applications to evaluate their utility and utilisability, for example by comparing prescribed and J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 632–641, 2007. © Springer-Verlag Berlin Heidelberg 2007

Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture

633

effective tasks. The first approach exploits the paradigm of aspect oriented programming to integrate mechanisms of trace in interactive applications. The concept of trace was the subject of various studies in HCI [16]. The second approach proposes an explicit coupling between agents constitutive of an agent based architecture, and several evaluation agents. These two approaches are first described; then, they are compared.

2 First Approach for Early Usability Evaluation: Injection of Mechanism of Traces by Aspects 2.1 Aspect-Oriented Programming New paradigm of programming appeared in the middle of the Nineties, AspectOriented Programming (AOP) results from the Xerox PARC. AOP must be perceived like an extension of Object-Oriented Programming: indeed, complementary generic mechanisms significantly come to improve separation of the concerns within the applications [14]. In a traditional approach, the business objects locally manage their technical constraints (identification/authentication, security, transactions, data integrity...). The duplication of these crosscutting elements in methods of classes leads to a phenomenon of dispersion and interlacing of the level system concerns and increases the complexity of the code. AOP allows the modularization of these elements by the addition of a new dimension of modularity, the aspect. The scope of the crosscutting concerns supported by AOP exceeds that of the current solutions like the EJB. Join point, advice, aspect, pointcut, are the principal concepts introduced by AOP: • A join point represents a particular location in the flow of the program instructions (beginning or end of method execution, field’s read or write access ...). • Advices are methods which are activated when precise join points are reached: the mechanism of weaving inserts in the initial code the advices calls either in a static way (at compile-time) or in a dynamic way (during execution). Advice can execute before, after, or around the join point. • An aspect is a module which allows the association between advices and join points by means of pointcuts. • Pointcuts are used to define a set of join points on which will have to activate an advice. Furthermore, a pointcut allows capturing the execution context of join points. For a method call, this context includes the target object, the arguments of the method and the reference of the returned object, as many information of most useful for the injection of mechanism of traces. Based on the principle of inversion of control (IOC), AOP thus extracts from the business code the dependences with the technical concerns by locating them in the aspects and by managing them from outside by the mechanism of weaving. It becomes consequently possible to be focused on business logic.

634

J.-C. Tarby et al.

Moreover, AOP proposes the mechanism of introduction. This last allows the modification of classes, interfaces or even of existing aspects: it is possible to inject a method or an attribute in a class, to add a relation of heritage, to specify that a class implements a new interface. For example, in the objective to automatically sort a collection of Java class instances, an aspect will declare that the latter implements the interface Comparable and inject the required method compareTo to it. 2.2 Traces by Aspects Thanks to the principle of separation of concerns, AOP can inject traces mechanisms in existing applications (cf. Figure 1, step c) by writing aspects (step d) which on the one hand listen user actions, method calls, changes in data values, etc., and on the other hand produce the traces. These aspects are then weaved with the initial code (step e) which remains intact. The code produced by weaving contains then the initial code and code of aspects (step f). The initial application can be used completely normally without the aspects or be traced with them (step g). The mechanism of trace is thus disengageable without any effect on the initial code.

AspectJ

d Formats of trace

e

f

Aspects of trace

c

g Application to be traced

Aspect weaving Initial application

interactions

j

Data analysis

Traces

i

h

Fig. 1. Injection of mechanism of traces by aspects

To produce a trace we need three types of information: data to be traced, when to produce the trace and where to store it. Traced data mainly relate to the functional core (and consequently the associated tasks) and the user interface (actions from the user, but also displayed data…). For example it is possible to trace the beginning, the end or the interruption of a task, the opening of a window, the selection in a dropdown list, etc. Because our work is use-oriented, it is easier to trace the actions of the user when the functional core and the user interface are built from a task oriented


635

design method. Thus, if the application is designed with an evaluation-oriented approach as presented in [23], it is easy to recover other data such as the context of execution of the tasks, the role of the user (in CSCW for example), etc. Most of the time, the traces are produced when a method is called or at the end of the execution of the method, and these methods may be associated to tasks. AOP provides us all the requested services for the production of traces (cf. before and after keywords present in AOP). Moreover, it is very easy to parameterize the productions of traces, for example to produce them by a dedicated thread, or only if a condition is true. Today the traces are generated in XML files (step h) whose contents are parameterized by a set of formats also written in XML (step d). This allows us to generate traces in different formats while emitting same information from the traced application. Although we privilege traces in XML format, the external definition of formats will make it possible to generate very compact textual files (not XML). With our approach the exploitation of traces is facilitated because we choose data that we want to trace, as well as the format for the result, contrary to approaches based on log files. The analysis of traces (step i) produce statistics, task models (step j), filtered information, etc. This side of our work is not presented in this paper. At the moment this analysis is done after the production of traces, but we plan to realise real time analysis in the future (for an adaptation of the application, to advise the user, etc.). Our work is similar to works such as [2,5,6,9,10,11,24]. It uses AspectJ [4] but it could be made with other languages supporting AOP such as [3,21,25].

3 Second Approach for Early Usability Evaluation: Interactive Agent-Based Architecture and Evaluation Module 3.1 Agent-Oriented Architecture for Interactive Systems Several architecture models have been put forward by researchers over the past twenty years. Two main types of architecture can be distinguished: architectures with functional components (Langage, Seeheim, Arch and their derived models) and architectures with structural components (PAC and its derived models [7], the MVC model (Model-View-Controller ; from Smalltalk) and its recent evolutions, AMF and its variants [20], H4 [8]…). The classic models of interactive systems distinguish three essential functions (presentation, control and application). Some models (such as the Seeheim and ARCH models) consider these three functions as being three distinct functional units. Other approaches using structural components, and in particular those said to be distributed or agent approaches, suggest grouping the three functions together into one unit, the agent. These architecture models propose the same principle based on separation between the system (application) and interface. Thus, an architecture must separate the application and the interface, define a distribution of the services of the interface, and define a protocol of exchange. The interest to separate the interface from the application is to facilitate the modifications to be made on the interface without

636

J.-C. Tarby et al.

Application Application agents

dialogue control agents

Interfaces agents

User

Fig. 2. An agent oriented architecture for interactive systems

touching with the application. Figure 2 proposes a comprehensive framework for architecture [12,15], showing a separation in three functional components, called respectively: interface with the application (connected to the application), controller of dialogue, presentation (this component being in direct relation with the user). These three components group together agents: − the application agents which handle the field concepts and cannot be directly accessed by the user. One of their roles is to ensure the correct functioning of the application and the real time dispatch of the information necessary for the other agents to perform their task, − the dialogue control agents which are also called mixed agents; these provide services for both the application and the user. They are intended to guarantee coherency in the exchanges emanating from the application towards the user, and vice versa, − the interactive agents (or interface agents), unlike the application agents, are in direct contact with the user (they can be seen by the user). These agents coordinate between themselves in order to intercept the user commands and to form a presentation which allows the user to gain an overall understanding of the current state of the application. In this way, a window may be considered as being an interactive agent in its own right; its specification describes its presentation and the services it is to perform. 3.2 Principle of Coupling Between Architecture Based on Agents and Evaluation Agents Our starting objective was to propose a tool for collecting objective data, adapted to agent based interactive systems. This tool corresponds to an electronic informer; it consists of a program, invisible for the user (of the system to be evaluated), which transmits and records all the interactions (actions of the operator and reactions of the system) in a data base. The exploitation of this data base has the aim of then providing the evaluator with data and statistics enabling him/her to draw conclusions with regard to various aspects of utility and utilisability.


637

Fig. 3. Principle of coupling between agent-based architecture of the interactive system and its evaluation [26]

This informer being dedicated to the evaluation of agent-based interactive systems, it must be closely related to the architecture of the system to evaluate [13,26]. We are interested particularly in the interactive agents. This electronic informer, figure 3, consists of several informer agents deduced starting from architecture from the system to evaluate and more particularly starting from the multi-agent system concerning presentation. It is based primarily on the acquisition of information and specific data of the system to be evaluated (actions of the user and reactions of the system). Those will make it possible to rebuild the tasks really carried out by the user (a posteriori mode) and to confront them with the model of tasks to be carried out (a priori mode), according to confrontation principles described in [1]. Let us suppose a module of presentation made up of 6 interactive agents (each one being able to interact with the user), 6 evaluation agents will be instanced and connected to the interactive agents. During the interactions with the user, the 6 evaluation agents memorize in real time the data concerning interaction between the user and the 6 interactive agents. After the realization of the tasks, these data are analyzed automatically; using a specific user interface dedicated to the evaluator, these data are presented in time differed at this one. They can go from a bottom level, corresponding to simple user or system events, to higher levels (for example concerning task level). Examples are available in [26].

4 Comparison Between the Two Approaches A comparison of the two approaches is given in Table 1. The two approaches have common objectives: to gather data to compare predicted tasks and activity, and to highlight utility and usability problems. The ways used to obtain these data differ according to the approaches.

638

J.-C. Tarby et al.

From the point of view of integration in the software engineering, the two approaches require particular specifications. Approach 1 (AOP approach) needs to know the methods and data that can be traced, as well as the formats of trace; this information can be collected during the specifications or after the implementation. Approach 2 (agent approach) requires the specification of the elements of the interactive system, and the evaluation agents. No particular architectural design is requested for the AOP approach, but agent approach requires that the design of the interactive system architecture must be based on interface agents, as well as the establishment of connections between the interactive agents and the evaluation agents. About the implementation, AOP approach automatically generates the code of the aspects and the weaving with the initial code of the application to be traced; agent approach requires programming the services of the interactive system agents and the evaluation agents. From the user centred evaluation point of view, in addition to the fact that the two approaches can be coupled with other techniques such as interviews, eye tracking, etc., they use different modes to gather data: with AOP approach, data are automatically collected by the execution of the code issued from the aspect weaving on the initial application code; with agent approach, data are collected from the evaluation agents by observing the interactions between the interface agents and the user. To be collected with AOP approach, data must be accessible by a method (with the meaning of the object-oriented programming); this method can be public, inherited, etc. Time is accessible in the same way. Data collected with agent approach are potentially multiple (cf. Table 1). In their current version the approaches use different languages. AOP approach uses Java and AspectJ; agent approach is based on C++. In the future, it is expected that AOP approach will be extended to other languages supporting AOP such as PHP, C++, etc., and that agent approach will uses Java. Concerning the types of application, AOP approach currently can trace any application written in Java and supporting AspectJ. However, the traced applications are today mainly interactive applications (WIMP1 applications). In the future, it is planned that AOP approach will be applied to information systems, distance learning applications, and mobile applications. Agent approach is currently applied to information systems used in a context of supervision of network of bus and tramway. In the future, it should aim any type of information system. The advantages of these two approaches are that they provide principles and mechanisms facilitating and prompting early evaluation. In addition AOP approach allows keeping intact the initial code and thus leading in parallel and/or serially the realization of the application and the realization of the mechanisms of traces. The disadvantages are as follows. With agent approach, it is difficult to define for the moment the optimal number of evaluation agents (the first version contained an evaluation agent by interaction agent, and the new version will contain only one for the need for new design methods of user interface envisaging a coupling between interface agents and evaluation agents. To be more effective, AOP approach needs 1

Window, Icon, Mouse, Pull-down menu.


639

Table 1. Comparison between the two approaches AOP approach

Agent approach

Injection of mechanism of traces by aspects

Traditional stages of software engineering

User-centred evaluation (simultaneously with other possible methods: interviews, eye tracking, questionnaire, etc.) Goals

Languages

Types of application

Coupling of interface agents and module of automatic acquisition Preliminary or Explicit consideration of early Explicit consideration of faisability study evaluation in the project early evaluation in the project Specification of: Specification Specification of: interactive interactive system agents, system, parameters to be evaluation agents traced, formats of traces Architectural (empty) Design of the design interactive system architecture based on interface agents; connections between interactive system agents and evaluation agents Coding Generation of the code of the Coding of the services of aspects and weaving between the interactive system the code to be traced and the agents and evaluation aspects agents Interaction data Execution of the weaved code Espionage by the gathering evaluation agents of the interactions between interface agents and the user Collected data Any data accessible by a User and system events, method (in the meaning of errors, time of tasks object-oriented programming) execution, unused objects, + Time number of help requests… Depends on how traces are exploited: gathering data to compare predicted tasks and real activities, highlighting problems of utility and usability… Current Java with AspectJ C++ Intended Any language supporting Java AOP Current WIMP applications Information systems used in a context of supervision of network of bus and tramway Intended Information systems, Information systems distance learning applications, mobile applications

design methods integrating aspects for the evaluation. That means for example that any potentially traceable data must be accessible by object methods.

5 Conclusion The early evaluation field is the subject of active researches in the HCI community. For our part, we work on two complementary approaches. The first is based on aspect

640

J.-C. Tarby et al.

oriented programming; it allows the injection of mechanisms of traces in existing applications. The second is based on new possibilities offered by agent based approaches; it aims at ensuring a coupling between agent based architectures and evaluation agents. Although turned towards same objectives in term of evaluation, these two approaches have different characteristics, advantages and disadvantages which were compared in the paper. For these two approaches, the research perspectives are numerous: it is important to study adapted design methods, to improve the current mechanisms, to test them in various application domains. Acknowledgments. The present research work has been supported by the “Ministère de l'Education Nationale, de la Recherche et de la Technologie », the « Région Nord Pas-de-Calais » and the FEDER (Fonds Européen de Développement Régional) during the projects SART, MIAOU and EUCUE. The authors gratefully acknowledge the support of these institutions.

References 1. Abed, M., Ezzedine, H.: Vers une démarche intégrée de conception-évaluation des systèmes Homme-Machine. Journal of Decision Systems 7, 147–175 (1998) 2. Aksit, M., Bergmans, L., Vural, S.: An object-oriented language-database integration model : the Composition-Filters Approach. In: Madsen, O.L. (ed.) ECOOP 1992. LNCS, vol. 615, pp. 372–395. Springer, Heidelberg (1992) 3. aoPHP, Aspect Oriented PHP http://www.aophp.net 4. AspectJ project http://www.eclipse.org/aspectj/ 5. Balbo, S., et al.: Project WAUTER (Website Automatic Usability Testing EnviRonment) http://wauter.weeweb.com.au 6. Champin, P-A., Prié, Y., Mille, A.: MUSETTE: Modeling USEs and Tasks for Tracing Experience. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 279–286. Springer, Heidelberg (2003) 7. Coutaz, J.: PAC, an Object-Oriented Model for Dialog Design. In: Bullinger, Hans-Jorg, Shackel, Brian. (ed.): Interact’87, 2nd IFIP International Conference on Human-Computer Interaction, September 1-4, Stuttgart, Germany, pp. 431-436 (1987) 8. Depaulis, F., Jambon, F., Girard, P., Guittet, L.: Le modèle d’architecture logicielle H4: Principes, usages, outils et retours d’expérience dans les applications de conception technique. Revue d’Interaction Homme-Machine (RIHM) 7, pp. 93–129 (2006) 9. Ducasse, S., Gîrba, T., Wuyts, R.: Object-Oriented Legacy System Trace-based Logic Testing. In: Proceedings 10th European Conference on Software Maintenance and Reengineering (CSMR 2006), IEEE Computer Society Press, Washington (2006) 10. Egyed-zsigmond, E., Mille, A., Prié, Y.: Club (Trèfle): a use trace model. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 146–160. Springer, Heidelberg (2003) 11. El-Ramly, M., Stroulia, E., Sorenson, P.: Mining system-user interaction traces for use case models. In: Proceedings of the 10th International Workshop on Program Comprehension (IWPC’02), Paris, France (27-29 June 2002) 12. Ezzedine, H., Kolski, C., Péninou, A.: Agent oriented design of human- computer interface. Application to supervision of an urban transport network. Engineering Applications of Artificial Intelligence, vol. 18, pp. 255-270 (2005)


641

13. Ezzedine, H., Trabelsi, A., Kolski, C.: Modelling of an interactive system with an agentbased architecture using Petri nets, application of the method to the supervision of a transport system. Mathematics and Computers in Simulation 70, 358–376 (2006) 14. Filman, R., Elrad, T., Clarke, S., Aksit, M.: Aspect-oriented software development. Addison-Wesley Professional, London (2004) 15. Grislin-Le Strugeon, E., Adam, E., Kolski, C.: Agents intelligents en interaction hommemachine dans les systèmes d’information. In: Kolski C. (ed.): Environnements évolués et évaluation de l’IHM, IHM pour les SI 2 (Éditions Hermes, Paris, pp. 207-248 (2001) 16. Hilbert, D.M., Redmiles, D.F.: Extracting usability information from user interface events. ACM Computing Surveys 32, 384–421 (2001) 17. Ivory, M., Hearst, M.: The State of the Art in Automated Usability Evaluation of User Interfaces. ACM Computing Surveys 33, 173–197 (2001) 18. Jacko, J.A, Sears, A.: The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications (human factors and ergonomics). Lawrence Erlbaum Associates, London (2002) 19. Nielsen, J.: Usability Engineering. Academic Press, Boston (1993) 20. Ouadou, K.: AMF: Un modèle d’architecture multi-agents multi-facettes pour Interfaces Homme-Machine et les outils associés. Ph.D. Thesis, Ecole Centrale de Lyon (1994) 21. PHPAspect http://phpaspect.org/ 22. Sweeney, X.E., Sibertin-Blanc, M., Maguire, M., Shackel, B.: Evaluation user-computer interaction: a framework. International Journal of Man.-Machine Studies 38, 689–711 (1993) 23. Tarby, J.C.: Evaluation précoce et conception orientée évaluation. In: Proceedings ErgoIA’ (Biarritz, France, 11-13 Octobre 2006), ESTIA and ESTIA.Innovation, Biarritz, pp. 343-346 (2006) 24. The Compose* project http://janus.cs.utwente.nl:8000/twiki/bin/view/Composer/ 25. The Java Aspect Components (JAC) project http://jac.objectweb.org/ 26. Trabelsi, A.: Contribution à l’évaluation des systèmes interactifs orientés agents: application à un poste de supervision du transport urbain (in french). PhD Thesis, University of Valenciennes and Hainaut-Cambrésis, Valenciennes, France (2006)

Usability and Software Development: Roles of the Stakeholders Tobias Uldall-Espersen and Erik Frøkjær Department of Computing, University of Copenhagen Universitetsparken 1, DK-2100 Copenhagen {tobiasue, erikf}@diku.dk

Abstract. Usability is a key issue when developing software, but how to integrate usability work and software development continues to be a problem, which the stakeholders must face. This study aims at developing a more coherent and realistic understanding of the problem based on 14 interviews in three case studies. The results indicate that usability during software development has to be considered with both a user interface focus and an organizational focus. Especially techniques to support the uncovering of organizational usability are lacking in both human computer interaction and software engineering. Further, the continued engagement of stakeholders, who carry the vision about the purpose of change, stands out as a critical factor for the realization of project goals.

1 Introduction Integrating usability work into software development is not easy [3]. It requires thorough understanding about usability work methods and software development practices to reach a proper integration, but this understanding seems insufficient when aiming at improving end product usability. Despite heavy investments in information technology we observe deficiencies in practical usability work and significant lack of impact [4]. Even current research fails to explain why [7]. This paper reports from a study combining both an organizational and an individual approach to understanding and exploring the problem. By selecting this approach we seek an understanding of how organizational issues and stakeholders in the organization influence end product usability.

2 Method We have conducted an interview study to explore how usability work and software development are handled under different organizational settings. We looked at cases in three companies where useful and usable software was developed. From each company four stakeholders covering the roles of users, software developers, technical project managers, and business decision makers were interviewed. To prevent influencing the respondents up front they were not informed about our special interest in usability. In one of the companies usability was forced into the software J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 642–651, 2007. © Springer-Verlag Berlin Heidelberg 2007

Usability and Software Development: Roles of the Stakeholders

643

development process by two stakeholders: a graphical user interface designer and a business representative responsible for requirement specification, test planning and user education. These two persons were interviewed as well. The main research question was how practitioners in software development projects are working with usability and what we can learn from their practices? All interviews had the same interview guide as starting point, but there were significant differences in how they progressed. The interview guide covered four themes: (1) The software development process. (2) Software quality. (3) Developing usable software. (4) General experiences with development of usable and useful software products. During the interviews theme 1 and 3 were given most attention, and theme 1, 2 and 3 were all discussed based on one specific software development project significant to the interviewees and their organization. Each interview took 60-90 minutes. The interviews were transcribed and analyzed using elements from grounded theory [5]. During the analyses we looked for information that directly or indirectly related to usability. This information was for instance statements about stakeholders’ perception of usability, descriptions of usability related activities, and non-usability related issues that influenced end product usability. 2.1 Usability as a Concept Our data suggests that usability is treated with different goals in mind in the various development projects and their organizational context. This leads us to look further into the relevance and practical conditions of conducting usability work in software development projects in order to examine the various stakeholders’ roles and the possible risks regarding realization of the full potential of the solution. The ISO 9241-11 standard defines usability as: “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” Using this definition, usability is depending on four variables, i.e. a product, specified users, specified goals, specified context of use. Following our organizational approach we observed how specified goals had significant influence on the handling of usability. This we found important since these goals existed more or less autonomously of the product, the users and the context of use; three variables which traditional usability work often have special focus on. Various stakeholders formulated goals and their direct or indirect roles informed each case significantly. We found it useful to distinguish between two groups of stakeholders, the users, i.e. persons who interact with the system, and the other stakeholders, i.e. persons who are directly or indirectly affected by the system or have important interests regarding it. Our data suggests that usability work is oriented towards two different dimensions, which is related to the various goals in the development project, among the stakeholders, and in the organization. The two dimensions found were: (1) Usability work oriented towards the user interface or user interests, which we refer to as user interface usability. (2) Usability work oriented towards the organization or other stakeholders, which we refer to as organizational usability. Incidents with both identical and different interests between the two usability orientations were observed, which support our assumption about the importance of analyzing these two dimensions.

644

T. Uldall-Espersen and E. Frøkjær

3 Results The cases had both strong similarities and differences. All projects were based on web technology and were all considered quite successful by the interviewees. In relation to their organizations the developed applications were innovative and both influencing and influenced by their organizations. All the systems had various user groups and groups of people that were influenced by the systems. The systems were all initiated centrally, and anchoring the systems locally in the organizations was a challenge. By nature the systems were very different. Two systems were custom-developed by external contractors and an in-house development team developed one system. Case 1: Development of a new insurance sales tool. This case regards the development of a new sales tool for two groups of users, insurance agents and customer service persons. The tool was developed in-house over a period of 18 months. At the most 25 employees were working at the project. About 400-500 employees would be using the tool. The two user groups had significantly different requirements as the insurance agents were selling at the customers’ locations, typically in their homes, and the customer service persons serviced customers over the phone. It was not considered possible by the project management team to make two different interfaces and considerable efforts were made to make one suitable interface. The sales tool was build as a front-end to two large insurance administration systems and it was a challenge to avoid letting administrative procedures inform the design. A customer centred approach was taken and all possible stakeholders were involved. The aim was to ensure the users the best possible tool and the main improvements were a better quality of data and an improved general view of the customers and their households. The company had a strongly centralized organization rooted at the head office, but employees at five regional offices generated the majority of the sale. A main challenge was to avoid that the tool became “another head office’s idea” and a considerable effort was done to insure that the tool was firmly anchored locally. The project was innovative and utilized new technology, such as wireless access to the back-end systems and other relevant systems, e.g. the national civil registration number register. The new technology also caused severe technological and usability problems. The company did not use a formal software development method and usability was not prioritized initially in the project. Two stakeholders strongly insisted on taking usability seriously and they gradually succeeded in making usability a significant and comprehensive part of the project. The project management team took a risk by yielding control with the process and allowing anyone involved in the project to have an opinion and express it. The software developer described the space for communication this way: “We had our arguments and we have been bloody angry at each other, close to physical fights, but it is like that to integrate systems if you ask me, and I find it great that we could ... we really could go directly to each other and say that this is really annoying. Can’t we ... I think this is foolish ... but I think this is foolish ... why aren’t you done now ... why shall I be done now, and so on. There we really had a very close collaboration.” So, space was made for rewarding discussions and iterations, but the downside was that much decisions making became very time consuming. Case 2: Developing a new IT platform for a political organization. This case regards the development of a new IT-platform for a political organization. The


645

IT-platform was custom-developed by an IT-contractor in close co-operation with the central office in the organization. The co-operation continued over several years where components continuously were delivered and put into use. The project team consisted of six or seven persons from the contractor and the customer’s organization. The organizational leaders had strong visions about modernizing the organization and the new IT-platform was a key tool to fulfil this vision. There were strong economic incentives in the project as well. The IT-platform should serve two purposes. First it should replace an existing, but outdated communication platform used by 2.000– 3.000 members. Otherwise a costly renewal of the license to the old communication platform was needed, which was not a realistic option. Introducing a new platform should help opening up the organization and make it more attractive to new members. The new platform included an advanced CMS-system available for all members (about 50.000) and specific tools for running effective and professional election campaigns. Second the IT-platform should serve as a new tool for membership administration, which would be decentralized and handed over to the local chapters of the organization. Membership administration includes issues like collection of dues, signing members up for courses and the national congress, and internal polling functionality. The contractor applied a highly agile and strongly business process oriented approach to the development. This was a key success factor since external events periodically completely did remove the customers’ focus from the project and changed the short termed goals. A very special contract was made between the contractor and the customer’s organization. No formal requirement specification was agreed upon, but a vision was developed, thoroughly discussed in the management group, and written down. The customers’ project manager describes it this way: “We ended up writing up a two-page contract and some enclosures, which essentially stated that we could put the deliveries into use when we were satisfied, and when we did so we paid. The whole issue of accepting that they had delivered what we needed was handed over to us, by stating ‘our experience is, that you only pay if you are satisfied, so let us put that into the contract.’ Thus, it was completely up to us to decide when things were approved, but it could not be put into use before it was accepted. This model does not work in all projects, but it was extremely operational in relation to what we were going through.” The agreement was governed by a fairness principle ensuring that the customer’s organization and the contractor treated each other respectfully and this converted potential conflicts to win-win situations. According to one of the key persons, “enlightened despotism” dominated within the customer’s organization and only three stakeholders were thoroughly involved in the project. Case 3: Developing a coherent physical and electronic department store. This case regards the development of a new website for a department store with a number of locale houses. The website was developed by an external contractor who were specialized in user centred web development. The customer’s organization did only little to involve itself in the project. The customer considered the solution to be a high-class web-solution and it was technically efficient, but it was poorly anchored in the customer’s organization. The contractor’s information architect experienced the lack of anchoring this way: “ ... they might not all have much notion about what this website should be used for, and they also had different positions. The commercial

646


manager had another position than the marketing manager, who had another position than the loyalty manager. And then ... they need to clarify it internally, and then they can come to us, because we are going to make something they can use for what they have agreed the system to be used for.” The project was completed within five months and five persons from the contractor were core project members. The unique in this case was the idea of creating a coherent solution where the physical and electronic world supplemented each other in order to maintain a leading role in the physical department store market in Denmark and, if possible, also establish a position within the web-shop market. Two different goals were formulated. The first goal was to enable department store customers to buy articles in a traditional web-shop and this was given most attention by the development team. This was a limited success since only about 1 out of 1000 articles from the physical stores were available in the webshop when it opened. It proved to be a non-trivial task to add articles to the web-shop and to ensure that the organization was able to handle the logistics. The second goal was to present information and to inspire potential customers to buy articles in the physical shops, which was the primary goal according to the business representative. A large effort was put into unifying these two goals. A combined physical and webbased fashion magazine was created and when searching for products at the web site, the search function returned information including the physical placements of the articles in the department stores. The development process was split into three phases, each sold individually to the customer. This was an efficient way to keep the project on track, but some economic surprises did occur. Most significant was the surprise when the cost of following the strict HTML 1.0 standard was summed up. This standard was not previously followed and the budget was blown for the html-development without adding significant quality to the usability of the solution. Furthermore, the customer did neither want to pay for a thorough analysis of the target group, i.e. department store customers, nor a final user test. These cost savings watered down the user centred process. 3.1 Cross-Analyzing the Cases Our data suggests three different approaches across the cases, which we use as starting points for analyzing and comparing the cases. Each approach seems to have or could have a significant impact on usability of the end product. The approaches are: (1) The existence or development of living visions or organizational goals in the organizations. (2) The technology used to implement the system and the technical context in which it was implemented. (3) The shaping of the software development process. Approach 1: The existence or development of living visions or organizational goals in the organizations. All three cases were influenced by visions or organizational goals, but the effect of these was very different. In case 1, two main goals were important. (1) Tying up customers closer to the company by selling product from more than one branch of the company. This goal was pursued by making the tool customer centred and by making it easy to refer customers to other branches. (2) Following best practises when selling insurance products. This was done by never leaving the customers with obvious needs that were not treated in the sales process. The treatment was documented in the printed policy


647

and signed by the customer. This was done to harmonize the expectations between the customers and the company and thereby avoiding disappointed and complaining customers when a possible insurance event happened. The redesign of the printed policies introduced a problem with clarity of the policy, since a normal policy that was handed over and signed by the customer was on about 18-20 pages. Since the old tool produced a three-page policy, this change directly influenced the sales process. In case 2 there was a clear vision about modernizing and opening the organization to make it more attractive to new or younger members. Modernizing included revising the administrative processes in order to save money and strengthen the campaign machinery. For example, the new platform included a web-based publication module where members, from a set of templates, could create folders and posters and send them directly to the printing house without dealing with colour formats and other technical issues. One key to opening the organization was through the design of an individual entry page called ‘my page’. My page should give the members easy access to discussion boards, mailing lists, and relevant homepages, but the page suffered from lack of user interface usability. It provided too much information and was difficult to use. This problem could be explained by a significant disagreement among the stakeholders about its purpose, functionality and design. In case 3 the buyer had a set of visions that was not clearly absorbed in the project team, and some of the project members expressed doubts about the realism of fulfilling the visions. The website should inspire customers and attract them to the physical department stores, and should help building and maintaining customer loyalty. Two means supported this. First, the company developed an electronic and physical fashion magazine, which included various articles about fashion, showed various shopping articles, and linked to other text articles on the website. Second, the buyer introduced a special search concept. When customers were searching for an article or a brand, the search result displayed the various available articles of that brand and where to physically find them in the department stores. Based on the three cases, we observe how fulfilling visions and goals in a project are strongly influenced by organizational usability. In all three cases the systems were important tools for creating loyalty or solidarity, but different approaches were chosen. In case 1 the utilization of the visions grew out of the comprehensive involvement of the various stakeholders, through workshops and formal or informal evaluations. In case 2 the design of the contract was an important factor for letting the understanding of the organizational usability develop, while the design and redesign of business processes were an important tool to its realization. The small project team with tightly cooperating members was well qualified for the job. In case 3 only one or a few key persons from the customer’s organisation understood the concept that was implemented and they did not succeed in making the solution an integral part of the organisation. Furthermore our data suggests that successful realization of visions and goals depends on thorough and coherent understandings of the users and the situation of use. Thus inadequacy of user interface usability constitutes a significant risk for not fulfilling the visions and goals. Approach 2: The technology used to implement the system and the technical context in which it is implemented. All three cases relied on web technology and were dependent of the technical context, but the technical impact on usability was very

648


different. One important commonality across the cases was the centralized architecture that made it easy and relative inexpensive to fix errors and ‘roll out’ new corrected versions of the software. Compared to traditional software development the test efforts were reduced because of the easy access to fix problems. In case 1 and 2 less attention was directed at the deliveries when they first were put into use, and the organizations thereby failed to profit fully from the centralized architecture. The tool in case 1 was a Java application running on a number of Citrix servers accessed through a traditional wired network or a high-speed mobile phone connection. On an early workshop the users were asked “What can we do to make your everyday better?” This provided important information about the possible improvements of the tool, such as how online access to the national civil registration register could help the users forming the household fast and correct while visiting the customers. The online abilities also made data validation possible through integration to the back-end systems. This drastically reduced the number of errors that required intervention from other employees after the sales were finalized. The wireless setup had a major performance problem and it took up to 17 minutes to print the policy, which preferably should be signed by the customer during the visit. Case 2 relied on a component based service oriented architecture. This architecture made the solution extremely flexible to expand and modify and supported fast adoptions to changes in the short termed goals of the organization. For example, components of the existing infrastructure was easily integrated into the new solution, which made the solution usable from an early stage in the overall development process, and the ability to fast adoptions to changing goals proved very useful when internal and general elections were announced. Case 3 took the most conservative approach to technology. The customer’s main focus was on getting a stable solution, which they got. The contractor put a lot of effort in delivering a strict html 1.0 compliant solution. This did not have a clear influence on usability of the end product, but increased the cost of the solution significantly. Integration of the web-shop with the existing enterprise resource planner-system was a major issue, which was postponed since the customer’s ITdepartment lacked resources to assist this work. This left the administrative and logistic processes to be carried out more or less manually and thereby exposed to human failures. This caused concerns among the stakeholders and would have been a major problem in the organization had the web-shop been a large success. The technological comparison suggests a number of things. First, the ability to integrate with other systems can have huge effect on both user interface usability and organizational usability and failing to integrate can have severe consequences for the organization. Our data suggests that successful integration depends on continuously bringing experts together. Second, discovering and utilizing the technological abilities can be a learning process that needs space and time. Relying on well-known technology and solution patterns reduces risks of technical issues, but might also reduce innovation in the solution and in the organization, which can reduce both the user interface usability and the organizational usability. New technology can be used to evolve usability and increase the usefulness of the end product, but with a greater risk. Third, relying on specific technology and standards can introduce limitations, formal and informal. This can be a reasonable overall decision, but the consequences for usability is hard to anticipate.


649

Approach 3: The shaping of the software development process. In our three cases we see three very different software development processes. Case 1 relied on a human centred development process. The team aimed at putting the customer in the centre in the tool. All possible stakeholders within the company were involved and anyone at the team was entitled to have an opinion and share it. Occasionally this made the process very time consuming and demanding to handle. The result of the development process was a solid all round sales tool, where different orientations of usability were considered. Neither the user interfaces nor the processes were optimized but both were designed well. Through a number of iterations involving various users most parts of the user interface were tested before the final user tests. Case 2 was a business process centred development process. The main focuses were on identifying important business processes, describing the processes into details, identifying stakeholders in the processes, and then implement the processes. All main design activities started with drawing up and analyzing the involved processes and the project organization saw it as their main task to “add electric current to the business processes”. The positive outcome of the process-oriented development was a system that supported a variety of processes in the organization and was well integrated with existing and new processes and components. However, it also resulted in a non-optimized user interface with serious flaws. Case 3 had a user centred development process as starting point. The user centred process was reduced due to economic limitations, since the customer did not want to pay for a target group analysis or a user test. This decision was inconsistent with the contractor’s advice. In the development process, focus was on the front-end of the system and the back-end was only minimally adjusted to the customer’s organization. The customer took only minimally part in the development project and although the contractor paid some attention to the organizational issues, the integration to the existing business did not work well and introduced a serious risk to the project. The comparison of the three different development processes suggests two main issues regarding usability. First, a process-oriented approach favours organizational usability while a user centred approach mainly considering direct users, favours user interface usability. The human centred approach of case 1 aiming at considering all possible stakeholders, places it self in between by promoting both organizational usability and user interface usability. Second, the human centred approach required lots of resources because of the broad discussions, which was deliberately avoided in case 2 and 3. In both case 2 and 3 the project managers were clearly aware of the risk of overloading the project and refrained from involving users in specific situations, while the project manager in case 1 aimed at ensuring that ‘the user involvement did not get out of hand’.

4 Discussion We discuss possible means to improve integration of usability work and software development based on the three approaches. Approach 1: The existence or development of living visions or organizational goals in the organizations. We find that the main issues regarding this approach are: (1) How is a living vision established, evolved, and maintained throughout the

650


development process? (2) How are visions and goals transformed into concrete and usable systems design? (3) How is usability of the systems design evaluated together with the visions? Participatory IT Design [2] and Contextual Design [1] suggest how to develop and utilize visions in systems design, but how to evolve, maintain and evaluate the vision and goals is not discussed. In our cases the visions and goals are initially anchored among the non-technical stakeholders and it becomes their task as vision carriers to maintain and propagate the visions to the entire set of stakeholders, and particularly to anchor the visions and keep them alive together with the key technical stakeholders. This is for example carried out through workshops, and workshops are also used as a place where visions and goals can inform the concrete systems design. Case 1 and 2 include a number of critical decision points, where the intervention by the vision carrying stakeholders was necessary to retained focus on the overall project goals, also in situations where fast and comprehensive reordering of priorities were urgent. Also, we do not see this issue discussed in either the usability literature or the software engineering literature. Since goals and visions seem to have great influence on organizational usability, an iterative process with evaluations and redesigns taking shape in accordance with visions might be a way to better support organizational usability and thereby to better realize the full potential of the solution. Approach 2: The technology used to implement the system and the technical context in which it was implemented. We find that the main issues regarding this approach are: (1) How do we best realize the technological possibilities regarding usability? (2) How do we visualize and evaluate the consequences of the technological choices regarding usability? (3) How do we evaluate the technical implementation regarding usability before it is to late? Both Participatory IT Design [2] and Contextual Design [1] suggest that technology and the technical context are important when planning and designing new IT-systems, but the need for ongoing evaluation during development is not covered. Our cases show that key stakeholders are aware of how technology can support usability work, for example by making it easy and inexpensive to update web-based software on central servers, which should make it possible to fix a number of usability issues with a reasonable cost. Unfortunately, our data also shows that this possibility is not properly utilized, since focus shifts to other important tasks, even though an insufficient or even defective system is put into use. Furthermore, it might be more difficult than anticipated to upgrade the systems after a large number of users have taken the system into use. Also we observe how rigidly relying on standards can introduce new risks, if they are not necessary and coherent with the visions. Adhering to standards can make demand on considerable scarce resources and remove focus from more critical issues. Approach 3: The shaping of the software development process. We find that the main issues regarding this approach are: (1) How is the development process organized? (2) How do the stakeholders stay engaged of the development process? (3) What tools are advantageous and profitable to apply? We have not yet seen a process taking both organizational usability and user interface usability into account in a controlled and efficient manner. This applies to both the involvement of stakeholders and the use of methods and techniques. So far methods and techniques in HCI are primarily backing user interface oriented usability. This is visible for instance in the many evaluation techniques such as Heuristic Evaluation, Cognitive Walkthrough and


651

Think Aloud Tests. Techniques for uncovering organizational usability issues are far fewer and less commonly used [6].

5 Conclusions The study reports from three interview-based case studies of software development projects, where important web-based applications were implemented. We have aimed at describing different stakeholders’ contributions through cross analysis of the development projects. In all three cases the stakeholders appear as individuals without an archetypical role. They all have positions, interests, and competences that make them important individual contributors. The cases show how end product usability is depending on various factors in the software development project, such as the presence of living visions, the technological choices, and the applied software development processes. Important usability contributors are found both at the user interface usability level and at the organizational level. While many techniques for developing user interface usability are employed, techniques to support the uncovering of organizational usability are lacking. Particularly important are the vision carriers, who are able to keep the project on track with clear focus on the organizational usability issues when plans have to be adjusted. Descriptions of work practises and techniques supporting this task are rare, both in human computer interaction and software engineering. Acknowledgments. This work is part of the USE-project (Usability Evaluation & Software Design) founded by the Danish Research Agency through the NABIIT Programme Committee (Grant no. 2106-04-0022).

References 1. Beyer, H., Holtzblatt, K.: Contextual Design. Morgan Kaufmann, San Francisco (1998) 2. Bødker, K., Kensing, F., Simonsen, J.: Participatory IT Design. The MIT Press, Cambridge, Massachusetts (2004) 3. Juristo, N., Windl, H., Constantine, L.: Introducing usability, IEEE Software, 20–21 (2001) 4. Landauer, T.K.: The trouble with computers. MIT Press, Cambridge, MA (1995) 5. Strauss, A., Corbin, J.: Basics of Qualitative Research. SAGE Publications, Thousand Oaks, CA (1998) 6. Vredenburgh, K., Mao, J., Smith, P., Carey, T.: A survey of user-centered design practice, In: Proc. CHI 2002, Minneapolis, Minnesota, USA (2002) 7. Wixon, D.: Evaluating usability methods: Why the Current Literature fails the Practitioner. interactions 10(4), 29–34 (2003)

Human Performance Model and Evaluation of PBUI Naoki Urano1 and Kazunari Morimoto2 1

SHARP Corporation, Nagaikecho 22-22, Abeno-ku, Osaka-shi, Osaka 545-8522, Japan [email protected] 2 Graduate School of Science and Technology, Kyoto Institute of Technology Matsugasaki, Sakyo-ku, Kyoto-shi, Kyoto 606-8585, Japan [email protected]

Abstract. We analyze and discuss human performance model for PBUI (PushBased User Interface) in this paper. PBUI is a user interface method in which a user performs a desired task by selecting a target object that usually represents the task itself. The candidate objects are sequentially and automatically presented to the user by the system. When a target object is presented, user selects the target object by a simple action such as just pushing a button. In this paper, we propose human performance model of PBUI and discuss the characteristics of PBUI. We also evaluate performance of PBUI by comparing with GUI. Keywords: user interface model, performance model, PBUI (push-based user interface).

1 Introduction A graphical user interface (GUI) is the most commonly used human-computer interface these days. In GUI, a graphical pointer and graphical elements (windows, icons, graphical widgets, and menus) are important media for human computer interaction. GUI elements can be manipulated by the graphical pointer which indicates the location of a pointing device such as a mouse device. A mouse is a typical pointing device and that can be used to select and move objects. A major advantage of GUI is that they make computer operation more intuitive. However, this contrasts with the situation for the command line interfaces (CLI)[1], which use only text and are accessed solely by a keyboard. It is necessary for GUI users to know how to manipulate the mouse, the graphical pointer, icons, graphical widgets, menus, windows, etc. In addition to those manipulations, we need to know the application very well to perform a desired task. For example, if you want to search a photo image among many images in the GUI window with a scroll bar, you should know how to manipulate the scroll bar to find a desired photo image. User need to manipulate a scroll bar for a typical image viewing application in GUI. Scroll bars are usually designed as a long rectangular area on one or two sides of the window area, containing a bar that can be dragged along a trough to move the body of the window as well as two arrows on either end. It might be easy J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 652–661, 2007. © Springer-Verlag Berlin Heidelberg 2007

Human Performance Model and Evaluation of PBUI

653

for ordinary PC users to manipulate such scroll bars, but it is often hard for noncomputer users, beginners, novice users, naïve users, or disabled users [2]. There are following cases that make GUI difficult to use. 1. Users do not know how to use a graphical pointer and manipulate GUI elements to perform a task. 2. Users themselves have difficulty to use a graphical pointer or cannot manipulate GUI elements smoothly to perform a task. 3. Users are in the limited environment so that users cannot use a graphical pointer to manipulate GUI elements. In case 1, they are usually beginners, novice users, or naïve users. Note that there are many non-computer users in the world. They usually do not have opportunity to have lessons how to use computer devices. Hence, it is sometimes difficult for them to manipulate the computer devices actively. However, it is common for them to watch the display screen and to react to the display similar to watching TV. Therefore, if we provide passive user interface that means less active operations, they might be able to use a computer more effectively. In case 2, they are physically handicapped and have difficulty to use graphical input devices unrestrainedly. If we provide a user interface that incorporates a simpler input device, they can use a computer more smoothly. In case 3, they cannot use the graphical input devices resulting from difficult situations. It is sometimes difficult to manipulate windows or widgets in the limited environment. For example, you are not free to use hands to use complex input devices when you are driving an automobile. In the above cases, if we provide simpler and easier user interface than GUI, it helps the user to perform the task where they cannot perform easily in GUI. We propose push based user interface (PBUI). It has the following characteristics. 1. Simpler input device 2. Less input operations 3. Passive user interface Simple input devices with less input operations are important for beginners or novice users. In PBUI, user performs an arbitrary task by just pushing a button to select a target object presented by the system. We call it one-button interface. The target object is usually represented in icon, image, or graphics which user can easily recognize the meaning of the object. Passive user interface is an important feature for PBUI. It guides the user to a designated task without any user’s active input. The interface prompts the user to select the target object representing a task that the system presented to user. In other words, the system pushes a suggestion to user. That’s why we call it push based user interface. In most GUI, users need to manipulate input device actively. Users move the pointer to the menu bar, open menus, point the chosen menu, and click to perform a task. Unless user makes some actions, the system does not change its state and the display does not change at all. In PBUI, users do not need to make actions actively. Instead of making an action first, users wait for the chosen menu or object to be presented to users. Users just select it by pushing the button when the designated

654

N. Urano and K. Morimoto

menu or object is displayed. The system changes the display to guide the user to perform a task. In the above photo image searching application, PBUI displays the photo images (i.e. candidate objects) sequentially rather than just displays a group of photo images and waits for user's active input of pointing, dragging, scrolling, picking etc. In PBUI, users just wait until a desired photo image (i.e. target object) is displayed. It is unnecessary to know how to use the GUI manipulators of windows such as scroll bars. Rather than manipulating the widows, candidate objects are automatically changing. The user selects the target object when it is displayed. We present human performance model for PBUI, important facors of PBUI and discuss important issues about PBUI that are decried below in this paper. 1. What is right duration time for displaying candidate objects? 2. What is a right number of candidate objects to be displayed simultaneously to user? 3. Is it affected by the complexity of objects? 4. What is an effective way to display candidate objects to user? 5. Performance compared with GUI.

2 PBUI Human Performance Model There are many possible ways to display candidate objects. PBUI’s performance is very dependent on how the objects are displayed to user. If it is displayed one by one, it is easy for user to recognize it to select. However, it takes much time to reach the target object. It is very inefficient if a candidate object is presented to user one by one. On the other hand, if many candidate objects are displayed at once, it is very difficult for user to explore the right target among many candidates and user often misses the right target even though it has been displayed. To maximize the efficiency, we should know human performance model for PBUI. Total time to perform a task consists of perception time, cognition time, and motion time based on the model human processor [3]. The task operation is processed as follows. 1. 2. 3. 4.

Visual stimulus by displaying candidate objects. Perception: Perceive the candidate objects. Cognition: Recognize the target object. Motion: Push a button to select the object.

Displaying time of object is determined by the perception process, cognition process and the motor process. Perception time, Tperception, is the time duration that is needed for user to perceive candidate objects are displayed to the user. Cognition time, Tcognition, is the time duration that is needed for user to recognize the meaning of object. Motion time, Tmotion, is the time duration that is needed for user to react to the target object for selection. Thus the total time, Ttotal, needed for user to decide selecting the target object is the sum of the time. It is represented in the equation (1). Ttotal= Tperception + Tcognition + Tmotion

(1)


655

It is important for an application associated with PBUI to design the appropriate time duration for displaying candidate objects. We assume that the perception time and the motion time are constant for a particular PBUI system. We should carefully design the time duration for the cognition time to maximize the human performance.

3 Experiments 3.1 Displaying Time In this section, we investigate an right duration time for displaying candidate objects. To find a reasonable displaying time for an application, we need to know how long it takes to process a candidate object. We had related experiments before [4]. We used simple objects shown in the figure 1 for the experiments.

Fig. 1. Simple objects used in the experiments

Experiments were carried out as follows. 1. A target object is presented to test subject. A test subject remembers it. 2. Candidate objects are randomly displayed in front of the test subject. 3. If target object is displayed, the test subject selects it by moving a finger from the home position to it. If there is not the target object, he or she just releases a finger from the home position to move to the next display. The average time to process a candidate object was 0.42 sec. The average time to process a target object was 0.61 sec that means the presented candidate object and the target objects were the same object. We have experimented for 8, 16, 24 object cases. In 8 object case, 8 objects are displayed simultaneously by 2 x 4 (i.e. 2 rows and 4 columns) style. Correspondingly, candidate objects are displayed by 4x4 for 16 objects, and 4x6 for 24 objects. More time is needed to process multiple candidate objects on the same display. The figure 2 shows the results. In that the target object exists in the display screen, the process time is less than that the target object does not exist because the test subject does not need to examine all the candidate subjects. The equation below shows the elements of time required for user to process one display. Tprocess= Tperception + n*Tcognition + Teye(n) + Tmotion

(2)

n：number of objects displayed simultaneously Tperception：Time required to perceive display change Tcognition：Time required to recognize a candidate object is the target or not the target. Teye(n)：Time required to move one candidate object to another candidate object. Tmotion ：Time required to select the target or to move to the next display


& KURNC[KPIVKO G㧔UGE㧕

656

6CTIGVQDLGEV 0 QVCIGVQDLGEV

0 WO DGTQHFKURNC[KPIQDLGEVU Fig. 2. Displaying time for different number of objects

3.2 Appropriate Number of Objects Based on the experiment and equation explained in the previous section, we designed an experiment to investigate the optimum number of simultaneously displaying objects. In this experiment, we measured total time to perform the task on each number of simultaneously displaying objects. In this experiment, the test subject processes all candidate objects to select the target object. Total time required for the task is represented in the equation (3). Ttask= N/n(Tperception + n*Tcognition + Teye(n) + Tmotion) N：Total number of candidate objects.

㫋㪸㫊㫂㩷㫋㫀㫄㪼䋨㫊㪼㪺䋩

㪈㪇㪇

㫋㪸㫊㫂㩷㫋㫀㫄㪼

㪏㪇㪍㪇㪋㪇㪉㪇㪇㪈

㪏㪈㪍㪉㪋㫅㫌㫄㪹㪼㫉㩷㫆㪽㩷㪻㫀㫊㫇㫃㪸㫐㫀㫅㪾㩷㫆㪹㫁㪼㪺㫋㫊

Fig. 3. Task time (total time to perform a task) for different number of displaying objects

(3)


657

In this experiment, the total number of candidate objects is 100. Number of test subjects are 6. Each test candidate had 80 trials for each number of displaying objects. The average times required for the task are 56.05 sec for 1 displaying object, 11.42 sec for 8 displaying objects, 16.32sec for 16 displaying objects, and 15.06 sec for 24 displaying objects as depicted in the figure 3. The results show that the optimum number of simultaneously displaying objects is between 8 and 16. 3.3 Complexity of Objects We discuss how complexity of objects affects the number of displaying objects to yield good performance. We had related experiments before [5][6]. We used the similar objects depicted in the figure 4 to test the performance. The complexity was set based on the number of vectors included in a picture. In figure 4, the complexity of the picture is increased from left to right.

Complexity 1

Complexity 2

Complexity 3

Fig. 4. Sample objects with different complexity

Experiments showed that if complexity is high the user needs more time to recognize the object as expected. However, the complexity does not affect the optimum number of simultaneous displaying objects. If the optimum number is n, displaying n objects applies to any complexity of objects. The area where an object occupies on the screen might affect the recognition time. We suggest that the object should be large enough to recognize it by user. 3.4 Effective Way to Display and Performance Compared with GUI We discuss what an effective way to display objects in PBUI is in this section. We suggest two types of PBUI for comparison. They are an automatic paging user interface and an automatic scrolling user interface. The display designs are depicted in the figure 5. The circle represents the position where a candidate object like a graphical object in the figure 4 is placed. The automatic paging system was experimented as follows. 1. When user’s finger is placed on the home position, 12 candidate objects are displayed for the displaying duration time. The duration time is deduced from the equation 2 based on the preliminary experiment of figuring out right values for each element of the equation for the objects. 2. After displaying the page for the certain time, it automatically displays the next page including the next 12 candidate objects.

658


Object

Object

Home

Home

Automatic Paging

Automatic Scrolling

Fig. 5. Display designs of PBUI

3. When the test subject finds the target object, he or she moves the finger to the target object to select. 4. The total time to perform the task is measured. In the automatic scrolling system, candidate objects are presented to the user in a different way. Instead of changing the whole page simultaneously, the objects are smoothly moved to the right. Each object is displayed for the certain time. The method of selecting the target object is the same. The table 1 shows results of the average time to perform a task in which the target object appears on the tenth page or equivalent. The automatic paging and the automatic scrolling showed about equal performance on every complexity. We don’t conclude which PBUI is better than the other, but it was reported that the accuracy of selecting right target is different [6]. The accuracy of the automatic scrolling is better than the automatic paging’s. We suppose that user has to move the eyes actively from an object to the next object in one page in the automatic paging. It might be easy for user to miss the target object because user has to pay attention to scan all objects in the page within the time set by the system. Rather than actively moving the eyes, user’s eyes relatively stay at the same vertical line to scan whole candidate objects in the automatic scrolling. It is easy for user to scan all objects. In other words, user seldom misses the target object. We think that the accuracy difference comes from the user’s scanning ability. Thus, we think that the automatic scrolling is suitable for naïve user. It is consistent with our assumption that PBUI is a user interface for beginners, novice users, or naïve users. Table 1. The results of task performance for PBUI on different complexity of objects

Complexity Automatic Paging(sec) Automatic Scrolling(sec)

1

2

3

29.394 28.811

33.897 34.422

35.734 37.516


659

Task time(sec)

60 50 40 30 20 10 0

1

#WVQOCVKERCIKPI

2

3

%QORNGZKV[QHQDLGEVU #WVQOCVKEUETQNNKPI

/CPWCNRCIKPI

/CPWCNUETQNNKPI

Fig. 6. Graph of task performance for PBUI and GUI

3.5 Performance Compared with GUI We provided two pilot GUIs for comparison. One is a manual paging user interface, and the other is a manual scrolling user interface. The display designs are depicted in the figure 7. The test subject needs to move the finger to the arrow to go to the next page including the next 12 candidate objects in the manual paging user interface. In the scrolling user interface, the test subject needs to manipulate the scrolling bar by dragging to navigate in the window to find the target object. The table 2 shows results of the average time.

Object

Object

Next Page Home

Scroll Bar

Manual Paging Fig. 7. Display designs of GUI

Manual Scrolling

660

N. Urano and K. Morimoto Table 2. The results of task performance for GUI on different complexity of objects

Complexity Manual Paging(sec) Manual Scrolling(sec)

1

2

3

34.766 43.738

38.192 43.341

40.336 53.041

The performance of PBUI of the automatic paging user interface and the automatic scrolling user interface is about equal to the manual paging user interface. Those three show better performance than the manual scrolling user interface that is widely used in GUI for photo applications.

4 Conclusions This paper explains the characteristics of PBUI and suggests some important factors of PBUI. PBUI is an alternative user interface for the users discussed in the introduction. We present the human performance mode by the equation. Based on the human performance model, we discussed important factors of PBUI that are the duration time for displaying object, the number of objects to be displayed simultaneously, the complexity of objects, and the displaying method. We summarize our answers to the issues raised in this paper as follows. 1. What is a right duration time for displaying candidate objects? The duration time should be expressed in the equation 2. 2. What is a right number of candidate objects to be displayed simultaneously to user? If it is a simple application like an image exploring application, the number is between 8 and 16. 3. Is the number of simultaneously displayed candidate objects affected by the complexity of objects? The experiments show it is independent of the complexity of objects. 4. What is an effective way to display candidate objects to user? There are many ways to display candidate objects. Automatic scrolling user interface is a typical PBUI in which users do not need to scan the objects actively. 5. Performance compared with GUI. It depends on applications. If application is very simple like an image exploring application, performance of PBUI shows as good as, or better than GUI’s. It is important to find a suitable application using PBUI. We have to prove that PBUI should be very effective user interface for the real applications for the future work.

References 1. Margone, S., Shneiderman, B. (eds.): A study of file manipulation by novices using commands versus direct manipulation, Twenty-sixth Annual Technical Symposium, pp. 154–159. ACM, Washington DC (1987) 2. Maulsby, D.L., Witten, I.H.: Inducing programs in a direct manipulation environment, Proc. CHI’89 Conference, Human Factors in Computing Systems, ACM, New York, pp. 57–62. ACM, New York (1989)


661

3. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Mahwah (1983) 4. Takekuni, T., Urano, N., Morimoto, K., Kurokawa, T.: Proposal of Push-based User Interface and its Operating Characteristics, In: 2003 Japan Ergonomics Society Kansai branch conference proceedings, pp.146– 149 (2003) 5. Takekuni, T., Urano, N., Morimoto, K., Kurokawa, T.: A Study on Number of Objects in Push-Based User Interface, In: Human interface symposium proceedings, pp.109–112 (2004) 6. Li, Q., Urano, N., Morimoto, K., Kurokawa, T.: A study of the visual Push-Based User Interface that considers practicality. In: The 7th. human media workshop proceedings (2006)

Developing Instrument for Handset Usability Evaluation: A Survey Study Ting Zhang, Pei-Luen Patrick Rau, and Gavriel Salvendy Department of Industrial Engineering Tsinghua University, Beijing 100084, China [email protected]

Abstract. Handset is transforming from a traditional cellular phone to an integrated content delivery platform for communications, entertainment and commerce. Their increasing capabilities and value-added features provide more utilities, and at the same time, make the design more complicated and the device more difficult to use. An online survey was conducted to measure user’s perspective of the usability level of their current handset using a psychometric type of instrument. A total of 9 usability factors were derived from the results of exploratory factor analysis. The total percentage variance explained by these 9 factors of the overall variance of the data was 65.20%. The average internal consistency in this study is 0.70. Keywords: Handset; Usability; Usability measurements; Usability factors; Instrument; Survey.

1 Introduction User’s subjective perceptions with the usability of handset have great impacts on successful adoption and use of applications. Users of consumer products care about the usability of product not only in terms of performance but also impressions and emotions [1]. It has been agreed that subjective aspects should be considered more important than they used to be. Many research studies have been conducted to reflect the subjective usability aspects in terms of perceived ease of use and perceived usefulness in the evaluation of user acceptance [2], or behavioral and emotional factors in the design of consumer products [3, 4]. Although those studies have attempted to extend the usability concept, most of these definitions or models do not include all major aspects of usability [5]. It is essential to refine the measures of usability particularly in regard to the unique handheld-related usability issues. To achieve that purpose, many researches and methodologies that can evaluate the usability of mobile applications were applied to emphasize the subjective aspects and extend the dimensions of usability. Zhang & Adipat [6] suggested that the selection of appropriate research methodology and data collection techniques for the usability evaluation of a mobile application should depend on specific research issues and objectives. In the present study, questionnaire survey research method was employed because of the following three reasons. 1) There are vast variables as well as the diversity of J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 662–671, 2007. © Springer-Verlag Berlin Heidelberg 2007

Developing Instrument for Handset Usability Evaluation: A Survey Study

663

end-users need to be considered when measuring the usability of handset. Survey research can facilitate large amounts of data to be gathered with relatively little effort and support broad generalization of results [7]. 2) Mobility is quite difficult to simulate in a laboratory setting because of the changing context. The use of such devices in the context of doing other work also has implications for determining the context of use for usability testing [8]. 3) As an inquiry method, questionnaire survey plays a major role in subjective measurements. While past reviews of research has indicated a lack of survey study and psychometric instruments when simultaneously measuring multiple key concepts in the quality of experience in software systems [7, 9]. Furthermore, in many cases, questions of the standardized instruments are not specific enough to investigate handsets [10]. To supply the gap, the present study contributed to develop a usability instrument, comprising of specific design elements and structured usability factors unique to handset devices. The objective of this study is to develop an instrument to measure the perceived usability of handset product. The research issues of this study focused on two questions: (1) what are the most important usability factors for indicating the handset overall perceived usability? And, (2) how do the factors contribute to the handset overall perceived usability? The approach used in this study is expected to provide an innovative and systematic methodology for explaining and measuring the usability of handsets.

2 Literature Review 2.1 Usability of Handset There are many studies focused on the usability testing of individual handset design issues such as keystrokes [11, 12], content presentation [13, 14], battery duration [15], menu structure [16, 17], etc. However, few of them contribute to identifying usability dimensions and design factors for multiple features regarding to subjective feeling about the use of handsets. Chuang, Chang, & Hsu [18] examined the relationship between user’s preference of mobile phones and their form (hardware) design elements. Participants were asked to judge 26 mobile phone designs by using a user preference rating scale for 11 image words. Han et al. [1] measured 88 specific design elements for 36 products by using a measurement checklist. Another related study investigated the relationships among the design features of cell phones according to 1,006 college students’ preference ratings to their current cell phone [19]. According to the results, 5 design issues that significantly impact user’s overall satisfaction were identified, including calling-related features, personal preferences, portability, durable aspect, and aesthetic aspect. All of those studies were based on currently available cell phone models and focused on frequently used features. So the results can just tell us which is best among current designs. However, mobile technology is growing very quickly. It is necessary to consider the advanced features and functions which probably are uncommonly used now but will be popular soon. In the further research of Ling, Hwang, & Salvendy [20], they investigated the relationship among five advanced features and users’ preference

664

T. Zhang, P.-L.P. Rau, and G. Salvendy

level. The results showed that color screen, voice-activated dialing, and Internet browsing feature can strongly predict users’ satisfaction levels. We must aware that the advanced features are changing quickly. This brings difficulties for the usability research work. For instance, the color screen and camera function may be attractive features a few years ago. Today most cell phones have a color screen and the camera function also turns into a must-have feature for most mobile phone users. 2.2 Handset Usability Dimensions/Factors Menu structure and user interfaces are critical design features related to those basic communication issues: text entry, dialing and messaging, calendar, etc. The usability of those basic features can strongly influence the users’ overall satisfaction with the product. So there are many usability criteria have been studied for those basic handset features. Efficiency, effectiveness, simplicity (complexity), learnability, consistency, feedback, and memorability are essential and often addressed [21-26]. Ziefle [26] also indicated that predictability, familiarity, and generalizability (transfer of knowledge of a specific to similar interaction) are also the crucial users’ criteria for selecting mobile phones. Ji, Park, Lee, & Yun [27] developed a usability checklist consisted of five groups of usability factors for mobile phone user interface based on 21 usability principles. The efficient retrieval of information includes the organization of information, the method of accessing it, and form of delivery. Information must be presented to be both easy to obtain and easy to mentally integrate. It is because that performing navigation tasks on handset will place heavy cognitive demands on the user’s short-term memory because of limited screen size, scrolling capabilities, and slower processing [21]. The researches on the user performance with information retrieval largely focus on efficiency (speed and errors) and effectiveness (quality of outputs) [16, 22, 28]. Providing direct access to focused valuable content and simple hierarchies will increase the efficiency and decrease the keystrokes and text entry [16]. Another study of offering efficient information retrieval can be found in the study of [29], in which the mobile phone feature named two-phase fetch in the mail retrieving via Internet services is provided. Other critical usability factors include network connectivity, flexibility [30, 31], and personalization [32]. The emerging mobile commerce (m-commerce) technology promises exciting possibilities, but the user experience and acceptance of this technology still awaits an understanding. Bruner & Kumar [33] applied the technology acceptance model (TAM) to the consumer context and found that, the “fun” attribute contributes to consumer adoption of handheld Internet devices even more than the perceived usefulness. It has been strongly suggested that personalization is essential to creating positive mobile experience [34, 35]. Tarasewich [36] suggested that the context of use and security must be taken into account during the design and use of mobile commerce applications which will be affected by changing environmental conditions. Malloy, Varshney, & Snow [37] suggested the reliability (dependability) of wireless network infrastructure is necessary for the success of mobile commerce.


665

Koutsiouris, Vlachos, & Vrechopoulos [38] provided a tailored evaluation framework to the mobile music services which integrated of key variables (factors) involved the user-mobile interaction process both from a business and technique perspective. Location-awareness provides mobile users with topical and personal contents that may increase the appeal of mobile guides in different application fields. Based on the results of seven field studies of Kaasinen [39], the usability factors of utility and user trust were identified to strongly affect the user acceptance of location-aware mobile guides. Fithian et al. [40] indicated that privacy would be the primary concern when using location-aware technologies, and the integration (with other functions, e.g. calling), understandability (of icons, labels, options, and how the application works), feedback (of system status or confirmation of some important actions) are some crucial determinants of users’ performance and satisfaction with the use of mobile location-aware applications. Except for the technology aspects, Ciavarella & Paternò [41] address the usability criteria for graphical UI design of the mobile guide applications with five concerns: web metaphor, navigation feedback, orientation support in the surrounding environment, minimal graphical interaction, and no redundancy in input commands. Howell, Love, & Turner [42] investigated the effect of interface metaphor and context of use (private/public) on the usability of a hierarchically structured speech-activated mobile city guide service. The results showed that visualization of the metaphor-based service significantly affected participants’ attitudes.

3 Survey Study A psychometric type of instrument was developed to gather end-users’ subjective perspective of usability of their current handset product. Firstly, handset usability dimensions were carefully selected, deleted and integrated from various resources [3, 6, 27, 43, 44]. Then, initial items were generated from a series of published usability instruments [2, 45-51] and modified with particular considerations for the identified handset usability dimensions. The first version of the instrument consisted of 98 items. All of the 98 items were firstly examined by the author according to correlation between items and centrality to the concept of usability. Then critiques of items were obtained from three PhD students and three master students who were familiar with the research topic and instrument design. 10 items were deleted, and word modifications were made according to those critiques. The final version of the scale includes a list of 88 items, with one global scale (item 89) measuring the perceived usability of the subject’s current handset. The instrument asks respondents to indicate how strongly they agree or disagree with each item on the instrument using a scale from 1 (strongly disagree) to 7 (strongly agree). The global scale can be used to analyze the criterion-related validity of the instrument. Not all questions were available on all handsets. If any question is not available on user’s handset or they perceive that they don’t have that question, they will mark “Not Applicable” (N/A) for the question.

666


The survey was implemented online using HTML forms to a broad sample of individuals in China. Participants were recruited via personal contact, email contact, university BBS message, blogs, and Web forum announcement. Demographic information including user’s age, gender, job, and education level were collected. The experience with handset, manufacture and model of his/her current handset were also important to this study. When entering the survey website, the respondents were firstly instructed to read a short introduction of handset devices and the general definition of usability. Then respondents were asked to fill in a background questionnaire concerning demographic information, experience with handset, manufacture and model of their current handset. After that, the list of items was given.

4 Results Analysis 4.1 Respondents The total number of users participating in the survey study was 408. Prior to the analysis, 42 cases were deleted because of incomplete or inconsistent responses or repeated submissions. Particularly, it was assessed that how many respondents fully agreed or fully disagreed with two pairs of items that expressed opposite views to a scale in each of the two pairs (item 3 and 18, and item 37 and 46). 5 respondents had fully agreed or disagreed in both pairs. Therefore, those 5 respondents were deleted in the later data analysis. After the procedures, a total of 361 valid cases (143 males and 218 females) remained for the analysis. Respondents averaged 24.9 years of age (SD = 3.10 years), 4.6 years of handset experience (SD = 1.53 years), and 2.9 years of experience with their current handset (SD = 1.34 years). There are over 22 manufactures were collected in this survey, including four most popular manufactures which covered 71.2% of the sample size: Nokia (34.9%), Motorola (15.8%), Samsung (13.3%) and Sony Ericsson (7.2%). 4.2 Factor Analysis First of all, item response means were evaluated to determine whether a large percentage of participant responses created a “floor” or “ceiling” effect, which is observed when many of the individual scores are at one or both of the extreme ends of the scale suggesting that the scale may not have captured the actual variability in responses. All of the survey items had mean responses greater than 2.4 and less than 5.6, therefore no items were excluded due to “floor” or “ceiling” effects. A series of exploratory factor analyses were repeated to identify the factor structure of the 88-item instrument. Then sample data of 361 responses was examined using a principle component factor analysis together with equamax rotation method. Item reduction procedure was processed based on the four common employed criteria: 1) eliminating items with factor loadings less than 0.50 on all factors or greater than 0.50 on two or more factors [52]; 2) eliminating single-item factors [53]; 3) the value of Cronbach’s alpha of each factor should not decrease substantially when the item within


667

that factor was dropped [54]; and 4) the derived structure should be simple and easy to interpret [55, 56]. As seen in Table 1, after an iterative sequence of factor analysis and item reduction, the finally identified instrument consisted of 29 items left. A total of 9 factors were derived, which explained 65.20% of the overall variance of the handset usability. The first two factors, satisfaction (how the user satisfy and enjoy with the product) and controllability (ability for the user to regulate, control, and operate the product), accounted for one third of the total variance of handset usability. The first five factors accounted for almost 50% of the total variance of handset usability. The internal consistencies of the 9 factors ranged from 0.60 to 0.84 with an average level of 0.70, indicating an acceptable level of internal consistency. Table 1. Factors, eigenvalues, percentage of variance explained and internal consistencies

Satisfaction

6.5718

Total Variance Explained % 22.66

2

Controllability

2.9111

3

Effectiveness

1.8337

4

Frustration

5

Factors 1

Initial Eigenvalues

No. of Items

Internal Consistency (Cronbach’s Alpha)

5

0.84

32.70

3

0.76

39.02

3

0.74

1.4759

44.11

3

0.72

Customizability

1.4419

49.08

3

0.67

6

Navigation

1.3064

53.59

3

0.66

7

Attractiveness

1.2602

57.93

4

0.64

8

Helpfulness

1.0908

61.70

3

0.66

9

Consistency

1.0165

65.20

2

0.60

5 Conclusion and Discussion The proposed instrument for handset usability testing was partially derived from the experimental and theoretical base outlined in the published literatures. The biggest difference between the proposed instrument and other published instruments is that, the factors and items were selected with special considerations for the handset characteristics. Its internal consistency in the present survey study is acceptable. But the construct validity and discriminant validity need more evaluations. The contribution of this study is both theoretical and practical. Few studies focus on the measurements of subjective perceptions on the usability of handsets. Most of the published instruments are limited in the traditional dimensions of software usability. Furthermore, the methodology of identifying relationships between usability factors and design features has not been systematically addressed. The results and methodology proposed in the present study supply those themes and contribute to the practice of handset designing in industry. With increasing efforts on the side of

668


technology development, there is a lack of in-depth inquiry of the underlying phenomenon. The concept of mobility and mobile users are poorly understood. The approach used in this study is expected to provide an innovative and systematic methodology for explaining and measuring the usability of handsets. The survey results may cumulate a base of knowledge on this topic and help designers to recognize false assumptions and better ground their design choices. The study was limited in several aspects. Firstly the instrument was tested only in Chinese language, which generating the problem of semantic validity because of translation. Not all the items were selected from the established instrument items. The content validity and criterion-based validity need to be tested in the future. Secondly, the survey sample size is not large enough to conduct more statistical analysis. The nomological validity should be validated using structural equation modeling (SEM) in the future. Finally, the test-retest reliability of the instrument should be evaluated. Furthermore, because of the diversity of manufactures and models within each manufacture, it is difficult to perform statistical analysis to individual models within manufactures, due to small sample size for each model. Further investigation with usability experiment should be conducted to extract more specific design guidelines to improve specific features.

References 1. Han, S.H., et al.: Evaluation of product usability: development and validation of usability dimensions and design elements based on empirical models. International Journal of Industrial Ergonomics 26(4), 477–488 (2000) 2. Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13(3), 318–340 (1989) 3. Han, S.H., et al.: Usability of consumer electronic products. International Journal of Industrial Ergonomics 28(3-4), 143–151 (2001) 4. Jordan, P.W.: Human factors for pleasure in product use. Applied Ergonomics 29(1), 25–33 (1998) 5. Jokela, T., et al.: The standard of user-centered design and the standard definition of usability: Analyzing ISO 13407 against ISO 9241-11. In: de Janeiro, R. (ed.) Proceedings of the Latin American conference on Human-Computer Interaction, Brazil, ACM Press, New York (2003) 6. Zhang, D., Adipat, B.: Challenges, methodologies, and issues in the usability testing of mobile applications. International Journal of Human-Computer Interaction 18(3), 293–308 (2005) 7. Kjeldskov, J., Graham, C.: A review of mobile HCI research methods. In: Chittaro, L. (ed.) Mobile HCI 2003. LNCS, vol. 2795, pp. 317–335. Springer, Heidelberg (2003) 8. Scholtz, J. Usability evaluation (2004) [cited 2006 Oct. 27th]; Available from http://www.itl.nist.gov/iad/IADpapers/2004/Usability%20Evaluation_rev1.pdf. 9. van Schaik, P., Ling, J.: Five psychometric scales for online measurement of the quality of human-computer interaction in Web sites. International Journal of Human-Computer Interaction 18(3), 309–322 (2005) 10. Lee, Y.S., et al.: Systematic evaluation methodology for cell phone user interfaces. Interacting with Computers 18(2), 304–325 (2006)


669

11. Klockar, T., et al.: Usability of mobile phones. In: Proceedings of the 19th International Symposium on Human Factors in Telecommunications, Berlin, Germany (2003) 12. Ziefle, M., Bay, S., Schwade, A.: On keys’ meanings and modes: The impact of different key solutions on children’s efficiency using a mobile phone. Behaviour & Information Technology 25(5), 413–431 (2006) 13. Bederson, B.B., et al.: A fisheye calendar interface for PDAs: Providing overviews for small displays. In: Proceedings of CHI’03 Conference on Human Factors in Computing Systems, ACM Press, Ft. Lauderdale, Florida, USA (2003) 14. Bederson, B.B., et al.: DateLens: a fisheye calendar interface for PDAs. ACM Transactions on Computer-Human Interaction (TOCHI) 11(1), 90–119 (2004) 15. Bloom, L. et al.: Investigating the relationship between battery life and user acceptance of dynamic, energy-aware interfaces on handhelds. In: Mobile Human-Computer Interaction Mobilehci, Proceedings. pp. 13–24 (2004) 16. Buchanan, G., et al.: Improving mobile internet usability. In: Proceedings of the 10th international conference on World Wide Web, ACM Press, Hong Kong (2001) 17. Ziefle, M., Bay, S.: Mental models of a cellular phone menu. Comparing older and younger novice users, in Mobile Human-Computer Interaction - Mobilehci 2004, Proceedings. pp. 25–37 (2004) 18. Chuang, M.C., Chang, C.C., Hsu, S.H.: Perceptual factors underlying user preferences toward product form of mobile phones. International Journal of Industrial Ergonomics 27(4), 247–258 (2001) 19. Ling, C., Hwang, W., Salvendy, G.: A survey of what customers want in a cell phone design. Behaviour & Information Technology (2005) 20. Ling, C., Hwang, W., Salvendy, G.: Diversified users’ satisfaction with advanced mobile phone features. Universal Access in the Information Society 5(2), 239–249 (2006) 21. Albers, M.J., Kim, L.: User Web browsing characteristics using palm handhelds for information retrieval. In: Professional Communication Conference, 2000. Proceedings of 2000 Joint IEEE International and 18th Annual Conference on Computer Documentation (IPCC/SIGDOC 2000), IEEE, Cambridge, MA (2000) 22. Chittaro, L., Dal Cin, P.: Evaluating interface design choices on WAP phones: navigation and selection. Personal and Ubiquitous Computing 6(4), 237–244 (2002) 23. Christie, J., Klein, R.M., Watters, C.: A comparison of simple hierarchy and grid metaphors for option layouts on small-size screens. International Journal of Human-Computer Studies 60(5-6), 564–584 (2004) 24. Kjeldskov, J., Stage, J.: New techniques for usability evaluation of mobile systems. International Journal of Human-Computer Studies 60(5-6), 599–620 (2004) 25. Marila, J., Ronkainen, S.: Time-out in mobile text input: The effects of learning and feedback. In: Chittaro, L. (ed.) Mobile HCI 2003. LNCS, vol. 2795, pp. 91–103. Springer, Heidelberg (2003) 26. Ziefle, M.: The influence of user expertise and phone complexity on performance, ease of use and learnability of different mobile phones. Behaviour & Information Technology 21(5), 303–311 (2002) 27. Ji, Y.G., et al.: A usability checklist for the usability evaluation of mobile phone user interface. International Journal of Human-Computer Interaction 20(3), 207–231 (2006) 28. Jones, M., et al.: Improving Web interaction on small displays. Computer Networks 31(11-16), 1129–1137 (1999) 29. Rao, H., et al.: iMail: a WAP mail retrieving system. Information Sciences 151, 71–91 (2003)

670


30. Watters, C., Duffy, J., Duffy, K.: Using large tables on small screen display devices. International Journal of Human Computer Studies 58(1), 21–37 (2003) 31. Watters, C., Zhang, R.: PDA access to Internet content: Focus on forms. In: Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS’03) - Track 4., IEEE Computer Society, Hawaii (2003) 32. Anderson, C.R., Domingos, P., Weld, D.S.: Personalizing web sites for mobile users. In: Proceedings of the 10th international conference on World Wide Web, ACM Press, Hong Kong (2001) 33. Bruner, G.C., Kumar, A.: Explaining consumer acceptance of handheld Internet devices. Journal of Business Research 58(5), 553–558 (2005) 34. Ho, S.Y., Kwok, S.H.: The attraction of personalized service for users in mobile commerce: an empirical study. ACM SIGecom Exchanges 3(4), 10–18 (2002) 35. Venkatesh, V., Ramesh, V., Massey, A.P.: Understanding usability in mobile commerce. Communications of the ACM 46(12), 53–56 (2003) 36. Tarasewich, P.: Designing mobile commerce applications. Communications of the ACM 46(12), 57–60 (2003) 37. Malloy, A.D., Varshney, U., Snow, A.P.: Supporting mobile commerce applications using dependable wireless networks. Mobile Networks and Applications 7(3), 225–234 (2002) 38. Koutsiouris, V., Vlachos, P., Vrechopoulos, A.: Developing and evaluating mobile entertainment applications: The case of the music industry. In: Rauterberg, M. (ed.) ICEC 2004. LNCS, vol. 3166, pp. 513–517. Springer, Heidelberg (2004) 39. Kaasinen, E.: User acceptance of location-aware mobile guides based on seven field studies. Behaviour & Information Technology 24(1), 37–49 (2005) 40. Fithian, R., et al.: The design and evaluation of a mobile location-aware handheld event planner. In: Proceedings of Human-Computer Interaction with Mobile Devices and Services: 5th International Symposium, Mobile HCI, 2003. Udine, Italy (2003) 41. Ciavarella, C., Paternò, F.: Design criteria for location-aware, indoor, PDA applications. In: Dignum, F.P.M., Cortés, U. (eds.) Agent-Mediated Electronic Commerce III. LNCS (LNAI), vol. 2003, pp. 131–144. Springer, Heidelberg (2001) 42. Howell, M., Love, S., Turner, M.: The impact of interface metaphor and context of use on the usability of a speech-based mobile city guide service. Behaviour & Information Technology 24(1), 67–78 (2005) 43. Folmer, E., Bosch, J.: Architecting for usability: a survey. Journal of Systems and Software 70(1-2), 61–78 (2004) 44. Hornbaek, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64(2), 79–102 (2006) 45. Chin, J.P., Diehl, V.A., Norman, K.L.: Development of an instrument measuring user satisfaction of the human-computer interface. In: Proceedings of SIGCHI ’88, Washington, DC: New York: ACM/SIGCHI (1988) 46. Lewis, J.R.: Psychometric evaluation of the post-study system usability questionnaire: the PSSUQ. In: Proceedings of the Human Factors Society 36th Annual Meeting, Human Factors Society, Atlanta, GA (1992) 47. Kirakowski, J., Corbett, M.: SUMI: The software usability measurement inventory. British Journal of Educational Technology 24(3), 210–212 (1993) 48. Lewis, J.R.: IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction 7(1), 57–58 (1995)


671

49. Lin, H.X., Choong, Y.Y., Salvendy, G.: A proposed index of usability: A method for comparing the relative usability of different software systems. Behaviour & Information Technology 16(4-5), 267–278 (1997) 50. Kirakowski, J., Claridge, N.: Website Analysis and MeasureMent Inventory (Web Usability Questionnaire). (1998) [cited 2006 Dec. 05th]; Available from: http://www.ucc.ie/hfrg/questionnaires/wammi/index.html 51. Muylle, S., Moenaert, R., Despontin, M.: The conceptualization and empirical validation of web site user satisfaction. Information & Management 41(5), 543–560 (2004) 52. Hair, J.E., et al.: Multivariate data analysis: with readings, 4th edn. Prentice-Hall, Inc, Upper Saddle River, NJ, USA (1995) 53. Stiggelbout, A.M., et al.: Ideals of patient autonomy in clinical decision making: a study on the development of a scale to assess patients’ and physicians’ views. Journal of Medical Ethics 30(3), 268–274 (2004) 54. Chiou, C.F., et al.: Development and validation of the revised Cedars-Sinai Health-Related Quality of Life for Rheumatoid Arthritis Instrument. Arthritis & Rheumatism-Arthritis Care. & Research 55(6), 856–863 (2006) 55. Smith, B., CapUti, P., Rawstorne, P.: The development of a measure of subjective computer experience. Computers in Human Behavior 23(1), 127–145 (2007) 56. Wang, Y.-S., Liao, Y.-W.: The conceptualization and measurement of m-commerce user satisfaction. Computers in Human Behavior 23(1), 381–398 (2007)

Part III

Understanding Users and Contexts of Use

This page intentionally blank

Tips for Designing Mobile Phone Web Pages for the Elderly Yoko Asano1, Harumi Saito1, Hitomi Sato1, Lin Wang2, Qin Gao2, and Pei-Luen Patrick Rau2 1

Cyber Solutions Laboratories, Nippon Telegraph and Telephone Corporation, 1-1 Hikarinooka, Yokosuka-Shi, Kanagawa, 239-0847, Japan {asano.yoko, saito.harumi, sato.hitomi}@lab.ntt.co.jp 2 Department of Industrial Engineering, Tsinghua University, Shunde Building, Tsinghua University, Beijing, 100084, P.R. China [email protected], [email protected], [email protected]

Abstract. This paper proposes tips for designing Web pages appropriate for the elderly. The characteristics of mobile phone Web pages and the effects of aging are elucidated. The elderly had difficulty in reading texts, finding the focus, operating pages and input, and understanding the contents in some cases. Tips for designing Web pages that are appropriate for the elderly are proposed based on our observations. Keywords: mobile phone Web pages, Web design, the elderly, aging effect.

1 Introduction The mobile phone is becoming ubiquitous in Japan. The coverage of the mobile phone now exceeds 85.3% [1]. A more ominous trend is that 20% of the population of Japan is over 65 years old [2]. This generation has very low adoption rates of IT equipment compared to other generations. Fortunately, the mobile phone is more widely used than the PC [1]. Two reasons are that mobile phones are initially cheaper than PCs, and they have fewer functions and operations than PCs, so users find it easier to learn how to use them. Moreover, the telephone call operation of the mobile phone is similar to that of the wired telephone that the elderly are used to. The elderly mainly use telephone calls and mail functions of the mobile phone, while young people use mobile Web access as well as telephone calls and mail functions [3]. The reasons for the rare use of mobile Web access by the elderly have been reported [3]. The most significant reason is that it is difficult for them to use the mobile Web interface. Other key reasons are that characters are too small to read, and manuals are difficult to understand. The aging effects of the elderly and the problems peculiar to mobile phone also factors. The Web must be seen as an important communication media for the elderly because they have lower rates of contact with society. To this end, it is necessary to design mobile phone Web pages that are easier for the elderly to use. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 675–680, 2007. © Springer-Verlag Berlin Heidelberg 2007

676

Y. Asano et al.

This paper proposes tips for designing Web pages that are appropriate for the elderly. The characteristics of mobile phone Web pages and the characteristics of the elderly are surveyed. Behavior of the elderly when accessing the mobile phone Web is observed. Tips for designing Web pages appropriate for the elderly are then proposed based on the results of the observation.

2 Characteristics of Mobile Phone Web There are differences in the design and operation of mobile phone Web services around the world. There are also differences among mobile phones in Japan. We talk here about the common characteristics of mobile phone Web service in Japan. The three major characteristics are shown in Table 1. One is small display. The second is that the interface is quite inflexible, i.e. font size, color, and so on. The third is that there are few keys to operate. Small Display. The small displays trigger many negative effects. Only a little information can be displayed at a time. Therefore, text is apt to be closely displayed. Lists are aligned. Abbreviated words, symbols, and icons are frequently used to shorten the text. Many Restrictions on Interface Flexibility. Character size and font can not be changed. Carriage returns and figures are used to indicate paragraphs. The layout is apt to be simple. Color variation is often used to indicate information structure. We can change only the background color and character color. The focus of cursor is generally indicated by color reversal. The color tones differ with the terminal type or usage. Moreover, the cursor only jumps from link to link. Table 1. Characteristics of mobile phone Web Major characters Small display

Interface restrictions

Few keys available

Characteristics of mobile phone Web Only a little information can be displayed at once. Texts are closely displayed. Lists are aligned. Abbreviated words, symbols, and icons are frequently used. Character size and font can not be changed. Color variation is often used in lieu of other formatting techniques. The cursor point is generally indicated by color reversal. The cursor jumps from link to link. Cursor jumps to the next link by down key input even if the link is to the right of the current cursor position. Display mode must be changed to input characters. Mode key must be pushed many times to change input character type. Same key must be pushed many times to input one character.

Tips for Designing Mobile Phone Web Pages for the Elderly

677

Few Keys. The average mobile phone has only twenty keys or so. All operations must be executed through these keys. The cursor is moved to the next link by the down key regardless of the direction of the next link. Even if the links are horizontally aligned, the user must push the down key to move the cursor to the next link. Another big problem is the difficulty of inputting characters. Display mode should be changed to input the characters. In Japanese, we use many types of characters; hiragana, katakana, Chinese characters, alphabets, numerals, and so on. We may have to push the mode key many times to change input character type. Moreover, more than 50 kinds of syllabic characters must be accessed through only twenty keys. We sometimes have to push the same key five times to input one character.

3 Aging Effect of the Elderly Many abilities of the elderly decrease with age. Table 2 shows the aging effects of the elderly as related to the usage of mobile phones. They are divided into three groups: effects related to physical ability, cognitive ability, and mental load. Physical Ability. The most important aging effect is poor sight. 80 percent in sixties and 90 percent in seventies suffer from cataracts. These have several symptoms. Most patients suffer a decrease in eyesight. Everything appears fogged with a yellow tint. The ability to distinguish contrast also decreases [4]. Moving ability also decreases. The elderly are not good at detailed work. Cognitive Ability. The most important impact on cognitive ability is a drop in distinction ability. The elderly take time to discover and recognize information. They are apt to have difficulty in recognizing the difference even when the two things are displayed simultaneously. Moreover, it is difficult for them to perceive a change over time. Table 2. Aging effects of the elderly Category Physical ability

Cognitive ability

Mental load

Common aging effects Eyesight decreases. Everything appears to have a yellow tint. The ability to distinguish contrast decreases. Weak at detailed work. Takes time to discover and recognize information. Tends to have difficulty in recognizing differences. Tends to have difficulty in perceiving things that change over time. Poor at forming mental model. Difficulty in memorizing and retrieving information. Poor spatial ability. Decline in motivation and understanding.

The elderly are also poor at forming mental models. They perceive their operations and information as low level elements, so they can not memorize and retrieve them easily. It causes a failure in spatial ability.

678

Y. Asano et al.

Mental Load. The decline in motivation to do anything causes a decline in understanding. Moreover, the elderly tend to give up easily when they are confronted with a challenge.

4 Behavior of the Elderly Using Mobile Web We conducted an experiment to observe how the elderly interacted with some existing mobile phone Web sites. Behavior of the elderly and their comments were collected. 4.1 Method of Experiment Subjects. Ten subjects participated in the experiment. Four were male and six were female. All subjects were over 55 years old. The average age was 65.4. All subjects had experience in using the telephone call function of mobile phones. Five of them had used the mail function. Only one of them had accessed a mobile phone Web site. Equipment. One of the most popular mobile phones, P901i, made by NTT DoCoMo was used in the experiment. It had been on the market for one year and eight months. Its screen displays only 12 by 12 characters. Objects. Thirteen mobile Web sites were used in the experiment. Two were portal sites, five were electronic commerce sites, two were air ticket reservation sites, three were stock exchange sites, and the other was broadcasting service site. Tasks. The subjects were instructed to access all mobile Web sites and perform a task specific to each Web site. For example, they were instructed to search for a specific flight and to reserve a seat on the air ticket reservation site. They were also asked to remark on the Web design and the problems encountered while using the Web sites. Observation data. Behavior of the subjects as they used the mobile Web sites and their remarks were captured. Particular attention was paid to the behavior and remarks made when they committed some error or were at a loss. 4.2 Results and Considerations The significant results are shown in Table 3. They were related to visibility, focus recognition, understanding, operation, and page structure. Visibility. Most subjects remarked that the pages were not easy to read because the characters were too closely packed and their weak visual acuity. We noted that white backgrounds made reading easier because the contrast between the characters and the white background tended to be high. In some cases, subjects did not recognize scrolling and blinking objects; it was hard for the elderly to recognize objects that rapidly changed and they gave up easily. Focus Recognition. In many cases, the subjects took too long time to identify the focus point. This is because there were many color combinations and they could not easily identify which combination indicated the focus point. In some cases, the focus area was too small to identify easily because short words were used. In other cases, the focus color matched surrounding color so well that users could not identify the focus easily.

Tips for Designing Mobile Phone Web Pages for the Elderly

679

Understanding. In many cases, the subjects skipped abbreviated words, foreign word, symbols, and icons because it was difficult for them to understand their meanings. Moreover, characters drawn in icons or figures were too small for the elderly to recognize. Several similar misinterpretation problems were observed. Many subjects thought, wrongly, that red text always meant a warning. This indicates poor ability in forming mental models. Table 3. Significant problems encountered by the elderly Category Visibility

Focus recognition

Understanding

Operation

Page structure

Major problems Closely packed characters are difficult to read. White background color made reading easier. Scrolling and blinking objects were sometimes skipped. They could not easily discern which color was used for focus highlighting. The focus area was small to recognize because short words were used. They skipped abbreviated words, foreign word, symbols, and icons because they were difficult to understand. They misunderstood red as always indicating a warning. They were at a loss when entering characters. They often tried to jump to the next link (to the right of the cursor) by wrongly pressing the right key instead of the down key. They were apt to try to understand the content based on only the information displayed on the screen at a time. They sometimes had difficulty in choosing one among choices when not all of the choices could be displayed at the same time.

Operation. Most subjects were at a loss when entering characters. This was because they failed to form a mental model of the operation of inputting the characters. Moreover, many input operations imposed high loads on the elderly. A lot of erroneous operations were observed when the subjects intended to move the cursor to the next link; they tried to use the right key instead of the down key. This was caused by the mismatch between the directions of cursor movement and those of the operation key. Page Structure. Most subjects tried to understand the content based on only the information displayed on the current screen; they were not good at remembering content. They sometimes had difficulty in choosing one of several choices when not all of the choices could be displayed at the same time. Moreover, they often got lost on mobile phone Web sites because they could not remember how they reached the present page.

5 Tips for Designing Mobile Web for the Elderly We propose tips for designing mobile phone Web pages for the elderly based on characteristics of mobile phone Web service, the aging effects of the elderly, and our observations of their behavior.

680

Y. Asano et al.

Layout. Use color variation and carriage returns to format content into chunks of information that are easier to follow. However, try to keep information density high so that as much information as possible can be seen without scrolling. Moreover, set choices in one screen so that users can compare all of them at once. Visibility. Do not use scrolling or blinking texts because text changes tend to be too fast for the elderly to recognize. Color. Use only enough color variation so as to make the information structure understandable. Too many meaningless colors hinder recognition of which color combination indicates the focus of attention. Only high contrast color combinations should be used to indicate highlighting. Words. Do not use abbreviated words, foreign word, symbols, or icons for the important words or links because the elderly tend to skip over unfamiliar symbols. Moreover, do not use short words for linked text so as to make the focus of attention stand out. Operation. Try to minimize the number of characters that must be input because input operations impose high loads on the elderly. Choosing one of a few choices is easier for them.

6 Conclusion The characteristics of mobile phone Web pages and the effects of aging were elucidated. The phones' small displays, interface restrictions, and few keys available caused many problems combined with the elderly aging effects of diminished physical ability, cognitive ability, and mental load. We found that the significant problems of using mobile phone Web pages were related to visibility, focus recognition, understanding, operation, and page structure. Tips for designing Web pages appropriate for the elderly were proposed based on the results of our observations. The tips proposed in this paper suit the development of mobile Web sites that is applicable to various users, from the young to the old. Acknowledgments. We would like to thank Ms. Mamiko Mori for conducting and managing the observations. She also gave us a lot of valuable recommendations.

References 1. Cabinet Office (ed.): Statistics of coverage of durable goods. Annual Report of Consumption Trend Investigation, March (2006) 2. Statistics Bureau (ed.): Information on the 2005 Population Census of Japan, Ministry of Internal Affairs and Communications (2006) 3. Mobile Society Research Institute (ed.): White Paper on Mobile Society, NTT Publishing, pp. 57–61 (2006) 4. Okajima, K., Takase, M.: Computerized Simulation and Chromatic Adaptation Experiments Based on a Model of Aged Human Lens. Optical Review 8(1), 64–70 (2001)

The Role of Task Characteristics and Organization Culture in Non-Work Related Computing (NWRC) Gee-Woo Bock, Huei-Huang Kuan, Ping Liu, and Hua Sun National University of Singapore, Department of Information Systems, School of Computing, 3 Science Drive 2, Singapore 117543 {bockgw, mkuan}@comp.nus.edu.sg

Abstract. Many organizations have scrambled to get control measures and discipline systems in place to deter employees from engaging in NWRC. Since control measures and discipline systems are insufficient to curb NWRC at the workplace, we propose to integrate the control perspective with task characteristics and organization culture. Thus, we examine the following research questions: How would the amount of NWRC control mechanisms affect employees’ NWRC behavior under different task characteristics? Does a match between the disciplinary approach and organization culture lead to more effective NWRC management? Two separate studies on full-time employees in various organizations revealed three important findings. Firstly, the ineffectiveness of NWRC control mechanisms occurred under high degree of task non-routineness. Secondly, the fit between discipline systems and organization culture leads to higher employee satisfaction with NWRC management, which subsequently led to lower time spent on NWRC. Thirdly, there is no best NWRC discipline system for each organization. Keywords: Non-Work Related Computing, Task Characteristics, Organization Culture, Fit.

1 Introduction Organizational IS resources have become more critical in empowering employees to perform their tasks and meet organizational goals. However, access to these computing resources may be easily abused by employees, with employees using these resources during office hours for non-work purposes which can include sending personal emails and online chatting [1]. The prevalent use of IS resources for personal purposes has become a serious problem for organizations [2]. According to Lim [3], non-work related activities may cost a company US$1 billion and lead to productivity loss of 30% to 40%. The negative ramifications of non-work related computing (NWRC) have led organizations to get policies and guidelines to discourage employees from engaging in NWRC [4]. Such measures failed to significantly curb NWRC activities and they still persist in the workplace [3, 5]. Drawing upon the concept of fit in strategic management literature [6], it may be crucial to synthesize the perspective of control (control mechanisms and discipline systems) with the perspectives of task J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 681–690, 2007. © Springer-Verlag Berlin Heidelberg 2007

682

G.-W. Bock et al.

characteristics and organization culture in NWRC management. Thus our research questions of this study are: How would the amount of NWRC control mechanisms affect employees’ NWRC behavior under different task characteristics? Does a match between the disciplinary approach and organization culture lead to more effective NWRC management?

2 Literature Review A number of terminologies have been used synonymously with NWRC such as “junk computing” and “cyberloafing”. Since we are interested in various forms of NWRC that involve Internet access, we define NWRC in this study as using the Internet during office hours for personal e-commerce (i.e. watch stock prices), personal communication (i.e. instant messaging with MSN), Internet browsing (i.e. reading news on the Internet), downloading files for personal purposes (i.e. movies and music) and Internet gaming (i.e. Yahoo! Games) [1]. As NWRC behavior is performed at the expense of organizational resources, much research has uncovered the negative impact of NWRC on corporate productivity [5]. Bock and Ho [5] attributed this result to the interruptive nature of NWRC activities, which are found in their study to be predominantly from emails and internet messaging. Personal emails and instant messaging are forms of distraction for employees and the recovery time compounded from both these interruptions may result in a great deal of time wasted leading to lower job performance. 2.1 NWRC Control Mechanisms Control mechanisms exist to discourage employees from engaging in NWRC at the work place [4]. Sharma and Gupta [7] cited that 45% of all firms and 17% of Fortune 100 companies use monitoring software of various kinds. The study also reported that software that records employees’ keystrokes are used by companies such as Exxon and the U.S State Department. Much of research on control mechanisms are based on General Deterrence Theory [8], which suggests that organizational control approaches can deter employees’ abuse of computing resources by increasing the perceived costs of computer abuse. Although many of the controls have been employed by numerous organizations, they have failed to significantly lessen NWRC behavior. Even by implementing tighter controls, NWRC still persists in organizations of the banking industry, which is equipped with strong security controls [2]. As evident, the implementation of control mechanisms in organizations alone is not sufficient to warrant the success of NWRC management. Organizations are unable to adequately apply it in their environments because General Deterrence Theory does not cover all the factors affecting the effectiveness of NWRC management. Many studies based on General Deterrence Theory did not consider employees’ task characteristics in the implementation of the control mechanisms.

The Role of Task Characteristics and Organization Culture in NWRC

683

2.2 Discipline Systems Besides control mechanisms, several IS studies have also suggested discipline systems to deal with the misuse of organizational computing resources [2]. Discipline can be meted out in several ways. Organizations are reported to discipline computer abusers by internal sanctions such as suspension and dismissal or even report offenses to third parties such as the police or FBI. There are predominantly two kinds of disciplinary systems: progressive and positive [9]. Progressive discipline provides that increasingly serious punishments be meted out to members of the organization who fail to behave acceptably. Positive discipline, on the other hand, retains the idea of making progressively serious contacts with an employee when work problems arise but eliminates the hastiness to use punishment as a means of getting the employee to adhere to rules and regulations. Instead, it seeks to prevent problems through formal and informal managerial practices and the recognition of good performance. Employees are allowed to participate in the disciplinary decision-making process, so they may be more responsible for their own behaviors and more willing to follow disciplinary policies [10]. As such, researchers generally believe that positive discipline is more effective than progressive discipline in curbing employees’ misconduct [9, 10]. 2.3 Fit Fit plays an essential role in strategic management and can be defined as a theoretically defined match between two variables [6]. Fit as matching is specified without reference to a criterion variable, although subsequently, its effect on a set of criterion variables could be examined. Task-Technology Fit model [11] argues the relation between users’ task requirements and usage of organizational IS may play a pivotal role in determining the effectiveness of organizational policies and measures. Goodhue [11] proposed that information systems (systems, policies, IS staff) have a positive impact on performance only when there is correspondence between their functionality and the task requirements of users. Therefore, we examine whether the fit between task characteristics and control mechanisms does indeed help to enhance NWRC management in organizations (Study 1). Prior research suggests that the match between discipline system and organization culture can help to curb NWRC. Crow and Hartman [12] revealed that health care organizations which neglect the detrimental elements of their culture can find themselves at risk of poor employee relations and ineffectiveness in applying discipline. Schwartz and Davis [13] have recommended that management should consider the cultural risk of implementing strategies. The lack of fit between organization culture and the discipline system may result in employees’ resistance towards the system, ultimately leading to the failure of the discipline system. Thus, we examine whether the match between disciplinary system and organizational culture has an impact on NWRC management (Study 2).

684

G.-W. Bock et al.

3 Study 1 Study 1 was conducted to examine the effects of fit between control mechanisms and task characteristics on NWRC management. In this study, tasks are broadly defined as the actions carried out by individuals in turning inputs into outputs [14]. The TaskTechnology Fit model suggests that task-technology fit will lead to greater performance of the technology [11]. Goodhue [11] measured a two-dimensional construct of task characteristics: non-routineness (non-repetitive and non-analyzable search behavior) and interdependence (reliance on other organizational units). This paper would focus on the analysis of these two dimensions of task characteristics because they are closely related to the requirement of information processing capabilities. As the objective of NWRC management is to reduce NWRC behavior, the dependent variable in this study is NWRC behavior. 3.1 Task Characteristics-Control Mechanisms Fit Task non-routineness is defined in this study as the level of structuredness, analyzability, difficulty and predictability of a task [14]. Tasks which have a high degree of non-routineness require employees to engage in intensive analysis, discussion and research in order to minimize the task uncertainty and find out a solution. The use of Internet browsing could provide immense amount of information as well as useful resources for the tasks to support non-routine tasks. Belanger and Slyke [15] found that a certain amount of playful use of the Internet can lead to learning that may be of value to the organization. Thus, certain NWRC like Internet browsing could be perceived as useful for non-routine tasks, since these tasks require employees to acquire more skills, knowledge as well as up-to-date information. If the NWRC control mechanisms within the organizations are tight, the employees may perceive the control mechanisms as a barrier for increasing their job performance and are hence more likely to ignore the control mechanisms. Hypothesis 1: The higher the task non-routineness, the weaker the negative effect of NWRC control mechanisms on employees’ NWRC behavior. Task interdependence is defined in this study as the degree to which the task is related to other organizational units and the extent to which coordination with other organizational units is required [16]. Coordination between interdependent parties needs to be supported by the organization’s information systems. Van de Ven et al [17] found that departmental communication increased as interdependence among employees increased. Instant messaging software and email provide a good communication platform for employees to communicate. As these tools match the communication requirement necessary for tasks of high interdependence, the prohibition of these tools would conflict with employees’ task requirements. If NWRC control mechanisms within the organization are tight, the employees may perceive control mechanisms as a barrier to communicate with other employees for work purposes and are more likely to ignore the control mechanisms. Hypothesis 2: The higher the task interdependence, the weaker the negative effect of NWRC control mechanisms on employees’ NWRC behavior.


685

3.2 Methodology A survey was carried out to test the proposed hypotheses. We target full-time employees who have easy access to the Internet at work. 40 organizations were contacted and 26 of them finally participated in the survey. 250 questionnaires were distributed either by mail or in person. After deleting the responses with missing data, there were 167 valid responses (effective response rate = 66.8%). All scale items are operationalized at the individual level. The measurement of NWRC behavior was measured in terms of the self-reported time spent for NWRC [1]. Respondents of the study were assured that any information they provide would be kept confidential to minimize underreporting of NWRC behavior. Before proceeding to test the hypotheses, we tested the validity of the measures. Convergent validity was shown by item-total correlation coefficients above 0.40. Cronbach’s alpha coefficients ranged from 0.60 to 0.81, which shows acceptable reliability for exploratory research [18]. To test discriminant validity, factor analysis with Varimax rotation was performed and loadings showed that the constructs are distinct from one another. 3.3 Results and Data Analysis Results were analyzed using SPSS 13.0. In performing hierarchical regression, we firstly added in the amount of control mechanisms as a predictor followed by a task characteristic and finally the interaction term. Hierarchical regression equations show that NWRC control mechanisms or NWRC control mechanisms along with non-routineness do not have significant impact on NWRC behavior. When the interaction variable is added, the effects of control mechanisms and the interaction variable are significant at p = 0.012 and p = 0.035 respectively. ∆R2 is 0.027 and the F statistic is 4.52, which is well above 1 [19]. This shows task non-routineness moderates the relationship between NWRC control mechanisms and NWRC behavior. Thus, H1 is supported. The hierarchical regression equations also show that NWRC control mechanisms or NWRC control mechanisms along with interdependence are insignificant on NWRC behavior. When the interaction variable is added, the effects of control mechanisms and the interaction variable are insignificant at p = 0.341 and p = 0.561 respectively. ∆R2 is 0.002 and F statistic is 0.34, which is below 1 [19]. This indicates that H2 is not supported. 3.4 Discussion of Results The first finding shows a significant result for the interaction of task non-routineness and the amount of NWRC control mechanisms. The ineffectiveness of NWRC control mechanisms under high task non-routineness may result from the perceived usefulness of NWRC. Tasks of high non-routineness are typically more difficult to accomplish and in order to accomplish these tasks, the use of up-to-date information and various resources are absolutely essential. Since Internet browsing is the most efficient and effective way to get information in today’s workplace, any restrictions on the Internet browsing would affect the accomplishment of these tasks. As they are required to browse the Internet to complete their tasks, tight control simply brings

686

G.-W. Bock et al.

inconvenience for their jobs and they would still continue to engage in NWRC due to the usefulness of the Internet in their jobs and their personal agenda. Thus, control mechanisms would be ineffective under high degree of non-routineness. Task interdependence may not be a significant moderator between the amount of NWRC control mechanisms and NWRC behavior due to two reasons. With high task interdependence, success at the workplace is contingent on one another’s output. As such, employees may devote more time to accomplish their tasks so that their work unit can accomplish their collective objectives effectively, instead of spending time to engage in NWRC. To collaborate with others within the organization, employees may use other substitutes for communication such as phone calls and/or face-to-face meetings which would be more effective to complete tasks. Thus the control mechanisms for instant messaging software have little conflict with interdependence.

4 Study 2 Study 2 focuses on examining the fit between discipline systems and organization culture. In this study, we include another dependent variable, satisfaction with NWRC management, as managers are also interested to find out if the fit between discipline system and organization culture can improve satisfaction with NWRC management. Organizational culture has pervasive effects on an organization and is defined as a socially constructed, cognitive reality that is rooted in deeply held perceptions, values, beliefs or expectations that are shared by, and are unique to, a particular organization [20]. Although there are several classifications of organization cultures in previous literature, the Organization Culture Index [21] is the most appropriate for the study as it provides three different distinct dimensions of organizational cultures to minimize ambiguity: bureaucratic, innovative and supportive. Bureaucratic cultures are hierarchical and compartmentalized. They are usually based on control and power, with clear lines of responsibility and authority. Innovative cultures are exciting and dynamic. They are creative places to work in, filled with challenge and risk. Supportive cultures are warm and “fuzzy” places to work in and employees are friendly and helpful to each other. 4.1 Organization Culture-Discipline System Fit Management literature suggests that the fit between organization culture and discipline system is crucial for management. Any management idea (which includes discipline systems), no matter how good it is, will not work in practice if it does not fit the culture [13]. Commanducci [22] also stressed that even management simulations require proper fit with company culture. The disciplinary system adopted in an organization is a type of management practice which can affect employees’ satisfaction of the system and NWRC behavior. Satisfaction of the discipline system is defined as a generalized positive or negative evaluation of the discipline system (adapted from [23]). The fit between organization culture and the discipline system may result in stronger employees’ compliance towards the discipline system [12], which can ultimately lead to greater satisfaction with NWRC management and lesser NWRC behavior.


687

Hypothesis 3: Employees of an organization with a disciplinary system closely matched with its organizational culture will have higher satisfaction with its NWRC management. Hypothesis 4: Employees of an organization with a disciplinary system closely matched with its organizational culture will engage in less NWRC behavior. 4.2 Methodology A survey was carried out to test the proposed hypotheses. Similar to study 1, we target full-time employees who have easy access to the Internet at their work place. 182 questionnaires from 30 organizations were collected. There are 174 valid questionnaires for data analysis (effective response rate = 95.6%). All scale items are operationalized at the individual level. The measurement of NWRC behavior was operationalized in terms of self-reported time spent on NWRC [1]. Similar to Study 1, respondents were also assured that any information they provide would be kept confidential. The validity of measures was also tested. Convergent validity was shown by itemtotal correlation coefficients above 0.40. Cronbach’s alpha coefficients ranged from 0.84 to 0.91, which shows acceptable reliability for exploratory research [18]. To test discriminant validity, factor analysis with Varimax rotation was performed and loadings showed that the constructs are distinct from one another. A cluster analysis was conducted to identify homogeneous groups of cultural profiles. 3 types of organization cultures were identified as the complete linkage dendrogram has suggested the possibility of a partition into 3 clusters [24]. Within each cluster, t-tests showed that there are significant differences in the means of one culture dimension from the rest of each cluster at p12.61) Supportive (df 1:60, F>11.97) Conclusion

Mean of Cluster 1

Mean of Cluster 2

F

1.82

1.29

101.46

1.89

1.62

18.47

1.83

1.26

142.42

Positive Discipline

Progressive Discipline

4.3 Results and Data Analysis The hypotheses of study 2 are analyzed with MANOVA and ANOVA. We conducted a test of between-subjects effects with MANOVA to analyze which individual dependent variables contribute to the significant multivariate effect. Bonferroni-type adjustment is applied to control experiment-wise error and decrease the chance of type I error [24]. In this paper, the adjusted alpha is equal to 0.025 (0.05/2). Using this alpha level, we have a significant univariate main effect on satisfaction towards NWRC management with F = 3.882 (p = 0.022) only when organizational culture and disciplinary system are considered together, supporting H1. However, the univariate main effect on the time spent on NWRC is not significant with F = 0.641 (p = 0.528). We also analyzed whether the disciplinary system has significant effect on satisfaction with NWRC management within each culture group using one-way ANOVA (see Table 2). For bureaucratic and supportive culture groups, satisfaction levels between two disciplinary approaches are different with F = 6.36 and F= 3.20 respectively, significant at p < 0.10. For the innovative culture group, the satisfaction levels between two disciplinary approaches are different with F = 8.90, significant at p < 0.05. These findings are consistent with the MANOVA results. Table 2. Summary of one-way ANOVA Test Results within each Culture Cluster Mean of Satisfaction Positive Progressive Discipline Discipline Bureaucratic (df 1:70, F>2.79, α=0.10) Innovative (df 1:40, F>4.08, α=0.05) Supportive (df 1:60, F>2.79, α=0.10)

FSAT

4.02

4.65

6.36

4.48

3.63

8.90

4.8

4.1

3.20

4.4 Discussion of Results The combination of organizational culture and NWRC disciplinary approach has significant effect on employee satisfaction toward NWRC management (H3). We also examined which discipline system fits best with which organization culture by looking at satisfaction. From Table 2 positive discipline is more accepted in


689

innovative and supportive cultures, with higher means of satisfaction (4.48 and 4.8 respectively) compared to progressive discipline (3.63 and 4.1 respectively). Progressive discipline is more accepted for bureaucratic cultures, with a higher satisfaction compared to positive discipline. This shows that positive discipline does not always produce better results than progressive discipline, although theoretically, positive discipline is superior as it encourages employees to participate in the management process. The fit between organizational culture and NWRC discipline have no significant impact on time spent on NWRC (H4). This may be explained by the reasons why employees engage in NWRC which include rational decision making factors like normative awareness regarding NWRC, influence of peer acquiescence and unconscious factors such as habit [25]. However, post-hoc linear regression shows that satisfaction with NWRC management negatively affects time spent on NWRC (p=0.002). Thus, we see that the fit between organizational culture and NWRC discipline system can exert an indirect effect on NWRC behavior through employees' satisfaction with NWRC management.

5 Conclusion and Directions for Future Research Studies 1 and 2 offer some insights for NWRC management in organizations. Firstly, the universal enforcement of control mechanisms or simply increasing the amount of disciplinary actions against offenders cannot reduce NWRC behavior. From the first study, the discrimination of sanctions according to task non-routineness is essential. The Internet activities of employees engaging in non-routine tasks should not be controlled rigidly. Instead, practitioners need to emphasize the balance of NWRC activities in employees’ daily use of Internet. From the second study, there is no best discipline system for each organization. The fit between the discipline system and organization culture leads to greater satisfaction towards NWRC management and eventually leads to less time spent on NWRC. This paper also offers contributions to research. This is a pioneer research effort to consider the effects of task characteristics and organization culture for NWRC management. Majority of NWRC literature has mainly focused on examining the antecedents of NWRC behavior (i.e. [25]) or elucidate the consequences of NWRC in organizations [5]. Although “fit” has occupied a central role in strategic management field [6], it has never been incorporated into NWRC research. However, since this research has not considered the role of industry norms and national culture, future research can further extend this paper by examining industry norms and national culture on NWRC.

References 1. Siau, K., Nah, F.F.H., Teng, L.: Acceptable Internet Use Policy. Communications of the ACM 45, 75–79 (2002) 2. Lee, J., Lee, Y.: A Holistic Model of Computer Abuse within Organizations. Information Management & Computer Security 10, 57–63 (2002)

690

G.-W. Bock et al.

3. Lim, V.K.G.: The Moderating Effects of Neutralization Technique on Cyberloafing and Organizational Justice. In: The Proceedings of Academy of Management Conference, Denver (2002) 4. Urbaczewski, A., Jessup, L.M.: Does Electronic Monitoring of Employee Internet Usage Work? Communications of the ACM 45, 80–83 (2002) 5. Bock, G.W., Ho, S.L.: Non-Work Related Computing (NWRC): Is there a Productivity Payoff? Accepted and forthcoming in Communications of the ACM 6. Venkatraman, N.: The Concept of Fit in Strategy Research: Toward Verbal and Statistical Correspondence. Academy of Management 14, 423–444 (1989) 7. Sharma, S.K., Gupta, J.T.N.: Improving Workers’ Productivity and Reducing Internet Abuse. Journal of Computer Information Systems 44, 74–78 (2003) 8. Beccaria, C.: On Crime and Punishments. Bobbs Merril, Indianapolis (1963) 9. Osigweh, C.A.B., Hutchison, W.R.: Positive Discipline. Human Resource Management 28, 367–383 (1989) 10. King, K.N., Wilcox, D.E.: Employee-Proposed Discipline: How Well is it Working? Public Personnel Management 32, 197–209 (2003) 11. Goodhue, D.L.: Understanding User Evaluations of Information Systems. Management Science 41, 1827–1844 (1995) 12. Crow, S.M., Hartman, S.J.: Organizational Culture: Its Impact on Employee Relations and Discipline in Health Care Organizations. The Heath Care Manager 21, 22–28 (2002) 13. Schwartz, H., Davis, S.: Matching Corporate Culture and Business Strategy. Organizational Dynamics 10, 30–48 (1981) 14. Perrow, C.: A Framework for the Comparative Analysis of Organizations. American Sociological Review 32, 194–208 (1967) 15. Belanger, F., Slyke, C.V.: Abuse or Learning? Communications of the ACM 45, 64–65 (2002) 16. Thompson, J.D.: Organizations in Action. McGraw-Hill, New York (1967) 17. Van de Ven, A.H., Delbecq, A.L., Koenig, R.: Determinants of Coordination Modes within Organizations. American Sociological Review 41, 322–338 (1976) 18. Nunnally, J.: Psychometric Theory. McGraw-Hill, New York (1967) 19. Carte, T.A., Russell, C.J.: In Pursuit of Moderation: Nine Common Errors and Their Solutions. MIS Quarterly 27, 479–501 (2003) 20. Hofstede, G., Neuijen, B., Ohayv, D.D., Sanders, G.: Measuring Organizational Cultures: A Qualitative and Quantitative Study across Twenty Cultures. Administrative Science Quarterly 35, 286–316 (1990) 21. Litwin, G.H., Stringer, R.A.: Motivation and Organizational Climate. Harvard University Press, Cambridge, Massachusetts (1968) 22. Commanducci, M.: Training Can Be Fun: Management Simulations Require Proper Fit with Company Culture. Canadian HR Reporter 11, 15 (1998) 23. Kidwell, R.E., Bennett, N.: Employee Reactions to Electronic Control Systems. Group and Organization Management 19, 203–218 (1994) 24. Coakes, S.J., Steed, L.G.: SPSS Analysis without Anguish: Version 10.0 for Windows. Wiley, Brisbane (2001) 25. Lee, O.K., Lim, K.H., Wong, W.M.: Why Employees do Non-work Related Computing: An Exploratory Investigation through Multiple Theoretical Perspectives. In: Proceedings of Hawaii International Conference of System Sciences, Hawaii (2005)

Searching for Information on the Web: Role of Aging and Ergonomic Quality of Website Aline Chevalier, Aurélie Dommes, Daniel Martins, and Cécile Valérian University of Paris X-Nanterre Cognitive Processes and Interactive Behaviours Laboratory 200 avenue de la République 92001 Nanterre cedex, France {aline.chevalier,adommes,daniel.martins}@u-paris10.fr

Abstract. Despite rapid growth in the number of websites, there is still a significant number of ergonomic problems, which hinder cognitive activities of web users. As cognitive aging is generally associated with a decrease of working memory capacities, an inhibition failure and a slowing of the speed of processing, we argue that aging may have negative effects on information search activities, especially when the website incorporates ergonomic problems. In the present experimental study, we compare younger and older web users performances while searching for information in two websites: one that fits the ergonomic recommendations and another with ergonomic problems. The results show that aging had negative consequences on users’ activities of information search (more times to find information, more number of steps required to find information and more cognitive resources involved in the activity). These consequences are more important for the non-ergonomic web site than for the ergonomic site. Keywords: Information search, Cognitive load, Ergonomics, Aging.

1 Introduction Over the last few decades, the number of websites has increased; navigating the Web becomes a daily activity for numerous individuals, to communicate, exchange information or search for information (e.g., planning a travel). In parallel, older adults are one of the fastest growing demographics on the Web users (for USA, see [24]).. Older adults navigate the Web to find health information, to plan travel and for e-mail [21] and more generally to search for information (e.g., [2], [29], [33]). Although websites commonly seem to be easy to use for everyone, numerous studies involving younger users showed that websites are still considered as difficult to use and to access (e.g., [4], [16]). But the problem gets more complicated for older web users. Because of age-related declines in cognitive abilities , the difficulties to search for and especially to find information on the Web could to be particularly important. Besides, older adults appear to be less familiar with the technologies than younger adults [17]. Moreover, earlier studies showed higher anxiety in older than in younger adults while using computers as well as an underestimate of their own computer competence and knowledge [6], [20]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 691–700, 2007. © Springer-Verlag Berlin Heidelberg 2007

692

A. Chevalier et al.

Recently, we notice that few researchers are interested in studying and determining cognitive strategies and difficulties that older users experience while searching for information (see, e.g., [2], [33]). Nevertheless, studies must be led in order to a better understanding of cognitive processes and cognitive difficulties involved in searching for information on the Web. Towards this end, we conducted an experimental study with younger and older web users. This study aimed at determining the influence of aging and ergonomic quality of websites on information search activity. The following section provides an overview of relations between information search activity, cognitive load and cognitive aging. Section 3 presents the experimental study. The results obtained are discussed in Section 4.

2 Searching for Information: Cognitive Load and Aging During the 80-90s, several attempts of modeling cognitive activities involved in searching for information were suggested (see [1], [9], [11]). First models described this activity as cyclical, i.e. the individual defines a (cognitive) goal, selects an information category, extracts information and integrates it into previous extracted information; the individual begins this cycle over and over again until s/he reaches her/his search goal. Nevertheless, these models do not explain why users fail when searching for information. Rouet and Tricot ([25]; then [34]) defined a more complete cognitive model that includes different factors, such as the degree of precision of the user's objective (vague vs precise), the extraction of unique or various sources of information, and the experience of users. This model is close to those used for searching in electronic information systems developed by Marchionini et al. [19] and Shneiderman et al. [32] with one main difference however: the latter models did not consider, for instance, the specific differences between the Web and bibliographical database systems. The model developed by Rouet and Tricot proposes an information search activity which is both cyclical (like Guthrie's model [11]), and close to text comprehension, problem-solving and decision-making activities. Accordingly, when individual searches for information, s/he elaborates a cognitive goal, selects a set of information, extracts information and integrates it to previous one. Individual restarts this cycle until s/he reaches the research goal. Therefore, searching for information consists in transforming a representation of a need of information into a request; its formulation depends on the contents and the constraints imposed by the system (here, the website). Next, the individual has to choose among the sources which are supplied to her/him those relevant for her/his information search (e.g., list of links or items presented on a website), by estimating them with regard to her/his representation of the goal. If the document is relevant, individual deepens her/his searching; if the document contains information partially relevant or very few relevant information, individual generally modifies her/his strategy and so her/his request. Ultimately, if the document is irrelevant, individual changes her/his request but also sometimes his/her representation of the goal. According to this model, searching for information is a very complex cognitive activity that requires many cognitive resources in working memory and so a high cognitive load.

Searching for Information on the Web

693

Sweller [31] distinguishes two kinds of cognitive loads: 1. Intrinsic cognitive load is linked to the task at hand. It depends on the difficulty of the content to be learned and on the amount of information that the individual has to simultaneously process in working memory. Intrinsic cognitive load decreases as knowledge in long-term memory increases. Consequently, a high intrinsic cognitive load corresponds either to a highly complex material or to an individual's low expertise. 2. Extraneous cognitive load is linked to the presentation of information, which has influence on cognitive resources involved. Intrinsic and extraneous cognitive loads are additive to create a total cognitive load [22]. As just indicated, working memory plays a central role in information search, since it allows to maintain and process temporarily several available information and individual’s goals [3]. Working memory is also one of the factors which has been revealed in cognitive aging research as a possible predictor of age-related declines in performance observed on a wide variety of cognitive tasks, such as reading or problem-solving. Older adults might experience difficulties in temporarily keeping and processing several information (for a review, see [35]): the amount of information that can be simultaneously processed and stored in working memory would decrease with aging. According to Hasher and Zacks [12], it is not so much the working memory size which would be important, but the way in which the information in working memory are managed with regard to the goal of the task at hand. Age-related differences in memory and other cognitive functions would be attributed to a decline in attentional inhibitory control over the contents of working memory (for a review see [18]). Older adults might be less able than younger adults to suppress and inhibit irrelevant information. Irrelevant information could overload working memory and thus interfere with the task to be performed. Inhibition failures associated to aging have been observed in numerous cognitive activities, such memory, language comprehension and reasoning [12], [18]. Last but not least, one of the most wide-spread theories of cognitive aging postulates a generalized decrease of the speed of executing processes, independently of the type or structure of the information being processed [28]. Many studies show that slowing of processing accounts for a significant portion of the age-related variance on a large number of cognitive tasks. Salthouse [28] suggested that two mechanisms underlie this effect: (1) according to the limited time mechanism, cognitive operations essential to the success of an activity are not all correctly executed by the elderly. Cognitive operations would be executed too slowly to be entirely accomplished in the assigned time, because much of the available time is taken up with early processes. (2) According to the simultaneity mechanism, the outcomes of early processes may be lost before they can be used by later processes. Decline in working memory capacities, inhibition failure and generalized decrease of the processing speed are the main effects of aging on cognition, and seem to explain the decline of older adults performances observed on numerous simple as well as more complex cognitive tasks, such as information search.

694

A. Chevalier et al.

3 Experimental Study 3.1 Research Problem and Objectives Information searching requires involvement of many cognitive resources, which depend on the individual’s cognitive capacities, their age, as well as characteristics and constraints of the used website. Among web interface characteristics, ergonomic quality has a central role. Indeed, if the website does not fit the users’ cognitive capacities, the cognitive load should increase. The possible consequences would be an overload and lostness on the Web [10], [30] ; for instance, Nielsen [23] noticed that half of the researches on the Web failed. Moreover, because of age-related changes in cognitive functioning, accessing to websites could become more complex for older users, especially when websites do not fit the users’ needs, i.e. sites incorporating (many) ergonomic violations. Accordingly, this study aims at determining the influence of the users’ age and the ergonomic quality of website on: (a) The time necessary to find information; (b) The number of steps (i.e. the number of hyperlinks visited by the participants) required to find information; (c) The amount of cognitive resources involved in finding information; (d) The participants' usability satisfaction with regard to the visited site. 3.2 Procedure Forty novice web users (in line with [13]) participated in this study: twenty younger users (M= 31 years old) and twenty older users (M= 64 years old). All of the participants had the same educational characteristics (Bachelor Degree) and they used punctually Internet to information searching and e-mails. Two versions of the same website were created, which presented an e-shop selling music products: • An ergonomic version that was consistent with ergonomic recommendations for web interfaces (ergonomic site, hereinafter referred to as ES, see Figure 1). • A non-ergonomic site (hereinafter referred to as NES, see Figure 2) that included the main ergonomic problems identified by designers and users in a previous study [7].

Fig. 1. Homepage of the ES

Fig. 2. Homepage of the NES

Searching for Information on the Web

695

We chose an e-commerce music site (selling CDs, show tickets, …) since those products are bought by many people on-line and do not require specific knowledge linked to the site content. The study was divided in two stages: Stage 1: Participants had to search for information to answer three questions successively presented from the homepage; each question had only one correct answer. For each participant, the order of the question presentation was counterbalanced. Navigation activities of the participants (visited pages, etc.) were recorded for analyzing. To respect the ergonomic recommendations and to compare the search time between the two sites, two steps (or two hyperlinks) were necessary to find the three correct answers. To measure cognitive load, participants, while they are searching for information, had to react to auditory signals (from Tholos software developed by [5]) by pressing a pedal with her/his foot (her/his hands remained free to use the computer). This made it possible to determine an average reaction time (in milliseconds). The participant's baseline reaction times measured during a training phase was subtracted from the reaction times measured during the experimental task (information search task), thus providing "reaction time interference scores" (RT). Such scores allowed us to measure the participants' cognitive resources: the greater the reaction time, the more cognitive resources were involved (for more details see, [5]). Stage 2: After searching for information, participants had to freely navigate the website to answer a usability satisfaction questionnaire including seventeen affirmations (based on the WAMMI, [14]). For each affirmation, participants had to indicate the degree to which they agreed on a 5-point-scale; the more the participants evaluated the site as satisfactory, the closer to 5 the grade was. 3.3 Results All of the results are presented in Table 1. Statistical analysis (ANOVAs) were conducted with Age (younger vs older users) and Site (ES vs NES) as factors. Analysis have considered, in the following order, search time and number of steps (§3.3.1), cognitive load (§3.3.2) and usability satisfaction (§3.3.3). 3.3.1 Times and Number of Steps Necessary to Find Targeted Information The time necessary to find targeted information (in sec.) was calculated from the moment the participant saw the homepage to the moment s/he said s/he had found the information. All of the participants succeeded in finding the three correct answers. The younger users needed significantly less time to find information than older users (F(1,36)=11.151 ; p> β (α has to be executed before β). Fig. 1 presents a typical task tree, describing in a rather simplistic way the agenda of a meeting that consists of three talks by users a, b, c (represented by task nodes A, B, and C, respectively) and a discussion (task node D). The talks can be given in any order, which is specified by the temporal relation order independency (“| = |”). The discussion can only be performed after all talks were presented. This is specified by the enabling relation (“>>”). Hierarchical task models are used to specify the behavior of users interacting with a software system. They allow to describe the basic temporal structure of

1112

M. Giersich et al.

A

|=|

B

|=|

C

>>

D

Fig. 1. Task model specifying the schedule of a meeting

compound activities. For inferring the activity of a user from sensor data, we need additional information: a specification of how probable a certain execution sequence is. Next, we will look at a current approach to this problem.

3

Inferring Intentions

As outlined above, computing the user’s current activity from sensor data requires a task model that allows to make statements about the plausibility of sensor data given a specific activity. A system can then try to identify the user’s current task by selecting that task, whose action sequence is most plausible with respect to the observed sensor data. Bayesian Filtering for identifying a user’s current task has been successfully used in several projects that aim at supporting user activities in classrooms, meeting rooms, and office environments [4,5,6]. Here, dynamic Bayesian networks (DBNs) are investigated increasingly for modeling a user’s activities [7,8]. In our own work, we look at using DBNs for inferring the current task and actions of a team of users. Given (noisy and intermittent) sensor readings of the team members’ positions in a meeting room, we are interested in inferring the team’s current objective – such as having a presentation delivered by a specific team member, a moderated brainstorming, a round table discussion, a break, or the end of the meeting. The basic structure of the DBN we propose for modeling the activities of such a team is given in Fig. 2. In general, a DBN consists of a sequence of time slices, where each time slice describes the possible state of a system at a given time t. A time slice consists of a set of nodes that represent the system’s state variables at that time. State variables may be connected through directed causal links. A connection such as X → Y means that the current value of Y depends on the current value of X. This dependency is described by a conditional probability table (CPT), such as P (Y = 0|X) P (Y = 1|X)

X=0X=1 0.9 0.3 0.1 0.7

which in this example says that, in case X is 1, the value of Y will be 0 with a probability of 0.3 and it will be 1 with a probability of 0.7. (If X is 0, Y will be 1 with a probability of 0.1 and 0 with a probability of 0.9.)

Task Modeling and Human Behavior Recognition

slice t − 1

slice t

Tt−1

Tt

(c)

(c)

Ut−1

(c)

Gt−1

(b) Ut−1 (a)

(a)

Gt−1

(b) Gt

(a)

(a)

Ut

Gt

(c)

(c)

At−1

At (b) At

(a)

(a)

At−1

At (c)

(c)

St−1

St

(b) St−1 (a)

Gt

(b) Ut

(b) At−1

St−1

(c)

Ut

(b) Gt−1

Ut−1

1113

(b) St (a)

St

Fig. 2. Two-sliced dynamic Bayesian network (DBN) modeling team intention inference. It shows the intra-slice dependencies between observable (double-contoured) and hidden variables, as well as the inter-slice dependencies between consecutive states.

Causal links may connect nodes within a time slice, they may also connect nodes between time slices – the latter is used to express the fact that the state at time t depends on the previous state at time t − 1. (We here consider only DBNs that are first-order Markovian – i.e., where the state t depends only on state t − 1 and no earlier states.) The classical problem of a Bayesian network in general and for DBNs specifically is that not all node values may be known at a given time. Some nodes may be directly observable, but other nodes may be hidden. Bayesian inference then tries to infer the probability distribution over the hidden nodes’ values from the values of the known (observable) nodes. In our network given in Fig. 2, only the nodes labeled S are observable (they represent sensor data reporting the position of a user), the other ones are hidden. With this network, we try to model the behavior of a team of three users during a meeting. At the top level, the Team Node Tt represents the current team intention. The team’s intention at time t depends on what the team has already achieved (T at time t − 1, Tt−1 ), and what the users i are currently (i) trying to achieve (the Ut -nodes, i ∈ {a, b, c}). If all users have achieved their individual assignment for the current team intention, the team T will adopt a (i) new intention. This may cause new assignments to the users. The Gt nodes represent these – possibly new – assignments. So at each time slice, the team looks at what the users have achieved so far and then decides what the users should do next. The CPT of node T therefore represents the negotiation process by which the team members agree on the next joint activity. For instance, if the team decides that the next activity should be the presentation of user a, it would

1114

M. Giersich et al.

assign to user a the task to go to the speaker stand and deliver his speech, while users b and c would be assigned the task to take a seat in the audience. (i) Whether a user i has achieved his assignment at time t – given by Ut – (i) (i) depends on the user’s current action (At ) and its previous assignment Gt−1 . (i) The At nodes record the current state of the user’s action (e.g., the user’s current position and velocity in case he has to reach the speaker stand in order to achieve his assignment). What the user is doing at time t depends on his previous (i) (i) action and assignment – At−1 and Gt−1 . Finally, the sensor observations of user (i) i at time t – the nodes St – depend on the user’s activities at that time. Note that these sensor nodes are the only observable nodes in our model: we cannot directly look into the minds of the users to observe their joint intention. (i) Rather, we take the available sensor data – the set of St values for the times up to t – and try from these to find the sequence of values for Ts , s ∈ {1 . . . t} that best explains the observed data – we try to estimate the team’s negotiations from the observable behavior of the team members. Once a probabilistic model is available, it allows us to infer user and team intentions. A substantial question is now of course: Where do we get such a model? Is it necessary to create it manually, from scratch, or can we synthesize it from existing information – such as from a hierarchical task model? This question will be addressed in the next section.

4

Synthesizing Probabilistic Models

In order to define a complete probabilistic model, sub-models have to be provided for the following three aspects: – How a team produces a sequence of joint intentions (Team model) – Which actions a user performs in response to a joint intention, an assignment (User model) – Which sensor data are caused by what actions (Sensor model) We will now look at the first topic and we will discuss how task models such as task-trees can be used to simplify the definition of such a model. First, the CPT of a T node basically looks as follows: Tt−1 .history h Tt−1 .activity α (i) (i) (i) Ut .done, i ∈ {a, b, c} ∀i : Ut .done = true ∃i : Ut .done = f alse P (Tt .history = h ∪ {α}) 1 0 0 1 P (Tt .history = h) P (Tt .activity = α) 0 1 0 P (Tt .activity = ξ), ξ = α mmodel(Tt .history, ξ)

Task Modeling and Human Behavior Recognition {A}

.09

{A, B} 1

.9

∅

.9 .1

1115

.99

{B}

.01

.01

{A, C}

1

{A, B, C}

1

{A, B, C, D}

1 .91

{C}

.09

{B, C}

Fig. 3. Markov model of agenda driven team process

The history slot of the T node records the team’s previous activities1 , the activity is the team’s current goal that the users try to jointly achieve through their individual assignments. If all users are done with their assignment, the team will add the current action to its history and it will then choose a new activity ξ. Otherwise, it will continue its current activity α. In this CPT, mmodel is the essential point. mmodel is a function that, given a history h and an action ξ, will yield the probability that the team will try ξ after h. mmodel describes our knowledge about what a team will most probably do in a certain situation. For instance, if the possible actions of a team are {A, B, C, D} and if we know that the team has an agenda stating the sequence of actions A, B, C, D, mmodel should assign the highest probability to action B when given the history {A} – modeling the prejudice that a team tends to follow its agenda. However, mmodel should also assign non-zero probabilities to the other actions in order to account for the possibility of deviations from an agenda. A possible model for a simple four-step agenda that states “A, B, C may come in any order but most probably in the order A, B, C, while D must be the last action.” is given in Fig. 3. mmodel essentially specifies a Markov model where the states are partial histories and the edges are transitions between histories. The problem here is how such a model can be specified efficiently: the number of states (histories) grows exponentially in the number of available actions! Our proposal for solving this problem is to utilize hierarchical task models for defining the structure and transition probabilities of a DBN. Specifically, we use an annotated CTT graph for generating an initial proposal for mmodel. Basically, a task model M defined over a set of actions A specifies a directed acyclic graph on possible execution histories h ∈ 2A , with the additional constraint that (h, h ) ∈ M ⇒ ∃α ∈ A : h ∪ {α} = h . 1

Given a set of actions A, an execution history is a set of actions that already have been performed. The set of all execution histories is the power set of A, which we denote by 2A . (This model makes the simplifying assumption that the exact sequence of actions is not important – however, it is easy to change the history model to a sequence model.)

1116

M. Giersich et al.

This means: in a task model M, a history h directly results from a history h through the execution of a single action α. The empty history ∅ is the root of this graph2 . For a given history h, the set C(h) denotes the set of actions that may directly follow this history. C(h) is defines as follows: C(h) = { α ∈ A|(h, h ∪ {α}) ∈ M } Clearly, the graph M directly represents the structure of a corresponding Markov model. At this point, the question to be addressed is how to provide initial proposals for the transition probabilities of this Markov model. The idea is to allow the developer of a task model to annotate a task-tree with additional information from which these initial proposals for the transition probabilities can be derived. One straightforward approach is to annotate each sibling action α with a “priority” prio(α), a number that indicates how important an early execution of this node is in relation to the other siblings – such as outlined in Fig. 4. For independently ordered tasks (order relation | = |), the priorities indicate the probabilities of being executed first. Then, for a given history h and a possible extension ξ ∈ C(h), the probability of a transition from h to h ∪ {ξ} is calculated from the priorities by: prio(ξ) P (h ∪ {ξ}|h) = prio(α) α∈C(h)

The resulting Markov model for the hierarchical task model in Fig. 4 is the one shown in Fig. 3. In this example the probability that a meeting starts with the presentation of A first is 0.9. Accordingly, it is 0.09 for presentation B. The probability that a meeting starts with third talk C is 0.01. If the meeting has started with talk B, the probabilites for the following two possible transitions to {B, C} and {A, B} are given by prio(A) prio(C) ≈ 0.01, and ≈ 0.99. prio(A) + prio(C) prio(A) + prio(C) Note that the most probable path through the generated Markov model is indeed the one following the agenda: ∅ → A → {A, B} → {A, B, C} → {A, B, C, D}. Also, if an action is taken out of order, the Markov model specifies that the team will try to return to the agenda: When the meeting has been started with B, the most probably following action will be to return to the planned sequence by executing A. So the generated Markov model represents the intuition behind the task-tree annotations. We have just shown that a proposal for a probabilistic model of user behavior can be generated from an annotated hierarchical task model. Therefore, we claim 2

If histories are represented by sequences instead of sets, this graph is a tree.

Task Modeling and Human Behavior Recognition

A 90

|=|

B 9

|=|

C 1

>>

1117

D

Fig. 4. Extended task model specifying the schedule of a meeting

that it is at least possible to exploit established user interface design methodology – task-tree modeling – for additionally enabling some aspects of proactive assistance in ubiquitous computing systems. Following, we will discuss some of the research issues that arise from this approach.

5

Discussion and Outlook

With the simple strategy outlined above, we have shown that it is possible to synthesize Markov models from annotated hierarchical task-trees that capture the probability of task execution in a team. An interesting question is now, how well the intended Markov model can be specified by the priority annotations. Not all possible distributions for the transition probabilities can be generated from these task-tree annotations (after all, the number of priority annotations is much smaller than the number of transitions in the Markov model). However, we think it is sufficient if the generated model is approximately correct: the exact transition probabilities are not known in advance, anyway. They have to be learned from the observation of real user behavior. The generated probabilities only have to be so exact as to permit a system a reasonable behavior right from the start, before training data is available (of course, the better the initial estimate, the less training data will be required). The salient point is a useful definition of “reasonable” and “approximate”. We think it is possible to provide such definitions; this will be part of our future research. The specific task-tree annotations and the accompanying probability computation given in Sec. 4 implicitly assume that the team uses a specific agenda management strategy. Specifically, the synthesized model assumes that a team prefers to execute actions in the order of their original priority, independent of the history. We call this a return to agenda strategy. Sometimes, teams might use other strategies. One example is to continue with the successor : if the meeting had started with talk B, the most probable next action would be C. This means the original “successor” of the action actually executed is also the most probable following activity. Another strategy of a team could be to stick to a timetable and to execute each action as close as possible to the original plan. Different strategies may require different annotations to a hierarchical task model. (For instance, in case of using continue with successor, one needs to

1118

M. Giersich et al.

provide the priority annotations at the parent task level rather than at the sibling level.) In addition, it may be conceivable to provide a set of mechanisms for inheriting such annotations within a task-tree. In order to render task-tree annotations usable, the set of annotation mechanisms has to be kept as small as possible. Therefore, we need to identify a set of annotations that allows to specify the typical team strategies for agenda management with sufficient precision. Future research has to identify which strategies are typical, what precision is required, and which set of task-tree annotations is able to capture the required information in a usable way for real life problems. To summarize, we have shown that task models are an important tool for addressing salient challenges of mobile and ubiquitous system design: adaptivity and proactive assistance. The models employed today in these areas use quite different modeling concepts, which leads to duplication of work. We have then argued that it is possible to automatically synthesize one kind of models (probabilistic behavior models) by giving simple annotations to the other kind (hierarchical task models). Therefore, it seems possible to generate proactive assistance for every application that already provides a suitably elaborate task model. We think that further research in this area is important for evaluating the potential of this approach and for rendering its benefit accessible to ubiquitous system development practice.

References 1. Card, S.K., Moran, T.P., Newell, A.: The psychology of human-computer interaction. Lawrence Erlbaum, Mahwah (1983) 2. Mori, G., Paterno, F., Santoro, C.: CTTE: support for developing and analyzing task models for interactive system design. IEEE Trans. Softw. Eng. 28(8), 797–813 (2002) 3. Fox, D., Hightower, J., Liao, L., Schulz, D., Borriello, G.: Bayesian filtering for location estimation. IEEE Pervasive Computing 2, 24–33 (2003) 4. Franklin, D., Budzik, J., Hammond, K.: Plan-based interfaces: keeping track of user tasks and acting to cooperate. In: Proceedings of the 7th international conference on Intelligent user interfaces, New York, NY, USA, pp. 79–86. ACM Press, New York (2002) 5. Bui, H.: A general model for online probabilistic plan recognition. In: IJCAI ’03: Proceedings of the 18th International Joint Conference on Artificial Intelligence, pp. 1309–1315 (2003) 6. Duong, T.V., Bui, H.H., Phung, D.Q., Venkatesh, S.: Activity recognition and abnormality detection with the switching hidden semi-markov model. In: CVPR ’05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Washington, DC, USA, vol. 1, pp. 838–845. IEEE Computer Society Press, Los Alamitos (2005) 7. Patterson, D.J., Liao, L., Fox, D., Kautz, H.A.: Inferring high-level behavior from low-level sensors. In: Dey, A.K., Schmidt, A., McCarthy, J.F. (eds.) UbiComp 2003. LNCS, vol. 2864, Springer, Heidelberg (2003), citeseer.ist.psu.edu/article/patterson03inferring.html 8. Patterson, D.J., Liao, L., Gajos, K., Collier, M., Livic, N., Olson, K., Wang, S., Fox, D., Kautz, H.A.: Opportunity knocks: A system to provide cognitive assistance with transportation services. In: Ubicomp, pp. 433–450. Springer, Heidelberg (2004)

A Pattern-Based Framework for the Exploration of Design Alternatives Tibor Kunert1 and Heidi Krömker2 1

Siemens AG, Automation and Drives, P.O. Box 31 80, 91050 Erlangen, Germany [email protected] 2 Technical University of Ilmenau, Institute of Media Technology, P.O. Box 10 05 65, 98694 Ilmenau, Germany [email protected]

Abstract. Design patterns serve the documentation and sharing of proven solutions for recurring design problems. Additionally, patterns can provide guidance on design alternatives. In this paper we present a pattern-based framework to support the designer in the exploration and evaluation of design alternatives and their tradeoffs. Based upon the systematic identification of recurring design problems and solution alternatives and their tradeoffs the framework consists of a generic hierarchy of design problems and solution alternatives as well as of two generic interaction design pattern templates. The presented framework can be used to specify design problems and existing solutions for a specific platform or application domain as well as to think about design alternatives and to develop new solutions. In addition, it can be used to structure interaction design pattern collections. The approach is illustrated by a case for interactive television applications. Keywords: Interaction design patterns, design patterns, design tradeoffs, interactive television.

1 Introduction Designing interactive applications includes choosing between design alternatives. Choosing the most suitable alternative usually is based upon an evaluation of the specific advantages and disadvantages or tradeoffs of the different design alternatives [3, 12, 9]. A good interaction design solution is a balanced compromise between all requirements and constraints. For an informed design decision first the alternatives for a given design task need to be clear. Then the specific advantages and disadvantages or tradeoffs of the single design alternatives need to be understood by the designer to be able to choose the most appropriate one. However, often neither all available design alternatives nor their specific design tradeoffs are explicitly analyzed leading to insufficiently informed design decisions. With the objective to support informed design decisions based upon the systematic analysis of design alternatives and their tradeoffs in this paper a design pattern based approach is followed. The design pattern approach is well suited for the analysis and evaluation of design alternatives because it is based upon the distinction of proven J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1119–1128, 2007. © Springer-Verlag Berlin Heidelberg 2007

1120

T. Kunert and H. Krömker

and unproven design alternatives. Furthermore, the pattern approach is well suited for the analysis of design tradeoffs of a particular design alternative because it is also based upon the idea that a proven design solution is an appropriate compromise between different competing forces. In the pattern approach a design problem is the result of competing forces [8]. If there are no competing forces there is no design problem and no guidance is required. The forces referred to in the design pattern terminology can be understood as different requirements and constraints and the solution approach described in design patterns is a proven solution to balance them out. In addition, as a form for design guidance design patterns provide a format to explicitly state and discuss the specific advantages and disadvantages of a particular design solution. In this way design patterns are well capable to support the informed decision making by the designer [5, 8]. Other forms of design guidance, e.g. design principles, guidelines and style guides, have less potential is this regard because they do not provide leads on how to choose between design alternatives or how to solve conflicts between competing forces. Besides the pattern approach also other approaches have been suggested to document interaction design alternatives and their tradeoffs, especially claims analysis [3, 12] and impact analysis [10]. However, in this paper a pattern-based approach for the exploration of design alternatives is followed to take advantage of the strong focus of patterns on the link between design problem and solution.

2 Design Patterns Design patterns describe successful solutions for recurring design problems [1, 6, 2, 8, 14, 15]. Design patterns usually do not describe isolated solutions but are structured hierarchically forming a pattern collection or language. The hierarchical order reflects the scope of the design problems addressed. Patterns on conceptual design issues are at the top while more patterns on implementation related design issues are at the bottom. Within a pattern language different relations between individual patterns exist. Some complement each other and others describe alternative solution approaches. The design pattern approach is based upon the distinction of proven and unproven design solutions. In the pattern literature the quality criteria of long-term reliability in the real context of use is commonly applied. Especially for interaction design patterns usability has been suggested instead as a substitute due to the relatively short existence of interaction design solutions [8]. Using usability [7] as quality criterion of design solutions has the advantage that it can be evaluated experimentally with prototypes. Thus emerging design patterns for new technologies, e.g. ubiquitous computing, mobile applications or interactive television, can be identified early in the technology lifecyle. Without the availability of design guidance supporting usability and user acceptance early in the technology lifecycle a new technology may never get further than infancy. More specifically, for new technologies the potential of design patterns is seen e.g. in serving as a tool to accelerate usability knowledge sharing in a practice oriented format and to avoid the establishment of insufficient design standards [4, 13]. Although it is widely recognized by pattern authors that it is difficult to identify recurring design problems no systematic method has been

A Pattern-Based Framework for the Exploration of Design Alternatives

1121

presented for it so far. Instead the selection of design problems addressed by existing interaction design pattern collections seems to be rather arbitrary leaving it unanswered if all design problems are covered within a particular pattern collection.

3 Method The object is to support informed design decisions by the development of a patternbased framework for the exploration of design alternatives. The framework is based upon two basic steps: 1. Identification of recurring design problems and solution alternatives. 2. Exploration of the solutions’ design tradeoffs. As a result of these steps a framework was developed that consists of two components: • A generic hierarchy of design problems and solution alternatives. • Two generic templates for interaction design patterns based upon the hierarchy of design problems and solution alternatives. The framework presented in this paper is generic and is applicable to any platform and application domain. In the following the conducted steps are described in detail. The results of the single steps are illustrated using interactive digital television (iTV) applications as example. Although iTV applications can be as diverse as electronic program guides, voting and betting applications, enhanced or personalized news, educational applications, games and shops the hardware for user interaction usually is a standard TV or set-top box remote control. 3.1 Identification of Design Problems and Solution Alternatives Method for the Identification of Design Problems. Solution alternatives are based upon design problems. To be able to systematically explore solution alternatives the corresponding design problems need to be identified systematically as well. Objective was to identify recurring interaction design problems. To support the design problem coverage to be as complete and as diverse as possible different methods were applied. On the one hand, the perspectives of the users of the applications as well as of the application designers were considered. On the other hand, for each of these two perspectives theoretical as well as empirical analyses were conducted. To cover the user perspective a theoretical context of use analysis [7] as well as an empirical user task and user requirements analysis [10] were undertaken. For the designer perspective a literature analysis of existing platform specific design guidance as well as an empirical analysis of the designers’ requirements for platform specific design guidance using interviews with usability experts for that platform were conducted. These different methods deliver different types of results. However, they all can be used to identify and specify recurring interaction design problems. Following this approach design problems from a user perspective are based upon particular context of use specifics and/or user requirements and user tasks. In other words, the design problem is how to cope with particular context of use specifics as well as how to support particular user tasks and user requirements. The designer perspective on the

1122


other hand can provide partly different design problems based upon deficits of existing guidance and/or the designers’ requirements regarding problems to be addressed to better support the application design process. Method for the Identification of Solution Alternatives. In the application design process for each design problem different solution alternatives exist. Solution alternatives represent different solution approaches to a particular design problem. After the recurring interaction design problems had been identified the possible solution alternatives were explored. Objective was to find as many solution approaches to any one of the identified design problems as possible. However, design problems and thus solution alternatives exist on different levels. After the decision for a certain solution alternative several design decisions are required regarding its concrete design. Each design alternative can be broken down into different components each one representing another lower level design problem. Many of these design alternative components again consist of different design variables that need to be decided upon in the design process. To provide meaningful guidance for the solution of a design problem each of these components and design variables needs to be addressed. To identify solution alternatives for iTV applications an analysis of existing iTV applications as well as a theoretical analysis of potential alternatives was carried out. Also, design solutions implemented within other platforms or application domains were considered to be a solution alternative for iTV design problems. The design alternatives consisted of implemented as well as of not yet implemented solutions. Results. The identified design problems and solution alternatives can be hierarchically classified resulting in a hierarchy of design problems and solution alternatives. Objective was to provide a hierarchical overview of design problems and alternatives to support informed design decisions. In addition, a hierarchical structure of design problems facilitates their mapping with solutions within a pattern collection and can serve as structure for the pattern collection itself. Each step conducted to classify the design problems and solution alternatives brought forth results at a different hierarchy level (Table 1). Table 1. Steps for the hierarchical classification of design problems and solution alternatives and the corresponding results or hierarchy levels Classification step Identification of recurring high level design problems Identification of design alternatives for each high level design problem Identification of the components for each design alternative Identification of the design variables for each design alternative component Identification of possible values for each design variable

Result (hierarchy level) Design problems (level 1) Solution alternatives (level 1) Design problems (level 2) Design problems (level 3) Solution alternatives (level 3)


1123

Level 3

Level 2

Level 1

Only for the design problems on level 2 no directly corresponding alternatives are assigned because the problems are broken down into different design variables. Fig. 1 shows the developed generic hierarchy of design problems and solution alternatives. In the naming convention of Fig. 1 the numbers in the boxes represent design problems (1, 2, 3…) while the letters represent solution alternatives (a, b, c…).

Fig. 1. Generic hierarchy of design problems and solution alternatives

Examples for Interactive TV Applications • Identification of recurring high level design problems: E.g. Page layout, Navigation and Text input. • Identification of solution alternatives for each high level design problem: Solution alternatives for the high level design problem “Text input” are Hardware QWERTY keyboard, On-screen QWERTY keyboard, On-screen alphabetic keyboard, Mobile phone keyboard on remote control, On-screen mobile phone keyboard as well as Scrolling alphabet. • Identification of the components for each solution alternative: Components of the solution alternative “On-screen alphabetic keyboard” are e.g. Keyboard presentation, Character input, Indication of selected character, Cursor presentation, Character correction. • Identification of the design variables for each solution alternative component: Design variables of the solution alternative component “Cursor” are e.g. its default position and its presentation. • Identification of the possible values for each design variable: Possible values for the design variable “Presentation” of the solution alternative component “Cursor” are e.g. static and blinking. For iTV applications the developed hierarchy of design problems and solution alternatives is illustrated for the design problem “Text input” (Fig. 2).


Level 3

Level 2

Level 1

1124

Fig. 2. Hierarchy of interaction design problems and solution alternatives for the interactive TV design problem “Text input”

3.2 Exploration of Design Tradeoffs Each solution alternative has specific advantages and disadvantages or tradeoffs that need to be considered and evaluated when designing a particular application. The suitability and selection of an alternative for a particular application is e.g. based upon certain context of use specifics. E.g. specific user characteristics or a specific use environment can result in certain solution alternatives not to be suitable. Therefore, to support informed design decisions the specific tradeoffs of the single solution alternatives were analyzed and specified. For iTV applications usability tests were used to identify the tradeoffs of different design alternatives as well was to identify proven solutions to the identified design problems. Tested were existing iTV applications as well as iTV prototypes especially developed for the development of an iTV interaction design pattern collection. The prototypes systematically implemented different solution alternatives for particular problems based upon the identified hierarchy of design problems and solution alternatives. For the evaluation of the prototypes comparative usability studies were conducted using a within-subjects design. For design problems for that only one design alternative was available conventional usability tests were carried out. For both test types qualitative as well as quantitative measures for effectiveness, efficiency and satisfaction were applied [7]. After all tasks had been completed a task-based post-test semi-structured qualitative interview was conducted with the test participants regarding critical incidents and specific strengths and weaknesses of the tested design alternative. Especially valuable for the identification of specific design tradeoffs were the qualitative results of the thinking-aloud and the post-test interviews. The two conducted test types for iTV applications resulted in different types of results. On the one hand, comparative usability studies were used to create a usability ranking of the different evaluated solution alternatives based upon the quantitative as


1125

well as the qualitative measures. On the other hand, both test types were suitable to identify specific advantages and disadvantages of the tested solutions.

4 Documentation of the Results as Design Patterns The identified design problems and solution alternatives as well as the tradeoffs of the solution alternatives can be documented in form of an interaction design pattern collection. The objective is to support informed design decisions by sharing design and usability knowledge amongst designers. The results of the previous analyses can be used to structure the pattern collection as well to guide the definition of pattern templates. 4.1 Structure of the Pattern Collection The structure of the interaction design pattern collection can be based upon the identified hierarchy of design problems and solution alternatives (Fig. 1). However, referring to the top level of the hierarchy of design problems and solution alternatives is sufficient to structure the resulting design pattern collection characterized by only two hierarchy levels (Table 2). Level 2 and 3 of the hierarchy do not need to be presented in the structure of the pattern collection but can be addressed within the patterns on level 2. Objective of limiting the structure of the pattern collection to two hierarchy levels is to create an easy to use pattern collection to support acceptance by designers. Table 2. Classification steps for design problems and alternatives, the corresponding levels in the hierarchy of design problems and alternatives and the corresponding hierarchy level in the pattern collection Classification step for design problems and solution alternatives High level design problems

Level in hierarchy of design problems and alternatives

Level in pattern hierarchy

Design problems (level 1)

Solution alternatives for the high level design problems Components of the solution alternatives Design variables for the solution alternative components Possible values for the design variables

Solution alternatives (level 1)

Interaction design patterns level 1 Interaction design patterns level 2

Design problems (level 2) Design problems (level 3) Solution alternatives (level 3)

4.2 Design Pattern Templates To address the identified high-level design problems (hierarchy level 1 in Fig. 1) as well as the lower level design problems (hierarchy level 2 and 3 in Fig. 1) two

1126


different design pattern templates were created. While pattern template 1 addresses the question” What solution alternative to choose?” pattern template 2 addresses the question “How to design a specific solution alternative?”. Template 1 is used for the patterns on level 1 and template 2 by the patterns on level 2 (Table 2). Template 1 and 2 only differ in regard to the content and structure of their “Problem” and “Solution” section. The other pattern sections (Name, Examples, Context, Evidence, Related patterns) are identical for both templates. Patterns using template 1 provide an overview of solution alternatives to a particular design problem (design problems on hierarchy level 1 in Fig. 1) by specifying and defining the alternatives and by listing their individual tradeoffs in form of advantages and disadvantages in the problem section (Table 3). In the solution section of template 1 patterns concrete guidelines are provided specifying when to use what solution alternative. Table 3. Interaction design pattern template 1 (naming convention refers to Fig. 1) Name Examples Context Problem

Name of the addressed high-level design problem (problem on hierarchy level 1), e.g. Design problem 1 One screenshot of each solution alternative Description of the use of this design pattern in the application design workflow x Solution alternative 1a: Specification and definition x Solution alternative 1b: Specification and definition x Solution alternative 1n: Specification and definition

Solution

Advantages Disadvantages Solution alternative 1a … … Solution alternative 1b … … Solution alternative 1n … … x Use solution alternative 1a for/when … x Use solution alternative 1b for/when… x Use solution alternative 1n for/when… Empirical usability test results and references to literature Other design patterns of this collection that should be considered as well.

Evidence Related patterns

Patterns using template 2 provide an overview of the design problems associated with the concrete design of a particular solution alternative (design problem on hierarchy level 2 and 3 in Fig. 1) in the problem section. The addressed solution alternative components and design variables in the “problem” section of the patterns following template 2 are identical with the ones specified in Fig. 1. In addition, also in the pattern’s problem section, they specify the possible variable values for the design variables of the corresponding solution alternative components together with their specified tradeoffs (Table 4).


1127

Table 4. Interaction design pattern template 2 (naming convention refers to Fig. 1) Name Examples Context Problem

Solution

Evidence Related patterns

Name of the addressed solution alternative (hierarchy level 1), e.g. Solution alternative 1b Screenshots of different design approaches for this solution alternative Description of the use of this design pattern in the application design workflow • Solution alternative component 1b.1 (design problem on hierarchy level 2): Design variable 1b.1.1 (problem on hierarchy level 3): Specification of possible variable values and their trade-offs. Design variable 1b.1.2 (problem hierarchy level 3): Specification of possible variable values and their trade-offs. • Solution alternative component 1b.2 (problem on hierarchy level 2): Design variable 1b.2.1 (problem on hierarchy level 3): Specification of possible variable values and their trade-offs. Design variable 1b.2.2 (problem hierarchy level 3): Specification of possible variable values and their trade-offs. • Solution alternative component 1b.1: For design variable 1b.1.1 use the value… (alternative on hierarchy level 3) For design variable 1b.1.2 use the value… (alternative on hierarchy level 3) • Solution alternative component 1b.n: For design variable 1b.2.1 use the value… (alternative on hierarchy level 3) For design variable 1b.2.2 use the value… (alternative on hierarchy level 3) Empirical usability test results and references to literature Other design patterns of this collection that should be considered as well.

5 Conclusion The developed generic hierarchy of design problems and solution alternatives has proven to be a suitable basis for the development of generic interaction design pattern templates aimed at supporting designers in the exploration and evaluation of design alternatives and their tradeoffs. After developing an interaction design pattern collection for iTV applications by applying the presented framework we can conclude that the presented framework is supporting the pattern creation process. The presented framework cannot only be used to structure interaction design pattern languages, but also to guide and structure the development of the pattern content. Patterns corresponding to this framework describe the different solution alternatives for a problem and discuss their specific tradeoffs, thus supporting an informed design decision. In addition, the framework may also support the finding of new design solutions that have not been implemented yet. As raised by Chung et al. [4] a systematic pattern-based framework may facilitate the structured search for new solutions as accomplished by Mendeleyev’s periodic table in chemistry. Future research is required to evaluate patterns developed using this framework regarding their practical utility in the application design process. The evaluation would need to distinguish between different phases in the application design process because the patterns might provide different levels of support for different design phases. In addition, it needs to be investigated what level of maturity is required for a platform or application domain in general for patterns to be of support in the design process.

1128


Acknowledgement. This work is supported by the EC 6th Framework IST NoE “3DTV” under Grant 511568.

References 1. Alexander, C., Ishikawa, S., Silverstein, M., et al.: A Pattern Language – Towns – Buildings – Construction. Oxford University Press, New York (1997) 2. Borchers, J.: A Pattern Approach to Interaction Design. Wiley, Chichester UK (2001) 3. Carroll, J.M.: Making use: Scenario based design of human computer interaction. MIT Press, New York (2000) 4. Chung, E.S., Hong, J.I., Lin, J., Prabaker, M.K., Landay, J.A., Liu, A.L: Development and Evaluation of Emerging Design Patterns for Ubiquitous Computing. In: Proc. of Designing Interactive Systems (DIS) 2004, pp. 233–242. ACM Press, New York (2004) 5. van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites: Patterns, Principles, and Processes for Crafting a Customer-Centered Web Experience. Addison-Wesley, Boston MA (2003) 6. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns. Addison-Wesley, Reading MA (1995) 7. ISO 9241-11: Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs) – Part 11: Guidance on Usability. International Organization for Standardization (ISO), Geneva (1998) 8. Mahemoff, M.J., Johnston, L.J.: Pattern Languages for Usability: An Investigation of Alternative Approaches. In: Proc. Australian Computer Human Interaction Conference OZCHI ’98., pp. 132–139. IEEE Computer Society, Adelaide Australia (1998) 9. Nielsen, J.: Usability Engineering. Morgan Kaufmann Academic Press, Boston (1993) 10. Preece, J., et al.: Human-Computer Interaction. Addison-Wesley, Wokingham, UK (1994) 11. Preece, J., Rogers, Y., Sharp, H.: Interaction Design: Beyond human-computer interaction. Wiley, New York (2002) 12. Rosson, M.B., Carroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan Kaufmann, San Francisco (2002) 13. Saponas, S.T., Prabaker, M.K., Abowd, G.D., Landay, J.A.: The Impact of Pre-Patterns on the Design of Digital Home Applications. In: Proc. of Designing Interactive Systems (DIS) 2006, pp. 189–198. ACM Press, New York (2006) 14. Tidwell, J.: Designing Interfaces. O’Reilly, Sebastopol CA (2005) 15. van Welie, M., van der Veer, G.C.: Pattern Languages in Interaction Design. In: Rauterberg, M., et al. (ed.) Proc. of IFIP INTERACT ’03, pp. 527–534. IOS Press, Amsterdam (2003)

Tasks Models Merging for High-Level Component Composition Arnaud Lewandowski1, Sophie Lepreux2, and Grégory Bourguin1 1

Laboratoire d'Informatique du Littoral (LIL) 50 rue Ferdinand Buisson, F-62100 Calais, France {lewandowski,bourguin}@lil.univ-littoral.fr 2 University of Valenciennnes, LAMIH Le Mont-Houy, F-59313 Valenciennes Cedex 9 [email protected]

Abstract. As users become more and more demanding about the software environments they use, they need environments offering them the possibility to integrate new tools in response to their emerging needs. However, most highlevel component composition solutions remain out of reach for users. Thanks to an innovative approach that tends to provide more understandable components, we propose in this paper a new mechanism in order to assist high-level component composition. This approach proposes to realize this composition through tasks models assembling. The assistance we propose is based on an adaptation of tree algebra operators and is able to automatically merge tasks trees in order to assist high-level component integration in a more global environment.

1 Introduction Many theoretical and empirical studies have already demonstrated the emerging nature of users’ needs towards their activities and the environments supporting them [0,0]. Actually, many research works tend to integrate in the software environments the mechanisms suited to support these emerging needs, and to give the users the possibility to make these environments evolve. One solution is to allow the users to integrate tools inside their environment, i.e. to compose high-level software components. In order to be efficient and accurate, such integration should be fine and dynamic. Despite the great amount of work and advances that have been made in the field of component integration or composition, one must agree that the available solutions are still generally complex and always directed to a public of software development experts. The purpose of our work is to facilitate the fine and dynamic integration of tools (or high-level components) inside a global environment. In order to reach this goal, we can identify two aspects on which we have to work. Firstly, we should be able to provide more understandable components. And secondly, we should provide some automated or semi-automated assistance for composing such components. In this paper, we particularly focus on this latter aspect. In the first part of the paper, however, we are going to introduce the Task Oriented (TO) approach that tends to provide more understandable components. This innovative approach proposes a way to construct high-level components that could be J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1129–1138, 2007. © Springer-Verlag Berlin Heidelberg 2007

1130

A. Lewandowski, S. Lepreux, and G. Bourguin

more easily integrated afterwards, especially thanks and through the use of tasks models. Following this approach, we propose to realize the integration of high-level components inside a global environment through the assembling of their individual tasks model in the more global task of the integrating environment. Even if interesting, this proposition raises some questions. We particularly focus in this paper on the means that could help or assist the realization of the merging between several tasks models. In the second part of the paper, we present a solution that has been developed during previous work about how to compose XML trees thanks to specific operators. Since the tasks models of our components are described in XML documents, we propose in the third part of the paper to adapt the XML tree composition solution to the merging of several tasks models. This proposition tends to assist the composition of high-level components that have been developed according to the TO approach. Finally, we illustrate this proposition with an example of highlevel component composition through the assisted merging of their individual tasks models.

2 The Task-Oriented Design Approach Software components composition is a large and complex research area. Besides, many technical solutions try to give it answers. For example, distributed components such as CORBA components [0], EJB (Enterprise JavaBeans) [0], or Web Services [0] have been conceived with the perspective of their future integration. Some of them are associated with composition languages [0] that allow the fine integration of these components or services inside software applications. One can notice that such technical solutions are exclusively usable by software development experts, especially because of their complexity, of their implementation cost and of the specificity of the used techniques [0]. However, these different methods follow the same principle: it is possible to dynamically discover objects on the Internet, to instantiate them, to discover their public methods and eventually their event channels, and to determine how the global environment will integrate and pilot them (through specific methods calls). Even if very useful, these mechanisms mainly bring a solution to the technical dimension of the problem. Indeed, the fact of finely and dynamically integrating a tool not only supposes that we are able to use it, but also that we understand how to use it. And even if some documentation supports — like the Javadoc for Javabeans components, or WSDL descriptions for Web services — may exist, this problem of semantic still remains. Every developer has been faced with this problem: introspecting the list of the public methods of a specific component, even with their documentation, is generally not sufficient in order to finely realize its integration: knowing the methods does not tell you — for instance — in which order you have to call them for this component to work properly. The Task Oriented approach is intended to give an answer to the semantic lack in high-level components and to raise the abstraction level of their composition. Our approach is to consider that each high-level component aims at supporting a specific kind of activities. Our goal is to provide the means to contextualize the many tools or components involved in the realization of a global task. In other words, the global environment has to manage what we call the inter-activities [0], i.e. the links

Tasks Models Merging for High-Level Component Composition

1131

existing between the activities supported by the many tools integrated in a global environment in order to support a composed and global activity. We consider then that each component supports the task it has been designed for. Indeed, the designer of a specific tool or high-level component has created the underlying mechanisms and its interface in order to propose an adequate support for a specific generic task. Thus, a mailing component supports the realization of mailing tasks; a chat component supports synchronous discussion activities, etc. So we can consider that contextualizing a tool is equivalent to contextualizing an existing task into the frame of a more global task, such as Co-writing an article for HCI-International, where a mailing tool may be associated with a word processor, a chat and other tools. In order to facilitate this contextualization and to bring an answer to the dynamic integration problems, we propose to better use the component’s tasks model, a kind of missing link that generally disappears between the design stage and the delivered code. Actually, tasks models are generally used at the beginning of the software development process. But their use progressively fades during the process and finally disappears behind an object-oriented design approach inspired by the computer engineering background. This classical software design approach tends to transform tasks models into objects models, from which emerges implicitly the class-based structure of the produced component. The original tasks model is swamped, implicitly inscribed in the complexity of the produced source code. Indeed, task-oriented approaches are slightly used – or even not used at all – during the design and development cycle, namely after the requirements collection and analysis. Nevertheless, at the stages where they are used, tasks models often serve, as shared objects, to help a better communication between the many actors (including the future users) implied in the complex software development process. Tasks models also contain useful information describing the functioning of the tool and serve a better understanding of it. The Task Oriented design approach tends to keep the benefits of tasks models during the whole software development process and even during composition or integration stages. In order to facilitate high-level component composition, the TO approach proposes to include the tasks model of a component within it [0]. This approach consists in the explicit preservation of the links between the functional source code and the tasks model it is based on. The Figure 1 summarizes part of the architecture of a high-level component developed according to the TO approach — or TO component. Not only the embedded tasks model adds some semantics to the component and should help in its understanding, but it could also be used for its integration. Thus, the developer of a specific component could specify on its tasks model which parts (or subtasks) could be “shunted”, i.e. realized by the global environment. For example, in the chat component of the Figure 1, the “connect” subtask could be shunted by the global environment calling the connect() method with the right arguments, which may have for effect to ‘realize’ the “validate” subtask, and as a consequence the “connect” subtask, skipping the corresponding interface. The purpose of this paper is not to describe in details how these links between the functional code and the tasks model are kept inside the TO component. Briefly said, the tasks model, described in a XML document, contains some information about the tasks that can be realized through the call of a specific method by any global environment integrating this component. For such tasks, a specific field indicates which method it corresponds to. These specific methods are

1132


grouped together into a kind of wrapper class so that the global integrating environment can do the appropriate calls to pilot the component. Such TO components are then developed with their future integration in mind, since the designers/developers are able to put in them the appropriate methods that will shunt some of their subtasks and then adapt their behavior.

Fig. 1. Architecture of a TO component

Therefore, high-level component composition could be realized through the composition, or merging, of several tasks models in a more global one, supposing that these components are “TO components” that include their tasks model. The global environment, managing the global task, could integrate the many tools by integrating their tasks models. One of the benefits of this approach is that it raises the abstraction level required for assembling components. It removes the need to look at the public methods of the component and to understand how to call them in order to properly integrate the component, since this information is obtained through the tasks model. The purpose of this paper is to propose means that can help this integration of tasks models and the merging of parts of them. The solution we propose is inspired by results obtained in the domain of XML tree composing applied to the merging of graphical user interfaces. We now introduce these previous results before presenting how we use them for TO components merging.

3 XML Tree Composing A tasks model can be expressed in XML. The XML document can be associated to algebra tree. We propose to use the tree algebra to manipulate the XML document in which the tasks model is written. The TAX model (Tree Algebra for XML) [0] defines a data tree as a rooted, ordered tree, such that each node carries data (its label) in the form of a set of attribute-value pairs. Each node has a special, single valued


1133

attribute called tag whose value indicates its type. A node may have a content attribute representing its atomic value. Each node has a virtual attribute called pedigree drawn from an ordered domain. The pedigree carries the history of “where it comes from”. Pedigrees play a central role in grouping, sorting and duplicate elimination. Originally proposed for database management, this model is also well suited to manipulate XML documents from Human-Computer Interaction domain. In [0], for instance, some operators (such as Union, Fusion, Selection, Difference, Equals operators) have been adapted in order to manipulate Graphical User Interfaces (GUIs) defined with the UsiXML UIDL [0] based on XML. Besides, a plug-in (ComposiXML) has been developed for the GrafiXML editor to compose GUIs [0]. The following example illustrates how this principle of XML tree composing based on the TAX model and applied to GUIs works. The Figure 2 shows the tree representation of a Union operator applied on two input interfaces. This result is operated from two XML trees in the case of horizontal union (this layout precision is specific to the Concrete User Interface Operator). The input interfaces and the resulting one (Final User Interface on java platform) are presented in the Figure 3. The Union operator creates a new window, whose width is equals to the sum of the two input windows’ width. It also adds a box in the tree to indicate the new type of layout (horizontal in this example). Then the duplicates are deleted; for instance, as the two buttons Save and Close appears on each form, they will be deleted from one of them (this choice — which ones will be deleted — is made by the user). To do this, the algorithm uses the tag “default value” and the content associated to compare the element. It uses the pedigree to know if the parent elements are repetitive. In the example, the box (with type = horizontal) is repetitive if the children are two buttons with default value equals to Save and Close; to do that, the pedigree is used.

Fig. 2. The tree resulting from the application of a Union operator on two interfaces

We think that this principle based on tree algebra can also be applied at a higher level, at the tasks model level. That is what we now illustrate through an example, showing how the XML tree composing approach can assist the TO components composition problem.

1134


Fig. 3. Union of two user interfaces without repetition of common part. The resulting interface is described by the tree of the Figure 2.

4 Tasks Models Merging Using Composition Operators 4.1 The Composition Problem First, we start from the assumption that we have at our disposal two components — a chat tool and a shared whiteboard — that have been developed according to the TaskOriented design approach. The architecture of each of these two final stand-alone components is then similar to the one illustrated on the Figure 1. As we said in this previous part, the use of tasks models should ease the composition of such components. Indeed, assembling such components, in which the tasks model is linked to the functional code, can then be realized through the integration, or linkage, of their individual tasks models in a more global tasks model, the one of the global environment. As each individual tasks model is linked to the code of the component it describes, the global environment will be able to know which methods to call in order for this component to be properly integrated. The introduction of the composition principle based on tree algebra will provide assistance for realizing this integration, by helping to merge several subtasks.

Fig. 4. Similar parts between the Chat tasks model and the White board tasks model. Thanks to the algebra tree composition approach, we want to assist the merging of these two subtasks so that the global environment integrating the two components may take in charge simultaneously the two connection processes.


1135

The embedded tasks models of these two high-level components appear in the Figure 4. Both may include a specific and similar connection subtask corresponding to a specific interface asking the user for his/her login and password (and eventually other information specific to each component, like the channel for the chat, or the board to join for the whiteboard). If we integrate these tools in the global environment without any specific merging, the environment will just launch the tools without any particular configuration, and the user will have to identify him/herself twice (once for the chat, and once for the white board, since they both have their own similar “connect” or “identification” subtask which is required). We propose a mechanism in order to assist the integration of these tasks models by merging the appropriate subtasks. 4.2 Tasks Models Composing Using Tree Algebra We can imagine several scenarios of composition. In this paper, we focus on the following one: we want the global environment to do the connection phase (that means the “connect” subtask for the chat, and the “identification” subtask for the whiteboard) at the very beginning; after that, the user will be able to use both tools in parallel. If we look at the tasks models of these two components (see Figure 4), we understand that assisting this scenario will consist in: 1) extracting the two subtasks or sub-trees tied to the connection processes; 2) merging these two subtasks in one; and finally 3) plug into the global tasks model the three resulting sub-models (the one containing the result of the merging, and the tasks models of the two components). In order to realize this, we adapt the tree composing approach presented before.

Fig. 5. Part of the XML tree obtained by transformation of the tasks model of the chat component. Some examples of the key concepts used during tree algebra transformation appear on it: (a) a tag, (b) a tag’s content, and (c) a pedigree.

The tasks models are described and stored in XML documents. The first step consists in transforming them in XML trees. The Figure 5 illustrates this transformation applied to the tasks model of the chat component. These XML trees

1136


serve as the basis to the assisted composition. According to our scenario, we want to merge the Connect subtask of the chat and the Identification subtask of the whiteboard. From the tree algebra and the work we have presented in part 3 about GUI composition, it corresponds to the Union operator. There are many ways to implement a Union operator between two trees. The more immediate possibility consists in creating a new tree where the root’s identifier is equal to the name of the global task; then a subtask is created and the two input trees (corresponding to the tasks models of the two TO components) are just plugged into this subtask. However, this basic solution does not do any merge between the similar connection subtasks of each component; it does not manage the problem of the repetitive connection process. A second alternative consists in deleting one of the two similar subtasks. It was the choice made for the implementation of the Union operator in the frame of GUIs composition (cf. part 3). In this case, two solutions are possible: either we choose to delete the connect subtask of the Chat, or the identification subtask of the shared whiteboard. But in both solutions, a problem remains: in the global environment, the user should be able to use the two components in parallel. If we delete one of the two similar subtasks — the connect subtask of the Chat for example — and do every necessary connection in the remaining one — the identification subtask of the whiteboard —, the resulting tree is not coherent; indeed, the Chat task may be initiated before the connection which stands in the whiteboard. The enabling relationship between the “connect” and the “use” subtasks of the Chat forbids this solution. The other solution — keeping the Chat connection and removing the whiteboard identification — presents the same problem. The algorithm we propose in order to implement the Union operator between two tasks trees is the following. It first creates a new sub tree that will contain the merging of the two similar subtasks. This merging is possible since both connection subtasks have a similar structure. The algorithm imports one of them in the new tree, and adds to it the missing subtasks appearing in the second one, according to their order in the corresponding input tree. In order to respect the enabling relationship between the connection subtask and the use subtask of each component, the created task must be placed before the integration of the components’ individual tasks models. Once the two subtasks have been merged, only one interface will be presented to the final user at the beginning, and s/he will fill the form only once. The task generated by the merging takes in charge the necessary methods calls on the two components in order for their own “connect” subtasks to be effectively realized. This is possible because each TO component contains the links existing between some tasks of their model and specific methods of their code (cf. part 2). The final resulting functioning is illustrated on the Figure 6 by the tasks model of the global environment that integrates the result of the merging process and the tasks models of the two components. As we can notice, the enabling relationship is kept between the merged subtask that realizes the connection processes and the parallel use of the two TO components. This simple example illustrates what kind of merging this approach is able to realize, depending on the choices of the user that will finally do the component assembling. This proposition assists the integration of individual tasks models. If the person in charge of this integration specifies which subtasks are similar and should be merged (in our example, the two “connect” subtasks), our solution is able to merge these subtasks in one task in the global model, and eventually merge the corresponding interfaces too.


1137

Fig. 6. Part of the tasks model of the global environment after the merging process. It contains first the tasks model resulting from the merging process, and the two adapted tasks models of the integrated components.

5 Conclusion and Future Work Providing means that will allow users to adapt their software environments by finely and dynamically integrating high-level components is truly a challenge. A first problem stands in the fact that components are generally hardly understandable. We have presented the Task-Oriented approach that proposes a new way to construct more understandable high-level components, especially thanks and through the use of tasks models. Assembling TO components means then, according to this approach, composing tasks models. In order to assist this integration, we have proposed a mechanism inspired by previous work on tree algebra applied to GUIs composition. TO components’ individual tasks models are described in XML documents. Assembling tasks models can then be seen as assembling XML trees. Tree operators can then be adapted to this specific domain. We have illustrated this approach with an example: the integration of two TO components with the automatic merging of two similar subtasks they share. The tasks models are transformed into trees, on which we apply a Union operator that merges the identified similar subtasks. The resulting tasks models are then integrated in the global model of the integrating environment. Even if this example seems to be very specific, it can be extended to other operators and technologies. The only requirement is that the high-level components have to be developed according to the TO approach. We now pursue our efforts in order to generalize this approach and provide a more efficient assistance to high-level component composition by users. Acknowledgments. The present research work has been supported by the “Ministère de l'Education Nationale, de la Recherche et de la Technologie”, the “Région Nord Pas-de-Calais” and the FEDER (Fonds Européen de Développement Régional) during the projects MIAOU and EUCUE. The authors gratefully acknowledge the support of these institutions. The authors thank also Jean Vanderdonckt for his contribution concerning UsiXML.

1138


References 1. Blevins, D.: Overview of the Enterprise JavaBeans Component Model. In: [0], pp. 589– 606 (2001) 2. Clerckx, T., Luyten, K., Coninx, K.: The Mapping Problem Back and Forth: Customizing Dynamic Models while preserving Consistency. In: TAMODIA 2004, 15-16 November, Prague, Czech Republic (2004) 3. Cubranic, D., Murphy, G.C., Singer, J., Booth, K.S.: Learning from project history: a case study for software development. In: Proc. of CSCW04, pp. 82–91. ACM Press, New York (2004) 4. Ferris, C., Farrel, J.: What are web services? Comm. of the ACM 46(6), 31 (2003) 5. Heineman, G.T., Councill, W.T. (eds.): Component-based software engineering: putting the pieces together. Addison-Wesley Longman Publishing Co., Inc, Boston (2001) 6. Jagadish, H.V., Lakshmanan, L.V.S., Srivastava, D., Thompson, K.: TAX: A Tree Algebra for XML. In: Ghelli, G., Grahne, G. (eds.) DBPL 2001. LNCS, vol. 2397, pp. 149–164. Springer, Heidelberg (2002) 7. Lepreux, S., Vanderdonkt, J., Michotte, B.: Visual Design of User Interfaces by (De)composition. In: Proc. of DSV-IS 2006 (Dublin, Ireland, July, 26–28 2006, pp. 26–28. Springer, Heidelberg (2006) 8. Lewandowski, A., Bourguin, G.: Inter-activities management for supporting cooperative software development. In: Nilsson, et al. (eds.) Advances in Information Systems Development: Bridging the Gap between Academia and Practice, vol. 1, pp. 155–167. Springer, Heidelberg (2005) 9. Lewandowski, A., Bourguin, G., Tarby, J.C.: Les Modèles de Tâches pour la Contextualisation des Composants. In: Proc. of the 10th Intern. Conf. Ergo’IA, Bidart/Biarritz, France, pp. 147–154 (October 11-13 , 2006) 10. Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López, V.: UsiXML: a Language Supporting Multi-Path Development of User Interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, pp. 200–220. Springer, Heidelberg (2005) 11. Suchman, L.: Plans and Situated Actions. Cambridge University Press, Cambridge (1987) 12. Van der Aalst, W.: Don’t go with the flow: Web services composition standards exposed. Trends Controversies Jan/Feb 2003 issue of IEEE Intelligent Systems (2003) 13. Wang, N., Schmidt, D.C., O’Ryan, C.: Overview of the CORBA component model. In: [0], pp. 557–571 (2001)

Application of Visual Programming to Web Mash Up Development Seung Chan Lim, Sandi Lowe, and Jeremy Koempel MAYA Design 2730 Sydney St Building 2 Suite 300 Pittsburgh, PA 15203, USA {slim, lowe, koempel}@maya.com

Abstract. The ongoing adoption of the latest Web development patterns such as AJAX is helping to enhance the user experience on the Web. Moreover, there is now API-based support from various vendors that allow seamless fusion of disparate data sources into a single application. However, the barrier for Web designers to integrate such features into their Web applications remains high. This hampers a wider proliferation of such novel Web applications. In this paper, we conduct an experiment to see whether visual programming is appropriate for allowing Web designers integrate the aforementioned features. For the experiment, we have developed a prototype, tentatively named WIPER that allows Web designers to incorporate pre-built JavaScript components into live Web pages using drag-and-drop. We combined rapid revision with usability testing to iteratively advance our prototype. Working with users, we have learned that with some targeted refinements, visual programming paradigm can be very effective in achieving our goal. Keywords: Visual Programming, Dataflow Architecture, JavaScript, Rapid Prototyping, End-User Programming.

1 Introduction Large-scale adoption of the AJAX pattern [1], coupled with sophisticated use of JavaScript, has given a significant boost to the deployment of rich Web applications. Rich Web applications typically execute server-resident programs to retrieve data without refreshing the Web page. When data retrieved from the server returns, only the necessary parts of the page are updated. Compared to the traditional model, which refreshes the entire page, this technique results in a richer and more responsive user experience. Good examples of such rich Web applications include Gmail, Google Maps, and Flickr. The proliferation of such techniques has also drawn end-users’ attentions to using JavaScript and Web services to come up with new applications of their own. This trend was illustrated by the recent onslaught of numerous Web mash ups. Web mash ups are Websites or Web applications that seamlessly combine content from more than one source into an integrated experience [2]. For example, there are currently several hundred Web mash ups that integrate Google Maps with third-party data to provide visualizations of geospatial information [3]. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1139–1148, 2007. © Springer-Verlag Berlin Heidelberg 2007

1140

S.C. Lim, S. Lowe, and J. Koempel

Building such rich Web applications typically involves extensive use of JavaScript. JavaScript is used to modify the parts of the Web page that need updating, and to also retrieve remote data using patterns such as AJAX. Although several third-party JavaScript code samples exist for download, integrating and configuring them has proven difficult for Web designers. In this paper, we conduct an experiment to see if the paradigm of visual programming can help Web designers to more easily employ modern Web development paradigms such as AJAX and other scripted real-time interactions. We define Web designers as those who have enough skill and inclination to mock-up pages, but struggle with or avoid the programming required to add rich interactions to their designs. Visual programming is a graphical paradigm that has been tested under various conditions to facilitate end-user programming [4,5,6,7,8,9]. We combined rapid iteration with usability testing to advance our prototype visual programming tool, tentatively named WIPER. Through the course of this work we tested the usability of the tool with fourteen Web designers by asking them to complete an identical Web development task. The task involved the creation of a rich Web application that allows page visitors to access, display and interact with data retrieved from a remote server using the AJAX pattern. Incremental design modifications were made to our prototype after each user test until consecutive user tests indicated that users were having little difficulty accomplishing the task at hand. Each user session was preceded by a brief demonstration that illustrated dragging components, consulting the help text, wiring components and testing the page.

2 Related Work Visual programming has been a topic of interest to many researchers for over two decades. Examples of visual programming environments span from the early works of Pygmalion [10], Prograph [11] to VIPERS [12], LabView [13] and Interstacks [14]. WIPER builds on the interaction paradigms found in these systems. Most popular examples of Web programming tools include Macromedia Dreamweaver [15] and Adobe GoLive [16]. Although these WYSIWYG editors help style and lay out the contents of a Web page in a visual manner, they do not provide any visual means to integrate and configure third-party JavaScript components that provide rich interaction patterns and/or access remote services via AJAX. Examples of work that focused on the design of a GUI that allowed access to remote services include efforts by Wolber et al. [17] as well as Mosconi et al. The work done by Wolber et al. proposed a WYSIWYG editor that could directly access relational databases. Work by Mosconi et al., named “Alligator,” used the visual programming paradigm to [18] allow the design of Web application flow that involved the execution of URL-addressable server-resident programs. Neither of the two efforts involved the incorporation of rich interaction patterns. Other examples of end-user programming tools intended for the Web include systems such as DENIM [19] and the image-oriented tools proposed by Takao

Application of Visual Programming to Web Mash Up Development

1141

Shimomura [20]. DENIM focused primarily on the site-map design phase of Web development, and the work by Takao Shimomura concentrated on the traditional fullpage refresh paradigm. Although there has been extensive research into systems intended to script dataflow among disparate large-grain Web applications, [21,22,23,24,25], none have focused on being able to provide small-grain dataflow among JavaScript components within a Web page. Recent efforts by Datamashup [26] also showcase interesting solutions to providing end-users with the ability to more easily build Web mash ups.

3 Initial Design WIPER can be launched directly from the FireFox Web browser. The initial design allowed right clicking on the content area of the Web page then clicking on “Edit HTML using WIPER” from the context menu to spawn WIPER in a new window (Fig 1). JavaScript components are integrated into the Web page by simply dragging a component from the library (white- colored sidebar shown in Fig 1.) and dropping it onto the light blue canvas region of the WIPER interface. Once the component is integrated, it can be made to communicate with other components using the traditional wiring paradigm seen in other visual programming environments. Wiring is also done by drag and drop. Dragging on an output terminal of a component dispenses a wire. Dropping the wire on the input terminal of another component creates a connection (Fig 2.). There are three different in-context interfaces provided by the visual programming tool. First is a text-based property editor, second is a text-based HTML editor, and third is a help documentation viewer (Fig 2.).

4 Implementation The foundation of WIPER lies on the Javascript Dataflow Architecture (JDA) [27]. JDA allows Web applications to be built from JavaScript black box components that can pass messages to and from one another via arbitrarily complex dataflow channels. WIPER is built as an extension to the FireFox Web browser. There are three main advantages to this approach. First, users can access the tool to edit the Web page they’re currently viewing without having to launch another application. Second, it allows WIPER to directly manipulate the target Web page in real time. Third, users can immediately test the resulting Web page as viewed in the FireFox Web browser. WIPER prefixes all modifications it makes to the HTML of the target Web page with well-documented comments, and keeps them cleanly separated from the content that pre-existed. The resulting HTML is also void of any automatically generated JavaScript. This makes it easier to open the resulting HTML in a text or a WYSIWYG editor for further editing.

1142


Fig. 1. WIPER launched in a new window

Fig. 2. Users can look up help documentation for any given component

5 Method We designed a think-aloud usability test consisting of three parts. First, we provided a brief tutorial of the tool and gave the user the opportunity to ask questions about the tool. Next, we provided the user with a pre-existing, plainvanilla Web page: “My Cool Blog”. The first task was as follows: “This is your cool blog. You want to let your blog visitors flip through the photos from your Flickr account on your blog page.” This task required the user to connect several components and change both their property configurations as well as their HTML configurations. This task seemed to successfully familiarize the user with the application. When the user had progressed through this task, we asked them to: “Please move this feature to the bottom of the right- hand column.” This task required that the user shift their focus away from the original WIPER prototype and complete the task by manipulating the HTML and CSS.


1143

We modified the prototype immediately when we identified interface problems. This gave us the opportunity to verify whether our solutions were having the desired effect.

6 Rapid Design Iterations In this section we detail the observations made through the course of fourteen user tests and four design iterations. 6.1 Observation: Clarify the Relationship Between Parts Our first user test indicated that it could be difficult for users to form a correct mental model of how WIPER affects the HTML and CSS of the Web page. Our initial design had separated the resulting Web page from the editing environment (Fig 1). Web designers could add components to the page with the prototype environment, but were required to edit the HTML and CSS files to reposition or resize the components. Hence the second task, “Please move this feature to the bottom of the right-hand column.” The Web designers we tested had a difficult time understanding the capabilities and limitations of the prototype versus design work that they were required to perform manually. 6.2 Observation: Visual Programming: Advantages and Responsibilities All of our users made positive remarks about the visual programming aspects of the system. They liked the ability to drag and drop components onto the canvas and the immediate visual feedback provided by our tool. However, the visual programming metaphor places an additional responsibility on the tool designers. Users expected that dragging a component to a particular place on the canvas would result in a related position on the Web page. For example, when the user dropped a component on the bottom right of the canvas, they expected it to show up on the bottom right of the Web page. 6.3 Refinement: Side-by-Side View Users’ difficulty understanding the relationship between parts and their expectations for more direct manipulation led us to three important changes. The first change is the side-by-side design view that increases the users’ visibility of the system. When the WIPER interface is invoked, a new frame slides in from the left pushing the Web browser frame to the right. The resulting interface presents two frames situated sideby-side. For reference, the original Web browser view can be seen in Fig 4, and the same view after WIPER has been invoked can be seen in Fig 5. As seen in Fig 5, the frame on the left houses the visual programming environment, and the frame on the right contains the current Web page. The side-by-side view was chosen for two reasons. First, the proximity between the visual programming environment and its output allows users to receive real-time

1144


feedback as they modify the page. Second, the side-by-side view allows for better use of real estate. On most popular resolutions, having a floating window housing the visual programming environment proved too difficult to due to its size. In the side-by-side view, the user can resize either frame horizontally to take up more or less space. The component library was moved to the bottom of the screen. To return to the normal Web browser view, users can click on the red “x” situated to the right of the left-hand frame. 6.4 Refinement: Screen Overlay and the Use of Halos Our second significant change is the use of a screen overlay and halos to help users understand that they are in an editing mode and to provide a visual connection between the components on the left-hand panel and their resulting display elements on the Web page. As seen in Fig 5, when WIPER opens, a semi-transparent screen overlay is cast on top of the original Web page. The screen overlay is designed to achieve two goals. First, it indicates to users that the Web page they are working on has now entered a non-interactive, editing mode. Second, it provides a platform on which halos can be rendered around any visualized JavaScript components they have incorporated thus far. The color of the halos on the Web designer’s page (right- hand side) matches the corresponding halos on the JavaScript component counterparts in the visual programming environment (left-hand side). This design is intended to help users better associate the component as displayed in the visual programming environment with its rendering in the Web browser. This color coordination was used throughout the interface even when the incontext interfaces were accessed. This helped users remember which component they were editing and still retain the visual connection between the component and its rendering on the Web page. The relocation and resizing that takes place on the screen overlay moves the corresponding HTML elements in real-time. 6.5 Refinement: Explicit Naming Scheme To correctly communicate the real behavior of the components, we first removed the rollover effect such that the labels would no longer look clickable. Next, we relabeled the component terminals to better communicate their function. For example, on component A the “username” label became “username-in,” indicating that the username should originate from some other component rather than by modifying component A. Similarly, “response” became “response-out,” indicating that the response should be sent out of the component rather than be wired to loop within it. These changes appeared to have solved the problem; we did not observe it in any of the subsequent users.


1145

Fig. 3. Original Web page as viewed in FireFox

Fig. 4. FireFox with WIPER open

6.6 Observation: Consistent, Immediate Feedback Is Crucial In our third iteration, we were able to diagnose a feedback problem now that some of the other problems appeared to be resolved. We realized that our prototype was misleading users with inconsistent behavior. On iterations 2 and 3, users were provided with some immediate feedback. For example, changing a text label would render instantly on the resulting page. Other actions required users to manually exit the editing mode and refresh the page. Users interpreted the lack of real-time feedback as though they had performed an incorrect action, and would continue to modify their work, often “breaking” work that was correct, had they only known to test it.

1146


6.7 Refinement: Provision of Real-Time Feedback One of the most important lessons learned during the iterative design process was to make sure that users are provided with consistent real-time feedback for all of their actions. When we implemented this change we observed considerable improvement in users’ comfort with the task. We updated the prototype such that any dataflow configuration changes, component additions and/or customization are also reflected immediately. This means that the Web browser always displays the most up-to-date result of user modifications at any given moment in time. To test the Web page, for example, to check whether a button click responds as the designer intended, s/he should close the visual programming environment and return to the normal Web browser view. The last few users did not exhibit any problem understanding the need to exit editing mode to test interactive elements like button behavior.

7 Conclusion and Future Work Throughout this research, we were able to incorporate several important refinements to our prototype visual programming tool. By the end, two consecutive users succeeded in the task with very little difficulty. Our work has shown that with targeted refinements, the paradigm of visual program can successfully help Web designers take advantage of modern Web techniques such as AJAX and other scripted real-time interaction capabilities. Our work also opens up interesting venues for further exploration. As the success of the visual programming paradigm in the large relies heavily on the provision of a sizable component market, the question of scale is certainly unavoidable. The ability to gracefully scale the component library remains an issue. As the number of components grows and their variety expands, the issue of message formats will arise. More specifically, it will be important for the tool to provide adequate feedback on whether the output originating from one component is appropriate to use as input to another component. We can envision the use of visual cues to indicate such compatibility coupled with additional real-time feedback from the tool that mimics the attract/repel phenomenon of a magnet. A feature of JDA not addressed by this prototype is the notion of hierarchical component encapsulation and decomposition. This also directly affects the optimal use of screen real estate dedicated to the visual programming canvas. We believe one of the best approaches to managing the complexity of large-scale applications is through hierarchical decomposition. Further, being able to encapsulate a number of components in one large packaging would be a novel way to introduce refactoring. We believe it is important to provide features in the tool that directly address these two paradigms. The direct manipulation feature provided by the tool currently only addresses fixed positions and dimensions. Although one can easily load the HTML modified by the tool in a text or a WYSIWYG editor to take advantage of the dynamic layout capabilities of HTML, additional work will be needed to provide such functionality directly from the tool’s interface.


1147

As minor hindrance to the overall experience, the in-context editing interfaces can be further improved using columnar input fields used by other tools such as Visual Basic, instead of the free-form text editor provided.

References 1. Garrett, J.: Ajax: A New Approach to Web Applications. http://www.adaptivepath.com/publications/essays/archives/000385.php 2. Wikipedia.: mashup (Web Application Hybrid) http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 3. Google Maps Mania http://googlemapsmania.blogspot.com/ 4. Johnston, W.M., Hanna, J.P., Millar, R.J.: Advances in dataflow programming languages. ACM Comput. Surv. 36(1), 1–34 (2004) 5. Myers, B.A.: Visual programming, programming by example, and program visualization: a taxonomy. In: Mantei, M., Orbeton, P. (eds.) Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA, April 13 - 17, 1986). CHI ’86, pp. 59–66. ACM Press, New York (1986) 6. Meyer, R.M., Masterson, T.: Towards a better visual programming language: critiquing Prograph’s control structures. In: Proceedings of the Fifth Annual CCSC Northeastern Conference on the Journal of Computing in Small Colleges (Ramapo College of New Jersey, Mahwah, New Jersey, United States). Meinke, J.G. (ed.) Consortium for Computing Sciences in Colleges. pp. 181–193 (2000) 7. Schmucker, K.J.: Rapid prototyping using visual programming tools. In: Tauber, M.J. (ed.) Conference Companion on Human Factors in Computing Systems: Common Ground (Vancouver, British Columbia, Canada, April 13 - 18, 1996). CHI ’96, pp. 359–360. ACM Press, New York (1996) 8. Whitley, K.N., Blackwell, A.F.: Visual programming: the outlook from academia and industry. In: Wiedenbeck, S., Scholtz, J. (eds.) Papers Presented At the Seventh Workshop on Empirical Studies of Programmers (Alexandria, VA, USA). ESP ’97, pp. 180–208. ACM Press, New York (1997) 9. Wilcox, E.M, Atwood, J.W., Burnett, M.M., Cadiz, J.J., Cook, C.R.: Does Continuous Visual Feedback Aid Debugging in Direct-Manipulation Programming Systems? In: Proceeding of CHI’97 (Atlanta, GA), ACM/SIGCHI, 10. Smith, D.C.: Pygmalion: a Creative Programming Environment. Doctoral Thesis. UMI Order Number: AAI7525608 (1975) 11. Cox, P.T., Mulligan, I.J.: Compiling the graphical functional language PROGRAPH. In: Proceedings of the 1985 ACM SIGSMALL Symposium on Small Systems (Danvers, Massachusetts, United States). SIGSMALL ’85, pp. 34–41. ACM Press, New York (1985) 12. Bernini, M., Mosconi, M.: VIPERS: a data flow visual programming environment based on the Tcl language. In: Costabile, M.F., Catarci, T., Levialdi, S., Santucci, G. (eds.) Proceedings of the Workshop on Advanced Visual interfaces (Bari, Italy, June 01 - 04, 1994). AVI ’94, pp. 243–245. ACM Press, New York (1994) 13. National Instruments LabView, http://www.ni.com/labview 14. http://www.maya.com/web/what/papers/maya_interstacks_scripting.pdf 15. Macromedia Dreamweaver, http://www.macromedia.com/dreamweaver 16. Adobe GoLive, http://www.adobe.com/golive

1148


17. Wolber, D., Yingfeng, S., Yih Tsung, C.: Designing Dynamic Web Pages and Persistence in the WYSIWYG Interface. In: Proceedings of IUI’02 (San Francisco, CA), ACM/SIGCHI/SIGART, NY, pp. 228–229 (2002) 18. http://www2003.org/cdrom/papers/poster/p326/XHTML/p326-mosconi.html 19. Lin, James, Newman, Mark, W., Hong, Jason, I., Landay James, A.: DENIM: Finding a Tighter Fit between Tools and Practice for Web Site Design. In: Proceedings of CHI’00 (The Hague, The Netherlands), ACM/SIGCHI, NY, pp. 510–517 (2000) 20. Shimomura, Takao.: A Page-Transition Framework for Image-Oriented Web Programming ACM SIFSOFT Software Engineering Notes, vol. 29(2), pp. 10–10. ACM, NY (2004) 21. Bauer, M., Dengler, D.: InfoBeans - Configuration ofPersonalized Information Services. In: Proceedings of IUI’99 (LA, USA, pp.153–156 (1999) 22. Davis, H.C., Hall, W., Heath, I., Hill, G., Wilkins, R.: Towards an Integrated Information Environment with OpenHypermedia Systems. In: Proceedings of ECHT’92 (Milan, Italy, pp. 181–190 (1992) 23. Ito, K., Tanaka, Y.: A Visual Environment for Dynamic Web Application Composition. In: Proceedings of SIGWEB’03 (Nottingham, UK), ACM/SIGWEB, NY, pp. 184–193 (2003) 24. Kistler, T., Marais, H.: WebL - A Programming Language for the Web. In: Proceedings of WWW7, (Brisbane, Australia, 1998), Computer Networks, vol. 30, No.1-7, pp. 259–270 (1990) 25. Sahuguet, A., Azavant, F.: Building Intelligent Web Applications Using Lightweight Wrappers. Data & Knowledge Engineering 36(3), 283–316 (2001) 26. http://datamashup.com 27. Lim, S.C., Lucas, P.: JDA: a step towards large-scale reuse on the web. In: Companion To the 21st ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications. OOPSLA ’06, Portland, OR, USA, October 22 - 26, pp. 586–601. ACM Press, New York (2006)

Comprehensive Task and Dialog Modelling Víctor López-Jaquero and Francisco Montero Laboratory on User Interaction & Software Engineering (LoUISE) University of Castilla-La Mancha, 02071 Albacete, Spain {victor, fmontero}@dsi.uclm.es

Abstract. Task modelling has proven useful as a basis for user interfaces (IU) design. Although different models have been pushed ConcurTaskTrees (CTT) notation has become without any doubt the most extended notation for task model specification. However, this notation suffers from a lack of modularity, making the creation and modification of real-world applications a cumbersome process. In this paper a notation that takes inspiration from CTT is described that allows for the specification of the tasks the user is supposed to perform through the user interface and the dialog between the user and the user interface in an intuitive manner. Furthermore, the notation makes use of an abstract operation set to help in the automatic or semi-automatic generation of a user interface that conforms with the specified model. Keywords: User interface design, abstract user interfaces, task models, dialog models.

1 Introduction The design of user interfaces has become a discipline of capital importance for most software enterprises. Usable user interfaces leverage user’s satisfaction within an application, and therefore leverage the potential success of any application. UI development task is one of the main design challenges in the creation of an application, since it must support the system’s acceptance and be accessible and usable for everyone. Involving the users from the very beginning in the design process and focusing on usability, and not just on technology, designers have tried to address this difficult challenge. Although different user interface design approaches have been used [13] throughout the short, but intense, graphical user interfaces history, model-based approach [17] is becoming the approach receiving a wider attention from both user interfaces research community and industry during the last decade. This growing interest in model-based development is not limited to user interface design, but it is also used in the development of the whole application, as proposed in MDA (ModelDriven Architecture) proposal. Model-based approaches take as input a requirements specification that is converted into different declarative models. The most widely used ones are the task, the user, the domain, the dialogue and the presentation models, although currently there is no standard describing which models should be used. These declarative J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1149–1158, 2007. © Springer-Verlag Berlin Heidelberg 2007

1150

V. López-Jaquero and F. Montero

models are used to generate automatically or semi-automatically a user interface compliant with the requirements captured in these models. The way the transformation from a set of declarative models into a running user interface is achieved can be made following different approaches. Nevertheless, the most widely used approaches take as the cornerstone of their design process either a task model [18][15], a domain model [1] or both [11]. In this paper a visual notation for a task model specification is introduced that is used within a model-based approach: AB-UIDE [11] as the main model guiding the whole model-based design method proposed. This notation takes inspiration from ConcurTaskTrees notation [15], introducing a greater modularity and including dialog modelling. This visual notation has been designed in a fashion close to UML statecharts notation [7], to make easier for the huge mass of UML practioners to get into model-based user interface development.

2 From Domain Model to User Interface Generation Domain model encapsulates the important entities of a particular application domain together with their attributes, methods and relationships. Within the scope of UI development, it describes the objects that the user requires in order to carry out his tasks. Most applications rely on a database to perform its objectives. This data dependency inspired the creation of some projects aimed at generating automatically a user interface out of the data it was supposed to handle. Examples of such projects were Janus [1] or Teallach [6]. Although these domain-based approaches are useful to quickly generate a user interface to access some data, the usability of the resulting user interface is rather low. These domain-based user interface generation approaches produce complex user interfaces, because users can see many elements at the same time. Moreover, as long as the user-tasks are not contemplated the dialog within the user interface is rather limited and constrained, producing user interfaces quite static. Another drawback in domain-based user interface generation approaches is the lack of a proper grouping between the elements the user requires to perform a task, reducing the productivity of the application. Next, we elaborate on the task-driven model-based approach we use, and the arguments that directed us towards this solution.

3 From Task Model to User Interface Generation Task model describes those tasks the user is allowed to perform through the user interface. This task model can be modelled in many different ways, some of them coming from Software Engineering community, such as UML statecharts, activity or use cases diagrams [16] or Petri nets [2], and some of them more specific to humancomputer interaction community, such as CTT or User Action Notation (UAN) [8]. The derivation of a user interface out of a task model adds an additional view to the design process: the user. Thus, taking into account which tasks the user is allowed to

Comprehensive Task and Dialog Modelling

1151

perform and the temporal relationships between those tasks it is possible to increase the overall system usability for example by grouping the related widgets or by hiding all or most of the irrelevant information for the current task. Therefore, relying on a task model for user interface generation rather than just on a domain model is a important step forward to improve the usability of the user interfaces built by applying model-based techniques. 3.1 Task and Domain: A Marriage of Convenience A task model by itself is not enough to generate a high quality user interface, additional information is required. Although the task model includes information regarding which are the tasks the user is supposed to carry out with the application, it does not include information regarding the data that those tasks require to be performed. Thus, we find it is necessary to relate the tasks and those data they require. Therefore, a marriage between task and domain model is required in order to generate a good user interface. For instance, if the user is supposed to perform an input task, where the data type for the input data is integer, the generation process should generate a set of widgets, which are appropriate for that kind of task and for that data type. Another fact that the generation process within a model-based approach relying on task and domain models is the cardinality of the domain object the tasks are related to. For instance, consider the user asks the system to show the phone numbers for a client. Obviously, in this case the cardinality between the output task “Show phone numbers” and the method of the domain object returning those phone numbers is oneto-many (1,*). Therefore, the generation process should generate a set of widgets able to show a set of data entries (for instance a list box). In some approaches this last situation is modelled by specifying a single output task with one-to-one cardinality, and marking that task as repetitive, but some tweaking is required to make it work properly. 3.2 Dialog Modelling By relating task model with domain model we get some more valuable extra information for the generation of the user interface. However, for us it is not enough. For instance, if a task model is created in CTT it is possible to describe the tasks the user will be allowed to perform and the temporal constraints between those tasks. However, it is not that easy to describe the dialog between the system and the user, that it to say, describe the situation where different branches are available depending on the actions the user take. In figure 1 an example is shown where a user is authenticating in a webmail (i.e. GMail1 application by providing a username and a password). The CTT notation is powerful and useful to analyse the authentication task and describe which data the user should provide and how he interacts with the system. Nevertheless, it is not flexible enough to embrace the interaction in all its dimensions. For instance, if the designer wants to specify a task modelling describing the behaviour exhibited in 1

http://gmail.google.com

1152


figure 1, it is really hard to specify all those situations arising from user’s interaction. In the example in the figure, it is hard to specify the possible error states that a login task can produce (inform the user about a wrong password, a wrong username or both).

(a) User login in a webmail application.

(b) The user enters a wrong password or username.

Fig. 1. Authentication task in a webmail application

This kind of exceptional situations shown in this tiny example, that have a direct impact in the usability of a software product, are some of the limitations of current task models that we overcome in our method. Following, the description of our technique for task and dialog modelling will be shown.

4 A Comprehensible Task and Dialog Modeling Approach Because of the facts enumerated in the previous section, a task model has been devised that takes inspiration from ConcurTaskTrees [15], the well-known technique within human-computer interaction community, UML statecharts diagrams [7], and the Canonical Abstract Prototypes [5]. Next, the most prominent features of the task-modelling approach introduced will be described in-depth. 4.1 Closer to UML In this approach, we bring closer to the huge mass of UML practioners human computer interaction task modelling techniques by adopting a notation much alike UML statecharts diagrams. A task or subtask model specification begins with a circle whose background colour is black (starting state). On the other hand, the end of specification is indicated by a circle whose background colour is black and surrounded by another circle (see figure 2), as it is used in UML statecharts diagrams (final state).

Comprehensive Task and Dialog Modelling end

start

Login

1153

( 1,* )

start

Perform operation

Fig. 2. Tiny example for a task model specification

Each state in the diagram can be either a task or an action. Tasks can be further refined, while actions are elemental tasks that can not be further refined. Tasks are represented by using the same symbol that is used for a state in statecharts diagrams (see figure 2a). To represent actions the states have been stereotyped. The stereotype used has been called “action” (see figure 2b). All these tasks and actions have a set of properties to better describe de purpose and presentation of the intended goal. Tasks, actions, starting state and final state are linked by transitions. A transition from a state S to another state R means that the task control flow can go from state S to state R if the condition given in the label of the transition is met (sequentially). The available labels for the transitions are detailed later on in the paper.

Login

(a) Task

Enter pin (b) Action

Fig. 3. Representation of tasks and actions

4.2 Modularity One of the most interesting things in modelling any complex system is modularity. For instance, in UML the designer can split the design in packages to better organize the structure and readability of the models. Moreover, it also allows the designer to create a complex model where some parts are underspecified while some other parts are fully specified. In our approach modularity has been taken into account also. First, the designer can design the general tasks structure (in our method this structure is derived out of a previous enriched use case model, capturing the initial requirements). The tasks created for that general structure are then refined either with new tasks or with the actions that will allow the user to carry out those tasks. In figure 4 an example of this kind of modularity is depicted. The task Login (it represents the task of a login in a bank ATM) has been decomposed into two actions: Enter login/card and Enter Pin. Notice the resulting actions have also a starting and a final state that represent the beginning and the end of Login task. The designer is allowed to edit/refine a task by double-clicking on the task to be edited. By rightclicking the properties of either a task or an action can be edited.

1154

V. López-Jaquero and F. Montero end

start

Login

( 1,* )

start

Perform operation

Refines

Enter login/card

Enter pin

Fig. 4. Refining a task

4.3 LOTOS-Based Temporal Relationships This notation has been enriched by including the same Lotos [9] operators used within CTT. The graphical notation for these operators has been adapted to make it more easily usable within the statecharts notation used. For instance, in figure 2 the repeat operator is represented as a transition from a state to the same state. Between the parentheses the designer can specify the minimum and maximum number of times that this task can be repeated. By default, a transition between two states means a sequential temporal relationship between those two states. Notice in figure 4 Enter login/card and Enter pin have a concurrent temporal relationship. Therefore, both actions can be executed concurrently in no particular order. The graphical representation used to represent concurrency between two tasks is the same one used in CTT. 4.4 Detailed Dialog Modelling with Abstract Tools Constantine [4][5] proposed a set of abstract tools that represent, in an abstract manner, the complete set of actions that can be performed in an user interface. This set of actions was devised after year experimenting and gathering feedback from developers. Constantine uses this set of abstract tools to represent canonical user interfaces (abstract user interfaces). In our approach these abstract tools are applied to express the dialog between the tasks and the actions. That is to say, which action from the user or the system are required to take a transition in our task model from one state to another. A state can have several outgoing transitions. Thus, the task flow control will choose the right transition according to the actions taken by the user or the system. For instance, figure 5 models a scenario where a bank client wants to make a deposit. The client first enters the amount to deposit and puts the money on the ATM slot (waiting for the ATM machine to acknowledge that the client puts the right amount of money on the slot is expressed as a post-condition for Enter amount action


1155

that needs to be held before Enter amount action is considered to be finished). If the post-condition is not held, an error is raised and the transition for error abstract action would be taken. Otherwise, the system would take the transition labelled start. For this transition to be taken it is required that the user enters the amount to deposit and to confirm/accept it.

Enter amount

start

Show balance

start

error

Invalid Amount

start

Fig. 5. Example of dialog modelling

In short, by adding the abstract tools to the specification of our task model we provide the designer with a powerful tool to make dialog modelling easier and intuitive. Moreover, it also make generation/transformation process much more easier, by providing additional meaningful information regarding the transitions from one task to another and the type of operation required from the user in order to force each transition. 4.5 Tasks and Actions Properties Tasks and actions need to be described in order to provide meaningful information for the generation process. In our approach a set of predefined properties has been defined (see figure 6), although it can be extended by the designer to add custom properties. A priori it is almost impossible to cover every potential property a designer might need to apply the transformation process leading to the generation of the final user interface, since different sets of properties are required to apply different heuristics or transformational approaches. Each task or action has a descriptive name, a description in natural language that describes which is the goal of the task or action, a type that specifies whether it is abstract, for input, for output, etc. They also have a frequency attribute that stores the task/action frequency the designer expects that task/action will be executed by the user/system. This attribute is quite useful to help in finding out a good task layout in the final user interface generated. Precondition attribute includes a set of expressions that must be evaluated to true for the task to be enabled. The expressions are evaluated as a logical program. It means that the precondition will failed whenever any of the expressions that it includes is not successfully evaluated. Postconditions work in a similar manner. Nevertheless, in this case all the expression should be evaluated to true before a transition to other state is taken. Resulting from either evaluation the precondition or the postcondition, error exceptions can be raised. These error exceptions can be handled within the dialog modelling by means of error

1156


abstract tool. For the specification of both precondition and postcondition attributes OCL [19] (Object Constraint language is used. OCL is widely used among UML practioners to express constraints in a variety of UML diagrams. Finally, any task can be represented at the abstract level either in a FreeContainer or in a Container [16]. The difference between both kinds of Containers is that the first one is a root container that cannot be included within any other container. Designers can choose to leave black this attribute, and to postpone this decision until generation process. Task name Login Description The bank customer enters the card or the login. Type Abstract Frecuency High Precondition NULL Postcondition Customer.checkLogin() Presentation FreeContainer

Action name Description

EnterLogin_card The user types in the login or enters the card.

Type

Input

Frequency

High

Precondition Postcondition

Action name Description

NULL Customer.currentLogin!=“”

EnterPin The user types in the password for the login.

Type

Input

Frequency

High

Precondition Postcondition

Customer.currentLogin!=“” Customer.currentPassword!=“”

Fig. 6. Properties for Login task and its associated actions

Notice the expressions used in either precondition or postcondition attributes can included any valid OCL expression. In the expressions created the designer can also make use of any public method or attribute of the classes defined for the domain model.

5 Conclusions Task modelling has become the cornerstone for model-based user interface design. Different approaches have been pushed, but CTT seems to be the most widely used.


1157

Nevertheless, CTT is not as used as it should, because of a lack of tools integrating that task modelling technique within the whole development process and because is quite far from what most developer are used to when modelling their applications: UML. In this paper a task modelling approach is introduced that takes inspiration from the strong points of CTT to create a graphical notation alike UML statecharts diagrams to bring closer to the huge mass of UML practioners HCI community modelling techniques. Moreover, the notation has been enriched with abstract tools to provide an easy and clear notation for the dialog between the system and the user. Although, model-based user interface design approaches have reached some kind of maturity, they are not as much used as they should by developers. To build a bridge between HCI research community and developers we need to devise notations able to attract developers towards the good practices for user interface design. In this paper, we have tried to make another step forward to build the bridge between both communities. Acknowledgments. This work is partly supported by the Spanish PAI06-0093-8836, CICYT TIN2004-08000-C03-01 and PCC05-005-1 grants.

References Balzert, H., Hofmann, F., Kruschinski, V., Niemann, C.: The JANUS Application Development Environment - Generating More than the User Interface 1996. CADUI, pp. 183–208 (1996) Bastide, R., Palanque, P.A.: Implementation Techniques for Petri Net Based Specifications of Human-Computer Dialogues. CADUI 1996, pp. 285–302 (1996) Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Bouillon, L., Vanderdonckt, J.: A Unifying Reference Framework for Multi-Target User Interfaces. Interacting with Computers 15(3), 289–308 (2003) Constantine, L.L., Lockwood, L.A.D.: Software for use. Addison-Wesley, London, UK (1999) Constantine, L.: Canonical Abstract Prototypes for abstract visual and interaction design. In: Jorge, J.A., Jardim Nunes, N., Falcão e Cunha, J. (eds.) DSV-IS 2003. LNCS, vol. 2844, Springer, Heidelberg (2003) Griffiths, T., Barclay, P., McKirdy, J., Paton, N., Gray, P., Kennedy, J., Cooper, R., Goble, C., West, A., Smyth, Teallach, M.: A model-based user interface development environment for object databases. In: Proceedings of UIDIS’99, pp. 86–96. IEEE Press, New York (1999) Harel, D.: Statecharts: A visual formalism for complex systems. Science of Computer Programming 8, 231–274 (1987) Hartson, R., Gray, P.: Temporal Aspects of Tasks in User Action Notation. Human Computer Interaction 7, 1–45 (1992) Information Process Systems - Open Systems Interconnection - LOTOS - A Formal Description Based on Temporal Ordering of Observational Behaviour. ISO/IS 8807 (1988) Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López, Jaquero, V.: UsiXML: a Language Supporting Multi-Path Development of User Interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, Springer, Heidelberg (2005) López Jaquero, V., Montero, F., Molina, J.P., González, P., Fernández Caballero, A.: A Seamless Development Process of Adaptive User Interfaces Explicitly Based on Usability Properties. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, Springer, Heidelberg (2005)

1158


Montero, F., López Jaquero, V., Vanderdonckt, J., González, P., Lozano, M.D.: Solving the Mapping Problem in User Interface Design by Seamless Integration in IdealXML. In: 12th International Workshop on Design, Specification and Verification of Interactive Systems (DSV-IS’2005), England, July 13-15. Newcastle upon Tyne, Springer, Heidelberg (2005) Myers, B., Hudson, S.E., Pausch, R.: Past, present, and future of user interface software tools. ACM Trans. Comput.-Hum. Interact 7(1), 3–28 (2000) Oeschger, I., Murphy, E., King, B., Collins, P., Boswell, D.: Creating Applications With Mozilla. O’Reilly (September 2002) Paternò, F., Mancini, Meniconi, S.: ConcurTaskTrees: A Diagrammatic Notation for Specifying Task Models. In: Interact’ 97, pp. 362–369. Chapman & Hall, Sydney, Australia (1997) da Silva, P.: Object Modelling of Interactive Systems: The UMLi Approach. Ph.D Thesis. University of Manchester, N.W. Paton (supervisor), UK (2002) Puerta, A.R.: A Model-Based Interface Development Environment. IEEE Software, pp. 40–47 (1997) Reichart, D., Forbrig, P., Dittmar, A.: Task models as basis for requirements engineering and software execution. In: Proceedings of the 3rd Annual Conference on Task Models and Diagrams. TAMODIA ’04, vol. 86, pp. 51–58. ACM Press, New York (2004) Warmer, J., Kleppe, A.: The Object Constraint Language: Precise Modeling with UML. Object Technology Series. Addison-Wesley, London, UK (1999)

Structurally Supported Design of HCI Pattern Languages Christian Märtin and Alexander Roski Augsburg University of Applied Sciences Faculty of Computer Science Baumgartnerstr. 16 D-86161 Augsburg, Germany {maertin, roski}@informatik.fh-augsburg.de

Abstract. HCI pattern languages represent an important software engineering concept and offer proven design and architectural solutions to developers of interactive systems and user interface designers. However, due to their poor organizational structures the effective usage of many existing pattern languages is not clear and easy enough to let developers quickly find appropriate patterns for solving their current design problems. In order to raise pattern language usability, there is a need for a sound definition of the hierarchical structure of pattern languages and a rule based workflow for constructing future pattern languages. The structural approach presented in this paper will provide the designer with a technique to ensure the development of efficient and usable pattern languages. Keywords: Pattern Language, HCI, Structured Hierarchy, Regulated Links.

1 Introduction Since Christopher Alexander [1,2] introduced the concept of patterns and pattern languages, numerous pattern catalogues have been developed for various disciplines. Software engineering and object-oriented modeling profited a lot from design patterns and the respective pattern languages [9]. In recent years patterns and pattern languages have also become valuable resources for design-related aspects of HCI [7]. Pattern applications in HCI cover a broad spectrum of design areas. They are used for controlling the model-based design of interactive systems, the various challenges of cross-platform user interface development [10] or the integration of user experience aspects into the design of websites with good usability [8]. Despite the big amount of existing pattern languages there are only a few ones with a well thought through structure that eases the pattern selection and application process for the user, i.e. the developer of interactive systems or web-based applications. This paper presents an approach which will provide the architects and designers of pattern languages with possibilities to ensure that their pattern languages will be efficient, easy to use and without any logical gaps. This also means that a potential pattern language user will find the required information in a very short time and always in the same methodical way. The application of our hierarchically structured approach should simplify the development of further pattern languages for all disciplines. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1159–1167, 2007. © Springer-Verlag Berlin Heidelberg 2007

1160

C. Märtin and A. Roski

2 Structured Pattern Language Design Approach 2.1 Dividing the Language into Hierarchical Steps To divide a pattern language into several planes or levels is a well known approach that was already presented in [4]. But whether the requirements for finding patterns e.g. for a complex web-based software system can really be covered by as few as four specific planes of patterns is still an unanswered question. We believe that it is not always the best solution to avoid additional planes by combining the respective multifacetted patterns in one of the few defined planes. There should rather be an opportunity to decompose the overall problem class covered by the pattern language into different pattern language planes. For this reason we present an approach without

Fig. 1. Separation into different hierarchical steps

Structurally Supported Design of HCI Pattern Languages

1161

strongly defined layers, but with methodical steps for including patterns with different levels of abstraction, grade of detail or refinement. These grades represent the part of the problem that is solved by the current pattern in relation to the underlying problem class of the whole language. Each level contains a collection of patterns that together solve all of the design requirements for a given part of the overall problem at a specific level of detail. This means that the first and/or the top pattern of the language will describe a relatively abstract solution which represents the main problem class of the language by 100%. Each of the following patterns stands for a smaller percentage part of the overall problem, but gives a more concrete or specific solution for the represented part of the main problem class. Thereby the number of refinement or diversification steps is variable and not limited. Figure 1 illustrates this concept. The pattern on the top will always represent the full 100% of the underlying problem class modelled by the whole language. As you can see, the subsequent patterns only cover smaller parts of the main problem class. It does not matter, whether they cover either 50% or only 10% (see part A of the figure), but in both cases the two patterns on the same level are required to solve the partial problem. In part B of the figure you might assume that all patterns in the second step represent 33.33% but this is not necessarily the case. The covered overall percentage of the patterns in one plane can be smaller than that of the parent pattern. Part C emphasizes that patterns at a deeper level present a higher grade of details or refinement and consequently only cover a smaller percentage of the main problem class than their predecessor. However, it is important that all patterns within the same plane represent the same level of detail. All deeper patterns represent the problem and its solution at a more concrete level. These requirements certainly constitute the greatest challenge when developing a pattern language with a good logical organization. The question is: How can you separate the problems targeted at by the patterns without the knowledge of which other problems might still occur and which grade of details they will correspond to? To solve this problem a simple method can be used. Starting at the general problem class and with the abstract pattern at the root of the pattern language graph you will decompose the problem class into several smaller problems and their patterns. In the beginning you will not know yet, whether the decomposed patterns correspond to the same level of details or not. However, if you continue with the decomposition process you will find new problems which cover smaller percentages of the higher patterns´ problems and a pattern tree is created. In case you want to link an existing pattern to a newly created pattern, which is on the same or even on a higher level, you have detected a gap in the pattern hierarchy. This gap implies that either the pattern, you want to link to, is not diversified into enough subpatterns and you have to fill in an additional subpattern or that the new problem corresponds to a more detailed pattern with less percentile coverage at a deeper level. In principle the resulting pattern hierarchy should provide the pattern language user with a simple und clear navigation structure. Additionally he or she will be able to better estimate the size of an existing problem within the overall application context. On the other hand applying this problem decomposition process, the pattern language developer has a way to find out possible logical gaps in the language.

1162


2.2 A System for Connecting the Patterns To be useful for the system developer a pattern language needs a well and conveniently constructed system of connections between its patterns in order to allow the systematic navigation through the pattern hierarchy during the development process. However, when is a system of connections constructed conveniently?

Fig. 2. Overview about the links of Tidwell's Pattern Language

Figure 2 shows a visualization of all the links in an existing pattern language for user interface development [6]. If you have a look at it you may doubt whether this language is constructed conveniently or not. By using the links, a possible user will move from pattern to pattern, but it can happen that he or she will not find some patterns useful for the design. Some of the patterns are completely isolated from the rest of the language. We think, to write patterns totally cut off from the environment may lead to suboptimal and inefficient design. On the other hand, we also find it is useless to write patterns which are connected to nearly all other patterns in the environment. We guess that the connection structure for this example language was created that way, because for the language designer it was not clear which trace a user would take through the pattern language. In an interactive system design project of a certain size it is an absolute necessity for the designer to have a design plan. Instead of jumping from one place of construction to another one he or she should systematically follow this construction plan from the overall content structure, to the logical structure, the abstract presentation structure, the concrete presentation structure down to the specific layout, positioning and coloring aspects. Consequently an appropriately constructed system of connections for such a comprehensive pattern language could be useful. Because of this we propose special restrictions or rules for possible links within a pattern language.


1163

Fig. 3. Graph of possible and not possible links

Implicitly during problem decomposition links are established to subsequent patterns in the next layer of the same part of the pattern tree. However, we also allow to reference patterns at a deeper level. We call this kind of connections special links. For such connections we have special rules to follow. In figure 3 you can see the possible three classes of links. The implicit links, painted in black, the special-links (green or possible) and the non-allowed links (red or not possible). The red lines with the numbers 1 and 2 are forbidden, because they are drawn within the same tree and therefore not necessary. The line with the number 3 is not allowed, because it refers to a pattern within the same step. 2.3 Possible Run Through In a run through a potential user will walk through the pattern language in a predefined way. In order to follow a structured construction plan, we also provide the user with a structured way through the language when designing an interactive application. As figure 4 shows, there is a global entrance point that represents the abstract pattern for the general problem class of the whole pattern language. Of course it can happen, that a user is looking for a pattern which is somewhere at an inner level

1164


Fig. 4. Overview of a systematic run through including special-links

of the pattern language. In order to find the relevant pattern, the user has to search for the specific pattern by following the problem decompositions from the top to the bottom of the pattern graph, i.e. from the most abstract to the most detailed patterns. By selecting the correct nodes, the user will be automatically guided to the required pattern. After the user has found the specified problem he or she will be in the phase of problem diversification. In this phase the advantages of the described structure will become evident. In the phase of problem diversification the user wants to solve a problem, for example the problem described in the pattern with the letter i. While the user retrieves the solution information contained in the pattern, he or she will realize that additional useful information is available (b, h and e) referring to special parts of the current problem. Continuing this procedure in each of the referenced patterns, the user will get into the deepest level of this tree and consequently will be fully informed about how to solve the current problem most efficiently and effectively. The green (lighter colored) arrows represent the basic run through which is possible with the implicit connections. The blue ones (labelled (4,5), (5,8), (15,8)) show some logical links, which will automatically be created when the user follows the predefined way. So it is not necessary to implicitly establish such relationships during the design of the pattern language. This also includes the connections with special links (dotted line).


1165

3 Example of a Web Design Pattern Language Implementation To demonstrate the approach we present a typical situation during the development of a pattern language. In this example you will notice, how easy it is to avoid a logical gap by simply arranging the patterns on the different hierarchical levels. All this is made possible by just following the defined rules above.

Fig. 5. Situation of avoiding a logical gap

The given situation is a pattern language for web design which has only five levels at the moment. At the top, as the global entry, there is a pattern that gives an overview of possible categories of websites together with category-independent design

1166


knowledge. On the next layer down, the developer has already written patterns about a community site, a portal site and a personal site. These three patterns represent the same grade of detail and therefore they are located on the same hierarchical step. For the community site, there are two more patterns below, which describe how to design a chat and a forum. In the portal domain, there is a special pattern for onlineshopping, which further references to some more detailed search function patterns. Now, a developer might wish to express that an online-shop could also include a forum. In this case he or she will realize that it is not possible to establish a link (special-link) between the “online-shop” pattern and the “forum” pattern, because both patterns reside on the same level and links of this kind are not allowed. As it is a fact that online-shops can include a forum, the developer knows that the “forum” and the “chat” pattern have to be shifted down one level. This is the only way to create a reference to the online-shopping pattern, but this also implies that you have to insert a new pattern at the former position of both patterns. For this simple dilemma he or she has to find a suitable pattern which will fit into the new gap. “Communication” would be such a possible pattern, because it complies as a parent pattern for “forum and chat” and as a child pattern for “Community”. Community, e.g., might serve as an overview pattern for possible communication functionalities, what sometimes may be quite helpful in order to understand the various reasons for the different communication techniques. Without this new pattern, a logical gap would have existed. For a novice webpage designer it would have been more difficult to follow the way through the pattern language. Figure 5 shows the whole process at one glance. At the beginning, only the patterns “Chat” and “Forum” (red 1,2) were residing in the left column directly under the “Community site” pattern. Then these patterns were shifted down one level (red 3) and the “Communication” pattern (red 4) was inserted. Finally, the developer is now able to establish the link from the online-shopping pattern to the Forum pattern (red 5). Using our structural approach, a pattern language developer will clearly notice from the beginning that “Chat” or “Forum” should not reside on the same level as the online-shopping pattern. During the complex design of a new pattern language, the language developer might otherwise often have overlooked such simple logical dependences between the patterns of the language. With the help of the structured design approach presented in this paper, this will not occur. The same approach can also be used for changing existing pattern languages in a structurally consistent way as well as for merging several pattern languages into one.

4 Conclusion This paves the way for the correct and efficient use of the resulting pattern languages by application developers and user interface designers. As the existing structure only implements the implicit and other regulated links, the system will still be consistent when changes or expansions to a language are made. The user will always keep the overview of the inter-relations between the patterns of the language and will be guided only to the necessary information. Thus, pattern languages that use the presented hierarchical structuring approach might add a new level of design experience to the HCI community.


1167

Two evaluation prototypes of pattern languages that use our structural approach are currently under way at our lab. One language is targeted at patterns for designing and evaluating web applications. The other one is dedicated to the organization of patterns for the semi-automatic design of user interfaces for industrial automation applications.

References [1] Alexander, C., Ishikawa, S., Silverstein, M.: A pattern language. Oxford University Press, Oxford, UK (1977) [2] Alexander, C.: The Timeless Way of Building. University Press, Oxford (1979) [3] van Duyne, D.K.: The design of sites, The Patterns, Principles and Processes for Crafting a Customer-Centered Web Experience, 1st edn. Addison Wesley, Reading, MA (2002) [4] van Welie, M., Traetteberg, H.: Interaction Patterns in User Interfaces. In: 7th Pattern Languages of Programs Conference, August 13-16, 2000, Allerton Park Monticello, USA (2000) [5] Tidwell, J.: Interaction Design Patterns. In: Proceedings of the Pattern Languages of Programming PLoP98 (1998) [6] Tidwell, J.: http://www.designinginterfaces.com [7] Marcus, A.: Patterns within Patterns, Interactions 11(2), 28–34 (2004) [8] Tiedtke, T., Krach, T., Märtin, C.: Multi-Level Patterns for the Planes of User Experience. In: Proc. of HCI International. Theories Models and Processes in HCI, Las Vegas, Nevada USA, July 22-27, 2005. Theories Models and Processes in HCI, vol. 4, pp. 22–27. Lawrence Erlbaum Associates, Mahwah, NJ (2005) [9] Gamma, E., et al.: Design Patterns. Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading Mass (1995) [10] Seffah, A., Forbrig, P.: Multiple User Interfaces: Towards a Task-Driven and PatternsOriented Design Model. In: Forbrig, P., Limbourg, Q., Urban, B., Vanderdonckt, J. (eds.) DSV-IS 2002. LNCS, vol. 2545, pp. 118–132. Springer, Heidelberg (2002)

Integrating Authoring Tools into Model-Driven Development of Interactive Multimedia Applications Andreas Pleuß and Heinrich Hußmann Department of Computer Science, University of Munich Munich, Germany {pleuss, hussmann}@cip.ifi.lmu.de http://www.medien.ifi.lmu.de

Abstract. The Multimedia Modeling Language (MML) is a platformindependent modeling language for model-driven development of interactive multimedia applications. Using models provides several advantages like well-structured applications and better coordination of the different developer groups involved in the development process. However, the creative tasks – like graphical design of the user interface and the design of media objects – are better supported by traditional informal methods and tools. In particular multimedia authoring tools such as Adobe Flash are well established for multimedia application development. In this paper we show how MML and authoring tools can be integrated by the example of Flash. Therefore we transform the MML models into code skeletons which can be directly loaded into the Flash authoring tool to perform the creative design tasks and finalize the application. In that way, the strengths of models and authoring tools are combined. The paper shows the required level of abstraction for the models, introduces a metamodel and a suitable code structure for the Flash platform, and finally presents the transformation.

1

Introduction

Through the growing pervasion of every day life with computers, many application areas appear where rich and comfortable and eventually entertaining user interfaces become more and more natural. In this paper we deal with “multimedia user interfaces” which make intensive use of different kind of media – like audio, video, graphics, and animation – and provide sophisticated user interfaces adapted individually to the user’s tasks and information. In particular, we address highly interactive systems which may include complex application logic. Classical examples are e-learning or training applications, simulation or computer games. New additional application areas are for instance home entertainment systems or infotainment systems in cars. Such interactive multimedia applications are often developed using authoring tools such as Adobe Flash, which includes the programming language ActionScript. Such tools provide excellent support for the creative development tasks. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1168–1177, 2007. c Springer-Verlag Berlin Heidelberg 2007

Integrating Authoring Tools into Model-Driven Development

1169

However, they lack of support for structuring the application. The ActionScript code can be scattered all over the application and is very difficult to maintain. Furthermore, there is very low support for teamwork and for coordination between the different developer groups for user interface design, software design and media design. The need for a better support of software engineering principles into multimedia application development is clearly stated by various publications in this area (e.g. [1]). To address this issues, we propose in [2,3] a modeling language for modeldriven development of multimedia applications called Multimedia Modeling Language (MML). Our idea is to combine this approach with the advantages of the existing tool. Therefore during the design phase only the overall structure and behavior of the application is specified in the MML models. Detailed behavior and concrete visual design should however be created in the authoring tools. We achieve this by generating code skeletons from the MML models which can be directly loaded into the authoring tools for the implementation phase. The code skeletons contain placeholders which have then to be filled out and arranged within the tool. In this paper we demonstrate the feasibility of this concept using as target platform the Flash authoring tool. The paper is structured as follows: In section 2 we briefly summarize MML. Section 3 introduces the target platform, Adobe Flash, and presents an overview on the main ideas of the approach. Section 4 elaborates a suitable code structure for Flash applications, summarizes the transformation and shows how the resulting documents are processed within the authoring tool.

2

MML

The Multimedia Modeling Language (MML) is a platform-independent language for model-driven development of multimedia applications. It supports a design phase for multimedia applications and allows generating code skeletons for different platforms. The language bases on UML 2.0 and integrates concepts from different approaches in user interface and multimedia modeling. Based on the results of requirement analysis - like user task models [4] and storyboards – four kinds of models are provided: structural model, scene model, abstract user interface model (which is enhanced to a media user interface model ) and interaction model. In the following we briefly summarize the MML models referring as example to a Jump’n Run gaming application like those at [5]. For further details on MML please see [2,3]. The structural model describes the structure of the application logic (domain model) in terms of an extended UML class diagram. The classes from the domain model are referred to as application entities. They can be associated with media components. For example a character in a Jump’n Run game is represented by an animation. If required, the inner structure of the media components can be defined. This is only necessary when its inner structure is relevant for other parts of the application. An example is a character in a Jump’n Run game: its legs should be animated when the character moves. Thus, the legs have to be realized

1170

A. Pleuß and H. Hußmann

as moveable parts of the animation and in some cases it must also be possible to access them from the application logic. As such issues often concern different developer groups – usually the software designer and the media designer – it is important to define them in the model. The scene model defines the scenes of the application and the transitions between them in terms of an adapted UML state chart. A scene represents a specific state of the application’s user interface and is an abstraction of the screen concept in graphical user interfaces. Scenes in a Jump’n Run game are for instance the menu, the game and the highscore. Through their dynamic character caused by temporal media objects, multimedia scenes have an inner state specified by attributes as well as operations to affect their state. In particular, they have so-called entry-operations to initialize the scene and exit-operations to clean it up. The abstract user interface model describes for each scene the user interface in terms of abstract user interface components (AUI components). For the AUI components in MML we reuse the concepts provided by user interface modeling approaches (e.g. [6,7]). The set of AUI components currently supported in MML includes input component, output component, action component, and different specializations of them. The AUI components required within a scene can be derived for instance from task models and are usually specified by the user interface designer. As a core concept of MML we enhance the abstract user interface with relationships to the media components from the structural model: some of the AUI components could be realized by one or more of the media components. Obviously, often output components are realized by media components. But also input components can be realized by media components as e.g. an animation can be clicked or dragged and dropped. Furthermore, the abstract user interface model is enhanced with sensors. A sensor represents an event caused by a temporal media component, like collision sensors for animations (triggering an event when an animation is moved over another object on the screen) or a time sensors for a videos triggering an event when the video has reached a specific point on its timeline. The enhanced abstract user interface model is referred to as media user interface model. Finally, the interaction model describes for each scene how the AUI components and the sensors from the media user interface model trigger operations from the structural model. This model is an adapted UML activity diagram where the actions are restricted to operations calls.

3

General Approach for the Target Platform

For the transformation from MML models to Flash code skeletons we consider the concepts from model-driven development (see e.g. [8]), such as explicit metamodels (to define the models) and explicit, modular transformations between them. MML is defined using a MOF-compliant metamodel. We use the Eclipse Modeling Framework (EMF ) for its implementation. The transformations are

Integrating Authoring Tools into Model-Driven Development FlashDocument

+document 1

backgroundColor : String frameRate : Double height : Integer width : Integer

1171

+document 0..1

timeline 1..n

Timeline

1 +timeline

Layer

+layer 1..n

+timeline 1 +library 1

0..n +frame +layer 1

Frame

0..1 +frame

1 +frame

Library 0..n +element +library

1

Element depth : Integer height : Double left : Double name : String top : Double width : Double

0..1 +actionScript 0..n +item

ASScript

Item

Class

name : String

0..1 +actionScript

+actionScript 0..1 0..1 +symbolItem

BitmapItem SoundItem

FontItem

SymbolItem

Instance +libraryItem 1

VideoItem MovieClip

Shape

Text

0..1 +symbol

SymbolInstance

ComponentInstance

Fig. 1. Extract of Flash metamodel including the main elements of Flash documents

defined using the Atlas Transformation Language (ATL), a declarative language close to the OMG standard QVT (Queries, Views and Transformations) [9]. The transformation is performed in two steps: first, the MML models are transformed to Flash models. Therefore we specify a Flash metamodel which is presented in this section. The actual mapping from the platform-independent MML concepts into the platform-specific concepts of Flash is performed during this transformation. In a second step, we transform the Flash model into the final code skeletons. This is mainly a straightforward transformation. However, it is more complex than conventional code generation (like transformation from a Java model into Java code), as we aim to generate files for the Flash authoring tool. The resulting files can be directly loaded and processed in the tool using its sophisticated support for the creative design tasks. This requires on the one hand strict compliance to the authoring tool’s internal document structures and on the other hand a solution how to produce the corresponding binary files. In this section we summarize the capabilities of the Flash authoring tool and the resulting general structure of Flash applications. The main concepts (in the following denoted in italics) are reflected in the simplified extract of our Flash metamodel in figure 1. Afterwards we explain our general approach for creating code for the authoring tool. On that base we propose in section 4 a mapping from MML models to a suitable Flash application skeleton. The Flash authoring tool was originally developed for the creation of graphics and animations. The tool is timeline-based, i.e. the temporal dimension of animations and behavior is represented by a timeline consisting of several frames. A frame owns a two-dimensional space (called stage, not part of the metamodel) where 2D vector graphics (shapes and text ) and other media objects can be placed. A third dimension (z-axis) is realized by layers to define which object is the topmost when several objects overlap each other on the stage. An

1172


animation means that some graphics changes (e.g. its position) over the time, i.e. over the frames in a timeline. A symbol is complex graphical object (often the term movie clip is used as synonym as movie clip is the most important type of symbol). A symbol contains a timeline which can contain (in its frames) any content as the main timeline. This means that a symbol may contain any complex content, even symbols and animations. Thus, symbols and animations can be hierarchically nested in arbitrary depth. Each Flash document contains a library which contains all media objects of the document. When a symbol is created or any media object is imported into the authoring tool, it is automatically added to the library. The items in the library can be instantiated multiple times in one or more frames on the timeline. An instance usually has an instance name and a location within a frame. Since version 4 a scripting language is included in Flash, called ActionScript which continuously evolved. In Flash MX 2004 ActionScript 2 was introduced which supports the object-oriented concept of classes. Classes have to be specified in separate ActionScript class files. In particular, it is now possible to attach ActionScript classes to symbols in the library of a Flash document. This is a very interesting opportunity, as symbol and associated code then build together a complex object consisting of programming logic and a (possibly very complex) visual representation. The associated ActionScript class has automatically access to all properties of the symbol, as if they were class properties, including visual elements nested inside the symbol. Furthermore events on the symbol (e.g. mouse clicks) can be processed in the class by just specifying corresponding event handler operations. Such a connection between symbols and ActionScript classes is an important concept which we intensively use in our generated code (see section 4). The file format for the flash documents is a proprietary binary format with the file extension FLA. For execution the files are compiled into SWF files which run within the Flash player available as plugin for Browsers. SWF is an open format, but as it is a complied format SWF files can not be edited comfortably within the authoring tool. Hence, for our purposes we aim to generate FLA files. To solve the problem of creating the proprietary FLA files, we use the mechanism of extensions for the Flash authoring tool. They must be specified in JavaScript and allow to automate every action, which can be done manually in the authoring tool, e.g. creating symbols. For that purpose the tool provides a kind of document object model, similar to that in browsers for HTML documents. These scripts must have the file extension JSFL and can be executed either within the authoring tool or from the command line (if the Flash authoring tool is available in the system). We use this mechanism to generate FLA files by generating a JSFL file which can be executed on the command line and then creates the FLA content according to the Flash model (figure 2). A core problem in Flash is the low support for structuring the applications. The program flow of the application can be determined for instance by ActionScript code, by the timeline or by a combination of both. ActionScript code can be attached to symbols, symbol instances, and frames. The sources on Flash


1173

Execution of JSFL file

JSFL File MML Model

FLA Files MML Model FLA Files

Flash Model

ATL Transformation

FLA Files ActionScript MML Model Class Files

Fig. 2. Approach for the overall transformation

in the literature and in the web provide various frequently used patterns for many different problems in small scale, but it exits no common solution for the overall structure of Flash applications. An important contribution into this direction is provided by [10] who applies several object-oriented patterns in Flash – e.g. the Model-View-Controller pattern (MVC) – and presents a framework for the overall application structure. However, this approach is restricted to ActionScript code and completely omits the usage of the authoring tool, and is hence not suitable for our purposes. On the other hand, the feedback in the web (e.g. in forums like http://flashforum.de) on books like this as well as the latest changes in Flash provided by Adobe show the general demand for a better support of software engineering principles in tools like Flash.

4

Transformation and Resulting Flash Code Skeletons

In this section we describe how to transform the platform-independent MML models into useful code skeletons in Flash. As described in section 3, the literature on Flash provides various different patterns, but there is no common solution for an overall structure of Flash applications which includes both: ActionScript and the features of the authoring tool. Thus, two issues have to be addressed: first we need to identify a suitable structure for Flash applications. Then, the concrete mapping from MML model elements into this structure has to be defined. Our proposed structure is based on the following considerations: The most important requirement for the Flash application structure is the usage of the authoring tool for creating and editing visual objects. Hence we generate FLA files which contain placeholders (annotated rectangles, see figure 5) for the media components and the AUI components. Besides, the application structure should be well-structured using common concepts, to avoid restriction to specific purpose or specific size and to enable an easy understanding of the generated code. Thus, we use object-oriented ActionScript code for the non-visual parts of the application. We make use of the ability of ActionScript 2 and place all ActionScript code into separate class files. As proposed e.g. by [10], we use the MVC-pattern to structure the ActionScript code. For the connections between

1174


Scene1 for each scene in the application

> AUI_Component1

sharedMedia once for the whole application

Scene1

attached

> Scene1 attached

references (only if realized by a media object)

AUI_Component1

model

Media1 > MediaComponent1

ApplicationEntity1

Fig. 3. General structure of Flash applications resulting from the transformation. The names of the artifacts indicate the MML model elements which they result from.

the visual elements in the authoring tool and the corresponding ActionScript code we apply the ability to attach ActionScript classes to movie clips. User interface objects of others kinds are just encapsulated into movie clips. To support teamwork, we divide the FLA part of the application into many small FLA files. Efficient version management of a single FLA file is usually not possible as FLA is a binary file format. To support development of large applications we provide a package structure for the ActionScript classes and a folder structure for the FLA files. Figure 3 shows an overview on the resulting structure. The element names in figure 3 indicate the MML model elements where they are derived from during the transformation. The ActionScript classes for the application entities contain the class properties derived from the MML class diagram. For operations we generate only the operation signature, as the operation body is not specified in MML. We believe that the operation bodies are specified more efficiently directly in the target language manually, using the platformspecific constructs and libraries. The classes generated from application entities correspond to the ‘model’ in terms of the MVC-pattern. The ActionScript classes for the scenes contain operations which perform the transitions between the scenes according to the MML scene model. The ActionScript classes for the AUI components contain event handler operations (depending on the type of AUI). They correspond to the ‘controller’ in terms of the MVC-pattern. For each media component in MML we generate a separate FLA file containing a movie clip in its library which encapsulates a placeholder. The movie clip has attached a name which can be used to refer on it from other files. This ensures that media component can be reused multiple times within an application, as this is possible in MML. If the media component is kind of graphics or animation the placeholder will usually be filled out directly in the Flash authoring tool. For instance, for an animation heroAnimation in a Jump’n Run application a FLA document heroAnimation is generated which contains in its library a movie clip heroAnimation containing a placeholder (figure 4(a)). The generated movie clip


(a) Placeholder in the library

(b) After double-click

1175

(c) Replacing with custom content

Fig. 4. Replacing the movie clip generated for the media component heroAnimation

Fig. 5. Screenshot from the Flash authoring tool showing a FLA document generated for the scene Game. (The window is reduced to the most important elements.)

can be edited in the authoring tool as easily as any other manually created movie clip: a double-click on the movie clip opens its content on the stage (figure 4(b)) where it can be replaced by any graphics or animation using the authoring tool’s various editing capabilities (figure 4(c)). Other media objects (which can not be created in Flash) will be imported from the file system into the movie clip. The FLA files generated for the scenes contain the actual user interface of the application (the different ‘screens’). They contain the elements generated for the AUI components from the MML model. Figure 5 shows as an example a screenshot of the Flash authoring tool after loading the FLA document generated for a scene Game of a Jump’n Run application. Each AUI component is represented by a movie clip (in the library and on the stage) which encapsulates its specific content. This allows us to directly associate it with the corresponding ActionScript class. AUI components which are not realized by media components are mapped to conventional Flash widget components (located into the encapsulating movie clip). In this case the encapsulating movie clip has no own visual representation on the stage beside the contained widgets. The widgets are labeled with the element name. In figure 5, there are three (invisible) movie clips: one for the output component playerName containing a generated text label, one for the output component playerScore also containing a generated text label, and one

1176


for the action component exit containing a generated button. The movie clips representing the AUI component can encapsulate multiple widgets if necessary, for example a text field and a related text label. As explained in section 3 the elements nested into a movie clip can be accessed from the associated ActionScript class as if they were class properties. If the AUI component is realized by a media component (in the MML model) the generated code uses another ability of the Flash authoring tool: to reuse a movie clip in multiple documents it can be referenced by other movie clips in other FLA documents. In this case the destination movie clip retains it original name and properties, but its contents are replaced with those of the referenced movie clip. Changes in the referenced movie clip appear also in the referencing movie clip. We use this mechanism to reuse the movie clips generated for the media components (e.g. heroAnimation in figure 4) in one ore more scenes. For instance the scene Game in figure 5 contains a movie clip heroAnimation which references the heroAnimation from heroAnimation.fla. In the screenshot, the referenced heroAnimation has already been edited while the referenced enemyAnimation and platformGraphics currently still contain their default placeholder rectangles.

5

Conclusion

In this paper we present a transformation from MML, a language for modeldriven development of interactive multimedia applications, to code skeletons for the widespread and professional authoring tool Adobe Flash. Further technical contributions of the paper are the MOF-compliant Flash metamodel and the proposed general structure for Flash applications. As they are independent from the modeling approach, they can be reused for other projects which aim to make use of the Flash authoring tool, e.g. web-engineering approaches which aim to generate rich internet applications (e.g. [11]). Currently, approaches in this area usually use e.g. frameworks like Flex, but they do not support individual user interfaces created in the Flash authoring tool. Our approach bases on existing concepts from the literature where possible. In particular, we use the abstract user interface model which is common to many approaches in the field of user interface modeling [12]. Thus, it is possible to combine the multimedia-specific aspects from our approach e.g. with concepts for context-sensitive user interfaces as presented e.g. in [6]. Our work also bases on concepts from [13], an existing modeling approach for multimedia applications. However, to our knowledge none of the approaches aims for generation of code skeletons for an authoring tool like Flash. MML and the Flash metamodel are implemented using the Eclipse Modeling Framework (EMF ). Currently no custom MML editor exists but there is an extension for the UML tool MagicDraw which allows creating MML models. The transformations are specified with ATL (see section 3). First user test with the presented concepts were performed in several student projects, mainly in the lecture “multimedia programming” where students developed in teams of 5


1177

to 6 persons (relatively complex) multimedia applications with MML (see [5]). The lessons learned from these practical projects are already integrated into the current version of MML and the Flash code structure. In all, the paper provides a general proof of concept for the integration of models and authoring tools and shows the required level of abstraction for the models. This results in a combination of the strengths of both technologies: well-structured applications and better coordinated cooperation of developers through models as well as excellent support for the creative design by established authoring tools. In general, the idea of integrating modeling with more informal techniques and tools for the creative development tasks might be another step towards a better integration of software engineering and human-computer interaction.

References 1. Hirakawa, M.: Do Software Engineers Like Multimedia? In: IEEE International Conference on Multimedia Computing and Systems (ICMCS). IEEE (1999) 2. Pleuß, A: Modeling the User Interface of Multimedia Applications. In: Wang, F. (ed.) FORTE 2005. LNCS, vol. 3731, Springer, Heidelberg (2005) 3. Pleuß, A.: MML: A Modeling Language for Interactive Multimedia Applications. In: 7th IEEE International Symposium on Multimedia (ISM 2005). IEEE (2005) 4. Patern´ o, F., Mancini, C., Meniconi, S.: ConcurTaskTrees: A Diagrammatic Notation for Specifying Task Models. In: Interact’97, Chapman & Hall, Sydney, Australia (1997) 5. University of Munich: Lecture Multimedia-Programmierung, Summer Term 2006 (2006) http://www.medien.ifi.lmu.de/studiengang-neu/galerie/mmp-ss06/ 6. Van den Bergh, J., Coninx, K.: Towards Modeling Context-Sensitive Interactive Applications. In: SoftVis 2005, ACM Press, New York (2005) 7. Constantine, L.L.: Canonical abstract prototypes for abstract visual and interaction. In: DSV-IS 2003. LNCS, vol. 2844, Springer, Heidelberg (2003) 8. Kleppe, A., Warmer, J., Bast, W.: MDA Explained. Addison-Wesley, Reading (2003) 9. Jouault, F., Kurtev, I.: On the architectural alignment of atl and qvt. In: Proceedings of the 2006 ACM Symposium on Applied Computing (SAC), ACM, New York (2006) 10. Moock, C.: Essential ActionScript 2.0. O’Reilly Media (2004) 11. Bozzon, A., Comai, S., Fraternali, P., Carughi, G.T.: Capturing RIA concepts in a web modeling language. In: WWW 2006, ACM, New York (2006) 12. Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Souchon, N., Bouillon, L., Florins, M., Vanderdonckt, J.: Plasticity of user interfaces: A revised reference framework. In: TAMODIA, INFOREC Publishing House Bucharest (2002) 13. Sauer, S., Engels, G.: Uml-based behavior specification of interactive multimedia applications. In: HCC’01, IEEE (2001)

A Survey on Transformation Tools for Model Based User Interface Development Robbie Schaefer Paderborn University, C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected]

Abstract. As a wide variety of interaction devices, modalities has to be supported by user interface developers, model-based user interface development gets increasing attention. Especially if context- and user-awareness comes into play, handcrafting a user interface is rendered almost impossible. In modelbased user interface development, usually several models are applied to describe different aspects of the user interface or to provide a varying level of detail. The relations between the models representing those levels of abstractions are established through transformations, a concept which is also applied in software engineering with the Model Driven Architecture (MDA). In this paper we will review several transformation systems and discuss their applicability for model-based user interface development. Keywords: User Interface Engineering, Model Driven Architecture, Model Based User Interface Development, Transformation Tools.

1 Introduction Model Driven Architecture and Model Based User Interface Development are quite similar since both use models to describe static and dynamic system properties on different levels of abstractions and use transformations from one model to another. In the Model Driven Architecture (MDA) [19], transformations between different models have been identified to be of key importance [9] and in order to classify the different model transformations, an extensive taxonomy has been proposed by Czarnecki and Helsen [7]. However, classical MDA-approaches have been lacking sound models for the engineering of user interfaces, although some UI development methodologies have been aligned with the OMG standards, as for example the UML based architecture Wisdom [14] and an MDA compliant environment around UsiXML based tools [22]. The Cameleon Reference Framework [5] provides a concensus on the types of UI models used for the different levels of abstractions, namely the tasks and concepts, the abstract user interface (AUI), the concrete user interface (CUI) and the final user interface (FUI). As with the MDA, transformation tools have to be used to move from one layer of abstraction to another or to adapt these models to different contexts of use. As manifold as the number of involved models, their presentations and tools are the used transformation methods in current practice: Some approaches work on the J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1178–1187, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Survey on Transformation Tools for Model Based User Interface Development

1179

models while other work on their representations, some are integrated in the model while others are applied externally, some are observable and modifiable while others are hardcoded and not accessible. For this reason, we will review several transformation systems and discuss their applicability for model-based user interface development. The considered transformation approaches and tools are graph transformations (GT) as applied in UsiXML [13], ATL [10], TXL [6], 4DML [4], UIML's internal transformation capability [1], XSLT [11], GAC [8] and RDL/TT [18].

2 Selection and Comparison Criteria Transformations are an essential part of many domains in computer science and applied computing for example the transformation of programmes, data and models. Since in this survey the focus is on transformations beeing used for engineering of user interfaces, a selection had to be made from the plethora of transformation tools. Even if we would constrain ourself to model transformation approaches used in the MDA, as done in [7], we would end up in a comparison of more than twenty canditates with a rich set of comparison features. However, model-based user interface development took a variety of paths in the past and provided several other models and transformations than used in software engineering only. For this reason we did not only select model transformation tools from the MDA but also transformation tools which are common practice in model-based UI development or approaches which have interesting properties that may be exploited in engineering user interfaces and in fact have been used for that purpose. In order to compare the selected transformation tools, we did not go into the same level of detail as in [7] but took a rather practical approach and looked at several criteria which are of particular importance with model-based UI development. First of all, the programming model is compared: While the distinction between imperative and declarative programming does not allow evaluating the expressiveness of the approach it is important with respect to the UI developer's familarity with one or the other approach. Furthermore, we looked at the capabilities to transform models, XML and code. The distinction between model- and XML-transformation is needed, since many UImodels are described in XML and also a lot of FUIs are XML-based e.g. XHTML, XForms, WML, VoiceXML etc. Another important aspect is, whether the transformation approach is capable of generating code beyond XML-based FUIs, while the ability of complex mapping as opposed to linear mapping is an evidence for the expressiveness of the approach. Furthermore, we looked at the extensibility and parameterizability of the tools which make the transformation possibilities more versatile, especially with plasticity of the user interfaces for different devices, contexts, users and modalities in mind.

3 Selected Transformation Approaches Before comparing the selected transformation languages, we will give a brief introduction to each of them and show how they have been used for the

1180

R. Schaefer

transformation of user interface models or representations and as such would with in a model based user interface design modality. 3.1 Graph Transformations (GT) in UsiXML A formal, purely declarative approach for model transformations is established with graph transformations as shown in [3], since many models can be designed with an underlying structure of directed graph. Graph transformations are quite common for tools in the MDA-domain, for example in AtoM3 [12]. For this survey, we selected UsiXML as a candidate which used GT since UsiXML is specifically designed for the multi-path development of User Interfaces and one of the first approaches which have been proven to be MDA-compliant [22]. The models UsiXML is based on are based on graphs and therefore the model mappings of UsiXML are specified with graph transformations which consist of a set of transformation rules [13]. Each rule consists of a Left Hand Side (LHS) matching a graph G, a Negative Application Condition (NAC) not matching G and a Right Hand Side which is the result of the transformation. The LHS may also further be augmented by additional attributes to further constrain the matches and thus adding to the expressiveness. Since graph transformations allow mappings between any models that are based on a graph, UsiXML thus allows reification, abstraction and translation between the models. The limitations with this approach are only with the construction (or reengineering) of the FUI, since the FUI usually is not represented as a graph. Translations between to different FUI formats are also not possible nor intended with UsiXML. 3.2 ATL Another language used for model transformation is ATL [10]. ATL follows a hybrid approach in a way that the user is in a position to select, whether to use ATL purely declarative or to employ imperative features in addition. The declarative aspect is provided by the approach of matching rules, where a source pattern is described through a set of source types and an OCL-expression which constrains the source types. The target pattern is constructed a similar way by specifying a set of target types form the target meta-model and a set of bindings which are used to initialize the features of the target types. While this declarative approach is very straightforward, it may be hard to specify more complex rules. For this case, ATL offers to add an action block with imperative constructs to the rules or even allows calling external code for the logic. Since ATL operates on the models themselves – not even on the representations of the models such as XML-representations – it is not suited as a transcoding tool for other purposes but only for model transformations. ATL has been successfully applied for the model driven engineering of plastic user interfaces [21]. 3.3 TXL TXL [6] is a transformation language which is designed for multiple purposes, especial for the transformation of programming languages, and is not constrained to any source or target format. This is established through two components:


1181

• A specification of the structure to be transformed based on grammars in the Backus Naur Form. • A set of transformation rules based on pattern/replacement pairs and functional programming Since the rules are specified in a functional way and the first part of a TXL specification only describes the grammatical structure, TXL can be considered also to be a mostly declarative language. TXL has been also proven to be capable of model transformations [15]. In fact the grammar support allows taking the final step from a concrete model to a representation in a programming language, which is not possible with graph transformations. 3.4 4DML The transformation language of 4DML (four-dimensional markup language) [4] has been originally developed in order to adapt web content to people with special needs and is therefore considered here. It is designed to transform different notations and as such serves a similar rich application domain as TXL. But while TXL is intended for transforming programming languages which can be represented as a syntax tree, 4DML supports the transformation of data which comes in a matrix-like structure. The transformation of 4DML documents is done purely declarative through the definition of a source pattern matching and the definition of a target model. While 4DML seems to be strong in transforming between completely different languages where the source is organized in an n-dimensional structure, it is a bit artificial to impose a matrix structure on documents which are organized as trees or graphs. 3.5 UIML Peers The User Interface Markup Language (UIML) [1] is an XML-based language to describe all relevant aspects of a user interface such as structure, style, content and behavior. A genuine aspect of UIML is the capability to define connections to the backend logic and to provide a vocabulary which maps UIML to other UIML instances or target languages. The latter two aspects are covered in the "peers"-section of UIML and provide the transformation features for this survey. The "presentation"section of UIML includes mappings of classes (and components) and their properties to target format constructs, while the "logic"-section is used to manage the connection to the application logic. A presentation section usually has a name, which allows different presentation sections to be provided for different target formats. For example, there may be a presentation section for VoiceXML and for HTML. Mappings of classes and their properties are not necessarily restricted to XMLbased formats but may also be mapped to, e.g., Java constructs. Since UIML’s mapping facility matches class names and provides new values for the matched objects, it can be regarded as declarative. This is however only linear and therefore too simple to support complex restructuring tasks. This results also in limited use, when it comes to model transformation. The obvious advantage of the UIML approach is that the abstract UI-representation and the transformation to the FUI can be specified in the same language. Therefore the model based approach using

1182

R. Schaefer

UIML as presented in [2] uses UIML internal mappings from the AUI to CUI and from CUI to the FUI but requires an external transcoding approach from the task-level to the AUI. 3.6 XSLT The transformation language XSLT [11] is designed for the purpose of transforming the XML-based input to textual (mostly XML-based) output. The input of an XSLT program is a set of XML-based documents. The output can be XML, or plain text. With plain text output, an XSLT processor can generate languages different from XML. An XSLT definition defines a set of template rules which associate patterns with templates. Each rule consists of a matching pattern, optional mode and priority attributes, and a template. Matching pattern expressions are defined by a subset of the XPath language and are evaluated with respect to a currently processed (matched) node or the root node. The matching process considers the node’s name, attributes, location in the tree and position in the list and results in a set of nodes that can be used to provide parameters for the template or as a base for further matching. XPath supports the processing of node-sets and covers five additional basic types: booleans, numbers, strings, node sets, and tree fragments. Processing usually starts at the root node. When a pattern is successfully matched, the pattern is associated with the template, the template (construction pattern) is recursively executed, mode is possibly changed, and matching is optionally continued from each matched node. For execution, XSLT provides variables and parameters which can be passed between template rules. For pattern processing, XSLT provides literals, constants, variables, and keys (for cross referencing) with conditions, list iterations, recursion, sorting, and numbering as control structures. For advanced processing, XSLT covers a powerful set of built-in string functions for creation, deletion, replacement, copying, and concatenation. While the XSLT processing foundation lies in functional programming, the processing allows imperative statements such as iterations and conditions. Therefore XSLT can be considered to be a hybrid approach. 3.7 GAC The General Adaptation Component (GAC) [8] has been developed to make web applications more adaptable. In contrast to the other presented transformation languages, GAC provides explicit means to reference context data to control the adaptation process and is able to modify the contextual data. Since its purpose is to adapt web content it is able to process HTML and XML in general. The architecture of GAC is as such notable that it does not use XSLT to describe the transformation rules - which is otherwise a quite common practice in that domain - but provides an RDF-based configuration of the adaptation process. The GAC configuration consists of rules which are bound to conditions. The rules can be of two types for adaptation and for updating the usage context. The adaptation rules allow deletion and substitution of XML fragments as well separation – the process of sourcing fragments out and making them accessible via links – and the inverse process. As the rules provide clear instructions of which operations to perform when a condition holds, we consider this approach to be more imperative.


1183

3.8 RDL/TT The Rule Description Language for Tree Transformation (RDL/TT) [18] evolved from a domain specific language for adapting Web-content to different devices into a transcoding language for multiple purposes including context-dependent transformations of XML-based UI descriptions. RDL/TT employs a Java-oriented syntax to define the transformation rules which operate on the DOM-tree of the XMLdocument. It defines simple search patterns based on tag-names or a collection of tags with complex restructuring rules on the found matches. A notable property of RDL/TT is the use of variables which may convey contextual information that allow different flows of transcoding operations for varying preferences, target platforms and contexts of use. The transcoding rules are specified in an imperative manner and provide several control structures such as branches and loops together with calls to predefined transcoding functions. The set of transcoding functions is extensible by compiling additional transcoding libraries to the tool, which for example has been used to include image processing rules to adapt visual content besides the user interface istself. In practice RDL/TT has for example been used for context-based adaptation of web content in [17] and with a generic user interface format in [16]

4 Comparison and Discussion Table 1 shows the support of different transformation characteristics for the languages we discussed. If a feature is supported, it is marked wit a “+” in the table, if not it is marked with “-”. If a supported feature is put in brackets, it means that it is in principle supported (maybe with some additional effort) but that the language is not specifically designed to support that property. Table 1. Comparison of general transformation language properties

Feature Declarative Imperative Model Transformation XML Transformation Code Transformation Code Generation Complex Mapping Extensible Parameterizable

ATL + + +

GT + +

TXL + (+)

4DML + (+)

XSLT + + (+)

GAC + (+)

UIML + (+)

RDL + (+)

-

-

(+)

(+)

+

+

-

+

-

-

+

(+)

-

-

-

-

-

-

+

+

(+)

-

+

(+)

+

+

+

+

+

+

-

+

+ -

-

-

-

-

+

-

+ +

1184

R. Schaefer

While the distinction between declarative and imperative programming tells nothing about the capability of the transcoding language, it may be a selection criteria for programmers who feel more familiar in one of these programming models. On the other hand a clear distinction between declarative and imperative transformation systems is not always possible. While for example Graph Transformations (GT) are clearly declarative, the declarative aspects of TXL for instance are a bit diluted. For this reason, the marks for declarative and imperative indicate the strongest tendencies. With respect to the ability to transform models (and as such to transform UI models), only ATL and graph transformations are really designed for it. However, tools which operate on XML Documents are capable to process the XML representations of the models. So XSLT, GAC and RDL/TT are principally capable of model transformations. For TXL it is also possible, but here the structure must be established first through an according grammar, and also 4DML has first to establish the structure, which actually counts for any type of input for 4DML. The ability for model transformation is poorest available in the UIML peers section. In fact it is not designed for model transformation at all but it allows at least the transformation from the AUI to the CUI and to the FUI. The latter transformation process is the actual goal for the peers section. Since graph transformations and ATL are designed to work on the model only, they are not capable of processing general XML documents and even less on arbitrary code.1 XSLT, RDL/TT and GAC only work with XML Documents and as such are not usable for code transformation as done with TXL, although XSLT and RDL/TT can at least produce non XML code out of an XML Document. The code generation ability is however more evident in 4DML, TXL and UIML. In the scope of code transformation, UIML peers on the other hand is just able to match UIML elements, but suited very well to produce for different target languages. Besides UIML peers all considered languages allow complex mappings, which means restructuring the source of operation. UIML however only provides a linear one-toone mapping. While RDL/TT is both extensible with additional functionality and parameterizable, only few of the other approaches come with these features: ATL allows extensions by calling native operations and GAC is able to process and change context information. While in GAC the context modification happens within the rules, it is separated it in RDL: The context information is fed to variables but processing and modifying the context information is performed with a different rule set.

5 Conclusion In this paper, we compared several transformation tools / languages with respect to a selected set of criteria we considered of importance for model based user interface 1

Of course XML document trees can be interpreted as graphs and as such are potentially subject to graph transformations, but we look explicitly at embedded graph transformations as in UsiXML.


1185

development from a more practical view. For this reason we included the programming model, since it may be a premier choice for a developer being familiar with either declarative or imperative programming. We also identified different levels of transformations: Model to model, transformations on XML representation of models and code transformations, while the ability of generating code with a transformation tool is of equal importance to fulfill the complete modeling pipeline. While the capability of specifying complex transformations is very important for the most applications, extensibility and parameterizability is more important to a subset of user interface development tasks, for example for building context-dependent applications. In respect on the variety of the modeling tasks it is impossible to definitely recommend or dismiss one of the compared candidates which provide different strengths and weaknesses for different applications. For purely model driven approaches, graph transformations and ATL will be good choices but also the XML-processing tools will do, if the model representations come with an XMLsyntax. The most problems can be seen with 4DML since a matrix structure has to be established first, which rather unnatural for user interface models, although it supports indirectly most of the required features. The capabilities of the UIML peers section on the other hand provides only very few features but has proven to be very strong to connect to specific target toolkits. Therefore, in a short summary, the choice of the transformation tool largely depends on the models, their applied representation and the targeted application. Sometimes, a combination of different transformation approaches is advisable, for example when graph transformations are used for model to model transformations on higher levels of abstractions but the last step towards the final UI requires code generation. In addition, many user interface modeling tools come with internal transformations which are neither observable, nor controllable. However, their models may be of interest for the user interface developer and are often available with an XML schema. Therefore it makes sense to use the desired models with an own developed ruleset for one of the XML-processing approaches (XSLT, GAC, RDL/TT and to a lesser extent TXL and 4DML) to bypass the tools' internal fixed transformation and for example make transformations to new targets or improve the tools' transformation resuls. To further compare the performance, code size of the transcoding rule representations, ease of definition and other specific aspects of the transformation tools, more detailed tests are required. Something we did for UIML, XSLT and RDL/TT in the past [16]. Furthermore, the evaluation against the design features developed in [11] is appropriate to get a denser picture and provide user interface developers as well as modeling tool developers a higher level of detail for their choice of transformation approach.

1186

R. Schaefer

References 1. Abrams, M., Helms, J.: User Interface Markup Language (UIML) Specification, Working Draft 3.1. OASIS (2004) 2. Ali, M.F., Pérez-Quiñones, M., Abrams, M.: Building Multi-Platform User Interfaces with UIML. Multiple User Interfaces - Cross-Platform Applications and Context-Aware Interfaces, pp. 95–118. John Wiley & Sons, Ltd, New York (2004) 3. Andries, M., Engels, G., Habel, A., Hoffmann, B., Kreowski, H.-J., Kuske, S., Plump, D., Schürr, A., Taentzer, G.: Graph transformation for specification and programming. Science of Computer Programming 34(1), 1–54 (1999) 4. Brown, S.S.: Conversion of notations. Technical report, University of Cambridge (2004) 5. Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Bouillon, L., Vanderdonckt, J.: A Unifying Reference Framework for Multi-Target User Interfaces. Interacting with Computers 15(3), 289–308 (2003) 6. Cordy, J.R.: The TXL Source Transformation Language. Science of Computer Programming 61, 190–210 (2006) 7. Czarnecki, K., Helsen, S.: Classification of Model Transformation Approaches. OOPSLA’03 Workshop on Generative Techniques in the Context of Model-Driven Architecture (2003) 8. Fiala, Z., Houben, G.-J.: A Generic Transcoding Tool for Making Web Applications Adaptive. In: Proceedings of the CAiSE’05 Forum. CEUR Workshop Proceedings (2005) 9. Gerber, A., Lawley, M., Raymond, K., Steel, J., Wood, A.: Transformation: The Missing Link of MDA. In: Corradini, A., Ehrig, H., Kreowski, H.-J., Rozenberg, G. (eds.) ICGT 2002. LNCS, vol. 2505, pp. 90–105. Springer, Heidelberg (2002) 10. Jouault, F., Kurtev, I.: Transforming Models with ATL. In: Bruel, J.-M. (ed.) MoDELS 2005. LNCS, vol. 3844, pp. 128–138. Springer, Heidelberg (2006) 11. Kay, M.: XSL Transformations (XSLT) Version 2.0, W3C Working Draft. World Wide Web Consortium (2002) 12. Lara, d.J., Vangheluwe, H.: AToM3: A Tool for Multi-formalism and Meta-modelling. In: Kutsche, R.-D., Weber, H. (eds.) ETAPS 2002 and FASE 2002. LNCS, vol. 2306, pp. 174–188. Springer, Heidelberg (2002) 13. Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López-Jaquero, V.: Usixml: A language supporting multi-path development of user interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, pp. 200–220. Springer, Heidelberg (2005) 14. Nunes, N.J., e Cunha, J.F.: Wisdom - A UML Based Architecture for Interactive Systems. In: Palanque, P., Paternó, F. (eds.) DSV-IS 2000. LNCS, vol. 1946, pp. 191–205. Springer, Heidelberg (2001) 15. Paige, R., Radjenovic, A.: Towards Model Transformations with TXL. In: First International Workshop on Metamodelling for MDA, pp. 162–177 (2003) 16. Plomp, J., Schaefer, R., Mueller, W.: Comparing Transcoding Tools for Use with a Generic User Interface Format. Extreme Markup Languages (2002) 17. Schaefer, R., Mueller, W., Dangberg, A.: Fuzzy Rules for HTML Transcoding. In: Hawaii International Conference on System Sciences, HICSS 35 (2002) 18. Schaefer, R., Mueller, W., Dangberg, A.: RDL/TT, A: Description Language for the Profile-Dependent Transcoding of XML Documents. In: Proceedings of the first International ITEA Workshop on Virtual Home Environments (2002)


1187

19. Soley, R. and the OMG Staff Strategy Group: Model Driven Architecture. White Paper. Object Management Group (2000) 20. Sottet, J.-S., Calvary, G., Favre, J-M.: Towards Model Driven Engineering of plastic User Interfaces. Model Driven Development of Advanced User Interfaces MDDAUI ’05. In: CEUR Workshop Proceedings, vol. 159 (2005) 21. Vanderdonckt, J.: A MDA-Compliant Environment for Developing User Interfaces of Information Systems. In: Pastor, Ó., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 16–31. Springer, Heidelberg (2005)

A Task Model Proposal for Web Sites Usability Evaluation for the ErgoMonitor Environment André Luis Schwerz1, Marcelo Morandini2, and Sérgio Roberto da Silva1 1

Informatics Department – State University of Maringa Av. Colombo, 5790 – Bloco 19 – Centro – Maringá – PR, Brazil 2 School of Arts, Sciences and Humanities – University of São Paulo Av. Arlindo Bettio, 1000 – Ermelino Matarazzo – Brazil {[email protected], [email protected], [email protected]}

Abstract. In this paper we present a task model for the usability monitoring environment called ErgoMonitor. ErgoMonitor realize an usability evaluation in websites through selective collection and analyses of the data from log files referring to the real interactions that are established between final users and an web interface. Nevertheless, the ErgoMonitor depends on the users expected behaviors previous identification and this activity is conducted by a specialist in usability that must observe the website characteristics or be assisted by traditional usability evaluation previous diagnosis to define which tasks (behaviors) should be inserted in the evaluation script. In this way, we developed a mechanism to register the expected users behaviors conceiving the Monitoring Tasks and Behaviors Model. This mechanism enabled the ErgoMonitor to realize web sites usability evaluations based on their log files. Keywords: Interactive Systems Usability Evaluation; Human-Computer Interaction; Web Sites; Server Log Files.

1 Introduction The main quality in a user interface (UI) is its usability, which establishes how much the systems are projected of form to be easy to use and to learn [1], [6]. Thus, usability is the quality of use that an interactive system presents for its users (intended) in way that these can reach a specific set of objectives and tasks in a particular environment of work [3]. Usability evaluation (UE) methods possess several purposes, as for example: to aid in the future projects of the Web UI to reach the necessities of the user; to identify and to diagnosis problems; and to evaluate the implementation (comparing with other systems, and also for the acceptance test). The collected data during these UEs can be qualitative (descriptions that characterize the usability) or quantitative (measured that quantify the usability). Some UE methods of interactive systems found in literature [1], [4], [9], and [7] approaches the aspects qualitative of the evaluated UI, and they is not enough objective to quantify the efficiency, effectiveness and user satisfaction when to use the UI. Winckler et al [10] considers one Web UI evaluation methods based on the elaboration of the navigational model using the notation of J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1188–1197, 2007. © Springer-Verlag Berlin Heidelberg 2007

A Task Model Proposal for Web Sites Usability Evaluation

1189

StateWebCharts [2] in confrontation with the obtained scenes from tasks model, represented by the diagrammatic notation proposal in [6]. This method approaches the project´s initial phase evaluation, however it still needs tools to aid in the UE phase, analyzing as the users interacts to the Web UI based in the developed scenes. For application of an UE with the support of the ErgoMonitor, it is necessary that a specialist in usability and the application domain, which we call of evaluator-operator, specifies the tasks that compose the script of evaluation. However, in this environment, the evaluator-operator must identify manually the tasks of script in server log files. This work is very slow and susceptible to errors when the log files have a great amount of hits. In this way, this paper presents a proposal of a tool that reduce the work of the evaluator-operator to identify the tasks in server log files. Thus, this tool allows that the evaluator-operator elaborates the Monitoring Task and Behavior Model that will be used for the ErgoMonitor to realize monitoring of usability. Section 2 presents an overview of the main steps of the ErgoMonitor environment. Section 3 describes the proposal of Monitoring Task and Behavior Model as an improvement of the ErgoMonitor. Section 4 demonstrates as Monitoring Task and Behavior Model can be useful for cleaning of server log files. Section 5 presents and discussion the results of performance of ErgoMonitor in institutional website.

2 ErgoMonitor The ErgoMonitor environment was proposed as a usability monitoring system in Web UI through the collect and analyze of hits of the log files referring the real interactions between the users and the Web UI. On based selected data of the log files and through of expected behaviors models the tool calculates taxes and metrics that quantify the usability of website. All these procedures are realized of imperceptible and invisible way for the users. Thus, the user interacts normally with the Web UI and, while these interactions occur, the server stores the data relating to these interactions in its log files. Later, the ErgoMonitor collect and analyze these files to elaborate usability measures of Web UI. Figure 1 illustrates an overview of components of the ErgoMonitor environment. As can be seen, the environment is divided between components that represent the activities that are realized by evaluator-operator and components that represent the activities that are realized by software of the ErgoMonitor. In the next paragraphs we will describe each one of these components. Monitoring Analysis – In this activity the evaluator-operator observes the web application and selects which tasks must compose the Monitoring Tasks and Behavior Model. In this activity the evaluator-operator can be based on a traditional UE to define which tasks must be included in evaluation script. LogControl – This component is composed for programs group that realize a cleaning in server log files and accounting of each user behavior based in the Monitoring Tasks and Behavior Model. The expected output of LogControl is the Verified Tasks and Behavior Model that is support for the taxes and metrics determination. The LogControl will be described in Section 4.

1190

A.L. Schwerz, M. Morandini, and S.R. da Silva

Fig. 1. An overview of ErgoMonitor

Usability Taxes and Metrics Determination – This step determines the taxes and metrics that quantify a Web UI. A tax is an intermediate measure that qualifies the interaction. A metric represents a usability measure, as established by ISO 9241 norm [3] (basically metric of effectiveness and efficiency). Depending on the resources of the interface, the ErgoMonitor can be able to supply to the evaluator-operator taxes and metrics as demonstrated in Table 1. Historic Parameters Database – This component is a database that can store information of UE realized previously. The stored data in this base can be compared with the obtained data from the interaction of the users with a new Web UI (when there is the need of a re-design, for instance), verifying, thus, the contribution of the modifications realized in this new Web UI. Moreover, the evaluator, who had the experience with similar UI, can establish values to judge permissible for the users interactions. Thus, through of verified and permissible values, the evaluator is able to elaborate a monitoring report that may confirm the existence of usability problems, previously diagnosed, or indicate new problems that must be analyzed later. Table 1. Usability metrics and taxes generated by the ErgoMonitor environment Usability Taxes amount of Verified Behaviours (VB) amount of Successful Behaviours (SB) amount of SB with Help (SH) amount of SB with Error (SE)

Usability Metrics Rate of Efficiency = SB VB Mean Time to Task =

∑ timeSB SB

Incidents in Success Rate = SH + SE SB


1191

Report Generation – The evaluator-operator must receive the usability taxes and metrics calculated by the ErgoMonitor and from the permissible parameters he/she can detect possible usability problems in the website. We showed that the ErgoMonitor was considered as a system able to monitor the real interactions with real users and that they are really using the website in his/her proper work environments. This defines the UE focus of attention: usability problems that can occur effectively, even without the full knowledge of the involved conditions of use context with such problems. However, the environment only was defined and systemized, and the evaluator-operator was forced to identify manually the tasks in the server log files. Moreover, the cleaning mechanism of log files did not arrive to be implemented. Thus, we present these innovations in next sections.

3 The Monitoring Task and Behavior Model The result of the activity of monitoring analysis is the Monitoring Tasks and Behaviors Model. Morandinil defined that the evaluator-operator is answerable for modeling the tasks that are used by the ErgoMonitor to realize an UE analyzing the users’ behaviors when interact with these tasks. However, the evaluator-operator has that to manipulate server log files to identify the accesses to the first page and last page of task through of special characters. This work is extremely difficult to be realized for websites that have a great amount of tasks, and therefore, may induce to errors for its modeling and demand a high the evaluator-operator workload. 3.1 Task Specification Problem Formalization A task is an objective associated with an ordered set of actions that can satisfy such objective in the appropriate contexts [8]. Translating into our context, we consider a task as a sequence of steps that the user must realize to reach specific objective. Thus, it is necessary that the evaluator-operator determines the steps that compose each task, establishing its beginning and end. Each one of these steps is an access to a webpage and must be identified by URLs. Formally, we can define a task k for quadruple tk = <S, p o, pf, δ>, where S is a finite set of webpages (with size n) that must be viewed by the users for each task po is the first webpage, p f is the last webpage that identifies the success of task and, finally, δ:S→S is the transition function that identifies the change of the webpage pi for another webpage pj, where i, j ≤ n. Thus, we define the Monitoring Tasks and Behaviors Model as: TM = {t1, t2, ..., tm}, where m is amount of tasks in the model. It is important to remember that each webpage is identified by a URL and that the server log files stores in sequential way a list of URLs that determines the historic of access of each user. A common task in website could be the “User Register” and could be analyzed by evaluator-operator as is illustrated in Figure 2.

1192


Fig. 2. Example of a task

When the user wants to register him/her self in a website he/she must to access the index.html, register.php, and finally, success.html. However, when finding a problem in the page register.php the user can access the page of help (help.php). If a problem persists and the user insists to be registered, he/she will be directed for the error page (error.php). The user only will have success in his/her objective if obtains access the page of success (success.html). Formally, we can define the example of Figure 2 as: • • • •

S = {index.html, register.php, help.php, error.php, success.html} po = {index.html} pf = {success.html} δ:S→S. The transitions are: • index.html → register.php; • register.php → error.php; • register.php → help.php; • register.php → success.html; • error.php → register.php; and • help.php → register.php.

3.2 An Overview of Register of Monitoring Tasks and Behaviors Model This work main focus is to develop a mechanism where the Ergo-Monitor´s evaluator-operator may register and manipulate the task model. Thus, we developed an application using the Borland Delphi 6 (currently it is been converted for Java language, due to requirement of portability) associated to a database MySQL 4.1, in which the evaluator-operator can register the website to be evaluated, and then, register the webpages through of its URLs. With these information already defined, the evaluator-operator will have the chance to register the tasks that he/she wants to evaluate. Figure 3 presents the class diagram of the Monitoring Tasks and Behaviors Model of the ErgoMonitor. The classes in colors hard gray and light gray will be presented in next section and do not belong to Monitoring Tasks and Behaviors Model.


1193

Fig. 3. Class Diagram of Monitoring Tasks and Behaviors Model

4 LogControl The LogControl is a set of programs that are performed sequentially. These programs receive the Monitoring Tasks and Behaviors Model and server log files to generate the Verified Task and Behaviors Model. The stages that compose the LogControl are: • Cleaning – is the elimination of unnecessary information of log files leaving only urls of the pages accessed by the users. • Search – is the research of the occurrence of urls that are in the Monitoring Tasks and Behaviors Model that was specified in the previous stages; and • Organization — is the manipulation of the data to organize the occurrence of behaviors in log files to facilitate the usability taxes/metrics determination. 4.1 The Cleaning Mechanism The purpose of the cleaning mechanism is to eliminate of log files the unnecessary accesses for UE. Realizing this process, the cleaning mechanism receives the server log file and the Monitoring Tasks and Behaviors Model generating filtered log file. We can consider a website as a set of web resources, represented for R={r1, r2, r3, ..., rv }, where v is the amount of available resources for the website, and the set U={u1, u2, u3, ..., uw} represent all the users that to have access to the website. Thus, it

1194


is possible to represent a hit in the log files as li = {ui, d, h, c, ri}, where ui ∈ U, ri ∈ R, d represents the access date, h represents the access hour and c the status code that identifies if the request was success or failure. When the user visualizes a page, he is visualizing a set of resources that this page represents. Thus, we can represent a pageview for pi = {ri1, ri2, ..., rip}, where rij ∈ R, p ∈ v. We know that ri1 represents the resource desired for the user, and the others are embedded resources, which are requested by the webbrowser. On the other hand, the Monitoring Tasks and Behaviors Model presents sequences of urls for each task, that is, sequences of pages that must be visited by the user. As we defined that a task k can be represented by the quadruple tk = <S, po, pf, δ>, where the S is the set of webpages that compose the task. Hence, we can say that S = { p1, p2, ..., pn }, where n is the number maximum of pageview for a task. Now, we can clean the log file of form that each register does not represent an access to a resource, but, an access to a page. Thus, one hit in filtered log file is represented as lfi = {ui, d, h, c, pi}. Therefore, we can define the cleaning problem as: Given a log file with hits li and a Monitoring Task and Behaviors Model TM, transform the hits of this file in lfi, in such a way that all the hits in this file represent a pageview. 4.2 The Search and the Organization The search is a process that realizes the identification of the users behaviors in log files and the organization is a process that realizes a score of these behaviors. The expected output file of these processes is the Verified Tasks and Behaviors Model with the following information: • • • • •

Task accesses; Task success; Task unsuccess; Task with error page access; and Task with help page access.

A hit i of the filtered server log file was defined as lfi = {ui, d, h, c, pi}. On the other hand, a task k was defined by quadruple <S, po, pf, δ>, where δ is the function that defined a transition between pages of the task k. In order to produce the Verified Tasks and Behaviors Model, the program must searches in filtered log file the first transition of task, that is, a transition of page po for any page of set S defined by transition function δ. After to find the first transition, it must marks one access for this task and to continue the analyses until to find the last page pf (task success) or to find one transition that does not match with transition function δ of this task (task unsuccess). We defined that each page belongs to a type (content, error or help) in the Monitoring Tasks and Behaviors Model (see Figure 3). Thus, we can determine when a user accessed a error page and/or help page during his/her interaction with Web UI


1195

In Figure 3, the classes painted in gray black implement these processes and they are shortly described below: • File — this class represents the filtered server log file; • File_Page — this associative class represents the pages (beside IP, date, time) that are in the filtered server log file; and • Evaluation — this class represents the performance of evaluation and the attributes represent the behaviors calculated. In order to present an example that demonstrates the performance of LogControl, the Table 2 presents a fragment of log file with possible access to the task illustrated in Figure 2. Notice that the accesses in bold are those which the cleaning process are identified from the task illustrated in Figure 2. Table 2. Fragment of log file of accesses in the User Register task 201.22.90.195 - - [30/Nov/2005:09:50:11] “/index.html” 201.22.90.195 - - [30/Nov/2005:09:50:12] “/images/menu.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/logo.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/menu_register.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/animation.swf” 201.22.90.195 - - [30/Nov/2005:09:51:50] “/register.php” 201.22.90.195 - - [30/Nov/2005:09:51:50] “/images/logo.jpg” 201.22.90.195 - - [30/Nov/2005:09:53:12] “/success.html” 201.22.90.195 - - [31/Nov/2005:10:30:11] “/index.html” 201.22.90.195 - - [31/Nov/2005:10:30:12] “/images/menu.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/menu_register.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/animation.swf” 201.22.90.195 - - [31/Nov/2005:10:31:12] “/register.php” 201.22.90.195 - - [31/Nov/2005:10:31:13] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:33:12] “/error.php” 201.22.90.195 - - [31/Nov/2005:10:33:40] “/register.php” 201.22.90.195 - - [31/Nov/2005:10:33:40] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:34:12] “/success.html”

Fig. 4. The search and organization processes in the fragment of filtered log file

In Figure 4 we present the result of the search and organization processes realized in the fragment of filtered log file of the Table 2. Moreover, Figure 4 shows the users behaviors when interacting to the task. These behaviors compose the Verified Tasks and Behaviors Model that is the output of LogControl.

1196


5 Experiment We use the ErgoMonitor environment with the additional modules to realize UE in the website of Construtora Cidade Verde (CCV)1. Thus, we collected the log files related to the months of December/2005, January/2006 and March/2006, that added possess 22.6 MB of textfile with more than 113 thousand hits. The evaluator-operator generated the Monitoring Tasks and Behaviors Model with 64 tasks reaching the main website´s functionalities. After registering the Monitoring Tasks and Behaviors Model, we realized the log files cleaning, resulting in a set of filtered log files that added possess 3442 hits in 638 KB, which provides a considerable reduction in the size and in the amount of hits. In this way, we showed that it is practically impracticable to make the manual identification of the tasks in the log file due the great amount of hits and the high number of tasks. Finally, Table 3 describes the result of some tasks evaluated in the website of CCV. The data regarding the accesses to errors and help pages cannot be calculated therefore the website does not present pages of these types. Table 3. Results of tasks evaluated in the website of CCV Task New Building Building under Construction News Useful Links

Access 53 200 12 366

Success 9 76 12 366

Unsuccess 44 124 0 0

6 Conclusions In this paper we presented a mechanism to register the Monitoring Tasks and Behaviors Model. This mechanism provides to the evaluator-operator an interface to register the tasks that his/her desires to evaluate. We also showed the ErgoMonitor as an environment able to realize the website´s usability monitoring. However, the ErgoMonitor could not be applied in websites that possesses a great volume of data in their log files and a great amount of tasks to be evaluated. This problem occurred due to the evaluator-operator having to dedicate great efforts to manually identify the tasks in the log files. After the Monitoring Tasks and Behaviors Model was produced, the evaluator-operator must provide the log files that he/she desires to use in the evaluation. The LogControl receives these log files and with the support of the Monitoring Tasks and Behaviors Model realizes the cleaning of these log files through of elimination the unnecessary data for the UE. Besides, the LogControl 1

This website can be accessed in http://www.construtoracidadeverde.com.br and presents a building construction company portfolio.


1197

identifies the users’ behaviors in the filtered log files to obtain the Verified Tasks and Behavior Model. Thus, we also verified that the implantation of these approaches made the ErgoMonitor able to realize quantitative UE in websites of great proportions diminished the workload of the evaluator-operator.

References 1. Downton, A.: Engineering the Human-Computer Interface. McGraw-Hill, London (1992) 2. Harel, D.: Statecharts: a visual formalism for complex systems. Science of Computer Programming, pp. 231–274 (1987) 3. ISO 9241. Ergonomic Requirements for Office Work with Visual Display Terminals, Part 11 Usability Statements; Draft International Standard ISO 9241-11 (1993) 4. LEA, M.: Evaluating User Interface Designs. User Interface Design for Computer Systems, Chichester: pp.134–167 (1988) 5. Nielsen, J.: Projetando Web Sites - Designing Web Usability, Editora Campus, p. 416 (2000) 6. Paterno, F., Mancini, C., Meniconi, S.: ConcurTaskTrees: a Diagrammatic Notation for Specifying Task Models, INTERACT 97, pp. 362–369. Chapman&Hall, Sydney (1997) 7. Sears, A.: Layout Appropriateness: A Metric For Evaluating User Interface Widget Layout. IEEE Transactions on Software Engineering 19(7), 707–719 (1993) 8. Storrs, G.: The Notion of Task un Human-Computer Interaction. In: HCI 95- 10th Annual conference of the British Human-Computer interaction Group, University of Huddersfield, UK (1995) 9. Treu, S.: User Interface Evaluation: A Structured Approach, p. 351. Plenum Press, New York (1994) 10. Winckler, M., Palanque, P., Farenc, C., Pimenta, M.: Task-Based Assessment of Web Navigation Design. In: Proc. of TAMODIA’02 Task Models and Diagrams for User Interface Design, Bucharest, Romania (2002)

Model-Driven Architecture for Web Applications Mohamed Taleb1, Ahmed Seffah2, and Alain Abran3 1

Human-Centered Software Engineering Group, Concordia University Montreal, Quebec, Canada Telephone: +1 514 848 2424 ext. 7165 [email protected] 2 Human-Centered Software Engineering Group, Concordia University Montreal, Quebec, Canada Telephone: +1 514 848 2424 ext. 3024 [email protected] 3 Software Engineering Department & Information Technology, École de Technologie Supérieure (ÉTS), Montreal, Quebec, Canada Telephone: +1 514 848 2424 ext. 3024 [email protected]

Abstract. A number of Web design problems continue to arise, such as: (1) decoupling the various aspects of Web applications (for example, business logic, the user interface, navigation and information architecture; and (2) isolating platform specifics from the concerns common to all Web applications. In the context of a proposal for a model-driven architecture for Web applications, this paper identifies an extensive list of models aimed at providing a pool of proven solutions to these problems. The models span several levels of abstraction such as business, task, dialog, presentation and layout models. The proposed architecture will show how several individual models can be combined at different levels of abstraction into heterogeneous structures, which can be used as building blocks in the development of Web applications. Keywords: Models, Model-Driven architecture, Software engineering, Web applications, MDA, architecture.

1 Introduction The concept of model became a major paradigm of the software engineering of which the use represents a significant projection, in terms of level of abstraction, perenniality, generality, evolutionarity, etc. New issues appear, related to the rebuilding of the activities of the software engineering around this new approach (architecture driven by the models). They are associated more traditional issues dependent on the bases even of the concepts of model and meta-model and on their construction and validation (architecture of the models). The Internet and its languages offer major opportunities for developing new generation architecture for Web software systems, the latest of which are highly J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1198–1205, 2007. © Springer-Verlag Berlin Heidelberg 2007

Model-Driven Architecture for Web Applications

1199

interactive, platform-independent and run on the client Web browser across a network. This paper is aimed at providing a pool of proven solutions to many recurring Web design problems. Examples of such problems include: (1) decoupling the various aspects of Web applications such as business logic, the user interface, navigation and information architecture; and (2) isolating platform-specific issues from the concerns common to all Web applications. In this paper, the definition of a software architecture from [1] is adopted: “the structure of the subsystems and components of a software system and the relationships between them typically represented in different views to show the relevant functional and non functional properties.” This definition introduces both the main architectural elements (for instance, subsystems, components and connectors), and covers the ways in which to represent them, including both functional and nonfunctional requirements, by means of a set of views. A pool of proven solutions is proposed here in the form of an architecture and the related models for a model-driven architecture for Web applications to address solving these problems. These individual models can then be combined at different levels of abstraction into heterogeneous structures, which can be used as building blocks in the development of these applications. This paper is organized as follows: section 2 introduces related work on modeloriented architectures in general, such as the Model-Driven Architecture; section 3, primarily, describes the model-oriented architecture proposed here and some models, which we have identified and formalized; finally, section 4 presents a summary and directions for future work.

2 Related Work The concept of model plays a central role in the majority of the scientific disciplines. Some of them consider it moreover as being the "confluence of sciences". In Computer science, the models always existed, but they were relegated for a long time to the second plan compared to the source code, which still plays a dominating role in industry. The tendency tends however to be reversed and the concept of model passes from the “contemplative” statute (interpreted by the human) to the “productive” statute (interpreted by processors). Moreover, whereas the models were used until now in phase of design, and more recently in phase of development, they must from now on be embarked in the software to allow a dynamic evolution of this one. The use systematic, dynamic and equipped of models results today to consider the software engineering architecture directed by models as a major paradigm of the software engineering. 2.1 MDA Model The models are commonly used to represent flexibly the complex system. The models can be viewed at many levels of abstraction, and complementary model views can be combined to give a more intelligible, accurate view of a system than a single model alone. Meservy and Fensternacher [2] claim that many software development experts have long advocated using models to understand the problem that a system seeks to

1200

M. Taleb, A. Seffah, and A. Abran

address; yet development teams commonly employ models only in the early stages of modeling. Often, once construction begins, the teams leave them behind and never update to reflect their changing conceptions of the project. Most software developers would agree that modeling should play a role in every project [2]. However, there is no clear consensus on what that role should be, how developers should integrate modeling with other development activities, and who should participate in the modeling process [2]. In 2001, the Object Management Group introduced the Model-Driven Architecture (MDA) initiative as an approach to system specification and interoperability based on the use of formal models (i.e., definite and formalized models) [3; 4; 5; 6; 7]. The main idea of MDA is to specify the business logic in the form of abstract models. These models are then mapped (partly automatically) according to a set of transformation rules to different platforms. The models are usually described by UML in a formalized manner, which can be used as input for tools, which perform the transformation process. The main benefit of MDA is the clear separation of the fundamental logic behind a specification from the specifics of the particular middleware that implements it. In other words, the MDA approach distinguishes between the specifications of the operation of a system from the details of the way that the system uses the capabilities of its platform. This architectural separation of concerns constitutes the foundation of MDA in order to reach three main goals: portability, interoperability and reusability [8, 9, 10]. The MDA approach is comprised of three main steps: • Specifying the system independently from the platform that supports it • Specifying target platforms • Transforming the system specification into a specification for a particular platform In short, MDA makes a sharp distinction between models of: • The business (the Computation-Independent Model, or CIM), sometimes called a domain model, • The business model in a specific technology context (PIM), and • A model that is tied to the business and uses platform-specific code (PSM). There are two others steps that can be integrate in MDA process development: • Capture requirements in a CIM. The Computation-Independent Model captures the domain without reference to a particular system implementation or technology; The CIM would remain the same even if the systems were implemented mechanically, rather than in computer software, for example. • Deploy the system in a specific environment. Here, the question is to deploy the system in several specific platforms and environments. 2.2 Compositional Structured Component Model The authors have presented the CSCM model [11] designed to allow the construction of software components with variable lists of functionalities selected according to


1201

components’ composition descriptor instances at runtime. The capability offered by CSCM component to select the required functionalities tackles the issue of excessive unwanted functionalities. Furthermore, software maintenance, modification and reuse can be significantly eased and simplified. According the authors, the power of CSCM components [11] can be efficiently tackled in the development of software application families. Software application families are most likely to reuse coarse to large-grained software components across families of applications with different functional configuration and capabilities.

3 The Proposed Architecture 3.1 Overview To tackle some of the weaknesses identified in related work, set of concepts, proposes a 5-tier architecture of model-driven generic classification schema for a Web software architecture. 3.2 Models Taxonomy A taxonomy of models is proposed next. Examples of models are also presented to illustrate the need to combine several types of models to provide solutions to complex problems at the five architectural levels. This list is not exhaustive: there is no doubt that more models are needed, and that others have yet to be discovered. A number of Web models have been suggested; for example, the OMG’s the Model-Driven Architecture [3, 4, 5, 6, 7], Si Alhir’s Understanding the Model Driven Architecture (MDA) [8], Methods & Tools [8], Paternò’s Model-Based Design and Evaluation of Interactive Applications [9], Vanderdonckt’s Task Modelling in Multiple Contexts of Use [10], Msheik’s Compositional Structured Component Model: Handling Selective Functional Composition [11], Puerta’s Modeling Tasks with Mechanisms [12]. In our work, we investigate how these existing collections of models can be used as building blocks within the context of the proposed five-layer architecture. Which models at which level solve which problem is the question we try to answer? An informal survey conducted in 2004 by the HSCE Research Group at Concordia University identified at least five types of Web models that can be used to create a model-oriented Web software architecture. Some examples of proposed models are presented below. 3.3 Domain Model The Domain Model, sometimes, called a business model. This model encapsulates the important entities of an application domain together with their attributes, methods and relationships [13]. Within the scope of user interface development, it defines the objects and functionalities accessed by the user via the interface. Such a model is generally developed using the information collected during the business and functional requirements stage. It defines the list of data and features or operations to be performed in different manner, i.e., by different users in different platforms. The

1202


first Model-Based approaches were using a domain model to drive the user interface at runtime. These domain models describe the application in general and include some specific information for the user interface. For example, the domain model [13] includes: • • • • •

a class hierarchy of objects which exist in the application, properties of the objects, actions which can performed on the objects, units of information (parameters) required by the actions, and Pre- and post-conditions for the actions.

In their basic form, domain models should represent the important entities together with their attributes, methods, and relationships. This kind of a domain model corresponds to the object model of recent object-oriented software development methods. Consequently, the only real way to integrate user interface and system development is the simultaneous use of the data model. That is why; recent ModelBased approaches include a domain model known from the software engineering methods. Four other models are then derived from this model: task, dialog, presentation and layout models. 3.4 Task Model This model enables to describe how activities can be performed to reach the user’s goals when interacting with an interactive system [9]. Using task models, designers can develop integrated descriptions of the system from a functional and interactive point of view. Task models typically are hierarchical decompositions of tasks and subtasks into atomic actions [10]. In addition, the relationships between tasks are described in correlation with the execution order or dependencies between peer tasks. The tasks may contain attributes about the importance, the duration of execution and the frequency of use. For our purposes, we can reuse a following definition: A task is a goal together with the ordered set of tasks and actions that would satisfy it in the appropriate context. [13] This definition explicates the intertwining nature of tasks and goals. Actions are required to satisfy goals. Furthermore, the definition allows the decomposition of tasks into sub-tasks and there exist some ordering among the sub-tasks and actions. In order to complete this definition we need to add the definition of goal, action, and artifact: A goal is an intention to change or maintain the state of an artifact (based on [13]). An action is any act that has the effect of changing or maintaining the state of an artifact (based on [13]). An artifact is an object, which is essential for a task. Without this object, the task cannot be performed; the sate of this artifact is usually changed in the course task performance. Artifacts are real things existing


1203

in the context of task performance - in the business. Artifacts are modeled as objects and represented in the business model. This implies a close relationship between the task model and the business model. With these definitions, we can derive the information necessary to represent in a task model. According to [13], one task description includes: • one goal, • a non-empty set of actions or other tasks which are necessary to achieve the goal, • a plan of how to select actions or tasks, and • a model of an artifact, which is influenced by the task Consequently, the development of the task model and of the domain model is interrelated. One of the goals of Model-Based approaches is to support user-centered interface design. Therefore, they must enable the user interface designer to create the different task models. One other model is then derived from this model: domain models. 3.5 Dialog Model This model enables to provide dialog styles to achieve tasks and to provide proven techniques for the dialog. The dialog model defines the navigational structure of the user interface. It is a more specific model and can be derived in good part from the more abstract task, user and business object models. A dialog model is used to describe the human-computer conversation. It specifies when the end-user can invoke commands, functions and interaction media, when the end user can select or specify inputs, and when the computer can query the end-user and presents information [14]. In other words, the dialog model describes the sequencing of input tokens, output tokens and their interleaving. It describes the syntactic structure of human-computer interaction. The input and output tokens are lexical elements. Therefore, in particular, this model specifies the user commands, interaction techniques, interface responses and command sequences permitted by the interface during user sessions. Two other models are then derived from this model: domain, task models. 3.6 Presentation Model The Presentation Model describes the visual appearance of the user interface [13]. This model exists at two levels of abstraction: the abstract and the concrete presentation model. In fact, this defines the appearance and the form of presentation of the application on the Web page. This model provides solutions for how the contents or the related services can be visually organized into working surfaces, the effective layout of multiple information spaces and the relationship between them. They define the physical and logical layout suitable for specific Web pages such as home pages, lists and tables. A presentation model describes the constructs that can appear on an end user’s display, their layout characteristics, and the visual dependencies among them. The

1204


displays of most applications consist of a static part and of a dynamic part. The static part includes the presentation of the standard widgets like buttons, menus, and list boxes. Typically, the static part remains fixed during run-time of the interactive system except for state changes like enable/disable, visible/invisible. The dynamic part displays application-dependent data what typically changes during run-time (e.g., the application generates output information; the end user constructs application specific data). The former provides an abstract view of a generic interface, which represents a corresponding task and dialog models. Three other models are then derived from this model: domain, task, dialog models. 3.7 Layout Model Layout Model is realized as a concrete instance of an interface. This model consists of a series of user interface components that defines the visual layout of user interface and the detailed dialogs for a specific platform ands context of use. There may be many concrete instances of layout model that can be derived from a presentation and dialog models. This Layout model enables to provide conceptual models and architectures for organizing the underlying content across multiple pages, servers, databases and computers. This model is concerned by the Look & Feel of Web applications and by the construction of a general drawing area (e.g., Canvas widget) and all output inside a canvas must be programmed using a general-purpose programming language and a low-level graphical library. Four other models are then derived from this model: domain, task, dialog and presentation models.

4 Summary and Future Work In this paper, we have identified and proposed five categories of models, providing examples, for a model-driven architecture for Web applications to resolve many recurring Web design problems, examples of which include: (1) decoupling the various aspects of Web applications such business logic, the user interface, navigation and information architecture; (2) isolating platform-specific problems from the concerns common to all Web applications. Our discussion has focused on the way to specify a model-driven architecture using particular models such as Domain, Task, Dialog Presentation and Layout. Future work will require the classification of each model and the illustration of each of them in UML class and sequence diagrams. Next, some transformation rules will have to be defined and some relationships will have to be defined between models so that they can be combined to define the new methodology for software development based on the resulting patterns categories that we have defined and formalized in our previous work and these different models proposed and defined in this present paper.


1205

References 1. Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., Stal, M.: A System of Patterns: Pattern-Oriented Software Architecture, West Sussex, England. John Wiley & Sons, New York (1996) 2. Meservy, T.O., Fensternacher, K.D.: Transforming Software Development: An MDA Road Map. IEEE Computer 38(8), 52–58 (2005) 3. An ORMSC White Paper, ormsc/05-04-01.: A Proposal for an MDA Foundation Model’, V00-02, OMG Group (2005) [Online] available at http://www.omg.org/docs/ormsc/05-0401.pdf 4. Desmond Dsouza, Kinetium: Model-Driven Architecture and Integration Opportunities and Chalenges, OMG Group (2001) [Online] available at ftp://ftp.omg.org/pub/docs/ab/01-03-02.pdf 5. Richard Soley and the OMG Staff Strategy Group.: ‘Model-Driven Architecture’, OMG Group (2000) [Online] available at ftp://ftp.omg.org/pub/docs/omg/00-11-05.pdf 6. Dr. Jishnu Mukerji.: Document number ormsc/2001-07-01, Architecture Board, ORMSC, Model Driven Architecture (MDA) – Technical Perspective, OMG Group (2001) [Online] available at http://www.omg.org/docs/omg/01-07-01.pdf 7. Miller, J., Mukerji, J.: MDA Guide Version 1.0.1, OMG doc.omg/2003-06-01 (2003) [Online] available at http://www.omg.org/docs/omg/03-06-01.pdf 8. Alhir, S.S.: Understanding the Model Driven Architecture (MDA), Methods & Tools, Vol. 11, No.3, pp. 17–24, [Online] (2003) available at: http://www.methodsandtools.com/archive/archive.php?id=5 OR http://home.comcast.net/ salhir/UnderstandingTheMDA.PDF 9. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 10. Vanderdonckt, J.Q., Limbourg, Q., Souchon, N.: Task Modelling in Multiple Contexts of Use. In: Forbrig, P., Limbourg, Q., Urban, B., Vanderdonckt, J. (eds.) DSV-IS 2002. LNCS, vol. 2545, pp. 77–95. Springer, Heidelberg (2002) 11. Msheik, H., Abran, A., Lefebvre, E.: Compositional Structured Component Model: Handling Selective Functional Composition. In: IEEE 30th EUROMICRO Conference, pp. 74–81 (2004) 12. Puerta, A.R., Tu, S.W., Musen, M.A.: Modeling Tasks with Mechanisms. International Journal of Intelligent Systems, 8 (1993) 13. Schlungbaum, E.: Model-based User Interface Software Tools Current state of declarative models Technical Report 96-30, Graphics, Visualization and Usability Center, Georgia Institute of Technology, CADUI’96 workshop in Namur, Belgium (1996) 14. Puerta, A.R.: Model-Based Interface Development Environment. IEEE Software 14, 41–47 (1997)

HCI Design Patterns for PDA Running Space Structured Applications Ricardo Tesoriero, Francisco Montero, María D. Lozano, and José A. Gallud Laboratory of User Interaction and Software Engineering, Albacete Research Institute of Informatics, University of Castilla-La Mancha 02071 Albacete, Spain {ricardo, fmontero, mlozano, jgallud}@dsi.uclm.es

Abstract. Nowadays, mobile activities such as m-commerce, m-learning, etc, are being increasingly adopted by people. Information availability will be a key feature in future applications. Public spaces, as shops, libraries, museums, etc do not have enough information available to visitors, mainly due to physical space constraints. In this context, PDAs provide a balance between physical dimensions and processing power capable of supporting Augmented and Immersive Reality (A&IR) features. However, they have several limitations (i.e., space screen). As a result of two usability evaluations of a PDA application currently running at the MCA (The Cutlery Museum of Albacete, Spain) some improvements were found. To reuse these solutions, this paper presents a collection of HCI design patterns for PDAs that run this kind of Space Structured Applications (SSA). Keywords: Information presentation, Interaction design, HCI standards, Graphical user interface, Architectures for interaction, Computer-augmented environment, Computer–mediated virtual spaces, Interaction techniques, platforms and metaphors.

1 Introduction Nowadays, mobile activities such as m-commerce [5] [8], m-learning [9], etc, are being increasingly adopted by people. New mobile activities are being explored, and one of them is to support Augmented and Immersive Reality (A&IR) features on public physical spaces (shops, libraries, museums, etc). They do not have enough information available for people due of physical space constraints. An example is exposed in [6] at The Cutlery Museum of Albacete, known as MCA in Spain. The MCA is an emblematic institution of the city that is currently using an electronic device to guide visitors through the museum. A study involving a comparative evaluation of different platforms for augmenting museums and art galleries was presented as part of [3]. As a result PDA tended to outperform the other prototypes as Tablet PC or HMD (Head Mounted Displays). PDAs have several limitations. For example, the modest amount of real space available on a screen. Thus, it is essential to simplify the User Interface (UI) as much as possible. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1206–1215, 2007. © Springer-Verlag Berlin Heidelberg 2007

HCI Design Patterns for PDA Running Space Structured Applications

1207

To address this situation, design patterns [1] [2] [7] seem to be an appropriate tool to provide generic solutions that enable designers to solve these problems [8] [12] [14] [15] [16] [17]. This paper is organized as follows: First, based on usability reports performed on a concrete application [12] we have defined an environment composed by a set of applications (Space Structured Applications) where these patterns are valid. Categories that grouped these patterns according to problems they solved were defined to organize them. Due to space reasons we only describe a pattern for each category. To end this article, conclusion and future work is presented.

2 Space Structured Applications The definition of Space Structured Application was a consequence of usability evaluations [12] performed on a production system, the MCA application [6]. Analyzing the usability evaluations, we detected some problems related to HCI using a PDA running the MCA system and some solutions for these problems were designed. Then we noticed that these solutions were too general to be reused on applications that share MCA system characteristics. So, we characterized these applications as Space Structured Applications (SSAs) or m-space applications. A SSA models physical spaces (buildings, floors, rooms, etc) that contain extra information about objects dwelling in these places. Users can use this metaphor to browse information or locate an object or space physically. The main aim of SSA is to improve information availability on physical spaces that is not available due to, for instance, physical space restrictions. Besides, objects that are not physically available may be so, virtually. An object virtually represented in a SSA may provide context to a physical object, and vice versa. This contextualized information leads to a richer user experience. Public spaces as museums, libraries, shops and so on may be modelled by SSAs and a key issue of SSAs is the fact that the physical space is part of the application and physical position of objects is used to address extra information about it. Most of SSAs are public spaces, so accessibility becomes a key issue in these applications. The following list summarizes key aspects that should be covered by SSAs: 1. User position and orientation is essential to keep virtual and physical spaces synchronized. 2. Most of SSA users are visitors, so effort to browse information should be minimized. 3. SSA may be used to guide people through the physical space instead of just browsing it; taking an active interaction role too. 4. Accessibility is a key issue to board because SSA may help disabled people to interact with their environment.

1208

R. Tesoriero et al.

3 Related Work Currently, there are many examples where pattern languages were successfully applied on HCI environments. One of the first pattern collection applied on HCI was Common ground: A Pattern Language for Human-Computer Interface Design [13]. This collection presents loosely coupled patterns and organized in two levels: primary patterns and sublanguages. As an evolution of this catalogue, Designing Interfaces: Patterns for effective Interaction Design [14] presented new patterns to its predecessor. Other collections and languages followed, for instance, Martijn van Welie's Interaction Design Patterns [17] that organized patterns according to interface types (Web Design Patterns, GUI Design Patterns and Mobile UI Design Patterns). Another interesting book related to HCI patterns is Jan Borchers's A Pattern Approach to Interaction Design [4] and recently, The Design of Sites [15] that presented a complete collection of patterns oriented to Web site applications. Finally, design patterns were applied to mobile applications. For instance, in [10] we can find classes of patterns (as patterns languages) related to mobile interaction.

4 Proposal of Categories and Design Patterns The aim of the pattern language proposed in this paper is not as general as Tidwell’s [13] [14], but not as concrete as Gamma’s [7]. It presents solutions for a specific environment without providing a concrete implementation. We based pattern characterization on Roth’s [10]. However, we modified this proposal, which was based on [7], into name, synopsis, context, forces, solution, consequences, implementation, example and schematic description characteristics. The collection of HCI design patterns is organized into four categories; each category focuses on related problems. We have defined categories in order to ease the identification of the best patterns to be applied to certain problems. Due to space reasons we will expose a summary of each category and a brief description of each pattern. As an example, we will present only one pattern of each category in detail. The whole catalog of design patterns is available in a Technical Report at [12]. 4.1 Orientation Patterns This category introduces HCI patterns to help users to get oriented into a physical space. These patterns improve virtual/physical synchronization of space in order to locate users within the space. They cope with the issues described in point 1 of SSA characteristics. Patterns belonging to this category are the following: 1. You are here (aka Address): A user tries to identify any space somehow. Usually, public spaces are identified by names; so they should be supplied to the user. This pattern is widely used on Web. 2. Multi-Layer Map: Sometimes users need to know their physical position within a space. Physical spaces are structured as a hierarchy and user position can be determined by user space position on each level.


1209

3. Signs: This pattern helps users to get oriented when they spend a long a time into a space and get lost in there. So, a sign is used to synchronize virtual and physical space 4. Door at back: This pattern helps users to get oriented when a space transition occurs. A space transition happens when a users moves virtually and physically from one space to another; for instance form a room to another. As an example of this category we present Door at Back. Door at Back 1. Synopsis: Spaces are graphically represented by maps. Large buildings have several rooms. As all rooms of a building do not fit on screen at once, each one is represented by a different screen, producing space transitions when a user moves from one space to another. 2. Context: Users pass through different rooms while visiting buildings. When users move from one space to another, a transition on screen occurs. 3. Forces: This interface transition leads to user disorientation between physical and virtual space. 4. Solution: Virtual space orientation is usually represented by a map. This map should be automatically oriented according to the door used by the user is at the bottom of screen. The door should be clearly marked, as an arrow, pointing the same direction the user; as seen on Fig. 1. 5. Consequences: User gets oriented on space by recalling at first sight the room he had when he entered first time. 6. Schematic Description:

Fig. 1. Sample of “Door at back” pattern

7. Related Patterns: It can be used jointly with Address. Map orientation may be combined with layout changes depending on map shape (Layout patterns). Based on W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern focuses on: • Satisfying Real-world constraints taking into account physical suggestions and environmental suggestions (physical space orientation is treated in this pattern). • Communicating clearly, concisely, and consistently with users (an arrow represents user entrance direction). Making users comfortable by easing user’s short term memory (the arrow help users to get “back to the basics” - the moment he / she entered the room -).

1210

R. Tesoriero et al.

We can relate this pattern with Tidwell’s HCI patterns [13] [14]. So, the following sublanguages may apply to this pattern definition: Go Back One step and Go Back to a Safe Place (arrow may be used to go to a safe place to orientate user); Bookmark (entrance is automatically as a safe place) and; Remembered State (at the time user entered into the room). From Van Welie et al. [16] point of view, we can relate this pattern with feedback (user gets oriented based on a previous known position) and visibility (User guidance) problem. So, it improves Learnability and Memorability. 4.2 Layout Patterns Layout patterns were introduced to organize SSA. Screen resolution on mobile devices are restricted, information to be displayed is increased due to virtual / physical space relationship and objects extra information. Point 2 of SSA characteristics can be designed using the following patterns in this category: • Landscape: This pattern proposes to use PDA in Landscape direction. • Vertical-Horizontal Layout: Modify the application layout according to the information to be displayed. • Layout Transition: It shows layout change transition. As an example of this category we present Vertical-Horizontal Layout. Vertical-Horizontal Layout 1. Synopsis: Information to be displayed on a portable devices screen should be optimized because screen space. 2. Context: Usually, there are two types of information to be displayed: main information (information that fulfil screen objective) and secondary (additional information to perform other operations). 3. Forces: Main information shape and size vary. For instance, maps, photos and videos may be displayed in portrait or landscape. 4. Solution: To optimize screen visualization for main information, screen layout is changed to fit main information the best way as possible. Secondary information is displayed “around” main information to have it available. 5. Consequences: Primary data information is optimized to fit screen and secondary information is displayed on available space. 6. Schematic Description:

Fig. 2. Sample of “Vertical-Horizontal Layout” pattern


1211

7. Related Patterns: This pattern is close related to Landscape and Layout Transition. This pattern satisfies the following principles of W3C Common Sense Suggestions for Developing Multimodal User Interfaces [18]: • Communicate clearly, concisely, and consistently with users by switching presentation modes when information is not easily presented in the current mode. Screen layout adapts interface to main information. It keeps interface as simple as possible, changing control layout instead of controls themselves. • Make users comfortable by reducing learning gap of a new user interface. Relationship to Tidwell’s HCI patterns [13] [14] sublanguages are: Disabled Irrelevant things (although secondary items are not disabled, they are not treated in the same level of relevance as main information) and Good Defaults (information default layout changes according to main information to be displayed). If we analyze this pattern from [16] perspective, the problems it affords are related to Conceptual Model and Natural Mapping (user knows exactly how to perform operations, if the user had previous experience with the interface - before layout transformation -). We try to cope with Learnability and Memoability usability issues. 4.3 Guide Patterns Design patterns on this category are used to model routes and paths that users may follow to guide users through any physical space based on user preferences. So, point 3 and slightly 1 of SSA characteristics are boarded here. Patterns belonging to this category are the following: • Free Will Navigation: This pattern provides a method to access spaces at any level through the application using cursor keys only. • Routes: Routes pattern provides routes to focus a visit on user preferences. As an example of this category we present Free Will Navigation. Free Will Navigation (aka Up-Down and Left-Right or No Guide) 1. Synopsis: Virtual space navigation is performed by cursor keys only. 2. Context: Usually, people using SSAs do not have both hands free (carry baggage). So, people should be able to hold and operate a device with one hand only. 3. Forces: As one of the most important things to be performed by this kind of applications is space navigation, it should be easily performed by one hand and be learned quickly. 4. Solution: To cope with this navigation problem we propose to control navigation by cursor buttons using: Left – Right keys to navigate across space levels (interlevel). Right button goes one level into selected space (if a piece is selected on a showcase, when right button is pressed, it goes into selected piece). While Left arrow cursor button goes one level up (if a showcase is being shown, when left button is pressed, it goes to the room enclosing this showcase). And Up – down buttons are used to navigate across same level spaces (intra-level). It selects a subspace into the same space. Up and down buttons changes selection to labelled items. Labeling actions representing cursors on screen provides action feedback to user. See fig 3.

1212

R. Tesoriero et al.

5. Consequences: User is aware of navigation destination using cursor keys. If proposed control is accepted as a standard on SSAs, learning gap will be minimized. A disadvantage of using labels is the fact that they may obscure map. 6. Schematic Description

Fig. 3. Sample of “Free Will Navigation” pattern

7. Related Patterns: Main relationship is established with Landscape Layout pattern because portable devices, as PDAs, can be used with one hand only, if they are in landscape position. It is also related to Right-Left handed users. On W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern satisfies the following principles: • Satisfy real-world constraints by assigning cursor key to most common operation on this kind of application. It also applies physical suggestions by using one hand only instead of both hands. • Communicate clearly, concisely, and consistently with users by using the same keys through navigation system regarding of space level (keeping interface simple). According to Tidwell’s sublanguages [13] [14] this pattern is related to: Optional Detail On Demand (user access information according to space level); Short description (information about navigation is displayed on screen) and Convenient environment Actions (people usually goes one level up and down only). Finally, according to Van Welie’s [16] perspective, the problems it affords are related to Visibility (user guidance, navigation can be used to guide users across building); Affordance (it uses the space metaphor); Feedback (operations are labelled). And usability issues we try to cope with are Learnability and Memoability. Note: We propose this pattern as a standard way of navigating across SSAs. 4.4 Accessibility Patterns Accessibility category is used to group patterns that can be applied to improve application access to disabled people. Patterns related to Point 4 of SSA characteristics are grouped here. • Space Audio Perception: A voice tells the user which space has selected • Right – Left Handed users: It adapts a SSA application designed using the Landscape pattern to be used by right or left handed people.


1213

• Zoom: It provides controls to change font size when users are reading documents. As an example of this category we present Right-Left handed users. Right-Left Handed Users 1. Synopsis: This pattern adapts the system to be used by most skilled hand of the user. 2. Context: Usually people do not have the same skills on both hands. So, if an application that should be used with one hand only, it is logical that the hand used to perform operations be the skilled one. 3. Forces: Right – Left handed users 4. Solution: Solution lays on two issues: mirroring screen horizontally and Change cursor control behaviour (Up - Down) (Left - Right). 5. Schematic Description:

Fig. 4. Sample of “Left-Right handed users” pattern

6. Related Patterns: On W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern satisfies the following principles: • Satisfy real-world constraints by using the easiest mode available on the device to perform each task. • Communicate clearly, concisely, and consistently with users by making command consistent and organizational suggestions keeps interface simple. Relating this pattern with Tidwell’s [13] [14] we found it is related to Convenient environment actions (actions are adjusted to user’s perspective).This pattern improves flexibility providing explicit control. It also improves learnability and memorability.

5 Conclusions As new types of applications emerge as technology advances, a set of applications (SSA or m-space applications) arises. These applications have new characteristics and constraints that define them. Problems like user position and orientation, information browsing, visitors’ guidance and accessibility should be managed. To solve these problems, we proposed an HCI design pattern language.

1214

R. Tesoriero et al.

Patterns have been grouped in four categories that solve related problems providing a useful way to identify them. One of the most relevant contributions of this paper is the proposal of a standard control to navigate across virtual spaced using cursor keys only (“Free Will Navigation” pattern). Finally, from our perspective, we think that SSAs are not mature enough and more research in needed to achieve a natural HCI within these environments. Thus, we propose some future woks on this field to improve HCI in this environment. Our work in progress is currently focused on the evaluation of these patterns. Usability evaluation tests are being designed to measure usability before and after these patterns are applied. We are also thinking about performing these tests inside and outside an HCI Lab and compare these measurements. In order to improve HCI on SSA, sensors could be able to suppress manual navigation on cursor keys. To provide this functionality, there are lots of technologies currently available (RFID, Barcodes, IRDA, etc) that could be used to provide location aware applications. Finally, we think that the catalogue presented in this paper is not complete, so the addition of new patterns is considered. Acknowledgements. We would like to thank the Spanish CICYT project TIN200408000-C03-01 for funding this work, which was also supported by the grant PCC-05005-1 from JCCM.

References 1. Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., Fiksdahl-King, I., Shlomo, A.: A Pattern Language: Towns, Buildings, Construction. Oxford University, New York (1977) 2. Alexander, C.: The Timeless Way of Building. Oxford University Press, New York (1979) 3. Baber, C., Bristow, H., Cheng, S.L., Hedley, A., Kuriyama, Y., Lien, M., Pollard, J., Sorrell, P.: Augmenting Museums and Art Galleries. Human-Computer Interaction. In: INTERACT ’01. The International Federation for Information Processing, Tokyo, Japan, pp. 439–447 (2001) 4. Borchers, J.: A Pattern Approach to Interaction Design. John Wiley & Sons, New York (2001) ISBN-10:0471498289. ISBN-13: 978-0471498285 5. Elliot, G., Phillips, N.: Mobile Commerce and Wireless Computing. Addison-Wesley, London, UK (2004) ISBN-10: 0201752409. ISBN-13: 9780201752403. 6. Gallud, J.A., Penichet, V.M.R., Argandeña, L., González, P., García, J.A.: Digital Museums: a multi-technological approach. In: HCI-International Conference 2005, Lawrence Erlbaum Associates Las Vegas, USA (2005) 7. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Abstraction and Reuse in Object-Oriented Designs. In: Nierstrasz, O. (ed.) ECOOP 1993. LNCS, vol. 707, Springer, Heidelberg (1993) 8. Gary, S., Simon, S.: A Service Management Framework for M-Commerce Applications. Mobile Networks and Applications 7, 199–212 (2002) ISSN: 1383-469X


1215

9. Georgiev, T., Georgieva, E., Smrikarov, A.: M-Learning - a New Stage of E-Learning. In: International Conference on Computer Systems and Technologies - CompSysTech’2004 (2004) 10. Roth, J.: Patterns of Mobile Interaction. In: Roth, J. (ed.) JournalPersonal and Ubiquitous Computing, vol. 6(4), Springer, London (September 2002) ISSN 1617-4909 (Print) 16174917 11. Tesoriero, R., Lozano, M.D., Gallud, J.A., Penichet, V.M.R.: Evaluation the user experience of PDA-based software applied in art museums. In: 3rd Conference on Web Information Systems and Technologies 2007 WebIST. Barcelona Spain (2007) 12. Tesoriero, R., Lozano, M.D., Gallud, J.A., Montero, F.: Technical Report. HCI design patterns for SSA to PDA applications in art museums (2007) https://www.dsi.uclm.es/trep.php?&codtrep=DIAB-07-01-2 13. Tidwell, J.: Common Ground: A Pattern Language for Human-Computer Interface Design (1999) URL http://www.mit.edu/ jtidwell/common_ground_onefile.html 14. Tidwell, J.: Designing Interfaces: Patterns for Effective Interaction Design. Publisher O’Reilly. November 2005 ISBN-10: 0-596-00803-1. ISBN-13: 9780596008031 (2005) http://designinginterfaces.com/ 15. Van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites. Publisher: AddisonWesley Professional (July 2002) ISBN-10: 020172149X. ISBN-13: 978-0201721492 16. Van Welie, M., Trætteberg, H.: Interaction Patterns in User Interfaces. In: Pattern Languages of Programs Conference (PLoP 2000), Allerton Park, Monticello, Illinois, USA (2000) 17. Van Welie, M.: Interaction Design Patterns (2007) http://www.welie.com/patterns/index.html 18. W3C Working Group. Common Sense Suggestions for Developing multimodal User Interfaces, Note September 11, 2006 (2006) http://www.w3.org/TR/mmi-suggestions/

Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments Kristof Verpoorten, Kris Luyten, and Karin Coninx Hasselt University, Expertise Centre for Digital Media and transnationale Universiteit Limburg Wetenschapspark 2, 3590 Diepenbeek, Belgium {kristof.verpoorten, kris.luyten, karin.conix}@uhasselt.be

Abstract. In this paper we introduce a monitoring system to support the user executing tasks in an ambient intelligence environment. In contrast with traditional environments, the goal of the user can not always be defined beforehand, but is determined while the user interacts with the environment. The monitor observes the user's activities and learns to correlate a set of user actions with a goal. The system maps activities to a task model and reuses these models to take appropriate actions in later similar user actions that are observed. Keywords: task patterns, activity patterns, ambient intelligence environment, pro-active agent system.

1 Introduction Ambient intelligence environments are extremely complex environments if it comes to engineering interactive systems to support users in these environments. Traditional approaches assume the goals of the user are known when designing the interactive system, thus a task analysis and modelling stage can be used to determine the interactions between the user and the system. This often leads to a set of predefined dialogs that the user navigates through in a particular order and interacts with to reach the envisioned goals. A dialog describes the interactions between the user and the user interface and contains the representation of a set of tasks that are supported during the same period in time. In an ambient intelligence environment, a user is seldom constrained to a certain set of interactions that are required to complete a set of tasks and it is often difficult to perceive dialogs that frame the interaction. In this paper we want to offer a way to cope with the degrees of freedom for the user in such an environment while still supporting the user in a sensible way to complete the tasks at hand. This requires the creation of a pro-active system that supports the user reaching its goals while only requiring partial information about the different models. We present a novel approach that builds a task model out of patterns based on the user activities that are observed. A special-purpose agent monitors the user's actions and reactions to certain events and the context in which these events occur. In order to do this, the environment needs to be represented in a way that it can be processed by J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1216–1225, 2007. © Springer-Verlag Berlin Heidelberg 2007

Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments

1217

the agent. Our system uses a graph-based representation of the environment, where each node represents a resource (devices) in the environment or a property of a resource. The edges specify the relations between these resources or properties. A relation that is often used is the distance between resources. The graph contains all relevant data about the user's environment. At this moment, relevant data is limited to information about other devices in the neighbourhood of our device. The node representing the user's device also contains a link to information about the device's internal context (running programs, battery level,...). The combination of this information with the environment information is enough for the agent to be able to help the user in supporting his/her tasks. The system's goal is to be self-learning: by observing actions of the users and the associated goals, it creates patterns of interaction that can lead to such a goal. For the actual learning, decision trees and a sliding window are used [7]. The decision trees are used to capture relevant information about the environment, and how this affects the user’s goals. The sliding window compares a set of actions of the user to previous interaction patterns to determine the next action that needs to be supported and how this contributes to the task at hand. The task and goals provide a context for the pattern. By using both decision trees and the sliding window simultaneously we increase the likelihood a correct action is selected and anticipate new behaviour of the user that was not encountered before. A self-learning system often takes incorrect choices and invokes incorrect behaviour that hinders rather than supports the user. The main problem in an ambient environment is that there is no standard way to give feedback to the system and correct its behaviour. For this purpose we investigate the use of predefined design patterns in these environments [10] that allow the user to interact with the selflearning system. In this paper we consider the usage of patterns that go beyond the traditional Alexandrian definition that indicate architectural or design patterns [15]. Besides HCI-oriented task and feature patterns, the recognition of patterns in activity data plays an important role in our approach.

2 Related Work To standardize the way the user can interact with a ubiquitous computing environment, Chung et al have developed and evaluated several design patterns for ubiquitous computing [10]. They developed an initial pattern language for ubiquitous computing consisting of 45 pre-patterns describing application genres, physicalvirtual spaces, interaction and systems techniques for managing privacy, and techniques for fluid interactions. Each of their pre-patterns consists of: a name and a letter-number pair, the patterns background, the problem that the patterns is addressing, the solution(s) to the problem and references to work related to the pattern. Their patterns tend to focus on high-level issues, such as user needs, versus specific user interfaces and interaction techniques. An evaluation showed that the prepatterns helped new and experienced designers unfamiliar with ubiquitous computing in avoiding design problems. These patterns are of great use to us when developing our context sensitive pro-active application.

1218

K. Verpoorten, K. Luyten, and K. Coninx

Software applications are becoming more and more complex, and so are user interfaces. Therefore, Sinnig et al feel that a disciplined form of reuse is needed for user interface development [17]. They explored the combination of components and patterns for this purpose. Patterns help in the reuse of well-known and proven design solutions, and play a significant role throughout the whole UI development process. Components embody reusable solutions at the implementation level. As a case study, they discussed the development of a web application for selling and managing IPphone service using the PCB (pattern -and component based) process. This research shows it is possible to build a UI using patterns. Pattie Maes [11] suggests the idea of agents acting as personal assistants. These agents acquire their competence by learning from the user, as well as learning from agents assisting other users. A few prototypes were build using this technique, including agents that provide assistance with meeting scheduling, e-mail handling, news filtering and entertainment selection. The problem with this approach is that the agents will only monitor one program (agenda, e-mail client, ...). We think more useful results can be obtained by monitoring the entire environment and the user's personal device. Charles Isbell et al [12] try to predict which task the user wants to execute with a remote control. They deploy a user interface that allows the user to execute a task using only one remote to control several devices. They describe tasks as clusters of similar commands, which are often used together. With the recorded data of 2 users during several weeks they were able to collect enough interactions to divide all available commands into several clusters. Each cluster represents a task the user is performing, during which he or she uses the commands in the cluster. They predict the next task the user is likely to perform by looking at the previous tasks he or she has performed. That way the user interface of the remote is adapted to support the probable next task, avoiding a cluttered UI with too many buttons. The way they predict the next task has some similarities with our approach of sliding windows (section 4.1). Another resemblance with our approach is that when our agent is not entirely sure about the next action to take, it will not execute an action on behalf of the user, but adapt the UI on the user's device to make it easier for the user to execute the task himself if desired. However, in contrast to us, they only focus on devices that can be controlled using a remote. Our system is designed to support the user with almost everything he or she can do using his or her PDA. Our system will also take over easy tasks from the user when it has had enough training, while the remote developed by Isbell et al will only adapt the UI of the remote to help the user in executing his or her task. Byun and Cheverst propose the utilization of context history together with user modelling and machine learning techniques to create pro-active applications. In [13] they describe an experiment to examine the feasibility of their approach for supporting pro-active adaptations in the context of an intelligent office environment. The application uses two different approaches to obtain pro-activity: pro-active rule based adaptation, and pro-active modelling adaptation. With the first approach, users have to reconfigure the system as their preferences change. The second approach automatically adapts those predefined rules when observation makes clear that the user's preferences have changed. In order to learn the patterns of the user's behaviour, decision trees are used. Decision trees have the advantage to be much easier to


1219

understand by designers than other machine learning techniques such as neural networks. Byun's conclusion is that context history has a concrete role for supporting pro-active adaptation in ubiquitous computing environments. This work supports our decision to use context history as our main source of data when trying to predict the user's next action. Both our decision tree and sliding window techniques (section 4.1) use context history. Obviously, a pro-active agent takes some control away from the user. Is the user willing to make this sacrifice? Barkhuus and Dey have examined this very question in [14]. Their conclusion is that the user is willing to accept a large degree of autonomy from applications as long as the application's usefulness is greater than the cost of limited control.

3 Environment Representation Our system represents the environment of the user in such a way that it is understandable for a computer program. The system monitors the environment continuously (through a set of sensors) and stores all relevant information in a graphlike structure. An example of such a graph is shown in figure 1.

Fig. 1. An environment graph representing the user's device and several other devices in the vicinity

Each node of the graph represents a resource (device) in the environment or a property of a resource and the edges represent relations between the resources. The node representing the user's device will be in the center of the graph. It will have several outgoing edges to all other devices in the vicinity that can be of any relevance to the user's context. It will also have an outgoing edge to a node representing the

1220


device's internal context. The internal context node contains all relevant information about the device itself (running programs, battery level,...). The representation of the environment is sufficient for our current needs, but it is still work in progress and will be further extended when necessary (e.g. by using the CoDAMoS ontology [16]).

4 Patterns in Ambient Intelligence Environments 4.1 Activity Patterns At the most fundamental level, patterns are considered as reoccurring sequences of user (inter)actions when inside the ambient intelligence environment. In our approach, on the syntactical level, the universe of interactions is an alphabet (from now on referred to as the interaction alphabet). Each letter in that alphabet is an atomic interaction. A word is a composition of letters of the interaction alphabet, and can be related with a sequence of activities that directly contributes to one of the tasks that can be executed while in the ambient intelligence environment. The length of such a word, and thus the sequence of actions, is limited by the maximum number of interactions that are required to complete a dialog (introduced in section 1). A word is always executed in a particular context. This means it is related with a (set of) task(s) from the task model and with a (set of) node(s) from the environment graph. The first indicates the interactions are done to complete a task, the latter that these interactions make use of resources and have an execution context. Based on the longstanding ideas of processor instruction predictions [5] and more recent work using Markov models [6], we think the dialog model that guides the user through the user interface can be composed dynamically. There are three different stages that need to be tackled here: 1. feature extraction from the observations 2. determining the next actions that can follow the observed set of actions 3. predicting the likelihood of each of these actions. We describe each step in more detail in the next paragraphs. When the user is interacting with the environment, the agent will continuously monitor her or his actions executed in the ambient environment. Each action is encoded as a letter from the interaction alphabet, and added to a list containing all executed actions. After a while, interesting patterns will start to occur in that list. Several actions will very often group together in the list and form recognizable subsets (interaction words) in the list. The system can look for these subsets in the user’s actions by comparing “partial words” already executed by the user, and use them to determine the next action the user is likely to execute. While the user is interacting with the ambient environment, her or his actions will be monitored by the system. With each action the user executes, it will search if the user’s activity is similar to previously recorded interaction patterns. It will do this by creating a window containing the last x actions executed by the user (a partial word). The system will use several lengths between a maximum and a minimum length for this window. It will first try to find reoccurring patterns using the maximum length for the window. Of course, with a long window, it will be harder to find a match but the


1221

likelihood of the predicted action based on this will be higher. When it does not find a match, it will start looking for matches with shorter windows, thus predicting actions with a lower likelihood. This goes on until the system finds a match or it has reached the minimum length for the window. When a matching window is found (figure 2), the action that occurs directly after the match will probably be the next action the user is going to execute. The reasoning behind this is simple, when the x previous actions are a part of a previously recorded word in the list, one can predict the next actions by completing the word. Currently we use a simple sliding window approach since we assume the actions that belong to an interaction word are always sequentially executed. One can imagine more complex pattern matching algorithms if parallel actions in an interaction word are allowed: in this case sequences becomes less important than the actual occurrence of an action in a word.

Fig. 2. Finding a matching pattern using the "sliding window" technique

A second machine learning method used by our system is decision trees. Instead of relating the user’s actions with previously executed actions, they are related with the user’s environment context at the time of the execution. Each time the user executes an action, it is added to the decision tree together with the current context. Next time a similar context occurs, the agent will be able to predict the expected action. Both the sliding window and the decision tree techniques will be used simultaneously when the agent tries to predict the user’s actions. By using both techniques together, we can figure out the prediction’s likelihood. When both techniques have the same outcome, it is a more likely outcome then when both algorithms disagree on the prediction. The agent has to be able to estimate the likelihood of the predicted pattern. It is important this is done as accurate as possible because wrong predictions will make the user lose trust in the system. Because we use both sliding windows and decision trees, there are two ways to calculate a prediction’s likelihood. First, when using the sliding window, the length of the partial word that matches a word in the list can be used to indicate the likelihood. The longer the word, the more likely the predicted action will be executed. Secondly, the decision tree can also be used to calculate the likelihood. Each time the agent “sees” a situation it already encountered before, the likelihood of the action executed in this situation is incremented.

1222


4.2 Task Patterns The previous section discussed the first level of patterns we use in our approach; patterns that are unlike other HCI or design patterns but have more in common with data patterns as they occur in machine learning techniques. This section considers more classic HCI patterns: the task pattern. A task pattern offers a solution to a particular problem (e.g. find item in list) for a given context. A task pattern is a re-occurring solution for a problem we encounter while doing task analysis. However, in the situation described above there is no predefined task model that describes the system, rather there is a set of observed user activities and available resources. We discuss how to progress toward dynamically selecting task patterns based on the observed activity patterns. The type of task pattern is also referred to as a feature pattern: a pattern that describes the activities of the user while using a certain feature of a system and that is part of a bigger task model [8]. The activity pattern and related environment graph describe the context of the pattern. Based on previous work [9], the user actions included in the activity pattern are interactions with services accessible from the user environment. Interaction with these services can be done using the available interaction resources available in the user environment and is transparent for the end-user. The service-interaction description that is included in the services includes a feature pattern that can be combined with other feature patterns to obtain a structured task model for the predicted goal of the user. Figure 3 gives an overview of this approach. Each service has a feature pattern attached that serves as a building block for an overall task model when a combination of services is being used. Each action from the activity is related with a service that is offered in the environment. A set of actions implies a set of services being accessed, and each service has its own specific interaction behaviour described by a task specification. Since a task specification can be considered equivalent with a dialog or behaviour model (depending on the notation used), the different dialog models of the services involved, and the services related with the predicted actions can be merged into one dialog model. Once this is completed, the next user interface elements likely to be interacted with by the user can be found following the different possible paths in the dialog model. Figure 3 shows exactly this approach: notice for the predicted actions there is also a dialog model that is merged with the previous dialog model by an edge that allows the user to progress from using the presentations for services X and Y toward using the presentation of service Z. The edges that occur between dialog models belonging to different action words are automatically added and depend on how the task models (feature patterns) of the services are merged. For example: suppose the top node of the task model for service X is A and for service Y it is B. Since X and Y are accessed together, these top nodes will be merged as (A ||| B), while if there is a merge with the top node C from service Z, this will result in (A ||| B) >> C since service Z is only used after services X and Y are used.


1223

Fig. 3. On the left the set of services accessible in the environment with their associated dialog models. On the right the action words with the user activities and their corresponding services. The combination of services leads to a combined dialog model guiding the user through the user interface.

5 Current Status and Future Work Currently, both the sliding window and the decision trees are implemented. The system is not yet able to detect the environment context itself. At this moment, tests are done by simulating the environment with a software component. The sliding window and decision trees are trained when simulating the environment, just as when it would be detected by the system itself. The agent will also suggest actions when it is able to make a reliable prediction and will try to help the user to execute that prediction by adapting the user interface to the user’s needs. Future work is to replace the software component that simulates the environment by one that can actually detect the environment context using sensors available for the system. These are standard sensors present on any modern device (WiFi, Bluetooth, …). The usage of task and feature patterns and their corresponding dialog models is also still work in progress. In previous work we already developed some basic components to support this approach, such as service annotation [9] and task prediction [6].

6 Conclusions In this paper we introduced a monitoring system to support the user in an ambient intelligence environment. The system is able to make sense of the environment context and learns the user’s expectations in certain contexts. Next time it encounters a similar environment, it will try to support the user by either executing the action on

1224


her or his behalf, or by adapting the user interface to support the user’s interactions in the ambient environment. Both for monitoring and predicting actions, as for adapting the user interface, patterns are used. Acknowledgments. Part of the research at EDM is funded by EFRO (European Fund for Regional Development), the Flemish Government and the Flemish Interdisciplinary institute for Broadband Technology (IBBT). Funding for this research was also provided by the Fund For Scientific Research Flanders (F.W.O. Vlaanderen), FWO project nr G.0461.05.

References 1. Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford Digital Library Metadata Architecture. Int. J. Digit. Libr. 1, 108–121 (1997) 2. Bruce, K.B., Cardelli, L., Pierce, B.C.: Comparing Object Encodings. In: Ito, T., Abadi, M. (eds.) TACS 1997. LNCS, vol. 1281, pp. 415–438. Springer, Heidelberg (1997) 3. van Leeuwen, J. (ed.): Computer Science Today. LNCS, vol. 1000. Springer, Heidelberg (1995) 4. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg, New York (1996) 5. Computer Architecture. A Quantitive Approach (2nd edn) Patterson, D.A., Henessy, J. L. (eds.) Morgan Kaufman Publishers (1996) ISBN 1-55860-329-8 6. Task-Driven Automated Component Deployment for Ambient Intelligence Environments, Rigole, P., Clerckx, T., Berbers, Y., Coninx, K.: Accepted for the Elsevier Journal on Pervasive and Mobile Computing, in press 7. Machine Learning, Tom M. Mitchell, McGraw-Hill Science (1997) ISBN 0-07042-807-7 8. Javahery, H., Seffah, A., Engelberg, D., Sinnig, D.: Multiple User Interfaces: MultipleDevices, Cross-Platform and Context-Awareness, ch. 12 Migrating User Interfaces between Platforms Using HCI Patterns. Wiley (2003) 9. Service-interaction Descriptions: Augmenting Services with User Interface Models. In: Vermeulen, J., Vandriessche, Y., Clerckx, T., Luyten, K., Coninx, K.(eds.) Proc. of EHCIHCSE-DSVIS’07 (March 2007) 10. Chung, E.S., Hong, J.I., Lin, J., Prabaker, M.K., Landay, J.A., Liu, A.L.: Development and evaluation of emerging design patterns for ubiquitous computing. In: Proceedings of the 2004 Conference on Designing interactive Systems: Processes, Practices, Methods, and Techniques (Cambridge, MA, USA, August 01 - 04, 2004). DIS ’04, pp. 233–242. ACM Press, New York (2004) 11. Maes, P.: Agents that reduce work and information overload. Human-computer interaction: toward the year 2000. pp. 811–821 (1995) 12. Charles Jr., L.I., Omojokun, O., Pierce, J.S.: From devices to tasks: automatic task prediction for personalized appliance control. Personal Ubiquitous Computing 8(3-4), 146–153 (2004) 13. Byun, H.E., Cheverst, K.: Utilising context history to support proactive adaptation. Applied Artificial Intelligence 18(6), 513–532 (2004) 14. Barkhuus, L., Dey, A.K.: Is context-aware computing taking control away from the user? three levels of interactivity examined. In: Dey, A.K., Schmidt, A., McCarthy, J.F. (eds.) UbiComp 2003. LNCS, vol. 2864, pp. 149–156. Springer, Heidelberg (2003)


1225

15. A Pattern Language: Towns, Buildings, Construction (Center for Environmental Structure Series) Alexander, C. (1977) ISBN 0-19-501919-9 16. Preuveneers, D., Van den Bergh, J., Wagelaar, D., Georges, A., Rigole, P., Clerckx, T., Berbers, Y., Coninx, K., Jonckers, V., De Bosschere, K.: Towards an extensible context ontology for ambient intelligence. In: European Symposium on AmbientIntelligence, pp. 148–159 (November 2004) 17. Sinnig, D., Javahery, H., Forbrig, P. Seffah, A.: Patterns and Components for Enhancing Reusability and Systematic UI Development. In: Proceedings of HCI International, Las Vegas, USA (2005)

Patterns for Task- and Dialog-Modeling Maik Wurdel1, Peter Forbrig1, T. Radhakrishnan2, and Daniel Sinnig2 1

Software Engineering Group, Department of Computer Science, University of Rostock, Albert-Einstein-Str. 21, 18051 Rostock, Germany {maik.wurdel, pforbrig}@informatik.uni-rostock.de 2 Department of Computer Science, Concordia University, 1455 De Maisonneuve Blvd. West, H3G 1M8, Montreal, Canada {krishnan, seffah, d_sinnig}@cs.concordia.ca

Abstract. The term Context of Use has been treated with much attention in HCI in recent years. In this paper, the integration of context information into task models will be described. The notion of context is formulated and used to annotate the task model. The reuse of such context-sensitive task models in light of task patterns is also examined. Keywords: task modeling, context-sensitivity, task patterns, context of use.

1 Introduction The development of UIs is complex and requires the integration of different disciplines. Model-based UI development has gained much attention by various researchers [5, 6, 8, 9], due to its ability to foster the integration of different viewpoints into the development process in the early stages of the development process. In general it tackles the problem of UI development by using different declarative models and the relationships between these models. The task model as a description of the tasks and goals is a commonly accepted starting point for modelbased UI development processes. Other models that have to be taken into account describe the environmental circumstances of the execution of tasks. The Context of Use (CoU) as an abstraction of these circumstances influences the tasks a user has to fulfill. Note that some tasks might not be useful or possible in a certain context. The adaptation of the UI according to the context can improve the human computer interaction by providing an adapted UI for the specific CoU. In this paper, we will demonstrate how context models are integrated into a modelbased UI development process, with particular emphasis on the task model. A definition of the term context and a formalization is given, which is later used to enrich task models. Finally we illustrate how context sensitive task patterns can be used as building blocks for the creation of context sensitive task models. The idea of reuse of generic model fragments by means of the use of patterns will be illustrated.

2 Model-Based Development and Task Models Modeling is understood as abstraction of a real system by removing the irrelevant details in the current level of abstraction [3]. Modeling helps software designers to J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1226–1235, 2007. © Springer-Verlag Berlin Heidelberg 2007

Patterns for Task- and Dialog-Modeling

1227

manage complexity by abstracting from low-level implementation details. In HCI, there are different kinds of UI models that can be taken into account to describe the various facets of the UI. Among those models, the task model has gained special attention, as it often forms the starting point from which the UI development should be initiated. Task models describe the tasks (and sub-tasks) a user has to execute to achieve a certain goal. A task is a hierarchical structure, which expresses the activities a user has to accomplish to fulfill this task. A goal is understood as a result a user wants to obtain after the task execution. Task modeling is a user-centered approach and thus task model designers concentrate on users and capture their activities. Considerations about how a user can reach a goal using a certain software system can foster usability. Even without using task models for UI generating processes they help capturing usability requirements, since the understanding of the task world of the user can lead to a better UI design. Model-based UI development describes the process of (semi-) automated UI generation by using a set of declarative models, which cover all aspects of importance of the envisioned UI. Most model-based UI approaches specify the user, task, domain (application), platform, dialog, layout and/or presentation model [7, 8]. Model-based UI development can be seen as a series of model transformations, where abstract models (e.g. task, user, domain model) gradually evolve into more concrete models (e.g. dialog, layout, presentation model) which finally result in the implementation of the UI. Since the design of UI models is complex and error prone, tool support is needed to carry out model-based UI development efficiently. Especially tedious tasks can be supported or automated. Furthermore tool support is able to hide technical details of the used technologies and the design is made at a conceptual level. A model based UI development environment can consequently be understood as a software system, which helps software designers to execute a certain model-based UI development process. Typical functions of such an environment are the design, validation, and animation of the model instances. Furthermore, the environment should provide decision-making assistance and involve the end-user in all stages of the development. The generation of prototypes based on the designed models to evaluate the made decisions helps to integrate the stakeholders’ needs in early stages of development. Design decision can be reconsidered based on the given feedback.

Fig. 1. Model-based UI development process and its steps

Fig. 1 shows the general rational of a model-based UI development process. It starts with an analysis of the goals and tasks, which results in overall task model. This step will be further detailed in a subsequent section. Next, the resulting task model

1228

M. Wurdel et al.

has to be adapted to the current context by taking into consideration aspects about the user, the end-device and the environment. This refined task model is less complex, since unnecessary tasks for this context are already filtered. Based on the information of the task model the dialog is constructed. It specifies groupings of tasks into dialog views and defines transitions between the various dialog views. In this stage of the process an abstract prototype can already be generated (more details on this generation process can be found in [4]), which exemplifies the general application flow of the later UI. This prototype is based on the designed dialog structure and the temporal relationships of the involved tasks. Followed by the creation of the dialog structure is the definition of the presentation and layout model. The former associates interaction elements (e.g. buttons, text fields, labels) with the various tasks, whereas the latter describes the arrangement and the look & feel of these elements. After describing our model-based UI development methodology we will now discuss different types of task models, which may be involved in the various development steps. In general we distinguish between three different types of task models: 1. Task model of the problem domain (Analysis task model) 2. Context-sensitive task model of the envisioned software system 3. Task model of the software system for a particular CoU (context-insensitive) The analysis of the domain of interest results in the analysis task model (1.). It reflects the knowledge of the domain how a problem can be tackled in a general way independent of a software system [4]. The analysis is performed in close cooperation with the domain expert. After considering, which problems should be solved by the envisioned software system, a corresponding envisioned task model is designed (2.). It is a boiled down version of the previous task model and omits tasks, which will either not be tackled by the envisioned software system or do not relate to the software system itself. This model has to reflect the behavior of the envisioned interactions between the user and software [4] and describes the task world for all platforms, users and environments. The context-insensitive task model, on the other hand, is considered as the task model, which has to be fulfilled by a certain user using certain device in a particular environment (3.). It is a filtered version of the previous task model. During the transformation only the applicable tasks for a particular context are considered. Note that a context-sensitive task model describes a set of context-insensitive task models. Thus, a context-insensitive task model can be seen as instance of the corresponding context-sensitive task model. The next section will clarify the term CoU and proposes a model, which is used to annotate task models. Later on we will come back to the different types of task models to reflect context dependencies.

3 Context of Use With the advent of mobile and ubiquitous computing the development of interactive systems has become increasingly complex. The interactive behavior of the system needs to be accustomed to a wide range of people having different skills hand and

Patterns for Task- and Dialog-Modeling

1229

using different end-devices. In addition the usage of mobile devices is not bound to a predefined location of use and hence versatile environmental factors need to be taken into account as well. In this paper we summarize the entirety of influential factors under the term Context of Use (CoU). The context of use is any information that can be used to characterize the situation of the environment, the user and the device of a software system, which is regarded as relevant for the interaction of the system and the user.

Fig. 2. Decomposition of the CoU into sub-models

From our point of view a holistic approach has to cover the device, the user of the system and the environment of the system and the user. There is consent that these models interrelate to each other[1]. According to our previous definition we propose the categorization illustrated in Fig. 2. 3.1 Meta-model In this section we propose a generic meta-model for the CoU model. It consists of variables and expressions. More precisely the CoU model is defined by a set of variables where each variable has a unique name and a domain of discrete values. Furthermore, an order is defined on the values of the domain. Based on these variables expressions are defined by comparisons of variables and Boolean logic. Brief Example Based on the proposed meta-model of the CoU we will exemplify its application for the sub-model “Device”. As depicted below, it consists of a set of representative variable with pre-defined domains. Variables: CPU (low, medium, high) Memory (1024Mb) Bandwidth (low, medium, high) Input capabilities (pen, cell phone keyboard, keyboard) Monitor resolution (

Human-Computer Interaction.. Interaction Design and Usability

Interaction Design

Interaction Design and Children

Map-based Mobile Services Design Interaction and Usability

Map-based mobile services: design, interaction, and usability

Mobile Interaction Design

Future Interaction Design II

Future Interaction Design

Mobile Interaction Design

Future Interaction Design

Multimodal Usability (Human-Computer Interaction Series)

Haptic and Audio Interaction Design - Haid 2011

Maturing Usability: Quality in Software, Interaction and Value (Human-Computer Interaction Series)

Interaction 2

Interaction 1

Interaction and Grammar

drug interaction

Interaction models

Total Interaction

Interaction models

Nonlinear Interaction

Gender, Control and Interaction

Ludics, Dialogue and Interaction

Information, Interaction, and Agency

Usability of Complex Information Systems: Evaluation of User Interaction

A Pattern Approach to Interaction Design

Strategic Interaction (Conduct and Communication)

Media Discourse: Representation and Interaction

Speech Acts and Conversational Interaction

Autism, Play and Social Interaction

The Art and Science of Interface and Interaction Design

Human-Computer Interaction.. Interaction Design and Usability

Interaction Design

Interaction Design and Children

Map-based Mobile Services Design Interaction and Usability

Map-based mobile services: design, interaction, and usability

Mobile Interaction Design

Future Interaction Design II

Future Interaction Design

Mobile Interaction Design

Future Interaction Design

Multimodal Usability (Human-Computer Interaction Series)

Haptic and Audio Interaction Design - Haid 2011

Maturing Usability: Quality in Software, Interaction and Value (Human-Computer Interaction Series)

Interaction 2

Interaction 1

Interaction and Grammar

drug interaction

Interaction models

Total Interaction

Interaction models

Nonlinear Interaction

Gender, Control and Interaction

Ludics, Dialogue and Interaction

Information, Interaction, and Agency

Usability of Complex Information Systems: Evaluation of User Interaction

A Pattern Approach to Interaction Design

Strategic Interaction (Conduct and Communication)

Media Discourse: Representation and Interaction

Speech Acts and Conversational Interaction

Autism, Play and Social Interaction

The Art and Science of Interface and Interaction Design

Recommend Documents