Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5610
Julie A. Jacko (Ed.)
Human-Computer Interaction New Trends 13th International Conference, HCI International 2009 San Diego, CA, USA, July 19-24, 2009 Proceedings, Part I
13
Volume Editor Julie A. Jacko University of Minnesota Institute of Health Informatics MMC 912, 420 Delaware Street S.E., Minneapolis, MN 55455, USA E-mail:
[email protected] Library of Congress Control Number: 2009929048 CR Subject Classification (1998): H.5, I.3, I.7.5, I.5, I.2.10 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI ISSN ISBN-10 ISBN-13
0302-9743 3-642-02573-0 Springer Berlin Heidelberg New York 978-3-642-02573-0 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2009 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12705719 06/3180 543210
Foreword
The 13th International Conference on Human–Computer Interaction, HCI International 2009, was held in San Diego, California, USA, July 19–24, 2009, jointly with the Symposium on Human Interface (Japan) 2009, the 8th International Conference on Engineering Psychology and Cognitive Ergonomics, the 5th International Conference on Universal Access in Human–Computer Interaction, the Third International Conference on Virtual and Mixed Reality, the Third International Conference on Internationalization, Design and Global Development, the Third International Conference on Online Communities and Social Computing, the 5th International Conference on Augmented Cognition, the Second International Conference on Digital Human Modeling, and the First International Conference on Human Centered Design. A total of 4,348 individuals from academia, research institutes, industry and governmental agencies from 73 countries submitted contributions, and 1,397 papers that were judged to be of high scientific quality were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of human–computer interaction, addressing major advances in the knowledge and effective use of computers in a variety of application areas. This volume, edited by Julie A. Jacko, contains papers in the thematic area of Human–Computer Interaction, addressing the following major topics: • • • • •
Novel Techniques for Measuring and Monitoring Evaluation Methods, Techniques and Tools User Studies User Interface Design Development Approaches, Methods and Tools
The remaining volumes of the HCI International 2009 proceedings are: • • • • •
Volume 2, LNCS 5611, Human–Computer Interaction––Novel Interaction Methods and Techniques (Part II), edited by Julie A. Jacko Volume 3, LNCS 5612, Human–Computer Interaction––Ambient, Ubiquitous and Intelligent Interaction (Part III), edited by Julie A. Jacko Volume 4, LNCS 5613, Human–Computer Interaction - Interacting in Various Application Domains (Part IV), edited by Julie A. Jacko Volume 5, LNCS 5614, Universal Access in Human–Computer Interaction––Addressing Diversity (Part I), edited by Constantine Stephanidis Volume 6, LNCS 5615, Universal Access in Human–Computer Interaction––Intelligent and Ubiquitous Interaction Environments (Part II), edited by Constantine Stephanidis
VI
Foreword
• • • • • • • • • • •
Volume 7, LNCS 5616, Universal Access in Human–Computer Interaction––Applications and Services (Part III), edited by Constantine Stephanidis Volume 8, LNCS 5617, Human Interface and the Management of Information––Designing Information Environments (Part I), edited by Michael J. Smith and Gavriel Salvendy Volume 9, LNCS 5618, Human Interface and the Management of Information––Information and Interaction (Part II), edited by Gavriel Salvendy and Michael J. Smith Volume 10, LNCS 5619, Human Centered Design, edited by Masaaki Kurosu Volume 11, LNCS 5620, Digital Human Modeling, edited by Vincent G. Duffy Volume 12, LNCS 5621, Online Communities and Social Computing, edited by A. Ant Ozok and Panayiotis Zaphiris Volume 13, LNCS 5622, Virtual and Mixed Reality, edited by Randall Shumaker Volume 14, LNCS 5623, Internationalization, Design and Global Development, edited by Nuray Aykin Volume 15, LNCS 5624, Ergonomics and Health Aspects of Work with Computers, edited by Ben-Tzion Karsh Volume 16, LNAI 5638, The Foundations of Augmented Cognition: Neuroergonomics and Operational Neuroscience, edited by Dylan Schmorrow, Ivy Estabrooke and Marc Grootjen Volume 17, LNAI 5639, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris
I would like to thank the Program Chairs and the members of the Program Boards of all thematic areas, listed below, for their contribution to the highest scientific quality and the overall success of HCI International 2009.
Ergonomics and Health Aspects of Work with Computers Program Chair: Ben-Tzion Karsh Arne Aarås, Norway Pascale Carayon, USA Barbara G.F. Cohen, USA Wolfgang Friesdorf, Germany John Gosbee, USA Martin Helander, Singapore Ed Israelski, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Kari Lindström, Finland
Holger Luczak, Germany Aura C. Matias, Philippines Kyung (Ken) Park, Korea Michelle M. Robertson, USA Michelle L. Rogers, USA Steven L. Sauter, USA Dominique L. Scapin, France Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK Teresa Zayas-Cabán, USA
Foreword
Human Interface and the Management of Information Program Chair: Michael J. Smith Gunilla Bradley, Sweden Hans-Jörg Bullinger, Germany Alan Chan, Hong Kong Klaus-Peter Fähnrich, Germany Michitaka Hirose, Japan Jhilmil Jain, USA Yasufumi Kume, Japan Mark Lehto, USA Fiona Fui-Hoon Nah, USA Shogo Nishida, Japan Robert Proctor, USA Youngho Rhee, Korea
Anxo Cereijo Roibás, UK Katsunori Shimohara, Japan Dieter Spath, Germany Tsutomu Tabe, Japan Alvaro D. Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan Li Zheng, P.R. China Bernhard Zimolong, Germany
Human–Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Sherry Y. Chen, UK Marvin J. Dainoff, USA Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Ayse Gurses, USA Vicki L. Hanson, UK Sheue-Ling Hwang, Taiwan Wonil Hwang, Korea Yong Gu Ji, Korea Steven Landry, USA
Gitte Lindgaard, Canada Chen Ling, USA Yan Liu, USA Chang S. Nam, USA Celestine A. Ntuen, USA Philippe Palanque, France P.L. Patrick Rau, P.R. China Ling Rothrock, USA Guangfeng Song, USA Steffen Staab, Germany Wan Chul Yoon, Korea Wenli Zhu, P.R. China
Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Guy A. Boy, USA John Huddlestone, UK Kenji Itoh, Japan Hung-Sying Jing, Taiwan Ron Laughery, USA Wen-Chin Li, Taiwan James T. Luxhøj, USA
Nicolas Marmaras, Greece Sundaram Narayanan, USA Mark A. Neerincx, The Netherlands Jan M. Noyes, UK Kjell Ohlsson, Sweden Axel Schulte, Germany Sarah C. Sharples, UK
VII
VIII
Foreword
Neville A. Stanton, UK Xianghong Sun, P.R. China Andrew Thatcher, South Africa
Matthew J.W. Thomas, Australia Mark Young, UK
Universal Access in Human–Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elisabeth André, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian Bühler, Germany Noelle Carbonell, France Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy Michael Fairhurst, UK Dimitris Grammenos, Greece Andreas Holzinger, Austria Arthur I. Karshmer, USA Simeon Keates, Denmark Georgios Kouroupetroglou, Greece Sri Kurniawan, USA
Patrick M. Langdon, UK Seongil Lee, Korea Zhengjie Liu, P.R. China Klaus Miesenberger, Austria Helen Petrie, UK Michael Pieper, Germany Anthony Savidis, Greece Andrew Sears, USA Christian Stary, Austria Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg C. Vanderheiden, USA Gerhard Weber, Germany Harald Weber, Germany Toshiki Yamaoka, Japan Panayiotis Zaphiris, UK
Virtual and Mixed Reality Program Chair: Randall Shumaker Pat Banerjee, USA Mark Billinghurst, New Zealand Charles E. Hughes, USA David Kaber, USA Hirokazu Kato, Japan Robert S. Kennedy, USA Young J. Kim, Korea Ben Lawson, USA
Gordon M. Mair, UK Miguel A. Otaduy, Switzerland David Pratt, UK Albert “Skip” Rizzo, USA Lawrence Rosenblum, USA Dieter Schmalstieg, Austria Dylan Schmorrow, USA Mark Wiederhold, USA
Internationalization, Design and Global Development Program Chair: Nuray Aykin Michael L. Best, USA Ram Bishu, USA Alan Chan, Hong Kong Andy M. Dearden, UK
Susan M. Dray, USA Vanessa Evers, The Netherlands Paul Fu, USA Emilie Gould, USA
Foreword
Sung H. Han, Korea Veikko Ikonen, Finland Esin Kiris, USA Masaaki Kurosu, Japan Apala Lahiri Chavan, USA James R. Lewis, USA Ann Light, UK James J.W. Lin, USA Rungtai Lin, Taiwan Zhengjie Liu, P.R. China Aaron Marcus, USA Allen E. Milewski, USA
Elizabeth D. Mynatt, USA Oguzhan Ozcan, Turkey Girish Prabhu, India Kerstin Röse, Germany Eunice Ratna Sari, Indonesia Supriya Singh, Australia Christian Sturm, Spain Adi Tedjasaputra, Singapore Kentaro Toyama, India Alvin W. Yeo, Malaysia Chen Zhao, P.R. China Wei Zhou, P.R. China
Online Communities and Social Computing Program Chairs: A. Ant Ozok, Panayiotis Zaphiris Chadia N. Abras, USA Chee Siang Ang, UK Amy Bruckman, USA Peter Day, UK Fiorella De Cindio, Italy Michael Gurstein, Canada Tom Horan, USA Anita Komlodi, USA Piet A.M. Kommers, The Netherlands Jonathan Lazar, USA Stefanie Lindstaedt, Austria
Gabriele Meiselwitz, USA Hideyuki Nakanishi, Japan Anthony F. Norcio, USA Jennifer Preece, USA Elaine M. Raybourn, USA Douglas Schuler, USA Gilson Schwartz, Brazil Sergei Stafeev, Russia Charalambos Vrasidas, Cyprus Cheng-Yen Wang, Taiwan
Augmented Cognition Program Chair: Dylan D. Schmorrow Andy Bellenkes, USA Andrew Belyavin, UK Joseph Cohn, USA Martha E. Crosby, USA Tjerk de Greef, The Netherlands Blair Dickson, UK Traci Downs, USA Julie Drexler, USA Ivy Estabrooke, USA Cali Fidopiastis, USA Chris Forsythe, USA Wai Tat Fu, USA Henry Girolamo, USA
Marc Grootjen, The Netherlands Taro Kanno, Japan Wilhelm E. Kincses, Germany David Kobus, USA Santosh Mathan, USA Rob Matthews, Australia Dennis McBride, USA Robert McCann, USA Jeff Morrison, USA Eric Muth, USA Mark A. Neerincx, The Netherlands Denise Nicholson, USA Glenn Osga, USA
IX
X
Foreword
Dennis Proffitt, USA Leah Reeves, USA Mike Russo, USA Kay Stanney, USA Roy Stripling, USA Mike Swetnam, USA Rob Taylor, UK
Maria L.Thomas, USA Peter-Paul van Maanen, The Netherlands Karl van Orden, USA Roman Vilimek, Germany Glenn Wilson, USA Thorsten Zander, Germany
Digital Human Modeling Program Chair: Vincent G. Duffy Karim Abdel-Malek, USA Thomas J. Armstrong, USA Norm Badler, USA Kathryn Cormican, Ireland Afzal Godil, USA Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Sung H. Han, Korea Lars Hanson, Sweden Pheng Ann Heng, Hong Kong Tianzi Jiang, P.R. China
Kang Li, USA Zhizhong Li, P.R. China Timo J. Määttä, Finland Woojin Park, USA Matthew Parkinson, USA Jim Potvin, Canada Rajesh Subramanian, USA Xuguang Wang, France John F. Wiechel, USA Jingzhou (James) Yang, USA Xiu-gan Yuan, P.R. China
Human Centered Design Program Chair: Masaaki Kurosu Gerhard Fischer, USA Tom Gross, Germany Naotake Hirasawa, Japan Yasuhiro Horibe, Japan Minna Isomursu, Finland Mitsuhiko Karashima, Japan Tadashi Kobayashi, Japan
Kun-Pyo Lee, Korea Loïc Martínez-Normand, Spain Dominique L. Scapin, France Haruhiko Urokohara, Japan Gerrit C. van der Veer, The Netherlands Kazuhiko Yamazaki, Japan
In addition to the members of the Program Boards above, I also wish to thank the following volunteer external reviewers: Gavin Lew from the USA, Daniel Su from the UK, and Ilia Adami, Ioannis Basdekis, Yannis Georgalis, Panagiotis Karampelas, Iosif Klironomos, Alexandros Mourouzis, and Stavroula Ntoa from Greece. This conference could not have been possible without the continuous support and advice of the Conference Scientific Advisor, Prof. Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications Chair and Editor of HCI International News, Abbas Moallem.
Foreword
XI
I would also like to thank for their contribution toward the organization of the HCI International 2009 conference the members of the Human–Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, George Paparoulis, Maria Pitsoulaki, Stavroula Ntoa, and Maria Bouhli. Constantine Stephanidis
HCI International 2011
The 14th International Conference on Human–Computer Interaction, HCI International 2011, will be held jointly with the affiliated conferences in the summer of 2011. It will cover a broad spectrum of themes related to human–computer interaction, including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. More information about the topics, as well as the venue and dates of the conference, will be announced through the HCI International Conference series website: http://www.hci-international.org/
General Chair Professor Constantine Stephanidis University of Crete and ICS-FORTH Heraklion, Crete, Greece Email:
[email protected] Table of Contents
Part I: Novel Techniques for Measuring and Monitoring Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kiyohiko Abe, Shoichi Ohi, and Minoru Ohyama
3
A Usability Study of WebMaps with Eye Tracking Tool: The Effects of Iconic Representation of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ Ozge Ala¸cam and Mustafa Dalcı
12
Feature Extraction and Selection for Inferring User Engagement in an HCI Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stylianos Asteriadis, Kostas Karpouzis, and Stefanos Kollias
22
Informative or Misleading? Heatmaps Deconstructed . . . . . . . . . . . . . . . . . Agnieszka (Aga) Bojko
30
Toward EEG Sensing of Imagined Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael D’Zmura, Siyi Deng, Tom Lappas, Samuel Thorpe, and Ramesh Srinivasan
40
Monitoring and Processing of the Pupil Diameter Signal for Affective Assessment of a Computer User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ying Gao, Armando Barreto, and Malek Adjouadi
49
Usability Evaluation by Monitoring Physiological and Other Data Simultaneously with a Time-Resolution of Only a Few Seconds . . . . . . . . K´ aroly Hercegfi, M´ arton P´ aszti, Sarolta T´ ov¨ olgyi, and Lajos Izs´ o
59
Study of Human Anxiety on the Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . Santosh Kumar Kalwar and Kari Heikkinen
69
The Research on Adaptive Process for Emotion Recognition by Using Time-Dependent Parameters of Autonomic Nervous Response . . . . . . . . . Jonghwa Kim, Mincheol Whang, and Jincheol Woo
77
Students’ Visual Perceptions of Virtual Lectures as Measured by Eye Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu-Jin Kim, Jin Ah Bae, and Byeong Ho Jeon
85
Toward Constructing an Electroencephalogram Measurement Method for Usability Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masaki Kimura, Hidetake Uwano, Masao Ohira, and Ken-ichi Matsumoto
95
XVI
Table of Contents
Automated Analysis of Eye-Tracking Data for the Evaluation of Driver Information Systems According to ISO/TS 15007-2:2001 . . . . . . . . . . . . . . Christian Lange, Martin Wohlfarter, and Heiner Bubb Brain Response to Good and Bad Design . . . . . . . . . . . . . . . . . . . . . . . . . . . Haeinn Lee, Jungtae Lee, and Ssanghee Seo An Analysis of Eye Movements during Browsing Multiple Search Results Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuko Matsuda, Hidetake Uwano, Masao Ohira, and Ken-ichi Matsumoto Development of Estimation System for Concentrate Situation Using Acceleration Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masashi Okubo and Aya Fujimura Psychophysiology as a Tool for HCI Research: Promises and Pitfalls . . . . Byungho Park Assessing NeuroSky’s Usability to Detect Attention Levels in an Assessment Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Genaro Rebolledo-Mendez, Ian Dunwell, Erika A. Mart´ınez-Mir´ on, Mar´ıa Dolores Vargas-Cerd´ an, Sara de Freitas, Fotis Liarokapis, and Alma R. Garc´ıa-Gaona Effect of Body Movement on Music Expressivity in Jazz Performances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mamiko Sakata, Sayaka Wakamiya, Naoki Odaka, and Kozaburo Hachimura
105 111
121
131 141
149
159
A Method to Monitor Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . . . Dvijesh Shastri, Ioannis Pavlidis, and Avinash Wesley
169
Decoding Attentional Orientation from EEG Spectra . . . . . . . . . . . . . . . . . Ramesh Srinivasan, Samuel Thorpe, Siyi Deng, Tom Lappas, and Michael D’Zmura
176
On the Possibility about Performance Estimation Just before Beginning a Voluntary Motion Using Movement Related Cortical Potential . . . . . . . Satoshi Suzuki, Takemi Matsui, Yusuke Sakaguchi, Kazuhiro Ando, Nobuyuki Nishiuchi, Toshimasa Yamazaki, and Shin’ichi Fukuzumi
184
Part II: Evaluation Methods, Techniques and Tools A Usability Evaluation Method Applying AHP and Treemap Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshiyuki Asahi, Teruya Ikegami, and Shin’ichi Fukuzumi
195
Table of Contents
Evaluation of User-Interfaces for Mobile Application Development Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Florence Balagtas-Fernandez and Heinrich Hussmann User-Centered Design and Evaluation – The Big Picture . . . . . . . . . . . . . . Victoria Bellotti, Shin’ichi Fukuzumi, Toshiyuki Asahi, and Shunsuke Suzuki
XVII
204 214
Web-Based System Development for Usability Evaluation of Ubiquitous Computing Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jong Kyu Choi, Han Joon Kim, Beom Suk Jin, and Yonggu Ji
224
Evaluating Mobile Usability: The Role of Fidelity in Full-Scale Laboratory Simulations with Mobile ICT for Hospitals . . . . . . . . . . . . . . . . Yngve Dahl, Ole Andreas Alsos, and Dag Svanæs
232
A Multidimensional Approach for the Evaluation of Mobile Application User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jos´e Eust´ aquio Rangel de Queiroz and Danilo de Sousa Ferreira
242
Development of Quantitative Usability Evaluation Method . . . . . . . . . . . . Shin’ichi Fukuzumi, Teruya Ikegami, and Hidehiko Okada
252
Reference Model for Quality Assurance of Speech Applications . . . . . . . . . Cornelia Hipp and Matthias Peissner
259
Toward Cognitive Modeling for Predicting Usability . . . . . . . . . . . . . . . . . . Bonnie E. John and Shunsuke Suzuki
267
Webjig: An Automated User Data Collection System for Website Usability Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mikio Kiura, Masao Ohira, and Ken-ichi Matsumoto ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems . . . Jan Kleindienst, Jan Cuˇr´ın, and Martin Labsk´ y Interpretation of User Evaluation for Emotional Speech Synthesis System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ho-Joon Lee and Jong C. Park Multi-level Validation of the ISOmetrics Questionnaire Based on Qualitative and Quantitative Data Obtained from a Conventional Usability Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan-Paul Leuteritz, Harald Widlroither, and Michael Kl¨ uh What Do Users Really Do? Experience Sampling in the 21st Century . . . Gavin S. Lew Evaluating Usability-Supporting Architecture Patterns: Reactions from Usability Professionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edgardo Luzcando, Davide Bolchini, and Anthony Faiola
277 287
295
304 314
320
XVIII
Table of Contents
Heuristic Evaluations of Bioinformatics Tools: A Development Case . . . . Barbara Mirel and Zach Wright A Prototype to Validate ErgoCoIn: A Web Site Ergonomic Inspection Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcelo Morandini, Walter de Abreu Cybis, and Dominique L. Scapin
329
339
Mobile Phone Usability Questionnaire (MPUQ) and Automated Usability Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Young Sam Ryu
349
Estimating Productivity: Composite Operators for Keystroke Level Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeff Sauro
352
Paper to Electronic Questionnaires: Effects on Structured Questionnaire Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Trujillo
362
Website Designer as an Evaluator: A Formative Evaluation Method for Website Interface Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chao-Yang Yang
372
Part III: User Studies Building on the Usability Study: Two Explorations on How to Better Understand an Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anshu Agarwal and Madhu Prabaker Measuring User Performance for Different Interfaces Using a Word Processor Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tanya R. Beelders, Pieter J. Blignaut, Theo McDonald, and Engela H. Dednam
385
395
Evaluating User Effectiveness in Exploratory Search with TouchGraph Google Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kemal Efe and Sabriye Ozerturk
405
What Do Users Want to See? A Content Preparation Study for Consumer Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yinni Guo, Robert W. Proctor, and Gavriel Salvendy
413
“I Love My iPhone... But There Are Certain Things That ‘Niggle’ Me” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Haywood and Gemma Boguslawski
421
Table of Contents
Acceptance of Future Technologies Using Personal Data: A Focus Group with Young Internet Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabian Hermann, Doris Janssen, Daniel Schipke, and Andreas Schuller Analysis of Breakdowns in Menu-Based Interaction Based on Information Scent Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yukio Horiguchi, Hiroaki Nakanishi, Tetsuo Sawaragi, and Yuji Kuroda E-Shopping Behavior and User-Web Interaction for Developing a Useful Green Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei-Hui Huang, Ying-Lien Lee, and Sheue-Ling Hwang Interaction Comparison among Media Internet Genre . . . . . . . . . . . . . . . . . Sang Hee Kweon, Eun Joung Cho, and Ae Jin Cho Comparing the Usability of the Icons and Functions between IE6.0 and IE7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chiuhsiang Joe Lin, Min-Chih Hsieh, Hui-Chi Yu, Ping-Jung Tsai, and Wei-Jung Shiang
XIX
431
438
446
455
465
Goods-Finding and Orientation in the Elderly on 3D Virtual Store Interface: The Impact of Classification and Landmarks . . . . . . . . . . . . . . . . Cheng-Li Liu, Shiaw-Tsyr Uang, and Chen-Hao Chang
474
Effects of Gender Difference on Emergency Operation Interface Design in Semiconductor Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hunszu Liu
484
Evaluating a Personal Communication Tool: Sidebar . . . . . . . . . . . . . . . . . Malena Mesarina, Jhilmil Jain, Craig Sayers, Tyler Close, and John Recker
490
“You’ve Got IMs!” How People Manage Concurrent Instant Messages . . . Shailendra Rao, Judy Chen, Robin Jeffries, and Richard Boardman
500
Investigating Children Preferences of a User Interface Design . . . . . . . . . . Jamaliah Taslim, Wan Adilah Wan Adnan, and Noor Azyanti Abu Bakar
510
Usability Evaluation of Graphic Design for Ilmu’s Interface . . . . . . . . . . . . Tengku Siti Meriam Tengku Wook and Siti Salwa Salim
514
Are We Trapped by Majority Influences in Electronic Word-of-Mouth? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Tong and Yinqing Zhong
520
XX
Table of Contents
Leveraging a User Research Framework to Guide Research Investments: Windows Vista Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gayna Williams A Usability Evaluation of Public Icon Interface . . . . . . . . . . . . . . . . . . . . . . Sungyoung Yoon, Jonghoon Seo, Joonyoung Yoon, Seungchul Shin, and Tack-Don Han
530 540
Part IV: User Interface Design Little Design Up-Front: A Design Science Approach to Integrating Usability into Agile Requirements Engineering . . . . . . . . . . . . . . . . . . . . . . . Sisira Adikari, Craig McDonald, and John Campbell
549
Aesthetics in Human-Computer Interaction: Views and Reviews . . . . . . . Salah Uddin Ahmed, Abdullah Al Mahmud, and Kristin Bergaust
559
Providing an Efficient Way to Make Desktop Icons Visible . . . . . . . . . . . . Toshiya Akasaka and Yusaku Okada
569
An Integration of Task and Use-Case Meta-models . . . . . . . . . . . . . . . . . . . R´emi Bastide
579
Model-Based Specification and Validation of User Interface Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Birgit Bomsdorf and Daniel Sinnig
587
A Position Paper on ’Living Laboratories’: Rethinking Ecological Designs and Experimentation in Human-Computer Interaction . . . . . . . . . Ed H. Chi
597
Embodied Interaction or Context-Aware Computing? An Integrated Approach to Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Johan Eliasson, Teresa Cerratto Pargman, and Robert Ramberg
606
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mieke Haesen, Jan Meskens, Kris Luyten, and Karin Coninx
616
Agent-Based Architecture for Interactive System Design: Current Approaches, Perspectives and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . Christophe Kolski, Peter Forbrig, Bertrand David, Patrick Girard, Chi Dung Tran, and Houcine Ezzedine BunBunMovie: Scenario Visualizing System Based on 3-D Character . . . Tomoya Matsuo and Takashi Yoshino Augmented Collaborative Card-Based Creative Activity with Digital Pens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motoki Miura, Taro Sugihara, and Susumu Kunifuji
624
634
644
Table of Contents
Usability-Engineering-Requirements as a Basis for the Integration with Software Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Karsten Nebe and Volker Paelke Design Creation Based on KANSEI in Toshiba . . . . . . . . . . . . . . . . . . . . . . Yosoko Nishizawa and Kanya Hiroi High-Fidelity Prototyping of Interactive Systems Can Be Formal Too . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philippe Palanque, Jean-Fran¸cois Ladry, David Navarre, and Eric Barboni
XXI
652 660
667
RUCID: Rapid Usable Consistent Interaction Design Patterns-Based Mobile Phone UI Design Library, Process and Tool . . . . . . . . . . . . . . . . . . . Avinash Raj and Vihari Komaragiri
677
The Appropriation of Information and Communication Technology: A Cross-Cultural Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jose Rojas and Matthew Chalmers
687
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vin´ıcius Costa Villas Bˆ oas Segura and Simone Diniz Junqueira Barbosa
697
Beyond the User Interface: Towards User-Centred Design of Online Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcin Sikorski
706
Designing for Change: Engineering Adaptable and Adaptive User Interaction by Focusing on User Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno S. da Silva, Ariane M. Bueno, and Simone D.J. Barbosa
715
Productive Love: A New Proposal for Designing Affective Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ramon Solves Pujol and Hiroyuki Umemuro
725
Insight into Kansei Color Combinations in Interactive User Interface Designing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K.G.D. Tharangie, Shuichi Matsuzaki, Ashu Marasinghe, and Koichi Yamada Learn as Babies Learn: A Conceptual Model of Designing Optimum Learnability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Douglas Xiaoyong Wang Time-Oriented Interface Design: Picking the Right Time and Method for Information Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keita Watanabe, Kei Sugawara, Shota Matsuda, and Michiaki Yasumura
735
745
752
XXII
Table of Contents
Enabling Interactive Access to Web Tables . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Yang, Wenchang Xu, and Yuanchun Shi
760
Integration of Creativity into Website Design . . . . . . . . . . . . . . . . . . . . . . . . Liang Zeng, Robert W. Proctor, and Gavriel Salvendy
769
Part V: Development Approaches, Methods and Tools YVision: A General Purpose Software Composition Framework . . . . . . . . Ant˜ ao Almada, Gon¸calo Lopes, Andr´e Almeida, Jo˜ ao Fraz˜ ao, and Nuno Cardoso Collaborative Development and New Devices for Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hans-J¨ org Bullinger and Gunnar Brink
779
789
Orchestration Modeling of Interactive Systems . . . . . . . . . . . . . . . . . . . . . . . Bertrand David and Ren´e Chalon
796
An Exploration of Perspective Changes within MBD . . . . . . . . . . . . . . . . . Anke Dittmar and Peter Forbrig
806
Rapid Development of Scoped User Interfaces . . . . . . . . . . . . . . . . . . . . . . . Denis Dub´e, Jacob Beard, and Hans Vangheluwe
816
PaMGIS: A Framework for Pattern-Based Modeling and Generation of Interactive Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J¨ urgen Engel and Christian M¨ artin
826
People-Oriented Programming: From Agent-Oriented Analysis to the Design of Interactive Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Steve Goschnick
836
Visualization of Software and Systems as Support Mechanism for Integrated Software Project Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Liggesmeyer, Jens Heidrich, J¨ urgen M¨ unch, Robert Kalckl¨ osch, Henning Barthel, and Dirk Zeckzer Collage: A Declarative Programming Model for Compositional Development of Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruce Lucas, Rahul Akolkar, and Charlie Wiecha Hypernetwork Model to Represent Similarity Details Applied to Musical Instrument Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tetsuya Maeshiro, Midori Maeshiro, Katsunori Shimohara, and Shin-ichi Nakayama
846
856
866
Table of Contents
Open Collaborative Development: Trends, Tools, and Tactics . . . . . . . . . . Kathrin M. Moeslein, Angelika C. Bullinger, and Jens Soeldner
XXIII
874
Investigating the Run Time Behavior of Distributed Applications by Using Tiny Java Virtual Machines with Wireless Communications . . . . . . Tsuyoshi Miyazaki, Takayuki Suzuki, and Fujio Yamamoto
882
OntoDesk: Ontology-Based Persistent System-Wide Undo on the Desktop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David Nemeskey, Buntarou Shizuki, and Jiro Tanaka
890
Peer-to-Peer File Sharing Communication Detection System with Traffic Mining and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Satoshi Togawa, Kazuhide Kanenishi, and Yoneo Yano
900
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
911
Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images Kiyohiko Abe1, Shoichi Ohi2, and Minoru Ohyama3 1
College of Engineering, Kanto Gakuin University, 1-50-1 Mutsuura-higashi, Kanazawa-ku, Yokohama, Kanagawa 236-8501, Japan 2 School of Engineering, Tokyo Denki University, 2-2 Kandanishiki-cho, Chiyoda-ku, Tokyo 101-8457, Japan 3 School of Information Environment, Tokyo Denki University, 2-1200 Muzaigakuendai, Inzai-shi, Chiba 270-1382, Japan
[email protected],
[email protected],
[email protected] Abstract. We propose a new eye blink detection method that uses NTSC video cameras. This method utilizes split-interlaced images of the eye. These split images are odd- and even-field images in the NTSC format and are generated from NTSC frames (interlaced images). The proposed method yields a time resolution that is double that in the NTSC format; that is, the detailed temporal change that occurs during the process of eye blinking can be measured. To verify the accuracy of the proposed method, experiments are performed using a high-speed digital video camera. Furthermore, results obtained using the NTSC camera were compared with those obtained using the high-speed digital video camera. We also report experimental results for comparing measurements made by the NTSC camera and the high-speed digital video camera. Keywords: Eye Blink, Interlaced Image, Natural Light, Image Analysis, HighSpeed Camera.
1 Introduction The blinking of the eye is related to factors such as human cognition, fatigue, and depressed consciousness; many studies have investigated eye blinking in relation to these factors. Most conventional methods for the measurement of the eye blink analyze eye images (images of the eye and its surrounding skin) captured by a video camera [1], [2], [3]. The NTSC video cameras that are commonly used are capable of detecting eye blinks; however, it is difficult for these cameras to measure the detailed temporal change occurring during the process of eye blinking, because eye blinks occur relatively fast (within a few hundred milliseconds). Therefore, a high-speed camera is required for an accurate measurement of the eye blink [3]. NTSC video cameras capture moving images at 60 fields/s and these field images are mixed with images that have a frame rate of 30 frames/s (fps) to field interlaced images. In this paper, we propose a new method for measuring the eye blink that uses J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 3–11, 2009. © Springer-Verlag Berlin Heidelberg 2009
4
K. Abe, S. Ohi, and M. Ohyama
NTSC video cameras. This method utilizes split-interlaced images of the eye captured by an NTSC video camera. These split images are odd- and even-field images in the NTSC format and are generated from NTSC frames (interlaced images). The proposed method yields a time resolution that is twice that in the NTSC format. Therefore, the detailed temporal change that occurs during the process of eye blinking can be measured. To verify the accuracy of the proposed method, we performed experiments using a high-speed digital video camera. Thereafter, we compared results obtained using the NTSC cameras with those obtained using the high-speed digital video camera. This paper also presents experiments that evaluate the proposed automatic method for measuring eye blinks.
2 Open-Eye Area Extraction Method by Image Analysis In general, eye blinks are estimated by measuring the open-eye area [2] or on basis of characteristics of specific moving points between the upper and lower eyelids [3]. Many of these methods utilize image analysis. It is possible to measure the wave pattern of eye blinks if the entire process of an eye blink is captured [3]. Furthermore, the type of eye blink and/or its velocity can be estimated on the basis of this wave pattern. However, it is difficult to measure the wave patterns of eye blinks by using video cameras that are commonly used for measuring eye blinks because the resulting eye images include high noise content owing to the change in light conditions. We have developed a new method for measuring the wave pattern of an eye blink. This method can be used with common indoor lighting sources such as fluorescent lights, and it can measure the wave pattern automatically. Hence, our proposed measurement method can be used under a variety of experimental conditions. In this method, the wave pattern is obtained by counting the number of pixels in the openeye area of the image as captured by a video camera. This image is enlarged for capturing the detailed eye image. We have proposed an algorithm for extracting the open-eye area in a previous study [4]. It utilizes color information of eye images. We have adapted the algorithm to our proposed method for elucidating the wave pattern of eye blink measurement. This algorithm has been developed for our eye-gaze input system, in which it compensates and traces head-movement [5]. Furthermore, the algorithm has been used under common indoor sources of light for a prolonged period. Hereafter, we describe in detail our image-processing algorithm for extracting the open-eye area. 2.1 Binarization Using Color Information on Image Many methods have been developed for the purpose of skin-color extraction; these methods are primarily focused on facial image processing, including those that utilize color information on a facial image. They mostly determine threshold skin-color values statistically or empirically [6]. We have developed an automatic algorithm for estimating thresholds of skin-color. Our algorithm can extract the open-eye area from the eye image on the basis of the skin-color.
Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images
5
Using our algorithm, skin-color threshold is determined by the histogram of the color-difference signal ratio of each pixel—Cr/Cb—that is calculated from the YCbCr image transformed from the RGB image. The histogram of the Cr/Cb value has 2 peaks indicating skin area and open-eye area. The Cr/Cb value indicated by the minimum value between the 2 peaks is designated as the threshold for open-eye area extraction. 2.2
Binarization by Pattern Matching Method
The method described in Subsection 2.1 can extract the open-eye area almost completely. However, the results of this extraction sometimes leave deficits around the corner of eye, because the Cr/Cb value around the corner of eye is similar to the value on skin in certain subjects. To resolve this problem, we have developed a method for open-eye extraction without deficits by combining 2 extraction results. One of them is a binarized image using color information, as described in Section 2.1. The other extraction result is a binarized image using light intensity information, which includes in the extraction result the area around the corner of the eye. Binarization using light intensity information utilizes the threshold estimated by a pattern matching method, which determines the matching point by using the color information of the binarized image as reference data. Hence, the threshold level is estimated automatically. The original image and the extracted open-eye area image are shown in Fig. 1(a) and Fig. 1(b).
(a)
(b)
Fig. 1. Original eye image (a) and extracted open-eye area (b)
3 Measurement Method of Wave Patterns of Eye Blinks Using Split-Interlaced Images Commonly used NTSC video cameras output interlaced images. One interlaced image has 2 field images, which are designated as odd or even fields. If an NTSC camera captures a fast movement such as an eye blink, there is a great divergence between the captured odd- and even-field images. Therefore, the area around eyelids on the captured image has comb-like noise. This phenomenon occurs because of mixing of 2 field images of the fast movement of eyelids. An example of interlaced images during eye blinking is shown in Fig. 2. To describe this phenomenon most clearly, Fig. 2 has been captured at low resolution (145 × 80 pixels).
6
K. Abe, S. Ohi, and M. Ohyama
If one interlaced image is split by scanning even- and odd-numbered lines separately, 2 field images are generated. Thus, the time resolution of the motion images doubles, but the amount of information in the vertical direction decreases by half. These field images are captured at 60 fields/sec, and the NTSC interlaced moving images are captured at 30 fps; therefore, this method yields a time resolution that is double that available in the NTSC format. The duration of a conscious blink is a few hundred milliseconds; therefore, it is difficult to measure accurately the wave pattern of an eye blink by using NTSC cameras. However, the detailed wave pattern of an eye blink can be measured by using our proposed method. The split-interlaced images are shown in Fig. 3. The 2 eye images shown in Fig. 3 are enlarged in a vertical direction and were generated from the interlaced image shown in Fig. 2. Our proposed method measures the wave patterns of eye blinks from these images.
Fig. 2. Blinking eye image (interlaced)
Fig. 3. Split-interlaced image generated from Fig. 2
4 Evaluation Experiment for Proposed Method Either 4 or 5 subjects participated in experiments to evaluate our proposed method, as described in Subsections 4.1 and 4.2, respectively. The experimental setup includes an NTSC DV camera (for home use), a high-speed digital video camera, and a personal computer (PC). The PC analyzes sequenced eye images captured by the video cameras. The DV camera captures interlaced images at 30 fps, and the high-speed digital video camera captures non-interlaced images at 300 fps. In the experiments performed using these video cameras, the wave pattern of eye blinks is measured from sequenced eye images. The experimental setup is shown in Fig. 4.
Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images PC
7
Display
User NTSC or High-speed digital video camera
Fig. 4. Hardware configuration of experimental system
4.1 Experiment for Eye Blink Measurement Using NTSC Camera In this experiment, sequenced eye images were captured using the DV camera at 30 fps in NTSC format. In addition, split-interlaced images are generated from these interlaced NTSC images. These split-interlaced images have a time resolution of 60 fields/s. The wave pattern of eye blinks is measured by the interlaced NTSC images and split-interlaced images. The binarization threshold for open-eye area extraction is determined automatically from the first field image of the experimental moving images. This threshold is estimated by the method described in Section 2. A typical result from this experiment is shown in Fig. 5.
Pixels of open-eye area
1.1 1 0.9 0.8
60fps
0.7
30fps
0.6 0.5 0.4 1
6
11 16 21 Sampling point (1/60 sec.)
26
Fig. 5. Wave patterns of eye blinks measured by DV (30 fps and 60 fps)
In Fig. 5, the longitudinal axis and the abscissa axis indicate pixels of open-eye area and sampling point (interval: 1/60 sec), respectively. To compare the 2 wave patterns of eye blinks, these plots are normalized using the pixels of open-eye area at the first field image. The bottoms of the plots indicate the eye-closed condition. Our proposed algorithm classifies the area of eyelid outline and cilia into the open-eye area; therefore, the pixels at the bottom of the plots are not reduced to zero. From Fig. 5, it is evident that sequenced images at 60 fields/s can be used to estimate the detailed wave pattern of an eye blink. During the eye blink, there is a great difference in the 2 plots of pixels of the open-eye area; however, this difference is not dependent on individual subjects.
8
K. Abe, S. Ohi, and M. Ohyama
Results of the wave pattern of eye blink measurements for 5 subjects are shown in Fig. 6, where the longitudinal axis and the abscissa axis show pixels of open-eye area and sampling point, respectively. These plots also are normalized in a manner similar to those in Fig. 5. From Fig. 6, it is evident that there are great differences in the results for each subject. 1.1
Pixels of open-eye area
1 0.9 0.8 0.7 0.6 0.5 0.4
Subject A Subject D
0.3
Subject B Subject E
Subject C
0.2 1
11 21 31 41 Sampling point (1/60 sec.)
51
Fig. 6. Wave patterns of eye blinks of 5 subjects measured by DV (60 fps)
4.2 Experiment for Eye Blink Measurement Using High-Speed Video Camera To verify the accuracy of the proposed method that utilizes split-interlaced images, experiments were conducted with 4 subjects; this experiment and the one described in Subsection 4.1 were conducted separately. Subjects A and E (listed in Fig. 6) were enrolled in this experiment continuously, in which sequenced images at 3 different frame rates (30, 60, and 150 fps) were generated from moving images captured by the high-speed digital video camera. These sequenced images were then analyzed to measure the wave pattern of eye blinks. The results of eye blink measurements performed using the sequenced images at 3 different frame rates and those taken at 300 fps are compared. Typical examples of measurement results are shown in Fig. 7, Fig. 8, and Fig. 9, which display results at 30, 60, and 150 fps, respectively. From Fig. 7 and Fig. 8, it is evident that the degree of accuracy of measurement at 60 fps is higher than that at 30 fps. The minimum of the wave pattern (bottom of the curve) is quite characteristic of when an eye blink occurs. Results at 60 fps show that the bottom of the plot is measured with a high degree of accuracy. Therefore, sequenced images at this frame rate are suitable for measurement of eyelid movement velocity. Moreover, our proposed method using split-interlaced images (described in Section 3) utilizes 2 field images generated from one interlaced image; that is, the
Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images
9
spatial information of these field images is decreased by half. We have confirmed that this decrease in spatial information does not affect measurement accuracy via an experiment using sequenced images at 60 fps. The sequenced images at 60 fps were generated from moving images captured by a high-speed digital video camera. In this experiment, we generated half-sized eye images by extracting scanned odd-numbered lines from sequenced images at 60 fps. We estimated wave patterns of eye blinks using these half-sized images. Our results show that the measured open-eye area decreases by half, which is in agreement with the results shown in Fig.8.
Pixels of open-eye area
46000 300 fps
44000
30 fps
42000 40000 38000 36000 34000 32000 1
11
21 31 41 51 61 71 Sampling point (1/300 sec.)
81
Fig. 7. Wave pattern of eye blinks measured by high-speed video camera (30 fps)
Pixels of open-eye area
46000
300 fps
44000
60 fps
42000 40000 38000 36000 34000 32000 1
11
21 31 41 51 61 71 Sampling point (1/300 sec.)
81
Fig. 8. Wave pattern of eye blinks measured by high-speed video camera (60 fps)
10
K. Abe, S. Ohi, and M. Ohyama
Pixels of open-eye area
46000
300 fps
44000
150 fps
42000 40000 38000 36000 34000 32000 1
11
21 31 41 51 61 71 Sampling point (1/300 sec.)
81
Fig. 9. Wave pattern of eye blinks measured by high-speed video camera (150 fps)
4.3 Discussion On the basis of Fig.5, it is evident that by using split-interlaced images, the time resolution of measurement is doubled than that of the results obtained in previous studies. These split images are odd- and even-numbered field images in the NTSC format that are generated from NTSC frames. This method can also be utilized for any subject under common indoor lighting sources, such as fluorescent lights. We have shown the wave patterns of eye blinks for 5 subjects in Fig. 6. From results shown in Fig. 7, Fig. 8, and Fig. 9, it is evident that the degree of accuracy of measurement increases with increasing frame rate. A closer estimate of eye blinking velocity can be achieved if the wave pattern of an eye blink were to be measured with higher accuracy. In other words, the type of eye blink can be classified with a high degree of accuracy. In addition, our proposed method can measure the wave patterns of eye blinks efficiently even by using half-sized eye images. As shown by our experimental results presented earlier, we have verified the reliability of our proposed method described in Section 3. Thus, detailed wave patterns of eye blinks can be measured by using our proposed method.
5 Conclusions We present a new automatic method for measuring eye blinks. Our method utilizes split-interlaced images of the eye captured by an NTSC video camera. These split images are odd- and even-numbered field images in the NTSC format and are generated from NTSC moving images. By using this method, the time resolution for measurement increases to 60 fps, which is double that of conventional methods. Besides the function of automatic measurement of eye blinks, our method can be used under common indoor lighting sources, such as fluorescent lights. In evaluation experiments, we measured eye blinks of all subjects without problems.
Automatic Method for Measuring Eye Blinks Using Split-Interlaced Images
11
To verify the accuracy of our proposed method, we performed experiments using a high-speed digital video camera. On comparison of the results obtained using NTSC cameras with those obtained using a high-speed digital video camera, it is evident that the degree of accuracy of measurement increases with increased resolution time. Additionally, a decrease in area of the split-interlaced image has no adverse effect on the results of eye blink measurements. We confirmed that our proposed method is capable of measuring the wave pattern of eye blinks with high accuracy by using an NTSC video camera. In the future, we plan to develop a new method for classifying types of eye blinks using our proposed measurement method reported above. That new method will be capable of profiling eye blinks according to velocity of open-eye area changes. We also plan to apply this new method to more general ergonomic measurements.
References 1. Grauman, K., Betke, M., Gips, J., Bradski, G.R.: Communication via Eye Blinks - Detection and Duration Analysis in Real Time. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1010–1017, Lihue, HI (2001) 2. Morris, T., Blenkhorn, P., Zaidi, F.: Blink Detection for Real-Time Eye Tracking. J. Network and Computer Applications 25(2), 129–143 (2002) 3. Ohzeki, K., Ryo, B.: Video Analysis for Detecting Eye Blinking Using a High-Speed Camera. In: Proc. of Fortieth Asilomar Conf. on Signals, Systems, and Computers, Pacific Grove, CA, pp. 1081–1085 (2006) 4. Abe, K., Ohyama, M., Ohi, S.: Eye-Gaze Input System with Multi-Indicators Based on Image Analysis under Natural Light. J. The Institute of Image Information and Television Engineers 58(11), 1656–1664 (2004) (in Japanese) 5. Abe, K., Ohi, S., Ohyama, M.: An Eye-Gaze Input System Using Information on Eye Movement History. In: Proc. on 12th International Conference on Human-Computer Interaction, HCI International 2007, Beijing, vol. 6, pp. 721–729 (2007) 6. Garcia, C., Tziritas, G.: Face Detection Using Quantized Skin Color Regions Merging and Wavelet Packet Analysis. IEEE Trans. on Multimedia 1(3), 264–277 (1999)
A Usability Study of WebMaps with Eye Tracking Tool: The Effects of Iconic Representation of Information Özge Alaçam and Mustafa Dalcı Human Computer Interaction Research and Application Laboratory, Computer Center, Middle East Technical University, 06531 Ankara/Turkey {ozge,mdalci}@metu.edu.tr
Abstract. In this study, we aim to conduct usability tests on different WebMap sites with eye movement analysis. Overall task performance, the effects of iconic representation of information, and the efficiency of pop-up usage were evaluated. The eye tracking technology is used for this study in order to follow the position of the users’ eye-gaze. The results show that there are remarkable differences in task performance between WebMaps. Addition, they also differ in the use of iconic representations according to results of users’ evaluation. It is also found that efficiency of pop-up windows’ usage has an effect on task performance. Keywords: Web mapping, usability, eye tracking, cognitive processes, iconic representations, and the efficiency of pop-ups.
1 Introduction Web mapping sites became widespread in many professional areas since they provide opportunities such as designing and sharing maps on the World Wide Web. Beside to their role on professional area, it also became very considerable part of our daily life since it makes the navigation easier [13]. Addition to large number of web mapping sites’ users which access these sites with their desktop and laptop PCs, improvements in technology make the internet available in nearly everywhere providing a chance to connect with mobile devices (mobile phones, smart phones, PDAs) and multiply the number of web mapping sites’ users. By the increasing number of web mapping sites’ users, researchers started to conduct usability studies of these sites and to investigate the effects of usability [2, 8, 13, 15]. The term usability is defined by ISO 9241 [9] as “the effectiveness, efficiency and satisfaction with which specified users achieve specified goals in particular environments” [9, 13]. Another definition from Nielson, one of the pioneers in the usability field, states usability as a quality evaluation that assesses how easy user interfaces are to use. According to his definition, usability is composed of five quality components [16]; these components are learnability, efficiency (task completion time), memorability, errors, satisfaction. Addition to these parameters obtained from usability study, usage of the eye tracking tools adds a different aspect to the usability field for the reason that it provides objective and quantitative evidence to investigate user’s cognitive processes such as visual and attentional J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 12–21, 2009. © Springer-Verlag Berlin Heidelberg 2009
A Usability Study of WebMaps with Eye Tracking Tool
13
processes [5]. Usage of eye tracking on usability field started at 1950’s [6]. However due to the difficulties in the analysis of huge data obtained from the eye tracking tools, it lost its popularity in 1970’s. With the improvements of the eye tracking technologies, eye tracking tools gain their impacts on the usability field again [10] and nowadays they are accepted as a tool to improve computer interface. In one of the studies about WebMap usability conducted by Nivala [13], severity of the usability problems were investigated. In our study, we aim to make additional analysis to find the reason of these usability problems and make them more clear by analyzing eye movements of the users. The focus of this study is to analyze the effects of the iconic representation of the information and to investigate whether the pop-ups are used efficiently by the user. The eye tracking tool is used for this study in order to follow the position of the users’ eye movements, which helps to measure the attended location on the map. It is known that eye movements provide information about cognitive processes such as perception, thinking, decision making and memory [1, 3, 4, 12, 14]. Evaluation of eye movement provided us the opportunity to focus on the iconic representations, efficiency of pop-up windows and their effects in map comprehension in different WebMaps.
2 Method and Materials 26 subjects (12 female, 14 male) either university students or graduate in a range of ages between 18 and 32 participated to this study. In order to get information about their prior knowledge about WebMap usage and to get the user’s idea about the comprehensibility of the icons and preferences about the WebMaps, a questionnaire was carried out. Each subject evaluated two different WebMaps for different places in US in random order. Six tasks shown in Table 1 were used in the experiment. Users were told that they could give up the task or the experiment whenever they wanted to. Tasks given to the users include; to find given address, to find definite places which are represented with icons (s.t airport, metro station, hospital) and to show the route to specific locations. The experiments are conducted at the Human-Computer Interaction Research and Application Laboratory at Middle East Technical University. Eye movements of users were collected by Tobii 1750 Eye Tracker and analyzed with Tobii Studio. Table 1. Task Description Task No Task Description Instruction Welcome to X City/State. You are planning to look at the city map to specify the locations that you want to visit before starting your trip 1 Point the nearest highway intersection to X International Airport You want to go from X International Airport to X University. Could you describe 2 how to arrive to that location? Find the address of the hospital nearest to X Park 3 Now, you are in X Location. Show the nearest metro/railway station to this place 4 You are searching for the library nearest to X place. Find and point it on the map. 5 Show the intersection point of the given address with X street. 6
14
Ö. Alaçam and M. Dalcı
In the Nivela et al.’s study [13], there is an evaluation of four different web mapping sites. These are Google Maps, MSN Maps & Directions, MapQuest and Multimap. However, since the MSN Maps and Directions, Multimap are based on Microsoft Virtual Earth, we replaced these sites with Live Search Maps that is also based on MS Virtual Earth. Since these are well-known and all have zooming and panning options on their 2D map applications, they are very good candidates for usability testing. Although their common properties mentioned above, they differed in terms of usage of icon representation and pop-up window properties. We conducted the usability testing of Google Maps, Live Search Map, MapQuest and Yahoo Map and investigated the effect of iconic representation of information and pop-up windows by analyzing eye movements. We use the term “The iconic representation of information” as to state the relationship between their semantics and appearance. Addition to evaluation of task completion performance (s.t. task completion score and time), eye tracking data such as fixation length, fixation count, observation length was collected.
3 Results Results are presented under three categories; task performance, analysis of the iconic representations and analysis of pop-up windows. 3.1 Task Performance Users are grouped into two categories according to their WebMap usage experience; experienced users (14 users) for high-level usage frequency and inexperienced users (12 users) for low level usage frequency. One way ANOVA test was conducted to compare mean fixation length on task completion time for experience level. Result shows that user’s experience level has a significant effect on task completion time, F(1,52)=5,30, p>.05. One of the evaluation criteria of comparing the usability of WebMaps is users’ task completion scores. Task completion score was evaluated under three categories; accomplished tasks, unaccomplished tasks and partially accomplished tasks that the users thought that they accomplished a task when they actually did not. Table 2 provides the percentage of users, who accomplished, partially accomplished and did not accomplish each task and also overall score was calculated for each WebMap site. Fig. 1 shows the overall completion score for each map. Results of one way ANOVA shows that task completion score of Google Map is significantly different than MapQuest and Yahoo Map, F(3,48)= 8.629 p in the HTML source code of the webpage that targets the usability evaluation using Webjig. Fig.2 is an example of Webjig installed in an HTML source code. Webjig works even if the developer may insert the script tag at the any place in the HTML source code. However, a mainstream Web browser interprets the HTML source code from the top and displays the contents. Therefore, we recommend inserting the script tag at the bottom of the HTML source code so that Webjig does not disturb the original contents. Sample Page
Sample Content
<script src=”http://example.com/webjig.js” > Fig. 2. An example of HTML source code
3.2 Webjig::Analysis Webjig::Analysis has various features for supporting website usability evaluation. For instance, Webjig::Analysis can replay users’ interactions such as mouse motions, mouse click, and keyboard input related to the displayed contents in a movie format by using the collected data. In Fig.3, we show a screenshot of Webjig::Analysis when it replays the users’ interactions. The system consists of displayed contents in a Web browser and some floating windows that control the system and show various kinds of information. Developers can replay users’ interactions such as play, stop, forward, and rewind anytime by using various control buttons, seek bar, or slider available on the control window. In addition, the system can also generate a heat map, which shows where the users often click, and presume the portions where the users read and do not read on a webpage. By using these features, developers can examine the following questions.
• Are there any confusing graphics in links?
• Do users pay attention to the content that developers want them to read? • Where do users look or not look? • How do users access the website? • What do user wrong operation on the way to the goal? • How do users use a dynamic interface? • Where do users pause when they input into forms? • Where did the user view before exiting the website? • and so forth.
Webjig: An Automated User Data Collection System for Website Usability Evaluation
281
Fig. 3. Screenshot of Webjig::Analysis
4 System Evaluation 4.1 Overview We performed an experiment to evaluate the usefulness of Webjig. 54 graduate students (39 males and 15 females, average age 20) participated in the experiment as subjects. 54 subjects were divided into three groups. Each group worked on different tasks described in the next subsection. 4.2 Experiment Procedure and Task We executed the experiment according to the following procedures. Step 1. We provided 24 uses (subject of Group A) five tasks. Each task required the subjects to find a specified product from a dynamic menu implemented using JavaScript. Webjig recorded users’ interactions during task execution. Step 2. Based on the collected data in Step 1, three subjects who had a role of developers (Group B) analyzed the users’ interactions during task execution using
282
M. Kiura, M. Ohira, and K. Matsumoto
Webjig::Analysis. The developers planned for an improved structure of menu. Step 3. We provided 27 different users (subjects of Group C) tasks similar to Step 1. The difference between Step 1 and Step 2 is that the subject of Group C used the improved menu. Webjig recorded users’ interactions during task execution. Step 4. Finally, comparing the task execution time of Step1 and Step3, we checked the validity of the change in the structure of the menu. Fig.4 is the dummy website for the experiment. Table 1 shows target products and categories where the products exist.
Fig. 4. Screenshot of the dummy website for the experiment Table 2. Target products and category for each task Task Name Task 1 Task 2 Task 3 Task 4 Task 5
Product Dry cell SD memory card A massage chair Electronic dictionary Fax
Category Audio & visual Cameras Health Office House & appliance
4.3 Experiment Results Developers can know where users look in the webpage by using Webjig. Table 3 shows what percentage of the subject of Group A firstly clicked on which categories. The grayed rectangle in Table 3 means the correct category where a specified product
Webjig: An Automated User Data Collection System for Website Usability Evaluation
283
Table 3. Results of first category sellection Category Task 1 Task 2 Task 3 Task 4 task5 Camera 29% 13% 0% 0% 0% Computer 0% 46% 0% 13% 4% 4% 33% 0% 0% 21% Audio-video equipment House & appliance 54% 4% 71% 29% 29% Game 4% 4% 0% 0% 0% 58% 46% Office equipment 8% 0% 4% 25% 0% 0% Health 0% 0%
exists for each task. For example, 54% of the subjects first clicked on the category of house & appliance, thought dry cell belonged to the category of audio & visual. When using existing systems, developers cannot know such the information. Table 4 shows the changed structure of the menu which was planned by the developers based on the result of Table 3. The plan is made from an idea that if there was the category more clicked by users than the current category, a target product should be moved to a proper category. In case of task 1 where subjects searched a dry cell, a dry cell belonged to the audio & visual category, but many subjects first pay attention to the house & appliance category. Therefore, the developers moved the dry cell to the category of house & appliance. Further, in case of task 4 where subjects searched an electronic dictionary, an electronic dictionary belonged to the category of office equipment, and the majority of the subjects first paid attention to the office equipment category. Therefore, the developers did not move it to any other category. Table 4. Change plan for the menu of the categories Task Name Task 1 Task 2 Task 3 Task 4 Task 5
Product Dry cell SD Memory Card A massage chair Electronic dictionary Fax
Original category Audio & video Cameras Health Office House & appliance
Destination category House & appliance Computers House & appliance Office Office
We perform the experiment after changing the website, as shown in table 7. We show the experiment result in Fig.5. From Fig.5, the task execution time has been reduced in tasks 1, 2, and 3 by applying the changed plan. Fig. 5 shows the results of the execution time for each task in Step1 and Step 31. We can confirm that the execution time in Step 3 is shorter than that in Step 1, that is, the improved menu structure based on the developers’ analysis using Wegjig was effective. 1
Since the structure of the menu was changed in Task 4, we could not confirm the significant difference between the results in Step1 and Step3.
284
M. Kiura, M. Ohira, and K. Matsumoto
Fig. 5. Result of the task execution time in Step1 and Step3
5 Discussion By using Webjig, developers can obtain information which they would not have got with the existing systems. For this reason, developers can detect problems in website usability and create a plan for improving website usability by collecting data of users’ interactions, as performed in this experiment. In the experiment where users choose the items from the menu, the developers can determine the execution time for each task by using existing systems. Thus, they can detect the problems of usability by comparing the execution time of each task and pinpoint the task where the execution time is longer than that taken by another task. In Fig.5, the execution time of tasks 1, 2, 3, and 5 is longer than that of task 4. For this reason, a developer can hypothesize that there remains problems of website usability. However, it is difficult to eliminate the problem if they cannot understand the cause of the problem. By using Webjig, a developer can efficiently detect the problem of website usability. In case of task 1 (subjects find a dry cell), we show the experiment result in table 3; dry cell belongs to audio-visual equipment, but many subjects pay attention to household appliance. The developer hypothesized that “Many users think that a dry cell belongs to a household appliance” and moved the dry cell from audio-visual equipment to household appliance. As a result, the execution time is reduced before changing the category. According to Fig.5, the task execution time of the changed website is less than that of the original website. In tasks 1, 2, and 3, we can observe significant improvement in the execution time. However, in task 5, we did not observe any significant improvement in the execution time.
Webjig: An Automated User Data Collection System for Website Usability Evaluation
285
Table 5. Priority for the improvement Task Name Task 1 Task 2 Task 3 Task 5
Correct category (A) 4% 13% 25% 29%
Current Category (B) 54% 46% 71% 46%
B/A 13.5 3.5 2.8 1.6
We explain the reason for this. In table 5, we compare the rate of users who pay attention to the correct category with the rate of users who pay attention to the changed category. In case of task 1, 4% of users pay attention to the correct category (a category of audio & visual) when searching for dry cell and 54% of users pay attention to the wrong category (a category of house & appliance) when searching for dry cell. This has a difference of 13.5 times. Similarly, task 2 has a difference of 3.5 times, task 3 has a difference of 2.8 times, and task 5 has a difference of 1.6 times. As a result, we can say that if there is not a big difference in the rate of users who pay attention to an original category and the rate of users who pay attention to a changed category, we cannot confirm an effect in the change. Therefore, developers have to examine whether the usability is improved by understanding users’ interactions and not by the reason that the task execution time was longer than others. By using Webjig, a developer can exactly understand users’ interactions and examine whether the usability is improved. However, it is difficult to examine the improvement of website usability by using existing systems because exact users’ interactions cannot be obtained. However, developers cannot use the Webjig instead of user testing because they can know the gaze point by using the eye tracking system and they can know the intention of the user by interviewing him/her during user testing. But we saw that there was the point that could be improved website usability by using Webjig. Therefore, developers may efficiently improve website usability by combining user testing and Webjig.
6 Conclusion and Future Work In this paper, we proposed a Webjig support system for static and dynamic websites. As a result of the experiment, we show that developers can improve website usability effectively by using Webjig. In the future, we are going to think about the cost of website usability evaluation between existing systems and Webjig and compare usability testing with Webjig to determine the efficiency of website usability evaluation.
Acknowledgements This study is supported by Information-technology Promotion Agency, Japan (IPA), Exploratory IT Human Resources Project (MITOU Program) in the fiscal year 2008.
286
M. Kiura, M. Ohira, and K. Matsumoto
References 1. Nielsen, J., Landauer, T.K.: A mathematical model of the finding of usability problems. In: The INTERACT 1993 and CHI 1993 conference on Human factors in computing systems, pp. 206–213 (1993) 2. Dumas, J.S., Redish, J.C.: A Practical Guide to Usability Testing. Ablex Publishing, Norwood, New Jersey (1993) 3. Barnum, C.M.: Usability Testing and Research. Longman, London (2001) 4. Hong, J.I., Landay, J.A.: WebQuilt: a framework for capturing and visualizing the web experience. In: The 10th international conference on World Wide Web (WWW 2001), pp. 717–724 (2001) 5. Etgan, M., Cantoe, J.: What does getting WET (Web Event-logging Tool) mean for web usability? In: 5th Conference on Human Factors and the Web, HFWEB 1999 (1999), http://zing.ncsl.nist.gov/hfweb/proceedings/ etgen-cantor/index.html (accessed February 27, 2009) 6. Arroyo, E., Selker, T., Wei, W.: Usability tool for analysis of web designs using mouse tracks. In: CHI 2006 extended abstracts on Human factors in computing systems, pp. 484– 489 (2006) 7. Atterer, R., Schmidt, A.: Tracking the interaction of users with AJAX applications for usability testing. In: The SIGCHI conference on Human factors in computing systems (CHI 2007), pp. 1347–1350 (2007) 8. Chen, M.C., Anderson, J.R., Sohn, M.H.: What can a mouse cursor tell us more?: correlation of eye/mouse movements on web browsing. In: CHI 2001 extended abstracts on Human factors in computing systems, pp. 281–282 (2001) 9. Mueller, F., Lockerd, A.: Cheese: tracking mouse movement activity on websites, a tool for user modeling. In: CHI 2001 extended abstracts on Human factors in computing systems, pp. 279–280 (2001)
ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems Jan Kleindienst, Jan Cuřín, and Martin Labský IBM Research, Prague, Czech Republic {jankle,jan_curin,martin.labsky}@cz.ibm.com
Abstract. We propose a new approach toward evaluation of spoken dialog systems. The novelty of our method is based on utilization of domain-specific knowledge combined with the deterministic measurement of dialog system performance on a set of individual tasks within the domain. The proposed methodology thus attempts to answer questions such as: “How well is my dialog system performing on a specific domain?”, “How much has my dialog system improved since the previous version?”, “How much is my dialog system better/worse than other dialog systems performing on that domain?” Keywords: Dialog, evaluation, scoring, multimodal, speech recognition.
1 Introduction Research in the field of conversational and dialog systems has a long tradition starting in 1966 with Weizenbaum's Eliza [1]. More recently, research in spoken dialog systems has tackled more ambitious domains, such as problem solving [2], navigation [3], or tutoring systems [4]. This paper is organized as follows. In introduction we outline our motivation and the principle of the proposed method. In Section 2 we introduce the concept of a domain task ontology that can serve as a benchmarking tool for well-known application domains. Section 3 describes in detail the proposed ADiEU metric and its computation. Section 4 presents a case study in the music management domain and demonstrates the application of ADiEU to a real-world task. We discuss practical considerations regarding the proposed metric in Section 5, human evaluation in Section 6, and conclude in Section 7. 1.1 Rationale Current methods and techniques for measuring performance of spoken dialog systems are still very immature. They are either based on subjective evaluation (Wizard of Oz or other usability studies) or they are borrowing automatic measures used in speech recognition, machine translation or action classification, which provide only incomplete picture of the performance of the system. Nowadays, dialog systems are evaluated by measures used in speech recognition, such as word error rate (WER) or action classification error rate [5], by techniques that measure primarily dialog coherence [6], and by systems supporting human judgment-based evaluation, such as PARADISE [7, 8]. What J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 287–294, 2009. © Springer-Verlag Berlin Heidelberg 2009
288
J. Kleindienst, J. Cuřín, and M. Labský
is particularly missing in this area is (1) a measurement of performance for a particular domain, (2) possibility to compare one dialog system with others, and (3) evaluation of a progress during the development of a dialog system. By the ADiEU1 scoring presented herein we attempt to address these three cases. 1.2 The Elements of ADiEU Metric The ADiEU score consists of two ingredients both of which range from 0 to 1: A) Domain Coverage (DC) score, B) Dialog Efficiency (DE) score. We describe both scores in the following chapters. Note that the results of domain coverage and dialog efficiency may be combined into a single compound score to attain a single overall characteristic (the eigen value) of the assessed dialog system. The ADiEU score relies on a good understanding of the dialog domain that is described in the form of a domain task ontology. The more expert knowledge is projected into the domain ontology, the more reliable results we expect from the ADiEU score.
2 Capturing Domain Ontology The cornerstone of our approach is to evaluate spoken and multi-modal dialog systems within a predefined, well-known (and typically narrow) domain. In our labs we have developed many speech and multimodal applications for various domains, such as music selection, TV remote control, in-car navigation and phone control; using grammars, language models and natural language understanding techniques. In order to compare two spoken dialog systems that deal with the same domain, we first describe the domain diligently using the task ontology. This restricted ontology represents the human expert knowledge of the domain and is encoded as a set of tasks with two kinds of relations between the tasks: task generalization and aggregation. Individual tasks are defined as sequences of parameterized actions. Actions are separable units of domain functionality, such as volume control, song browsing or playback. Parameters are categories of named entities, such as album or track title, artist name or genre. Tasks are labeled by weights, which express the relative importance of a particular task with respect to other tasks. The ontology may also define task aggregations which explicitly state that a complex task can be realized by sequencing several simpler tasks. Table 1 shows a sample task ontology for the music control domain. For example, the task volume control/relative with weight of 2 (e.g. “louder, please”) is considered more important in evaluation than its absolute sibling (e.g. “set volume to 5”). This may be highly subjective if scored by a single human judge and thus a consensus of domain experts may be required to converge to a generally acceptable ontology for the domain. Once acknowledged by the community, this ontology could be used as the common etalon for scoring third-party dialog systems. 1
We call our measurement the Automatic Dialog Evaluation Understudy, ADiEU.
ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems
289
Table 1. Speech-enabled reference tasks for the jukebox domain. Tasks are divided into groups. Both group as well as tasks within the group are assigned relative importance points by an expert. These points are normalized to obtain per-task contribution to the domain’s functionality. ITC shows ideal turn count range for each task. Group Points Share Volume 2 15.50% Playback 4 31.01%
Play mode 0.5 3.88% Media library 6 46.51%
Menu 0.4
3.10% 100%
Task Description relative absolute mute play stop pause resume next, previous track next, previous album media selection shuffle repeat browse by criteria play by criteria search by genre search by artist name up to 100 artists more then 100 artists search by album name up to 200 albums more than 200 albums search by song title up to 250 songs more than 2000 songs search by partial names words spelled letters ambiguous entries query item counts favorites browse and play add items media management refresh from media add or remove media access online content quit switch among other apps
Points 2 1 2 3 3 1.5 1.5 1 1 1
Contrib. % 6.20 3.10 6.20 7.75 7.75 3.88 3.88 2.58 2.58 2.58
ITC 1 1 1 1 1 1 1 1 1 1
1 1 2 4 2
1.94 1.94 3.93 7.85 3.93
1 1 1..2 1..2 1
1 2
1.96 3.93
1..2 1..2
1 2
1.96 3.93
1..2 1..2
1 2
1.96 3.93
1..2 1..2
1 1 2
1.96 1.96 3.93
2 2 2
0.5
0.98
1
0.5 0.3
0.98 0.59
1..2 1
0.2 0.2 1 0.5 1
0.39 0.39 1.96 1.03 2.07 100
1 1..2 2..3 1..2 1..2
3 The Proposed Method of ADiEU Evaluation The actual dialog system evaluation metric that is in the heart of our method consists of two indicators: Domain Coverage (DC) - computed over the task ontology and Dialog Efficiency (DE) that quantifies the outcome of user test sessions. The DC expresses how the evaluated system covers the set of tasks in the ontology for a particular domain; while the DE indicates the performance of the evaluated system on those tasks supported by the system.
290
J. Kleindienst, J. Cuřín, and M. Labský
3.1 Scoring of Domain Coverage The domain coverage (DC) is a sum of weights of tasks supported by the system (S) over the sum of weights of all tasks from the ontology (O).
DC ( S , O) =
∑ ∑
t ∈su pported tasks ( O ) t *∈all tasks ( O )
wt
wt *
(1)
Table 1 shows a sample domain task ontology for the music management domain that shows the raw points assigned by a domain expert and their normalized versions that are used to assess the relative importance of individual tasks. The expert may control the weights of whole task groups (such as Playback control) as well as the weights of individual tasks that comprise these groups. Generally, the ontology can have more than two levels of sub-categorization that are shown in the example. 3.2 Scoring of Dialog Efficiency The actual efficiency of dialog is measured using the number of dialogue turns [9, 10] needed to accomplish a chosen task. In spoken dialog systems, a dialog turn corresponds to a pattern of user speech input followed by the system’s response. We introduce a generalized penalty turn count (PTC) that measures overall dialog efficiency by incorporating other considered factors: number of help requests, number of rejections, and user and system reaction times.
PTC (t ) = TC (t ) + λNHR NHR(t ) + λNRP NRP(t ) + λURTURT (t ) + λSRT SRT (t )
(2)
Where TC is the actual dialog turn count, NHR is the number of help requests, URT is user response time and SRT is system response time and the lambdas represent weights of each contributor to the final penalty turn count (PTC)2. The obtained penalty turn count in then compared to an ideal number of turns for a particular task. We define a key property, the ideal number of turns (INT), as being determined by at least the following factors. The INT is (F1) directly proportional to a number of information slots to be filled and (F2) indirectly proportional to a size of the block of information slots commonly accepted as coherent. INT (t ) =
number of in formation slots to be filled size of a block of in formation slots commonly accepted as coherent
(3)
For example, the concept of “date” consists of three information slots (day, month, and year) that need to be filled. Here, the number of information slots (F1) is three, which is in this case the same as the size of a coherent block expected by the users. The INT for the “date” concept is thus 1 (=3/3). In the contemporary art the INT property is determined manually by a human judgment. 2
In our experiments, we set λNHR=0.5, λNRP=1, and λURT=λSRT=0 since for the music domain the user reaction time was not indicative of dialog quality and both applications responded instantly.
ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems
291
The actual score of the dialog efficiency (DE score) for an individual task is then counted as a fraction of difference between INT and PTC against current PTC, i.e.: ⎛ PTC (t ) − INT (t ) ⎞ DE (t ) = 1 − max ⎜⎜ , 0 ⎟⎟ PTC (t ) ⎝ ⎠
(4)
To avoid subjective scoring we typically use several human testers as well as several trials per one task. For example for task “play by artist” the following set of trials can be used: “Play something by Patsy Cline”, “Play some song from your favorite interpreter”, or “Play some rock album, make the final selection by the artist name”. Each of these trials has assigned its ideal number of turns (this is why INT for tasks in the ontology are given by range in the Table 1.) The task dialog efficiency score is then computed as an average over all human testers and dialog efficiency for each trial. Samples of trials used in the evaluation of music management domain are given in Table 2. 3.3 The ADiEU Score The ADiEU score is then counted as a sum of products of domain coverage and dialog efficiency for each task in the domain ontology, i.e.:
ADiEU ( S , O) =
∑
t ∈su pported tasks ( O )
wt ⋅ DE (t )
∑t∈su pported tasks (O ) wt
(5)
4 Case Study: ADiEU Scores for Music Management Domain We applied the ADiEU scoring to our two dialog systems developed at different times and both partially covering the music management dialog domain. Both allow their users to play music by dynamically generating grammars based on meta tags found in users’ mp3 files. The first one, named A-player, is simpler and covers a limited part of the music management domain. The second, named Jukebox, covers a larger part of the domain and also allows free-form input using a combination of statistical language models and maximum entropy based action classifiers. For both applications, we collected input from a group of 10 speakers who were asked to accomplish tasks listed in Table 2. Each of these user tasks corresponded to a task in the domain task ontology and there was at least one user task per each ontology task that was supported by either A-player or Jukebox. The subjects were given general guidance but no sample English phrases were suggested to them that could be used to control the system. In order not to guide users even by the wording of the user tasks, the tasks were described to them in their native language. All ten subjects were non-native but fluent English speakers.
292
J. Kleindienst, J. Cuřín, and M. Labský Table 2. Specific tasks to be accomplished by speakers using A-player and Jukebox Task Start playback of arbitrary music Increase the volume Set volume to level 10 Mute on Mute off Pause Resume Next track Previous track Shuffle Play some jazz song Play a song from Patsy Cline Play Iron Man from Black Sabbath Play the album The Best of Beethoven Play a song Where the Streets Have No Name Play a song Sonata no. 11 (ambiguous) Play a rock song by your favorite artist Reload songs from media
A-player Jukebox ITC x x 1 x 1 x 1 x 1 x 1 x 1 x 1 x x 1 x x 1 x x 1 x 1 x x 1 x x 1 x x 1 x x 1 x x 2 x x 3 x 1
Table 3. Computation of coverage, task completion score and ADiEU for A-player and Jukebox Task sup volume relative volume absolute mute play stop pause resume next, prev. track next, prev. album shuffle browse by criteria play by criteria search by genre search by artist 100 artists search by album 200 albums search by song 2000 songs word part. search ambiguous entries media refresh
0 0 0 1 1 0 0 1 0 0 0 1 0
A-player DC DE weight score 0 0 0 7.75 0.57 7.75 1.00 0 0 2.58 0.80 0 0.50 0 0 7.85 0.82 0 0.67
1 1
1.96 3.93
0.83 0.83
1 1
1.96 3.93
1.00 1.00
1 1 0 0 1 0.34
1.96 3.93 0 0 0.39 43.99
0.79 0.79 0.67 82.6
ADiEU 0.000 0.000 0.000 0.044 0.078 0.000 0.000 0.021 0.000 0.000 0.000 0.064 0.000 0.000 0.016 0.033 0.000 0.020 0.039 0.000 0.015 0.031 0.000 0.000 0.003 36.3
sup 1 1 1 1 1 1 1 1 1 1 0.5 1 1
Jukebox DC DE weight score 6.2 0.82 3.1 0.82 6.2 0.82 7.75 0.32 7.75 0.82 3.88 0.57 3.88 0.50 2.58 1.00 2.58 0.80 1.94 0.67 1.97 0.52 7.85 0.67 3.93 0.78
1 1
1.96 3.93
0.40 0.60
1 1
1.96 3.93
0.29 0.75
1 1 1 1 0 0.55
1.96 3.93 1.96 3.93 0 83.17
0.61 0.93 0.55 0.49 66.7
ADiEU 0.051 0.025 0.051 0.025 0.064 0.022 0.019 0.026 0.021 0.013 0.010 0.052 0.030 0.000 0.008 0.024 0.000 0.006 0.029 0.000 0.012 0.036 0.011 0.019 0.000 0.554
Table 3 shows the computation of the ADiEU score and its components: domain coverage (DC) and domain efficiency (DE). For A-player, which is limited in functionality, the weighted domain coverage only reached 43.99%, whereas for Jukebox
ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems
293
this was 83.17%. On the other hand, A-player allowed its users to accomplish the tasks it supported more quickly than Jukebox; this is documented by the weighted dialog efficiency score reaching 82.6% for A-player and 66.7% for Jukebox. This was mainly due to Jukebox being more interactive (e.g. asking questions, presenting choices) and due to a slightly higher error rate of a dictation-based system as opposed to a grammar-based one. The overall ADiEU score was higher for Jukebox (55.4%) than it was for A-player (36.3%). This was in accord with the feedback we received from users from ongoing evaluations who claimed they had better experience with the Jukebox application. The two major reasons were the support of free-form commands by the Jukebox and its broader functionality.
5 Human Evaluation in Progress The HCI methodology [10] advocates several factors that human judges collect in the process of dialog system evaluation. These key indicators include accuracy, intuitiveness, reaction time, and efficiency. When designing the evaluation method we attempted to incorporate the core of these indicators into the scoring method to ensure good correlation of the ADiEU metric with the human judgment. We are currently collecting data form the evaluation test where the human judges act as personas [11]. The results of the evaluation either confirm or reject the assumption of the ADiEU scoring correlation with human judgment.
6 Practical Considerations of the ADiEU Scoring The application of the ADiEU scoring to an arbitrary dialog system has several practical considerations. Generally, there are two possibilities how to evaluate a thirdparty dialog system by our metric: 1) agreed API contract supported by the external system or 2) rich enough tracing and logging information. Both approaches will typically require cooperation with the supplier of the measured system. The API approach asserts there exists a runtime API that supports e.g.: simulating input to the system, changing the dialog state, obtaining notification about dialog state changes with sufficient introspection, possibility to read output of the system. The logging approach demands the application to write all the required information to a log file, ideally in a format compliant with the ADiEU score measuring tool. This usually means tight cooperation with the dialog system engineers, but it is easier and more straight forward than changing the application API in the case it does not provide access to all information needed by the ADiEU metric. Having the test run in the form of log has the advantage of the possibility to send the logs to the scoring tool hosted as a web service and the possibility to evaluate the system against multiple domain ontologies or ontology versions of the same domain. We have experimented with both approaches while evaluation our systems.
7 Conclusion We introduce a method for quantitative evaluation of spoken dialog system that utilizes the domain knowledge encoded by a human expert. The evaluation results are
294
J. Kleindienst, J. Cuřín, and M. Labský
described in the form of a comparison metric consisting of domain coverage and dialog efficiency scores allowing to compare relative as well as absolute performance of a system within a given domain. This approach has an advantage of comparing incremental improvements on an individual dialog system that the dialog designer may want to verify along the way. In addition, the method allows to cross-check the performance of third-party dialog systems operating on the same domain and immediately understand the strong and weak points in the dialog design. Human evaluations are currently conducted to estimate the correlation between the ADiEU score and human judgment. The subjectivity of human scoring and consensus on the ontology coverage are subject of further investigation.
References 1. Weizenbaum, J.: ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine. Communications of the Association for Computing Machinery 9, 36–45 (1966) 2. Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., Taysom, W.: PLOW: A Collaborative Task Learning Agent. In: Twenty-Second Conference on Artificial Intelligence, AAAI-2007 (2007) 3. Cassell, J., Stocky, T., Bickmore, T., Gao, Y., Nakano, Y., Ryokai, K.: Mack: Media lab autonomous conversational kiosk. In: Imagina 2002 (2002) 4. Graesser, A.C., VanLehn, K., Rosfie, C.P., Jordan, P.W., Harter, D.: Intelligent tutoring systems with conversational dialogue. AI Mag. 22(4), 39–51 (2001) 5. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (International edn.). Prentice-Hall, Englewood Cliffs (February 2000) 6. Gandhe, S., Traum, D.: Evaluation understudy for dialogue coherence models. In: Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, Columbus, Ohio, June 2008, pp. 172–181. Association for Computational Linguistics (2008) 7. Walker, M., Kamm, C., Litman, D.: Towards developing general models of usability with paradise. Nat. Lang. Eng. 6(3-4), 363–377 (2000) 8. Hajdinjak, M., Mihelific, F.: The paradise evaluation framework: Issues and findings. Comput. Linguist. 32(2), 263–272 (2006) 9. Le Bigot, L., Bretier, P., Terrier, P.: Detecting and exploiting user familiarity in natural language human-computer dialogue. In: Asai, K. (ed.) Human Computer Interaction: New Developments, pp. 269–382. InTech Education and Publishing (2008); ISBN: 978-9537619-14-5 10. Nielsen, J.: Heuristic evaluation. In: Nielsen, J., Mack, R.L. (eds.) Usability Inspection Methods, pp. 25–64. John Wiley & Sons, New York (1994); ISBN: 0-471-01877-5 11. Carroll, J.: Human Computer Interaction in the New Millennium. ACM Press, New York (2001)
Interpretation of User Evaluation for Emotional Speech Synthesis System Ho-Joon Lee and Jong C. Park Computer Science Department, KAIST 335 Gwahangno, Yuseong-gu, Daejeon 305-701 Republic of Korea
[email protected],
[email protected] Abstract. Whether it is for human-robot interaction or for human-computer interaction, there is a growing need for an emotional speech synthesis system that can provide the required information in a more natural and effective manner. In order to identify and understand the characteristics of basic emotions and their effects, we propose a series of user evaluation experiments on an emotional prosody modification system that can express either perceivable or slightly exaggerated emotions classified into anger, joy, and sadness as an independent module for a general purpose speech synthesis system. In this paper, we propose two experiments to evaluate the emotional prosody modification module according to different types of the initial input speech. And we also provide a supplementary experiment to understand the apparently prosody-independent emotion, or joy, by replacing the resynthesized joy speech information with original human voice recorded in the emotional state of joy. Keywords: Emotional Speech Synthesis, User Evaluation, Emotional Prosody Modification, Affective Interaction.
1
Introduction
Speech is understood as the most basic and widely used communication method for the expression of one’s thoughts during human-human interactions, and studied also for a user-friendly interface between humans and machines. The recent progress in speech synthesis has produced artificial voice results with very high intelligibility, but the quality of sound and the naturalness of inflection still remain a major issue. Recently, in addition to the need for improvement in sound quality and naturalness, there is a growing need for a method to generate spoken language expressions with appropriate emotions to provide the required information in a more natural and effective manner, as well as for the enhancement of an emotional speech synthesis system for effective human-robot interaction. The related work in the field confirms the common belief that prosody plays a key role for the task [1, 2]. However, during the development of our emotional speech synthesis system [3], we realized that, while there are emotions that can be easily perceived with simplified prosody structures, there are those that are very hard to express with prosody structures alone, even when we provide the most accurate prosody structure. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 295–303, 2009. © Springer-Verlag Berlin Heidelberg 2009
296
H.-J. Lee and J.C. Park
In order to identify and understand the characteristics of these emotions and their effects, we propose in this paper a series of user evaluation experiments on an emotional prosody modification system that can express either perceivable or slightly exaggerated emotions as an independent module for general purpose speech synthesis systems.
2
Emotional Speech Synthesis System
For the analysis of prosody structure through a more precise level of units, we annotated the Korean emotional speech corpus, distributed by the Speech Information Technology & Industry Promotion Center [4], with the K-ToBI labeling system. This speech corpus was recorded by six professional actors and actresses in a sound-proof room, and is composed of emotionally neutral ten sentences with six different emotions (joy, anger, sadness, fear, boredom, and neutral). An AKG C414-B ULS microphone was used with a 16KHz sample rate, and each speech was stored as a 16bit Windows wave format. We used eight sentences spoken by six speakers, as described in Table 1, considering four emotions (joy, anger, sadness, and neutral). The number of Ejeols (words separated by a space) was evenly distributed from 1 to 6. Table 1. Eight sentences used for prosody structure analysis
Ejeol 1 1 2 3 3 4 5 6
Sentence . (Yes.) . (No.) . (I don’t know either.) , . (See, let’s end it now.) . (It really is.) ? (Where are you going now?) . (This is not what I wanted.) . (I shut the door closed asking her not to leave.)
예 아니요 나도 몰라 야 이제 그만하자 정말 그렇단 말이야 지금 어디 가는 거야 이건 내가 원하던 게 아니야 난 가지 말라고 하면서 문을 닫았어
The Korean emotional speech corpus had passed manufacturer’s perception test performed by twenty subjects (eighteen males, two females), and Table 2 below shows the results. Among the emotions, anger turned out to be the most perceivable emotion (94.3%), and fear, the most confusing one (80.3%). However, the overall acceptance rate is more than 80%. For the analysis of dominant emotional prosody patterns, we annotated eight sentences spoken by six speakers with four emotions, or 192 pieces of speech in total with the K-ToBI labeling system [5]. And for the statistical verification of the K-ToBI labeled data, we performed Pearson’s Chi-square tests. As shown in Fig. 1, the results support the null hypothesis that each emotion has distinct Intonational Phrase (IP) boundary patterns that can distinguish one emotional state from the rest. Then we calculated adjusted residuals to find the distinct pitch contour pattern or patterns. If the calculated value of the adjusted residual is bigger than 2, that feature can be statistically
Interpretation of User Evaluation for Emotional Speech Synthesis System
297
interpreted as the dominant pattern of a certain emotion. Pearson’s Chi-square tests and adjusted residual were performed by SPSS software. From the statistical analyses of pitch contour patterns, we were able to find very strong tendencies between anger and HL%, joy and LH%, sadness and H%, and neutral and L%.
Fig. 1. Chi-square test and adjusted residual calculation results Table 2. Perception test result done by twenty subjects Speaker CWJ KKS LHJ MYS PYH YSW Average
Neutral 89.5 62.5 83.5 84.5 85.0 95.4 83.3
Joy 93.5 90.5 67.5 91.5 95.0 89.5 87.9
Anger 88.5 92.0 98.0 90.0 99.0 98.5 94.3
Sadness 85.5 80.5 84.5 89.5 94.0 89.5 87.3
Fear 59.0 85.5 88.5 93.5 61.5 93.5 80.3
Boredom 93.0 82.0 84.0 81.0 94.5 81.0 85.9
To incorporate these analyzed and distinct Intonational Phrase boundary patterns for different emotional states, we propose a prosody-unit-level emotional prosody modifier that produces distinct pitch contour, intensity contour, and speech duration according to the three different emotional states: anger, joy, and sadness. The emotional prosody modifier is a simple, coarse-grained prosody re-synthesis module that consists of a pitch contour mapping function, a pitch exaggeration function, an intensity variation
298
H.-J. Lee and J.C. Park
function, and a duration variation function. We set the empirical value of each prosodic parameter based on the previous findings in the literature [1, 2], also taking into account language specific phenomena for Korean including the speaker’s gender information, short and long vowel sound disambiguation [6, 7], and prosodic structure of discourse markers [8], captured from various Korean speech corpora. Equation 1 below shows the algorithm of our pitch contour modification function. This pitch contour modification function generates the base emotional pitch contour of speech including the synthesized results of Text-to-Speech (TTS) systems and recorded human voice for each emotion. (1) where t
∈[t1,t 2] ;
y y′
original pitch value as a function of time t ; modified pitch value;
a
maximum / minimum pitch range ; initial position of pitch contour; final position of pitch contour (rising tone: 0.5, rising-falling: 1); and declination / ascent level.
b
c d
After the modification of the base emotional pitch contour, we apply a pitch exaggeration function to characterize the difference in pitch variation according to the difference in emotion types. First, this module detects eight pitch points per unit. Then we exaggerate the difference in each pitch pair by adding 6Hz for joy and anger, and 40Hz for fear and sadness. Next, we adjust the intensity with the intensity contour modification function which is similar to the pitch contour modification function in Equation 1, but much simpler. Then we control the duration of each unit preserving the intrinsic value of f0. All these four modules are implemented in a PRAAT [9] script supporting not only commercial TTS systems, but recorded human voice also. We used the Python language for the interface of PRAAT software and TTS output or human voice, and therefore this module supports both Linux and Windows environments.
Fig. 2. Pitch and intensity traces of original speech, spoken in a neutral emotional state
Interpretation of User Evaluation for Emotional Speech Synthesis System
299
Fig. 3. Pitch and intensity traces of prosody modified speech to a sad emotional state
이건 내가 원하던
Fig. 2 shows the prosody trace of a recorded Korean utterance “ .” which means in English “This is not what I wanted.” spoken neutrally by a professional actress, and Fig. 3 shows its modified prosody trace as a sad emotional state produced by our emotional prosody modifier. The blue line (upper line) indicates the pitch contour, and the green line (lower line) the intensity. In Fig. 3, the entire duration is lengthened from 1.753 seconds to 2.805 seconds without any side effect such as f0 contour lowering. Pitch contour is spread more widely, and intensity is weakened.
게 아니야
3
Evaluation of Emotional Speech Synthesis System
For the identification and understanding of the characteristics of three basic emotions and their effects, we prepared three stages of experiments. The first and second experiments are designed to evaluate the emotional prosody modifier according to different types of the initial input speech, such as monotonous-prosody speech and excited-prosody speech. The supplementary experiment is performed to identify apparently prosody-independent speech. The subjects of these three experiments are fourteen kindergarten teachers, twelve of them females and two males. They are 29.6 years old on average. We did not carry out any prior training for the fourteen subjects, and answers were not notified to the subjects after the experiments. At the beginning of the experiments, subjects were asked to choose one most likely emotion among anger, joy, sadness, and neutral. We used five semantically neutral sentences as show in Table 3. For the first experiment, five neutrally recorded speech files were used as a monotonous input speech, and the emotional prosody modifier produced fifteen results with three emotional states. The test sequences of first and second experiments were randomly organized. Table 3. Input sentences for the evaluation of emotional prosody modifier
Sentence
야, 이제 그만하자. (See, let’s end it now.) 정말 그렇단 말이야. (It really is.) 지금 어디 가는 거야? (Where are you going now?) 이건 내가 원하던 게 아니야. (This is not what I wanted.) 난 가지 말라고 하면서 문을 닫았어. (I shut the door closed asking her not to leave.)
300
H.-J. Lee and J.C. Park
Table 4 shows the evaluation results of the emotional prosody modification with monotonous input speech. From the analysis of the results of the first experiment, we find that anger is very sensitive to emotional prosody structure (80% of perception rate). And sadness also shows a strong relationship with prosody structure. It is rather surprising to note that none of the subjects perceived joy from the monotonous input speech, even though we modified the prosody structure of joy based on the analyses of real speech, exactly as we did for anger and sadness. Table 4. Evaluation result for monotonous input speech
Anger Joy Sadness
Anger 56 (80.0%) 12 (17.1%) 4 (5.7%)
Joy 3 (4.3%) 0 (0%) 2 (2.9%)
Neutral 6 (8.6%) 16 (22.9%) 23 (32.9%)
Sadness 5 (7.1%) 42 (60.0%) 41 (58.6%)
Total 70 70 70
For the second experiment, we used five pieces of excited voice as the input for the emotional prosody modifier, and generated fifteen randomly organized test sets. Table 5 indicates the results of the second perception experiment. Table 5. Evaluation result for excited input speech
Anger Joy Sadness
Anger 56 (80.0%) 18 (25.7%) 3 (4.3%)
Joy 7 (10.0%) 15 (21.4%) 38 (54.3%)
Neutral 6 (8.6%) 15 (21.4%) 18 (25.7%)
Sadness 1 (1.4%) 22 (31.4%) 11 (15.7%)
Total 70 70 70
Interestingly, anger preserved prosody sensitivity when the type of input was changed from monotonous-prosody speech to excited-prosody speech. From the second experiment, two major changes were observed: an increase in the perception rate of joy, and a decrease in the perception rate of sadness. The decrease in the perception rate of sadness can be caused by the sudden change of the test environment. In order to indentify the cause of this sudden change, we proposed the third experiment. However, the expected response of the perception rate of joy was still very weak. To identify the characteristics of the emotional prosody structure of joy, and to validate the hypothesis above on a sudden change of sadness, we performed the third experiment with the same subjects and in the same sequence as the second experiment. The only difference between the second and third experiments was just the replacement of the modified joy speech with the original human voice recordings in the emotional state of joy, which had passed the manufacturer’s perception test at the rate of 91.5%.
Interpretation of User U Evaluation for Emotional Speech Synthesis System
301
Table 6. Evaluatio on result for repeated test with human voice recordings
Anger Joy Sadness
Anger 58 (82.9% %) 32 (45.7% %) 10 (14.3% %)
Joy 7 (10.0%) 12 (17.1%) 18 (25.7%)
Neutral 4 (5.7%) 15 (21.4%) 19 (27.1%)
Sadness 1 (1.4%) 11 (15.7%) 23 (32.9%)
Total 70 70 70
After the third perceptio on test, we made three interesting interpretations from the results shown in Table 6. First, F the same sequence in the repeated experiment did not seem to influence the percception rate of anger. There was only a slight movem ment from neutral to anger. This allows us to define anger as a primarily prosody-sensittive emotion. Second, we found that some s part of the decreased perception rate was due to the sudden change of the test en nvironment. So it is a possible interpretation that there w was a confusion of sadness in the second experiment. Despite the result of the secoond experiment, it appears that sadness s is also a prosody-sensitive emotion. Third and most importan nt, we could not find any meaningful relationship betw ween the prosody structure and the t emotion of joy, even though we used real voice whhich had passed the manufacturrer’s perception test at the rate of 91.5%. This leads uss to conclude that joy is not a prosody sensitive emotion, which forces us to find othher, effective approaches to ex xpress the emotion of joy through an emotional spoken language generation system m.
4
Discussion
For the accurate understan nding of each evaluation result, a quantitative compariison method that can also desccribe the influence of wrong answers is called for. For example, the perception rate of the first experiment related to anger is just equaal to that of the second experimeent. But for the same category, it is very hard to figure out the influence of errors su uch as joy and sadness. For this kind of interpretattion including error analysis, we suggest a Euclidean distance based quantitattive comparison method. Fig. 4 describes a Euclidean distance model of tetraheddron designed for the analysis off four types of category.
Fig. 4. Euclidean distance model for tetrahedron
302
H.-J. Lee and J.C. Park
From this point of view, we can calculate and compare each distance described in Table 4, Table 5, and Table 6. When the size of n is 70, the maximum distance of each category is approximately 98.99, and the minimum distance is 0. Table 7. Euclidean distance of Table 4
Anger Joy Sadness
Anger 16.31 73.38 81.06
Joy 87.67 84.05 82.76
Neutral 85.24 69.46 62.53
Sadness 86.06 34.41 37.28
Table 8. Euclidean distance of Table 5
Anger Joy Sadness
Anger 16.79 60.32 79.86
Joy 84.51 63.70 38.44
Neutral 85.33 63.70 65.41
Sadness 89.34 55.48 72.51
Considering both correct answers and errors, we conclude that synthesized anger based on the monotonous input speech is slightly closer to the position of anger than that based on the excited speech, even though they have the same perception rate. And for the synthesis of anger, the change of initial input speech from monotonous to excited one increases the distance of joy by 3.16, but decreases the distance of neutral by 0.09 and sadness by 3.28.
5
Conclusion
In this paper, we proposed an emotional prosody modification system, and evaluated the performance of the system, in order to find a relationship between prosody structures and emotions. First, we proposed a prosody-unit-level emotional prosody modification system that produces distinct pitch contour, intensity contour, and speech duration according to three different emotional states: anger, joy, and sadness. And during the evaluation process, anger and sadness were identified as prosody sensitive emotions, whereas joy was not. Consequently, this difference led us to discover the possibilities and limitations of prosody modification for the generation of emotional spoken language expression systematically. Further analyses of emotional speech data are necessary, taking into account various speakers, speaking environment, and speaking styles. And more organized evaluation and interpretation strategies are essentially needed for further work. Acknowledgments. This research was performed for the Intelligent Robotics Development Program, one of the 21st Century Frontier R&D Programs, funded by the Korea Ministry of Knowledge Economy.
Interpretation of User Evaluation for Emotional Speech Synthesis System
303
References 1. Schröder, M.: Emotional Speech Synthesis: A Review. In: Eurospeech 2001, vol. 1, pp. 561–564 (2001) 2. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001) 3. Lee, H.-J., Park, J.C.: Customized Message Generation and Speech Synthesis in Response to Characteristic Behavioral Patterns of Children. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4552, pp. 114–123. Springer, Heidelberg (2007) 4. SiTEC Emotional Speech Corpus, http://www.sitec.or.kr/English/index.asp 5. Jun, S.-A.: K-ToBI (Korean ToBI) Labeling Convention. Korean Journal of Speech Science 7 (2000) 6. Lee, H.-J., Park, J.C.: Lexical Disambiguation for Intonation Synthesis: A CCG Approach. In: Korean Society for Language and Information, pp. 103–118 (2005) 7. Lee, H.-J., Park, J.C.: Vowel Sound Disambiguation for Proper Intonation Synthesis. In: 19th Pacific Asia Conference on Language, Information and Computation, pp. 131–142 (2005) 8. Lee, H.-J., Park, J.C.: Characteristics of Spoken Discourse Markers and their Application to Speech Synthesis Systems. In: 19th Annual Conference on Human and Cognitive Language Technology, pp. 254–260 (2007) 9. PRAAT, http://www.praat.org
Multi-level Validation of the ISOmetrics Questionnaire Based on Qualitative and Quantitative Data Obtained from a Conventional Usability Test Jan-Paul Leuteritz1, Harald Widlroither1, and Michael Klüh2 1
Fraunhofer IAO / Universität Stuttgart IAT, Nobelstr. 12, 70569 Stuttgart, Germany 2 Hansgrohe AG, Auestr. 5, 77761 Schiltach, Germany {Jan-paul.leuteritz,Harald.widlroither}@iao.fraunhofer.de,
[email protected] Abstract. Qualitative and quantitative data, collected during a usability evaluation of two innovative prototypes of a small display touch screen device, have been used to perform a multi-level assessment of the questionnaires used within the trial. The use of different validation methods is depicted and discussed concerning their advantages and disadvantages. The conclusions from the validation study are depicted, revealing that the usage of the ISOmetrics for testing uncommon prototypes may result in insufficient validity of the instrument. Keywords: Validity, questionnaire, ISOmetrics, AttrakDiff, small display devices, shower control.
1 Introduction 1.1 The Goal of the Study Questionnaires are, at a first glance, a highly attractive instrument for the evaluation of new, innovative prototypes: First, they are easy to use. Moreover, they can be given to a high number of participants without resulting in a lot of extra work for the experimenter. Their use is easy to argue because they are standardised, therefore they are rather objective in comparison to other usability evaluation methods. Last but not least, questionnaires give out a numeric measure of clearly defined dimensions of the users’ cognitions or emotions, which makes their results easy to interpret and to explain to clients or colleagues. However, depending on the exact evaluation task, some usability questionnaires are more adequate than others. Some of them might even become invalid or practically useless if they are used in a certain context. In order to decide which questionnaire to use within their specific projects, usability professionals need empirically based information on the strengths and weaknesses of certain groups of questionnaires or even specific instruments. As this kind of information is usually not provided in the testing manuals, other ways must be encountered to retrieve it. This article proposes a method to tackle this problem. The idea behind this method is based on the assumption that many usability professionals frequently do evaluations J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 304–313, 2009. © Springer-Verlag Berlin Heidelberg 2009
Multi-level Validation of the ISOmetrics Questionnaire
305
under repeating conditions; they work, for example, with the same user group or a similar test pattern or they usually evaluate prototypes from a certain line of products. Hence, they could use data from their own tests to cross-validate their survey instruments and see what kind of information they yield. This solution is fine, as long as the cross-validation procedure does not consume too much effort. In order to find out if such an approach could be recommendable, Fraunhofer Institute of Industrial Engineering (Fraunhofer IAO) conducted the study described in this article. An evaluation project commissioned by the German shower technology manufacturer Hansgrohe AG served as the basis of the multi-level validation approach. A usability test design was developed that would not just answer the respective evaluation questions but that would also provide data for multi-level validation procedures of the questionnaires used. It was paid attention to keep the additional efforts which only served the validation task as low as possible. This article presents the outline of the evaluation study and the detailed results of the multi-level validation approach. It aims at inviting other usability professionals to use and/or refine this method.
Fig. 1. Prototype A (© Hansgrohe AG, Design by Phoenix Design, Stuttgart)
Fig. 2. Prototype B (© Hansgrohe AG, Design by Phoenix Design, Stuttgart)
1.2 The Evaluation Project The devices to be tested were two prototypes of a wall-mounted device for controlling the different functions of a modern comfort-shower: hand showers, overheadmounted shower plates offering various combinations of water rays, wall-mounted shower-heads, steam-bath functions, coloured lighting, and a music-player. The designs, including the interaction concept, had been created by Phoenix Design GmbH & Co.KG, Stuttgart. Prototype A (Fig. 1) was a touch-screen device that featured two additional buttons and a pusher-and-rotator switch. Prototype B (Fig. 2) had a smaller screen that did not respond to touch input. It was instead controlled by a number of buttons, including a
306
J.-P. Leuteritz, H. Widlroither, and M. Klüh
set of four arrow-buttons, an “OK”-button, a “menu”-button, a back-button in form of a u-turn-arrow. Prototype B also featured the pusher-rotator switch. The usability test was meant to identify the prototype with the better usability, which would then be finalised, while the other prototype would be discarded. Furthermore, the test had to provide information on how to improve the better prototype in the next design development phase.
2 Theoretical Background 2.1 Definition of Usability The definition of usability on which this validation study is based had been taken from ISO 9241-11. The main advantage of ISO 9241 is that it is an international standard and therefore widely accepted. Furthermore, other definitions of usability (like Nielsen’s definition, see Nielsen 1993) seemed to be less adequate for a validation study, as it was suspected that their subordinate constructs might not be independent factors and hence increase the preparatory efforts to be undertaken. ISO 9241-11 defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” 2.2 Measuring Usability According to ISO 9241, efficacy and efficiency are best measured by so-called objective data, which means behaviour data such as error rates or the time needed to complete a task. This data can be collected during a standardised experiment. The measurement of satisfaction is more difficult, because satisfaction is the user’s subjective reaction to the interaction with the product (ISO 9241). Hassenzahl (2004) states that user satisfaction is an emotion which results from the user comparing his expectations of the system to his actual experiences with it. Satisfaction is therefore only to be measured by asking the user about his feelings towards the system. With regards to the above given argumentation, it was assumed that • The most valid measure or criterion for the efficacy of use would be the number of tasks people were not able to finish by themselves. • The most valid measure or criterion for the efficiency of use would be either the number of mistakes people made during the trial or the time they needed to complete all tasks. • The most valid measure or criterion for the users satisfaction with the interface would be either the result of a questionnaire, most probably a semantic differential, or a quantified item on their preference or choice of prototype after the test. 2.3 Selection and Purpose of the Questionnaires After collecting information about the available psychometric instruments, it was decided to use two questionnaires within the study:
Multi-level Validation of the ISOmetrics Questionnaire
307
1. The ISOmetrics (Gediga & Hamborg, 1999), which is supposed to measure usability, using the set of seven dimensions for the design of dialogue systems defined in ISO 9241-10. It’s a five-point Likert-scale questionnaire. As the experiment focused on the dialogues of the shower system, the ISOmetrics seemed to be adequate. As the ISOmetrics is based on the ISO standard, it was expected to fit well into the theoretical approach chosen. According to the above given definitions of criterions, the ISOmetrics was in this study not the main source for usability measures but rather an additional instrument, the validity of which was to be examined. It was planned to compare the questionnaire results with the criterions for efficacy and efficiency and with the qualitative data collected during the test. 2. The second questionnaire, the AttrakDiff (Hassenzahl et al., 2003), is a sevenpoint-scaled semantic differential questionnaire, which is supposed to measure the attractiveness of a system to a user. Although Hassenzahl et al. (2003) do not directly state that the AttrakDiff questionnaire measured satisfaction, the construct of attractiveness, seems to reflect quite well the whole range of expectations a user can have. Hence, this was the instrument selected for the measurement of satisfaction. Validating the AttrakDiff in this context was more difficult because there is hardly a better criterion for the users emotions towards a technical system than their responses to an emotion-focused questionnaire. The only other criterion is the subsequent behaviour towards the system after the test – the motivation to carry on with the communication with the system. This is reflected in quantitative preference judgement, which was therefore selected as criterion for the AttrakDiff.
3 Method 3.1 Sample 22 users (12 women, 10 men) participated in the study, each providing both quantitative and qualitative data for the validation project. They had an arithmetic mean age of 39.1 years (SD = 14.5 years). The sample consisted of 10 potential customers, 4 elderly users (60+, selected for their lack of experience with information technology) and 8 additional users from Fraunhofer Institute. 3.2 Experimental Setting The prototypes were simulated on a touch-screen monitor, mounted within the wall of a trade-fair mock-up of a shower cabin. The test was done without having water pour from the showers, the users wore normal clothing. Therefore, a video of the shower’s functions was shown at the beginning of the test. Each participant tested both of the prototypes; the sequence was matched according to person characteristics. Each prototype test consisted in a set of tasks the participants had to complete and a questionnaire given after the completion of the task-set. The experiment ended with final questions, asking for a comparison between the tested devices. It was assured that every participant completed all tasks. Whenever the participant was unable to complete a task by himself, the experimenter provided the information
308
J.-P. Leuteritz, H. Widlroither, and M. Klüh
for the next step and placed a marker in the log-file, indicating that help had been given. If the participant was able to continue by himself after receiving a hint, no further advice was given. Otherwise every assistance was rendered that was needed to complete the task. Participants were instructed to complete each task as fast as possible, without thinking aloud or giving comments. This should guarantee the reliability of the time-measures. The test was conducted in German language, including instructions and questionnaires. Each test lasted between 90 and 120 minutes. All tests were conducted by the same instructor, using a written instruction. The first trials were supervised. 3.3 Variables Collected For each participant’s interaction with each prototype, the number of tasks was counted that he/she could not complete without the help of the test instructor (number of hints). For every task of each participant, the number of errors1 they committed was counted and the time to complete the task was measured, using an automatic logging technique. The questionnaire given to the participants after each of their two trials contained: 1. The ISOmetrics in a shortened version. Items that did not apply to shower controls had been deleted. The subscale “suitability for individualization had been removed entirely as none of the items fit. This shortened version is referred to as ISOmetricsSDD (ISOmetrics for small display devices) in the text below. 2. The AttrakDiff in its full version. 3. Additional items, including • one item to determine which of the two prototypes the user would prefer in the end and • one item that asked to quantify the superiority of the preferred prototype on a five-point-scale. Qualitative data was taken from the participants’ statements and comments during and after each task. All test sessions were videotaped in order to allow a thorough analysis of all the statements the users gave and all their actions, including errors that did not appear in the log-files (e.g., touching the screen of prototype B).
4 The Validation Procedure and Its Results 4.1 Reliability As the instruments were not new but commonly used ones, there wasn’t any attention paid to the factorial structure of the answers. The reliability of the results was calculated rather to exclude a reliability problem that would render all validation attempts useless. Cronbach’s α was chosen as a correct factorial structure of the instruments had been assumed. 1
“Errors” were all intended button-pushes that did not contribute to the solution of the task. Due to the specifications of the log-file, special exceptions were phrased to exclude, for example, unnecessary rotating of the pusher-and-rotator switch from the errors count.
Multi-level Validation of the ISOmetrics Questionnaire
309
Table 1. Reliability Estimation of the ISOmetricsSDD subscales, using Cronbach’s α Scale
No. of items
prototype A
prototype B
Suitability for the task
7
.70
.90
Self-descriptiveness
4
.75
.87
Controllability
4
.68
.81
Conformity with user expectations
5
.76
.78
Error tolerance
3
.48
.74
Suitability for learning
4
.79
.90
4.2 Content Validity A survey with three usability experts from Fraunhofer IAO, conducted before the usability evaluation of the shower prototypes, did not yield any majority vote calling for the deletion or the addition of a specific item or aspect to/from the ISOmetricsSDD. The lowest mean-estimation of a subscale’s validity was 82% (see table 3). Additionally, it has to be stated that the interviewed specialists did not even know the shower control prototypes and hence demanded the inclusion of items that would generally be useful but that had no application in this study. Table 2. Consolidated ratings of the content validity of the ISOmetricsSDD
No. of evaluators requesting a change
Mean estimation of validity
Number of items to eliminate
Number of aspects missing
Suitability for the task
1
83 %
0
2
Self-descriptiveness
2
82 %
0
1
Controllability
1
90 %
2
2
Conformity with user expectations
0
88 %
0
1
Error tolerance
0
95 %
0
0
Suitability for learning
2
85 %
1
2
4.3 Criterion-Based Validity Extreme-group-validation The ISOmetricsSDD questionnaire was clearly able to identify the “better” prototype, preferred by 20 of 22 participants. Prototype A yielded with 4.12 (SD = 0.50) a significantly higher sum-score than Prototype B with 3.29 (SD = 0.32) (t(21) = 5.90, p = .00). Hence, using the ISOmetricsSDD would have led to the correct decision which prototype to discard.
310
J.-P. Leuteritz, H. Widlroither, and M. Klüh
Correlation of problem-counts and subscale-means Another validation method applied here repeated a procedure that had already been used in a study which reported satisfying validity of the ISOmetrics questionnaire (Ollermann, 2004). A category system was created for all the usability problems encountered. Sources were the statements of the participants, the notes of the test instructor and the log-files. For each problem category, the number of occurrences was counted. Then, each problem category (40 for prototype A and 31 for prototype B) was assigned to one ISOmetricsSDD subscale. Afterwards, for each subscale the numbers of appearances of each assigned problem category were summed. This way, the whole sum of all usability problems encountered was split between the questionnaire subscales. Finally, the Pearson-correlation between the number of problems and the mean score of the subscale was calculated for both prototypes. The application of Ollermann’s method yielded less promising results: For prototype A the correlation between the usability problems encountered and the arithmetic mean scores of the ISOmetricsSDD subscales was r = -0.259 (N = 6; p = .310). For prototype B this correlation was r = 0.020 (N = 6, p = .485).2 Correlation of ISOmetricsSDD and metric criteria The Pearson-correlation between the score-differences of the ISOmetricsSDD and the differences in errors committed was statistically not significant with r = -.11 (N = 22, p = .66). The Pearson-correlation between the score-differences of the ISOmetricsSDD and the differences in the time needed to complete all tasks was statistically not significant with r = -.29 (N = 22, p = .19). The Pearson-correlation between the (A-B) difference in number of hints (the number of tasks that could only be completed with the instructor’s help) and the score-differences of the ISOmetricsSDD was r = .386 (N = 22, p = .076). Correlation of AttrakDiff and the preference item With a single item it was intended to create a criterion for the validity of the AttrakDiff questionnaire. The item requested the participant to describe the degree superiority of the better prototype to the weaker one using a 5-point Likert-Scale. The score was Pearson-correlated to the difference of the AttrakDiff sum scores (not-preferred prototype minus preferred prototype). The result was r = -.44, statistically significant with p = .04 (N = 22), which due to the value coding shows that indeed those participants who perceived their favourite to be to a great extent superior to the other prototype also yielded a higher difference in the AttrakDiff sum-scores, pointing in the same direction.
5 Discussion of the Results The results of the survey among usability experts show that there are no severe problems concerning the content of the ISOmetricsSDD items. They apparently represent quite well the ISO definition of the different constructs describing the usability of dialogue systems. However, the correlation between the numbers of problems 2
N in this case is not the number of participants but the number of subscales used.
Multi-level Validation of the ISOmetrics Questionnaire
311
assigned to each subscale and the mean scores of the subscales does not support these validity assumptions. Correlations with N=6 should not be over-interpreted and significances are not to be expected in any case. However, if one looks at the whole correlation matrix, he will find that the Pearson-correlation between the ISOmetricsSDD scores of prototype A and prototype B was r = 0.416 (N = 6; p = .206) and that the problem counts of A and B correlated by r = 0.627 (N = 6, p = .091). This indicates that the data are not totally random. There are coherences between the ISOmetrics – scores and between the problems found for both prototypes. So the question is: Why do neither the sum-scores of the subscales correlate with the problem count, nor do the entire sum-scores correlate with the most objective measures of usability – user mistakes and time-to-complete? There is just no match between the usability problems and the questionnaire results. In Ollermann’s study, the first correlation coefficient found was r = 0.277. As there was one subscale that seemed to be responsible for this low result, this subscale was eliminated, resulting in the correlation jumping up to r = 0.756 (p = .019) (Ollermann, 2004). This procedure did not seem acceptable in this study because for the two prototypes, different subscales messed up the correlation. Even more disappointing were the correlations of the ISOmetricsSDD scores with number of errors, as well as with time to complete. The AttrakDiff questionnaire yielded promising results. Regarding the fact that it was validated using just one item, resulting in a possibly low reliability of the criterion, a correlation of r = -.44 can be considered sufficiently high to indicate that the results of the questionnaire do more or less reflect the constructs named in the respective theory (see Hassenzahl et al., 2003). As a consequence of the described findings, it was assumed that the ISOmetricsSDD instrument had in this case not been measuring the system’s usability. What did it measure instead? It was presumed that the ISOmetricsSDD had failed because it tried to make usability-experts out of the users. Even for the authors of the study, the categorisation of the encountered usability problems to the questionnaire’s subscales was a difficult task. Expecting a user to remember all the problems he had encountered and to correctly map them to questionnaire items seems impossible, especially if the user is asked to do so after testing an unknown system for 90 minutes. Most probably, the test participants will rather rely on their general perception of the system, on the emotional substrate of their recent experiences. Two findings support this presumption: 1. The mean scores of the different subscales were quite similar. For prototype A the standard deviation of those subscale-means is SD = 0.15, for prototype B it is SD = 0.32, which seems small for a five-point Likert-scale. Ives, Olson and Baroudi (1983, as cited in Hartson et al., 2000) say that participants tend to fill in satisfaction questionnaires quite homogeneously. This might also apply to questionnaires like the ISOmetricsSDD. 2. The correlation between the differences of the AttrakDiff-scores (A-B) and the differences of the ISOmetricsSDD-scores (A-B) of the participants was r = 0.81 (N = 22; p = .00). This means that the ISOmetricsSDD perfectly measured the emotional value that the participants gave to the system, closely linked to what is called “satisfaction”.
312
J.-P. Leuteritz, H. Widlroither, and M. Klüh
6 Conclusions 6.1 Concerning the Findings of the Study When confronted with a system for the first time, users are probably unable to remember the usability problems they encountered and to cluster them correctly, producing a valid score in all the subscales of a questionnaire like the ISOmetrics. Participants rather seem to use the instrument to convey their overall satisfaction with the system to the test instructor. Therefore the use of questionnaires focusing on different categories of usability problems is not recommendable in certain test designs. According to the findings of this study, questionnaires like SUMI, QUIS and ISOmetrics need to be used carefully. 6.2 Concerning Multi-level Validations The aim of this article and the work depicted here is to encourage usability experts to evaluate their measurement instruments with a method similar to this multi-levelapproach. This approach of course has a downside, which is the small number of participants. In the above described case, only three usability experts have been interviewed, only two prototypes were used, only 22 participants have gone through the evaluation process, and only six subscales of the ISOmetrics were taken into account. Furthermore, aspects like the assignment of the encountered usability problems to certain scales could always be questioned. Finally, it could be argued that the changes done to the questionnaire (e.g. the deletion of items) had a bad effect on the validity of the whole instrument. The results of such a study may hence seem less apt for publication than the results of big validation studies, carried out with hundreds of participants. The advantage of this method is that without going into unreasonable costs of money or time, it combines different forms of validations and collects information that is usually just lost. Eventually, the question is if a usability practitioner’s primary interest is to win a scientific argument and publishing results or if s/he just wants to get a hint on whether a certain tool is recommendable for the planned task or not. In the second case, the common perception of usability evaluation itself would then also apply to the evaluation of the assessment tools: Little and possibly unreliable information is better than none (see Nielsen, 1993). So if the assumption is true that not just literally the validity of a questionnaire but more generally the value gained from its results is possibly dependent on the product tested, the users, and other context parameters, then the here promoted method becomes recommendable.
References 1. ISO 9241, Ergonomics of human-system interaction. International Organization for Standardisation (1998) 2. Gediga, G., Hamborg, K.-C.: IsoMetrics: Ein Verfahren zur Evaluation von Software nach ISO 9241/10. In: Hollingm, H., Gediga, G. (eds.) Evaluationsforschung, pp. 195–234. Hogrefe, Göttingen (1999)
Multi-level Validation of the ISOmetrics Questionnaire
313
3. Hamborg, K.-C.: Gestaltungsunterstützende Evaluation von Software: Zur Effektivität und Effizienz des IsoMetricsL Verfahrens. In: Herczeg, M., Prinz, W., Oberquelle, H. (eds.) Mensch & Computer 2002, pp. S.303–S.312. B.G. Teubner, Stuttgart (2002) 4. Hartson, H.R., Andre, T.S., Williges, R.C.: Criteria for Evaluating Usability Evaluation Methods. International Journal of Human-Computer Interaction, 2001 13(4), 343–349 (2000) 5. Hassenzahl, M., Burmester, M., Koller, F.: AttrakDiff.: Ein Fragebogen zur Messung wahrgenommener hedonischer und pragmatischer Qualität. In: Ziegler, J., Szwillus, G. (eds.) Mensch & Computer 2003, pp. 187–196. B.G. Teubner, Stuttgart (2003) 6. Hassenzahl, M.: Interaktive Produkte wahrnehmen, erleben, bewerten und gestalten. In: Thissen, F., Stephan, P.F. (eds.) Knowledge Media Design – Grundlagen und Perspektiven einer neuen Gestaltungsdisziplin, Oldenburg Verlag, München (2004) 7. Nielsen, J.: Usability Engineering. Morgan Kaufmann, Heidelberg (1993) 8. Ollermann, F.: Verhaltensbasierte Validierung von Usability-Fragebögen. In: Keil-Slawik, R., Selke, H., Szwillus, G. (eds.) Mensch & Computer 2004: Allgemeine Interaktion, pp. 55–64. Oldenburg Verlag, München (2004)
What Do Users Really Do? Experience Sampling in the 21st Century Gavin S. Lew User Centric, Inc. 2 Trans Am Plaza Dr, Ste 100, Oakbrook Terrace, IL 60181, USA
[email protected] Abstract. As practitioners we spend a great deal of effort designing and testing products within the confines of usability testing labs when we know that a rich user experience lies outside. What is needed is more research in “the wild” where people use the very interfaces we take so much time to design, test, iterate, and develop. Through innovative advancements in mobile technology, we can expand upon the tried and true “experience sampling” research techniques, such as diary or pager studies, to effectively solicit, monitor and receive data on users’ interactions at given points in time. This paper describes various research methodologies and recent advancements in mobile technology that can provide practitioners with improved research techniques to better assess the user experience of a product. The conference presentation will also include results from a pilot experience sampling method study focused on collecting data on usage and satisfaction of a product. Keywords: Experience sampling, in-situ research, mobile device research, pager study, diary study, mobile research, SMS studies.
1 Introduction All too often, as designers and researchers, we spend a great deal of effort creating and testing products within the confines of our corporate offices or labs. Yet, we know that there is a rich user experience that lies outside—where people use the very interfaces we take so much time to design, test, iterate, and develop. In order to create better user experiences, we need to better understand how users actually use the products we build. There are research techniques that can capture experiences that occur “in the wild”. The user experience field needs to incorporate more of these techniques into the research and discovery phases to produce more insights that can foster more thought and design discussions. This paper addresses some of the reasons why traditional research techniques fall short and describes how experience sampling methods (ESMs) can be applied using recent advances in mobile technologies to capture how users actually use products. The conference presentation will also include results from a pilot ESM study focused on collecting data on usage and satisfaction of a product. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 314–319, 2009. © Springer-Verlag Berlin Heidelberg 2009
What Do Users Really Do? Experience Sampling in the 21st Century
315
2 Common Research Techniques There is a number of common research techniques employed to understand the user experience of a product. These methods range in difficulty from easy to challenging, but each provides insight into different aspects of the user experience. 2.1 Usability Testing Usability testing with users is a critical component of any user-centered design process. Traditional usability testing involves task-based research in the lab where designs can be tested, iterated and validated. Within the confines of this controlled environment, this methodology is ideally suited to assess usability in a highly tactical and specific manner. Outcomes include answers to specific design questions. Usability testing is critical to product success because we must ensure that the core features are usable. However, the focus of usability testing on tasks is also a limitation because the lens tends to target the “walk-up-and-use” user experience of the product. Session time is often limited and the user experience typically does not involve a user who interacts with a device that he/she actually owns. As practitioners and designers, we accept the lack of external validity because of the benefits of usability testing to formative and iterative design. We apply the insights uncovered in the lab into the design and hope that they generalize to how the product is actually used in the real world. However, we understand that the usability, usage, and usefulness of a product are determined over time and not necessarily in the first hour of use in the lab setting. 2.2 Surveys and Focus Groups Often the data provided to describe the “real world” user experience is obtained through survey or focus group methodologies. While these research methods are quite useful in the early-stage development of feature importance, pricing, or intent to purchase, using this information for design is challenging. Results tend to be at a highlevel and we often need more tactical direction to meaningfully influence some of our design decisions. Even when these methods are directed toward answering design questions, the obtained data is largely retrospective in nature. We know that asking users to reflect on tasks done in the past is not as robust or credible as when the same question is asked during or immediately following the completion of the task. Satisfaction metrics can be obtained in surveys, but they would be much more useful when captured as close to the actual usage instance as possible (e.g., gathering satisfaction data after completing a task rather than asking in a focus group or survey months after the experience occurred). The benefit of a short latency between the action and the satisfaction request is more than simply measurement integrity. Specific feature and functionality questions can be asked immediately after use to acquire more insightful and relevant feedback with direct impact on design.
316
G.S. Lew
2.3 Ethnographic Research One method that avoids retrospection and any associated confabulation due to the long latency between action and question is ethnography. It involves observing user behaviors in a natural environment. However, there are obvious challenges that prevent its widespread use as a research technique. Setup and logistics necessary to observe natural behaviors are difficult (e.g., consider the case of trying to observe mobile devices where screens are small and interactions are very rapid). Fieldwork and analysis can be time-consuming. Sample sizes are often small. And most importantly, the likelihood that the output of the study will be actionable is low relative to more direct and tactical techniques such as usability testing. Because ethnography is best suited to uncovering insight that drives ideation rather than answering direct design questions, securing authorization and budget to conduct ethnographic research can be difficult. However, what cannot be refuted is that ethnographic research collects data in the environment where interactions occur and with products used by the users. 2.4 Longitudinal Research Longitudinal research captures data from users over time. With its foundations from developmental psychology, this methodology has been largely observational in nature, using correlational analysis to assess phenomena. However, the longitudinal approach has applicability to user experience research. While usability testing can be seen as tapping the user experience just once, the study could be extended over time to make multiple, repeated assessments on the same set of users over time. The study could have users perform tasks and provide feedback. Thus, learning can be an area of interest. Moreover, the methodology could assess how the user adapts and uses the product during critical periods of its lifecycle. Longitudinal research is compelling as it often involves fieldwork in a naturalistic environment with the benefit of having a more structured data collection technique. Questions, tasks, and observations can also be very design-focused and tactical. Moreover, it fills in the post-walk-up-and-use gap left open with a usability testing methodology. In short, longitudinal research offers access to the daily user experience of a product. Consider a mobile phone. Usability testing can assess the usability of core functions such as the ability to add a contact or determine whether or not there is sufficient affordance to use a specific keypad button to complete tasks. The problem is that when usability issues are uncovered, it is impossible to know if the feature that was difficult in usability testing can be learned and become second nature over time or will be left unused because users could not learn it. Information about how users interact with products over time is thus extremely valuable. Longitudinal methods could provide information about a product in the hands of users. Because assessments can be made over time, the technique can capture how the user learns to use the product. Given the possible potential of longitudinal research, why is it NOT widely used? At the 2007 CHI (ACM-SIGCHI) conference in San Jose, a new special interest group (SIG) on longitudinal research was formed. What was most interesting is that only 25% of the attendees of this SIG had actually conducted a longitudinal research study in the last couple of years. Possible reasons for why longitudinal research is rare include:
What Do Users Really Do? Experience Sampling in the 21st Century
317
A. Long timelines: The business challenge of a research project where data collection is stretched over time makes longitudinal research compete with “just in time” or “we need the data last week” research alternatives. B. Cost: Building a user panel where users are tapped for an extended period has a high cost and high panel attrition. Since timelines can extend across multiple product releases with benefits to different business groups, it is unclear which group should be charged for the study. Securing funding is inherently more difficult. C. Complex logistics: Study design and execution have a high initial setup cost because every aspect of the study must be coordinated. Any repeated measures technique will require allocation of resources to manage the study activities for an extended period of time. D. High effort: Data collection requires high effort from both researchers and users who must participate across multiple data sessions. Alternatively, data come in the form of written diaries where the coding process is non-trivial. E. Difficult analysis: Analyzing the large amount of data collected can be time consuming as data are essentially multiplied by the number of repeated measures. 2.5 Need for an Alternative Method If usability testing captures walk-up-and-use usability, ethnographic research gets us in the field and longitudinal research can reveal how users learn, what still seems to be lacking is usage and motivation. Consider the mobile phone example again. Manufacturers and mobile service providers know that a call was made and how long it lasted. What is unknown, however, is whether the user called “John” from their contacts or dialed the number directly. In terms of designing features, researchers and designers are blind as to whether the user ever entered John into their contact list or what motivates the user to even use the feature. All too often, when launched, the product becomes a mysterious “black box” and we do not know how users use the product or feature that took so much effort to design.
3 Experience Sampling Method Experience sampling method (ESM) refers to in-situ (Latin for “in place”) research where the phenomenon is examined in the place where it occurs. The methodology was developed in 1977 at the University of Chicago by Csikszentmihalyi, Larson and Prescott [1] to understand the experience of adolescent activities, but its applicability to other areas of user experience is clear. ESM is more commonly referred to as a “pager study” where users are asked to provide information via a diary. Users are prompted to enter information by a “page” sent to a device (e.g., “What are you doing now?”). Participants enter data into a paper diary. Prompting could be either controlled by a researcher or set to prompt at specific intervals. The data can be analyzed to understand user activity, motivation, and other cognitive and social dimensions. This methodology can be used to assess how users use products. 3.1 ESM Coupled with Advanced Mobile Technologies It would be great if the product could tell us how it is being used, but that is not necessarily practical, nor does it provide the rich user experience as interpreted and
318
G.S. Lew
provided by users. Imagine if a technology could retain the tactical and rigorous elements of “in-lab” research while capturing the richness and environmental cues associated with more natural settings. What if the satisfaction data are not retrospective, but closely tied to user behavior and actions? Through innovative advancements in mobile technology, researchers can now expand upon longitudinal and experience sampling research techniques to effectively solicit, monitor and receive data on users’ interactions at given points in time. These advancement tap directly into both application and operating system to provide the building blocks to take user experience research to new levels. 3.2 Using Mobile Technology to Capture Data Mobile device technology has advanced to a level where research can be more complex than simply paging users to ask them to write passages in a diary. The mobile device itself can be the conduit between the user and researcher. Imagine what research areas would be open if practitioners could conduct studies on a robust platform that prompts the user, collects data both from the user and from the device itself and handles logistics (e.g., compensation). Moreover, what if the device is the participant’s own personal mobile device? With full QWERTY keyboards on mobile phones, one can readily imagine feedback in the form of free-form text responses. Considering the abilities of the youth of today who can type 40 words per minute using a 12-key numeric keypad, the tremendous data collection benefit of a phone over diary input is easy to envision. In addition, the device can be leveraged as a powerful remote data collection tool where areas under investigation could be anywhere a user could go with their mobile device at their side. This opens up novel forms of research never before possible without specialized equipment specifically designed for the study. Using mobile devices, user input and feedback extends beyond making a simple selection or answering a series of questions. Users could speak their response and have it recorded. They could also respond by taking a picture or recording a video of their experience. The remote capabilities of a mobile device as a research tool create a wealth of research opportunities. LEOtrace MobileTM is a mobile technology that uses ESM to obtain data [2]. It runs on Windows Mobile 6, Symbian Series 60, and RIM Blackberry devices. User input and device information that can be collected is shown in Table 1. Table 1. Types of data that can be collected from ESM using LEOtrace MobileTM User-provided data A. Open-ended feedback B. Scaled feedback (binary, Likert-scale, slider ratings) C. Image selection D. Voice recording E. Camera image F. Video clip
Device-provided data A. Task completion (success/fail) B. Event (app start/end, SMS sent, picture taken, etc.)
What Do Users Really Do? Experience Sampling in the 21st Century
319
3.3 Event or Behavior Triggers Research using new mobile technologies could be further enhanced by analyzing user behaviors and feature usage to trigger prompts for user feedback. In this case, the user’s own actions are of interest and the behavior itself prompts the device to ask specific questions around the behavior captured. This differs from contrived tasks set up by a researcher for the user to complete. Algorithms can be designed to watch for specific situations to occur that would trigger research questions so feedback can be obtained very close to when the behavior happened. 3.4 Other Mobile Technologies This paper describes various research methodologies and recent advancements in mobile technology that can provide practitioners with improved research techniques to better assess the user experience of a product. Besides LEOtrace MobileTM there are several other technologies available – from those that sit on old Palm Pilots to those that run on the latest mobile devices; from techniques involving simple SMS text messages to ask for feedback to web surveys solicited via phone-based email or messaging, there are many mobile technologies that can be used to solicit data from users. As practitioners, the potential of remotely capturing user interactions in an ecologically valid manner while extending beyond walk-up-and-use usability is compelling. Experience sampling techniques can further our design practice by yielding more insight into user motivation, usage, and learning. Implications for future research are vast given the capability to more efficiently and remotely monitor user behavior and perception “as it happens.”
4 ESM Study Findings The conference presentation will include findings from an ESM study. Device usage and satisfaction data will be presented from a four-week study with a participant sample size of 100. Participants will use mobile devices they presently own. Software will be loaded on the devices to passively monitor usage. Users will also be asked to perform specific tasks. Success and failure will be reported with user feedback on their experience and satisfaction using device features. Acknowledgments. Many thanks to the development team at Nurago (www.nurago. com) for developing the LEOtrace Mobile™ software used in this research. The user experience research teams at both User Centric, Inc. (www.usercentric.com) and SirValUse Consulting GmbH (www.sirvaluse.com) deserve credit as their insight was essential to the research approach and execution of this study.
References 1. Csikszentmihalyi, M., Larson, R., Prescott, S.: The ecology of adolescent activity and experience. Journal of Youth and Adolescence 6, 281–294 (1977) 2. Lew, G.S.: The truth is out there: Using mobile technology for experience sampling. User Experience 7(3), 8–10 (2008)
Evaluating Usability-Supporting Architecture Patterns: Reactions from Usability Professionals Edgardo Luzcando, Davide Bolchini, and Anthony Faiola Indiana University Purdue University Indianapolis School of Informatics - HCI {eluzcand,dbolchin,afaiola}@iupui.edu
Abstract. Usability professionals and software engineers approach software design differently, which creates a communication gap that hinders effective usability design discussions. An online survey was conducted to evaluate how usability professionals react to Usability-Supporting Architecture Patterns (USAPs) as a potential way to bridge this gap. Members of the Usability Professionals Association (UPA) participated in a pretest-posttest control group design experiment where they answered questions about USAPs and software design. Results suggest that participants perceived USAPs as useful to account for usability in software architectures, recognizing the importance of the USAPs stated usability benefits. Additionally, results showed a difference in perception of the USAPs stated usability benefits between US and European participants. A better understanding of what the usability community thinks about USAPs can lead to their improvement as well as increased adoption by software engineers, which can lead to better integration of usability and HCI principles into software design. Keywords: Architecture Patterns, HCI, Usability, Usability Professionals, Software Design, USAP.
1 Introduction and Motivation The development of modern interactive applications entails the necessity of a smooth coordination and cooperation between software engineers (who conceive, design and develop the technological infrastructure for the system to be) and usability professionals (who conceive, design and develop the elements of the user experience). Due to cultural and historical factors, the tools mastered by software engineers and usability professionals are different and represent their own fields in isolation, thus mining mutual understanding and ultimately posing obstacles to the efficient accomplishment of the goals of the project [1-3]. In particular, one of the common communication breakdowns between software engineers and usability professionals is the lack of strategies to inform the early design of software architectures with usability principles, which helps avoid late (and expensive) architectural changes to accommodate user experience requirements [4]. As a consequence, user requirements are typically added on top of already developed software architectures, thus being constrained and locked early on by system-centered design decisions [5]. Edwards recently warned that we have been successful at J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 320–328, 2009. © Springer-Verlag Berlin Heidelberg 2009
Evaluating Usability-Supporting Architecture Patterns
321
“covering up ill-suited infrastructure features with interface veneer, but there are limits to how far this can take us.” [6] He argues that infrastructure and interaction features need to be jointly designed, and not performed ad-hoc. To address this challenge, Usability-Supporting Architectural Patterns (USAPs) have been recently proposed as a strategy to systematically embed usability requirements in the early design of software architectures [5]. USAPs are a blend of HumanComputer Interaction (HCI) and Software Engineering (SE) principles that provide a framework to design recurrent software and user requirements (e.g. provide the user a way to undo operations). USAPs are enriched with indications on how these requirements may impact the components of the system architecture, and with examples of how to deal with it at this level. The foreseen benefits of leveraging USAPs in software design are many, including: (a) the opportunity for software engineers to consider and take into account the needs of the user experience in making strategic architectural decision; (b) to provide a shared language between usability professionals and software engineers to discuss design decisions in the light of both system and user requirements; (c) to offer reusable solutions (patterns) which capitalizes on previous design expertise. Initial studies suggest that USAPs are effective when applied by software engineers [7]. However little is known about USAPs understanding and acceptance by usability professionals [8, 9]. Acknowledging the proposal of USAPs as an important step towards bridging the communication gap between software engineers and usability professionals, we have conducted a study aimed at assessing the perceived value of USAPs among the community of usability professionals. There is the risk, in fact, that the original value of USAPs (improving mutual understanding) may be weakened amongst usability professionals by the way USAPs are proposed and described: still using concepts, terminology and notation familiar only to software engineers. The study consisted in a focused online survey administrated to usability professionals and was based on the following multi-part hypothesis: H.1 - Usability professionals can perceive Usability-Supporting Architecture Patterns as relevant in their everyday work. H.2 - Usability professionals consider the usability benefits of Usability-Supporting Architecture Patterns important for their everyday work. H.3 - If Usability-Supporting Architecture Patterns are communicated in more natural HCI terminology to usability professionals, they can better appreciate the value of Usability-Supporting Architecture Patterns in their everyday work. The remainder of the paper is organized as follows. Section 2 describes the methods and instrument used to conduct the experiment. Section 3 presents the qualitative and quantitative results. Section 4 covers the discussion of the findings, and section 5 summarizes the paper with concluding statements.
2 Methods and Instruments 2.1 Participants This study surveyed a convenience sample of usability professionals from the Indianapolis Usability Professionals Association (UPA) and the Swiss UPA. The sample included approximately 80 participants that have academic training in HCI, HCI
322
E. Luzcando, D. Bolchini, and A. Faiola
professional experience, or both. The study did not differentiate between professionals and students, but it was expected that most participants would have some degree of professional experience in HCI or related fields given their involvement with the UPA. 2.2 Survey Design The study is based on a mixed-methods research design to analyze an area where little research has been conducted, following a Concurrent Triangulation Strategy [10] and the quantitative data given higher priority during the analysis. The quantitative portion of the experiment used a Pretest-Posttest Control Group Design [11] with a classic between-subjects design where participants are randomly assigned to any of two groups during the data collection phase. Participants in the experiment group received a treatment in the form of specific USAP materials consisting of a software design scenario and an USAP example. Participants in the control group did not receive the treatment. A questionnaire format was used for the pretest as well as the posttest, including both quantitative and qualitative questions. Demographic information was solicited after the questionnaire, in addition to the opportunity to provide additional comments. The survey questions were created leveraging survey design techniques from Dillman [12], and several questions were constructed based on previous questions from Schuman and Presser [13] used to survey attitudes. The online survey was constructed from scratch with a combination of PHP and MySQL technologies available at IUPUI. All data was collected and stored in university infrastructure. 2.3 Procedure The survey introduction provided a brief history about the desire to improve usability in software products. Participants were then given pretest questions to record their existing knowledge and experience. Following the pretest, the treatment introduced USAPs (to the experiment group) and explained how leveraging USAPs could facilitate the communication between usability professionals and software engineers. The treatment provided a software design scenario (canceling a command) describing the communication challenges regarding usability in software design and presented a USAP example. During the posttest, participants were asked to rate the importance of USAP usability benefits from an HCI perspective using a Likert scale. This was done with the nine original USAP usability benefits as well as a newly worded set of usability benefits meant to find if different terminology would improve acceptance. Although all nine USAP usability benefits were rated, the study focused on two Table 1. USAP Usability Benefit Comparison Design Usability Benefit Original Wording
Usability Benefit New Wording
1
Accelerates error-free portion
Increases Efficiency
2
Reduces impact of slips
Reduce the impact of errors
Evaluating Usability-Supporting Architecture Patterns
323
USAP usability benefits: Accelerates error-free portion, and Reduces the impact of slips, as shown Table 1. An initial pilot study and conversations with HCI peers suggested that these two used terminology that was confusing to a usability professional. Additional posttest questions explored further perceptions about USAPs and software design, asking participants to state their opinions about USAPs and their potential applications in practice. The survey was designed to flow as one continuous questionnaire where participants were unaware of the difference between the pretest questions or posttest questions.
3 Results From the convenience sample of 80 usability professionals, 67 participants began the survey, 49 completed the pretest and 45 completed the posttest. Of the 45 participants that completed the pretest and posttest only the results of 35 participants were complete and summarized in this section. There were 17 participants in the experiment group and 18 in the control group, and 15 were from the Swiss UPA (Region 1), and 20 from the Indianapolis UPA (Region 2). Of the 34 participants that provided demographic information 20 had a masters, doctorate or post-graduate degree, 12 had a bachelors degree, and 2 did not have any degree. From these, 25 reported six or more years of experience. When asked to what extent they agreed that usability is an important aspect of software design, all 35 participants agreed, and when asked if they had worked in close contact with software engineers, 28 of 35 participants agreed. When asked to what extent they agreed that USAPs would assist usability professionals identify usability concerns that impact the architecture of a software system, 23 of 35 participants agreed (66%). When asked if they found it challenging to apply usability principles in software design projects, 30 of 35 participants answered yes, and when asked if there is a communication gap between usability professionals and software engineers, 33 of 35 participants answered yes. Additionally, participants volunteered comments about the existence of a communication gap between usability professionals and software engineers, as summarized in Table 2. When participants were asked if they were familiar with any methodologies that would improve communication between usability professionals and software engineers, 21 of 35 answered yes (60%). In addition, those participants who answered yes where asked to list the known methodologies to substantiate their quantitative answer, and their responses are summarized in Table 3. Participants were asked to rate the importance of original USAP usability benefits as well as the newly worded version using the following scale: Very Important =1, Important=2, Somewhat Important=3, Not Important =4, and Don't Know=5. The Don’t Know answers were filtered out. The results are summarized in Fig. 1 using the following weighted average: Very Important =16, Important=12, Somewhat Important=8, and Not Important =4.
324
E. Luzcando, D. Bolchini, and A. Faiola Table 2. Identified Reasons for the Communication Gap between Groups1 Answer
Identified Issue
Yes
Knowledge: software engineers only know software development and usability professionals only know usability. They don’t know each other’s disciplines. Core focus in project: software engineers focus on getting all system parts to work, and usability professionals only focus on system parts that impact the user interface. Mutual understanding: Both groups struggle to understand each other’s needs. Awareness: software engineers have not been exposed to usability and usability professionals have not been exposed to software engineering. Process: The software design process is may or may not include usability. Availability of usability people: Not all project benefit from the participation of usability professionals. Stated there is gap, but did not elaborate on the reason. No gap
No
Participants 5
7
4 2
1 2 2 1
Table 3. Reported Methods to Improve Communication2 Listed Methods MILE+ Open communications (e.g. meetings, workshops) AWARE HCI-driven methodologies Using prototypes and mockups Software development methodologies Conceptual Comics
Participants 2 10 1 1 3 6 1
An independent groups t test was used to test the difference in the mean response or rated importance of the target USAP usability benefits Accelerates error-free portion and Reduces impact of slips. Respondents from Region 2 (M = 1.76) showed a lower mean response than those from Region 1 (M = 2.29), t(30) =2.09, p < .05, r = .36. The rating of USAP usability benefits also collected qualitative data by asking participants to provide any comments if any of the USAP usability benefits were not clear to them. The targeted USAP usability benefits Accelerates error-free portion and Reduces Impacts of Slips received the most comments, mostly about ambiguous meaning and language that was not familiar. The other (non-targeted) USAP usability benefits did not receive similar comments. 1 2
Included five additional responses from the pretest that were not part of the 35 clean data sets. Included three additional responses from the pretest that were not part the 35 clean data sets.
Evaluating Usability-Supporting Architecture Patterns
325
Fig. 1. USAP Usability Benefits Ratings
When asked if they found that leveraging USAPs would be useful for their software design activities, 24 of 35 participants agreed. However, there was a directional difference between the control group and the experiment group. Of the 24 that agreed, 15 were from the control group and 9 were from the experiment group. The experiment group experienced an increase from 0 to 6 participants in the selection of the no opinion choice when compared to the control group. When asked if there is a communication gap between usability professionals and software engineers, 29 of 35 participants agreed. When participants were asked how likely it would be for them to go and learn more about USAPs after completing the survey, 25 of 35 agreed.
4 Discussion H.1 predicted that usability professionals expect to get benefits of UsabilitySupporting Architecture Patterns in their everyday work. During the pretest 66% of the participants agreed that USAPs could enable usability professionals identify usability concerns that impact the architecture of a software system. However, it is unclear why 66% agreed because no participants reported to have a priori knowledge of USAPs, and of the 60% that reported knowing methodologies to improve this gap, none reported USAPs. One possible explanation for this result could be that the term “usabilitysupporting” along with “architecture-patterns” could lead to an implicit belief that USAPs are beneficial. In the posttest, 68% of the participants reported USAPs as
326
E. Luzcando, D. Bolchini, and A. Faiola
useful for software design activities based on what they had learned in the survey. However, agreement was directionally different between the control group (62%) and the experiment group (38%). This difference could stem from the participants comfort in selecting the no opinion choice. The selection of the no opinion choice could be an effect of receiving the treatment. It is possible that after participants received the treatment and were exposed to the USAP scenario, they did not understand its purpose or were perhaps confused by the presentation of the materials. For example, it could be that the USAP scenario of canceling a command did not easily apply to their experience, and therefore did not add clarity about the usefulness of USAPs. Conversely, it is possible that participants that did not receive the treatment and did not see the USAP materials were able to imagine (or construct) their own idea of what USAPs are, which in their view might be more effective than the actual USAPs. However, there was no effect found for the treatment in determining the difference between the pretest and posttest difference (p> .10). H.2 predicted that usability professionals can perceive the importance in using Usability-Supporting Architecture Patterns for their everyday work. During the pretest, 100% of the participants acknowledged that usability is an important aspect of software design, and 86% of participants acknowledged they have previously found it challenging to apply usability principles in a software design project. This suggests that participants understood the importance of usability in software design and the challenges of applying usability principles therein. Hence, the fact that 71% of participants responded that they would likely investigate USAPs further and learn more about them is a potential indication of their usefulness. However, it is possible that the perceived importance of USAPs is a result of recognizing that any technique to improve usability is innately important to usability professionals. This study did not analyze this further. H.3 predicted that if Usability-Supporting Architecture Patterns are communicated in more natural HCI terminology to usability professionals, they can better appreciate the value of Usability-Supporting Architecture Patterns in their everyday work. We predicted that when participants received the treatment they would rate USAP usability benefits as more important since they had (in the treatment) been exposed to a positive introduction of USAP usability benefits and potential use in software design. The effect of the treatment was non-significant (p > .10) for the ratings. When contrasting the control group with the experiment group, the targeted USAP usability benefits Accelerates error-free portion and Increases efficiency exhibited an 18% reduction in rating of importance when compared to their newly worded counterparts Increases Efficiency and Reduce the impact of errors. However, there was no significant effect found for the treatment (p = 0.63). An unexpected yet interesting result of the experiment was that participants in Region 1 (Europe) responded differently than those in Region 2 (US) when rating the importance of the target USAP usability benefits Accelerates error-free portion and Reduces impact of slips. US usability professionals rated the target USAP usability benefits more important than European usability professionals, which is a potential indication that USAPs are more difficult to understand for European usability professionals than for US usability professionals.
Evaluating Usability-Supporting Architecture Patterns
327
5 Conclusion This study suggests that usability professionals' initial perception of USAPs is positive. Participants agreed that USAPs are relevant for considering usability concerns in software design, and that usability professionals recognize there is a communication gap with software engineers. However, exposure to USAP materials did not conclusively affect their perception of USAPs. The study suggests that usability professionals generally accept the notion of USAPs without understanding USAP details. This effect was more prominent for US participants in the study, in contrast with their European counterparts. More studies would need to be performed to evaluate additional characteristics of USAPs and their potential acceptance by usability professionals.
Acknowledgments Thanks to Dr. Mark Pfaff for his guidance in conducting the statistical analysis for several parts of this study.
References 1. Snyder, C.: Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces. Morgan Kaufmann, San Francisco (2003) 2. Karat, J.: Taking Software Design Seriously. Academic Press, San Diego (1991) 3. Preece, J., Rogers, Y., Sharp, H.: Interaction Design: Beyond Human-Computer Interaction. John Wiley & Sons, New York (2002) 4. Seffah, A., Gulliksen, J., Desmarais, M.C.: Integrating Usability in the Development Process. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-Centered Software Engineering: Integrating Usability in the Software Development Lifecycle, pp. 3–14. Springer, Dordrecht (2005) 5. John, B.E., Bass, L., Sanchez-Segura, M.I., Adams, R.: Bringing Usability Concerns to the Design of Software Architecture. In: 9th IFIP Working Conference on Engineering for Human-Computer Interaction and 11th International Workshop on Design, Specification and Verification of Interactive Systems, Hamburg, Germany (2004) 6. Edwards, W.K.: Infrastructure and Its Effect on the Interface. In: Erickson, T., McDonald, D.W. (eds.) HCI Remixed: Reflections on Works That Have Influenced the HCI Community, pp. 119–122. MIT Press, Cambridge (2008) 7. Golden, E., John, B.E., Bass, L.: The value of a usability-supporting architectural pattern in software architecture design: a controlled experiment. In: 27th International Conference on Software Engineering ICSE, St. Louis, Missouri, p. 460 (2005) 8. Adams, R., Bass, L., John, B.E.: Experience with using general usability scenarios on the software architecture of a collaborative system. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-Centered Software Engineering: Integrating Usability in the Software Development Lifecycle, pp. 87–112. Springer, Dordrecht (2005) 9. John, B.E.: Evidence-Based Practice in Human-Computer Interaction and Evidence Maps. ACM SIGSOFT Software Engineering Notes 30, 1–5 (2005)
328
E. Luzcando, D. Bolchini, and A. Faiola
10. Creswell, J.W.: Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. Sage Publications, Thousand Oaks (2003) 11. Campbell, D.T., Stanley, J.C.: Experimental and Quasi-Experimental Designs for Research. Houghton Mifflin Company, Boston (1963) 12. Dillman, D.A.: Mail and Internet Surveys: The Tailored Design Method. John Wiley & Sons, New York (2000) 13. Schuman, H., Presser, S.: Questions and Answers in Attitude Surveys. Academic Press, New York (1981)
Heuristic Evaluations of Bioinformatics Tools: A Development Case Barbara Mirel and Zach Wright University of Michigan (bmirel,zwright}@umich.edu
Abstract. Heuristic evaluations are an efficient low cost method for identifying usability problems in a biomedical research tool. Combining the results of these evaluations with findings from user models based on biomedical scientists’ research methods guided and prioritized the design and development process of these tools and resulted in improved usability. Incorporating heuristic evaluations and user models into the larger organizational practice led to increased awareness of usability across disciplines. Keywords: Usability, heuristic evaluation, biomedical research, organizational learning, user models.
1 Introduction Assuring usefulness and usability–a perennial challenge in any software project–is particularly tricky in bioinformatics research and development contexts. Our NIHfunded bioinformatics center produces tools for systems biology analysis. The databases and tools enable biomedical researchers to interactively analyze genomic-level data for the purpose of uncovering systemic functional roles that candidate genes/gene products may play in susceptibility to a disease. Ensuring the usability of these tools is a challenge because we are not a software shop and must optimize the combination of academic and implementation specialties that we have available.The discount usability inspection method of heuristic evaluations is highly attractive. We recognize that heuristic evaluations (HE) alone—the process of scoring tools for their concordance with usability standard—are insufficient for detecting and generating improvements for significant usability and usefulness advances [8]. Therefore, we integrate heuristic evaluations with three processes known to enhance their effectiveness: (1) Evaluators are familiar with the tools and users’ query and analysis tasks; (2) heuristics—i.e., the usability principles against which tools are judged—are adapted to the domain and tasks specific to the tools, and (3) heuristics and interpretations of findings are informed by user models of researchers’ analytical performances and goal-driven cognition [6]. Additionally, we recognize that assessments of our web-based bioinformatics tools must account for the support of more complex explorations than user interfaces (UI)/web pages originally targeted by usability inspection methods support. Toward this end, we combine heuristic evaluations with research, development, and other organizational processes. This integration facilitates our abilities to distinguish real J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 329–338, 2009. © Springer-Verlag Berlin Heidelberg 2009
330
B. Mirel and Z. Wright
problems in the results, set priorities for fixes, and raise developers’ awareness of user needs beyond surface fixes to better build for users’ cognition in scientific analysis. Our outcomes have been positive. We argue that for our bioinformatics tools, positive results hinge on combining domain-based, user-informed heuristic evaluations with organizational processes that break down boundaries isolating usability from development, modification request decisions, and UI design.
2 Relevant Research Heuristic evaluations involve “evaluators inspect[ing] a user interface against a guideline, be it composed of usability heuristics or cognitive engineering principles, in order to identify usability problems that violate any items in the guideline”[8]. This method is known to produce many false positives and likely omissions of problems related to users’ cognitive tasks. It nonetheless is one of the most popular usability assessment methods due to its low costs and efficiencies [2]. Thus it is important to improve the effectiveness of HEs without diminishing their benefits. Researchers have found several ways to achieve these improvements. They include conducting heuristic evaluations with many evaluators and combining them with evaluator training and reliability testing increase the effectiveness of HEs [10,12]. Heuristic evaluation results also improve when evaluators have prior knowledge of usability and the tool; when heuristics are adapted to domain tasks and knowledge; and when HE findings are compared with results from user performance studies [3]. Finally, improvements come from using sets of heuristics that are “minimal” (not overlapping) yet inclusive [10]. For example, some researchers have evaluators jointly consider heuristics and problem areas, thereby assessing to a “usability problem profile” [2]. Establishing an optimal set of heuristics, however, is still a black box. To compensate for elusive “ideal heuristics,” many usability researchers advocate integrating findings from user performance studies with HE. Demonstrably, heuristic and user performance evaluations combined uncover more problems than either method does alone. Yet quality not just quantity of problems is critical. For better quality, some researchers claim that what is missing in Nielsen’s standard set of heuristics is that they are not “related to cognitive models of users when they interact with system interfaces” [8]. Cognitively-oriented heuristics are especially important when tools support complex tasks. Recent attempts to construct heuristics that address cognition include Gerhart-Powel’s [5] cognitive engineering principles and Frokjaer and Hornbaek’s [3] principles based on metaphors of thinking. So far findings about the superiority of such heuristics have been mixed [4,8]. Running in parallel with these academic efforts, some studies by specialists in production contexts aim to improve the effectiveness of HEs by advantageously combining them with organizational processes. Hollinger [7], for example, reports on positive efforts at Oracle—against great organizational resistance at first—to combine bug reporting processes with HE findings, thereby “mainstreaming” reviews of outcomes. This mainstreaming increased usability awareness across different teams and functional specialties, incited interactive team discussions about usability, initiated tracking the costs and benefits of usability improvements, and resulted in fixing more usability defects. Moreover, results included “significant improvements in the quality of the user interface” [7].
Heuristic Evaluations of Bioinformatics Tools
331
Exploiting organizational processes is promising but, to the best of our knowledge, few production context studies report on combining HE with even more organizational processes than Hollinger [7] describes or on combining organizational processes with the established methods of improving HE outcomes by comparing them with usability performance findings, assuring evaluator familiarity with the tools, and adapting heuristics to the task domain.
3 Methods Our methods are tied to achieving the same effectiveness with HE that other researchers seek by combining them with other factors. Unfortunately, due to resource constraints we could not conduct extensive training of evaluators or involve numerous evaluators. We could, however, get several evaluators familiar with the tools, adapt and pilot test heuristics to our domain and tools, and introduce several new organizational processes. We also introduced the novel process of reframing surface problems found by HEs into more substantial problems-based on user models. 3.1 Tools We report on heuristic evaluations of one open source, web-based query and analysis tool. The tool is the front end for querying our center’s innovatively integrated protein interaction database. The query and analysis tool lets users query by gene(s), keyword(s), or gene list and provides tabular query results of relevant genes, attributes, and interactions. The tool is non visual but links to visualization tools. 3.2 User Task Models User models were derived from longitudinal field studies of 15 biomedical researchers using our tools and others to conduct their systems biology analysis [9]. These models directed both our adaptations and interpretations of heuristic evaluations. The user models are unique in bioinformatics because they captures scientists’ higher order cognitive and analytical flow for research aimed at hypothesizing and not only lower level tasks that are typically studied in usability tests, cognitive walkthroughs, or cognitive task analysis. Specifically, the user models capture moves and strategies for verifying accuracy, relevance, and completeness and uncovering previously unknown relationships of interest. These tasks involve manipulating data to turn it into knowledge through task-specific combinations of sorting, selecting, filtering, drilling down to detail, and navigating through links to external knowledge bases and literature. Additionally, to judge if genes and interactions are interesting and credible, scientists analyze high dimensional relationships and seek contextual cues from which to draw explanatory inferences. Ultimately, they examine conditions and causes in interactive visualizations, tools outside the scope of this article. This empirically-derived model of higher order cognition was critical to adapting standard Nielsen heuristics to our domain and tool.
332
B. Mirel and Z. Wright
3.3 Adapted Heuristics We adapted Nielsen’s standard set of 10 usability heuristics to our domain and uses of our tools to include the following: The presence of external links to multiple data sources and internal links to details and the large amounts of heterogeneous data in result sets; the core need for statistics and surrogates for confidence; and the variety of interactions needed for validating, sensemaking, and judging results. 3.4 Heuristic Evaluations and Evaluators Three evaluators pilot tested the adapted heuristics with other query and analysis tools developed by our center to refine their applicability to the domain and users tasks. One evaluator is trained in usability and visualizations and the other two evaluators specialize, respectively, in portal architecture and systems engineering and in web administration and marketing communications. All were knowledgeable about the tools and moderately aware of users’ tasks and actual practices through discussions with the usability director about field study findings. No reliability testing was done due to time constraints. Instead, inter-evaluator differences were analyzed by exaining comments entered in the comments field in the instrument. After heuristic evaluations were conducted, outcomes and comments were summarized and grouped by agreement and severity. Relevant design changes were suggested. 3.5 Integration of Additional Processes Concurrent with the heuristic evaluations, the following organizational and software development life cycle processes were instituted with enhanced usability in mind: • Usability categories and severity levels were built into in the modification request (MR) system. Levels were: Minor, serious, major, critical, and failure, and they were coordinated with a newly instituted Technical Difficulty ranking. • Operational processes were put into place for turning MRs into long term development priorities and for raising awareness of user models and their requirements. Our processes included forming a new committee for setting priorities composed of the directors of computer science, life sciences, and usability along with the lead developer and project manager. • Informal and highly collaborative processes between developers, web designers, usability evaluators, and scientists were implemented to assure rapid prototyping and feedback • A research project was initiated into design requirements based on heuristic evaluation findings and user models.
4 Results 4.1 Evaluation Outcomes Conducting the heuristic evaluations took on average two hours/evaluator. Summarizing added another few hours to the effort. Sample summary outcomes are shown in Table 1. Those with agreed upon high severity are highlighted.
Heuristic Evaluations of Bioinformatics Tools
333
Table 1. Sample of results summarized from heuristic evaluations Heuristic
1. Currency of the tool web pages 2. Readable text 3. Hints for formulating a query for better results 4. Able to undo, redo, go back 5. Broken links 6. Examples included and prominently 7. Currency of the data; data sources cited 8. Clearly shows if no results occur 9. Able to change result organization 10. Vital statistics are available. 11. Information density is reasonable 12. Clear what produced the query results 13. Clear why results seem high or low 15. Can access necessary data for validating
Problem severity /agreement High/agreement
Problem(s)
Design change
No date present
High/agreement High/agreement
Small font No hints available
Indicate last update to web pages 12 point font Need query hints when the query fails.
High/agreement
No history tracking;
High/agreement
“Top of page” is broken Could use more examples and better emphasis Versions of dbs are listed but no dates of latest updates Shows, but the message isn’t clear Sort is available but not apparent What would those stats be? A lot of whitespace; too many sections
Range/no agreement (high to low) Range/no agreement (high to low) Range/no agreement (high to low) Range/no agreement (high to low) Range/no agreement (high to low) Range/no agreement (high to lo) Range/no agreement (high to low) Middle/agreement
Low/ agreement
Should redisplay search term so user ties it to results No explanations; I assume informed user knows why Not sure what the data would be
Provide history tracking Fix [list of broken links] Add 1-2 (bolded) examples under the search box Add a date for last updating to our database Change message to: [Suggestion] Need note that columns are sortable No agreement Get rid of the 5 nested boxes; No agreement
Not clear where the search term is “hitting.” No agreement
As Table 1 shows, highly ranked problems involved broken and missing features and web page omissions that could be added without programming. Middle-ranked problems were tied more to user task needs and subjective issues such as what constitutes either “enough” support or the criteria scientists use for judging reliability/validity. Problems with little agreement about severity level were tied even more to evaluators having to project and evaluate the importance of scientists’ task needs in this domain. For example, evaluators varied widely in judging the importance of validation in scientists’ ways of reasoning and knowing. Some actual problems were not caught by the heuristic evaluations, especially those involving novel and unexpected ways users might interact with the interface. These findings were provided by the field studies. Additionally, evaluators’ comments and the summarized design changes ranged from precise to vague. Typically, design changes for familiar problems in
334
B. Mirel and Z. Wright
generic UI design were precise; those tied to user task models for systems biology and complex exploratory analysis were not. 4.2 Integrating Organizational Processes Interpretations and the actions taken on the outcomes of heuristic evaluations took the following course organizationally. As noted in Methods, design changes were entered into the MR system and ranked for severity and degree of development effort. Low cost problems at the levels of failure, critical, major, and serious–e.g. broken links— were delegated and fixed immediately. Concurrently, areas where the heuristic evaluation outcomes combined with problems pertinent to scientists’ demonstrated practices in the field (as captured in the user models) were examined. From these analyses, important combinations of problems found by the HE surfaced—combinations that implied problems related to higher order cognitive task needs. For example, problems 3, 6, 8, 9, 12, and 13 in Table 1 were observed as a recurrent cluster in the field observations as part of scientists’ higher order task for locating interesting genes and relationships expediently. For this task, scientists progressively narrow down results sets based on several meaningful attributes and on validity standards, such as genes/gene products enriched for neurodegenerative processes. Once combined, this set of HE problems revealed scientists’ difficulty manipulating queries and output sufficiently to uncover potentially interesting relationships. Thus beyond easy fixes—e.g. column cues for sorting— deeper implications of a tool’s actual usefulness were uncovered by the combined HE problems and user model. Shaped by the user models developed at our center and by ongoing research in our into design requirements, issues like the example above were presented to the usability and development teams and then brought to the priority setting committee. For example. problems related to users being able to narrow down to interesting results led to realizations that the tool needed to provide a more powerful search mechanism, extensive indexing, and interfaces that allowed users to construct/revise queries using multi-dimensional. Another priority setting issue suggested by the HE outcomes and better understood through the user models was the need for specific types of additional content for users’ validation purposes. Both needs received high priority. Additionally, as the software developers became more aware of the value of these usability techniques, we started to get requests for the heuristic evaluation instrument itself so that programmers could keep the criteria in mind while in the process of developing their software.
5 Discussion Developing the heuristic evaluation instrument was an iterative process as the evaluators discovered its weaknesses and strengths during the course of evaluations. Many of the heuristics turned out to be redundant and were either combined or discarded. Close inspection of the tools also engendered new heuristics as evaluators noticed additional usability problems. Accompanying comments proved to be crucial and were made mandatory for any problems found in later evaluations. The severity
Heuristic Evaluations of Bioinformatics Tools
335
numbering system also proved to be too abstract and will be replaced by ratings that mirror the ones used in the MR system. Finally, some heuristics in the instrument proved to be too theoretical or complex to be useful (e.g. “salient patterns stand out”) and had to be removed or refined. Some of these difficult heuristics were less concrete and were often better suited to incorporation and analysis within the user model. In tool assessments, heuristics alone identified isolated problems and a few inaccuracies. Combined with the user model, the heuristic evaluations enabled us to uncover problems related to integrated tasks associated with scientists’ higher order analysis and reasoning. Evaluators’ written comments, omissions, imprecision in some proposed design changes, and lack of agreement about certain items were vital in cuing us to further examine particular problems or combinations of problems in light of the user models. Had time and resources permitted, reliability testing would have diminished disagreements. A positive unintended consequence of these disagreements, however, was that they revealed where developers’ awareness of user tasks was incomplete. For example, in the heuristic evaluations, comments about “the ability to change the organization of results” indicated that the tool did not make it obvious that columns could be sorted. The user model revealed, however, that the untransparent sorting was only one shortcoming related to this specific heuristic. In actual practice, scientists’ analysis and judgments required tools to provide a combined set of sortingand-filtering interactions to rearrange results into multidimensional groupings—i.e. interesting relationships. Reframed to account for this need, this problem led to high priority, enhanced functionality. Unlike in Hollinger’s study, many usability problems—framed in ways that join heuristic evaluation outcomes and user models—were given high priority status. For such achievements, collaborations across specialties were critical—formally and informally. Developers, web specialists, project managers, scientifically expert bioinformatics specialists, and the usability, scientific, and computer science directors all played distinct roles in shaping the perspectives needed for strategically determining and then implementing a better match between tools and systems biology tasks. In the process, people across specialties grew increasingly aware of each others’ perspectives and began slowly evolving a shared language for articulating them. This process is often termed “double-loop learning” and is essential for innovation [1]. One example of this cross-organizational learning is the software developers’ requests fro the heuristics to help guide software development. Vital to this learning and the common grounding on which it rests is the perennial challenge of assuring that heuristics are expressed in the right grain size and language. As with other research focused on this goal, our center’s efforts have highlighted places to make heuristics more concrete and ways to join outcomes with user models.
6 Conclusions In our center’s case, collaborative communication, shared language, and greater awareness—i.e. double-loop organizational learning—were integrated into and developed from heuristic evaluations. We found a way to use this discount usability inspection method combined with user models and newly implemented organizational processes,
336
B. Mirel and Z. Wright
to reframe problems and to gain buy-in for short and long term usability improvements aimed at scientists’ cognitive task behaviors. Heuristic evaluations coupled with user modeling revealed problems related to the higher order cognitive flow of analysis Combined with organizational and software development processes that encouraged attention to usability, heuristic evaluations produced results and recommended changes that received high priority. Moreover, developers and directors who previously had not considered usability in choices they about knowledge representations or functionality now grew increasingly sensitive to the implication of their choices from a user perspective. Our center continues to refine the instrument and apply it to other tools and is simultaneously creating a complementary instrument for heuristic evaluation of interactive visualizations in bioinformatics tools.
References 1. Argyris, C., Schön, D.: Organizational learning II: Theory, method and practice. Addison Wesley, Reading (1996) 2. Chattratichart, J., Lindgaard, G.: A comparative evaluation of heuristic-based usability inspection methods. In: Proceedings of ACM CHI 2008 Conference, pp. 2213–2220. ACM Press, New York (2008) 3. Cockton, G., Woolrych, A.: Understanding inspection methods: lessons from an assessment of heuristic evaluation. In: Blandford, A., Vanderdonckt, J. (eds.) People & Computers XV, pp. 171–192. Springer, Berlin (2001) 4. Frokjaer, E., Hornbaek, K.: Metaphors of human thinking for usability inspection and design. ACM Transactions on Computer-Human Interaction 14, 1–33 (2008) 5. Gerhardt-Powals, J.: Cognitive engineering principles for enhancing human-computer performance. International Journal of Human-Computer Interactions 8, 189–211 (1996) 6. Hartson, H., Anndre, T.S., Williges, R.: Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction 13, 373–410 (2001) 7. Hollinger, M.: A process for incorporationg heuristic evaluation into a software release. In: Proceedings of AIGA 2005 Conference, pp. 2–17. ACM Press, New York (2005) 8. Law, E.L.-C., Hvannberg, E.T.: Analysis of strategies for improving and estimating the effectiveness of heuristic evaluation. In: Proceedings of ACM NordiCHI 2004, pp. 241–250. ACM Press, New York (2004) 9. Mirel, B.: Supporting cognition in systems biology analysis: Findings on users processes and design implications. Journal of Biomedical Discovery and Collaboration (forthcoming) 10. Nielsen, J.: Heuristic evaluation. In: Nielsen, J., Mack, R.I. (eds.) Usablity Inspection methods, John Wiley, Chichester (1994) 11. Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proceedings of ACM CHI 1994 Conference, pp. 152–158. ACM Press, New York (1994) 12. Schmettow, M., Vietze, W.: Introducing item response theory for measuring usability processes. In: Proceedings of CHI 2008, pp. 893–902. ACM Press, New York (2008)
Heuristic Evaluations of Bioinformatics Tools
337
Supplemental Material: Adapted Heuristics Heuristic
Severity Rating 0 = no problem 5=major problem
First Impression Does the tool fit the overall NCIBI look and feel? Does it look professional? Is the tool appropriately branded with funding source and NCIBI, CCMB, and UM logos? Does the tool link back to UM, CCMB, and NCIBI? Is it clear what to do and what to enter? (limitations are clear, how to format query is clear, what options user has, if a user needs to enter terms from some taxonomy/ontology access to those terms is available for user to choose from) Are there examples shown and are they prominent? Is the display consistent with user conventions for web pages/apps? Is it clear why use the tool and to what purpose? Does it require minimal steps to get started quickly? Is the cursor positioned in the first field that requires entry? Is help readily available? Is it clear how current the data are? Is it clear how current the website is? Are data sources cited and identified? Are appropriate publications cited? Are there any broken links? Are the page titles (displayed at the top of the browser) meaningful and change for different pages? Are page elements aligned (e.g. in a grid) for readability? Is the site readable at 1024x768 resolution? Is the text readable? (e.g. size, font, contrast)? Does the page have appropriate metadata tags for search engines? Search / Results Is the length of processing time acceptable? Do adequate indicators show system status and how long it may take? Clearly shows if there are no query results? Clearly shows how many results query produces? Is it clear what produced the query results? Is it easy to reformulate query if necessary? Are there hints/tips for reformulating query for better results? If the query results seem high or low is it clear why? Are the results transparent as to what results are being shown and how to interpret it? Are the results displayed clearly and not confusing? Is there an ability to detect and resolve errors? Interaction with Results Is there an ability to filter or group large quantities of data? Is there an ability to change the organizations of results? Is there ability to undo, redo, or go back to previous results ? Are the mechanisms for interactivity clear? Is the logic of the organization clear? Are different data items (e.g. rows) kept clearly separate or delineated?
0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A
0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A
Comments
338
B. Mirel and Z. Wright
If there are links is it clear where they go? If there are icons is it clear what they do? Do the link outs provide reliable return? Are the vital statistics/counts of information available? Do the names/labels adequately convey the meaning of items/features? Are data items kept short? Is there too much/little information? Is the density of information reasonable? Can you access the necessary data to assure validity? (e.g. sources) Can results be saved? Are the results available for download in other formats? Can the pages be easily printed? Is vertical scrolling kept to a minimum? Is there horizontal scrolling? Comments Additional comments go here
0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A 0 1 2 3 4 5 N/A
A Prototype to Validate ErgoCoIn: A Web Site Ergonomic Inspection Technique Marcelo Morandini1, Walter de Abreu Cybis2, and Dominique L. Scapin3 1
School of Arts, Science and Humanities – University of Sao Paulo, Sao Paulo, Brazil
[email protected] 2 Ecole Polytechnique Montreal, Canada
[email protected] 3 Institut National de Recherche en Informatique et Automatique Rocquencourt, France
[email protected] Abstract. This paper presents current actions, results and perspectives concerning the development of the ErgoCoIn approach, which allows non expert inspectors to conduct ergonomic inspections of e-commerce web sites. An environment supporting inspections based on this approach was designed and a tool is being developed in order to accomplish its validation plan. Besides this validation, the actions to be undertaken will allow us to analyze the task of applying checklists and specify an inspection support environment especially fitted for that. This is of great importance as this environment is intended to be an open web service supporting ergonomic inspections of web sites from different domains. A wiki environment for this tool development is also being proposed. Keywords: Usability, Evaluation, Web Sites, Inspection, Web 2.0.
1 Introduction An important attribute for most interactive systems is the level of usability they offer to users while accomplishing their tasks. According to ISO9241:11, usability is characterized by the effectiveness, efficiency, and satisfaction with which users achieve specified goals in a particular environment [9]. In such a way, usability is a blend of objective and subjective task oriented measures. Effectiveness can be objectively measured by the rate of users’ achievement (with accuracy and completeness) of specific goals. Efficiency can also be objectively measured by the amount of resources expended on task by actual users. User satisfaction concerns subjective data indicating how well users evaluate the system’s comfort and acceptability. Usability can be measured during user interactions with the system and evaluated by evaluators and/or inspectors that may judge how well the user interface aspects are, a priori, fitted to users, tasks and environments. In doing so, they judge the ergonomics of that user interface. Usability and ergonomics are linked to a cause-effect relationship. The more ergonomic (or fitted) the interface is the higher is the level of J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 339–348, 2009. © Springer-Verlag Berlin Heidelberg 2009
340
M. Morandini, W. de Abreu Cybis, and D.L. Scapin
usability it can afford to its users [6]. Considering the software product quality model 1 proposed by ISO 9126 , ergonomics may be understood as an external quality of the software while the usability is the quality of its use [8]. Methods aimed to measure usability (usability tests) are known to be usually expensive and complex [13]. Alternatively, ergonomics of the user interfaces can be evaluated or inspected faster and at lower costs. A simple differentiation between evaluations and inspections can be established based on the type of the knowledge applied to the judgments involved with both techniques. Evaluators apply mainly implicit knowledge they accumulated from study and experience, while inspectors apply primarily the explicit knowledge supported by documents, such as checklists. Inspectors cannot produce fully elaborated or conclusive diagnosis, but their diagnoses are comparatively coherent and generally obtained at low cost. ErgoCoIn [5] is an approach designed to provide support to inspectors in order to allow them to perform objective web sites ergonomic inspections. With the goal of improving the quality of the diagnoses, this approach postulates several considerations about the web site’s context of use, including: users, tasks and environments attributes. Among them must be considered the ones concerning the interface of the web site under evaluation [9]. Content of interviews/questionnaires as well as of the others contextual data gathering activities are based on information demand presupposed by the approach’s knowledge base. Such strategy allows performing specific objective ergonomic inspections: only pertinent information gathering is proposed to the inspectors in the context of use analysis, and only applicable questions are presented to them while inspecting the web site. The ErgoCoIn checklists can support the inspectors by providing more homogeneous results when compared to those produced by ergonomic experts. This is an obvious consequence of having inspectors applying the same checklist set of questions and sharing decisions about their relative importance. This approach is interesting to web sites designers and evaluators due to the fact that questionnaires and checklists can be applied by the design staff, not necessarily experts in usability evaluation. Thus, the inspections can be usually performed quickly and at low costs. It can also be considered as a way to introduce ergonomic concepts to designers and to stimulate them in their daily work to be questioning human factors specialists when facing potentially serious ergonomics problems. In this paper we present details about both the ErgoCoIn logical architecture and the tool built to validate the approach: (i) low cost, (ii) objectivity and (iii) homogeneity of inspection diagnosis. The other requirements that were identified include the variety and novelty of the knowledge base. In order to achieve the fulfillment of these requirements, we propose the development of a collaborative effort aimed to insure that the ErgoCoIn knowledge base can be enriched continuously. We believe that inspections supported by an environment that incorporates these features can be more efficient and reliable. 1
In fact, ISO 9214:11 and ISO 9126:1 don’t agree completely about the terminology concerning the “a priory” and the “a posteriori” perspectives of usability. While the first standard employs “ergonomics” and “usability”, the second one employs “usability” and “in use quality” to denote these perspectives.
A Prototype to Validate ErgoCoIn
341
This paper contains 5 sections: Section 2 presents an overview of the ErgoCoIn approach. Section 3 presents the logic architecture of an environment aimed at supporting the software application, as well as introduces the tool that is being developed for validating the ErgoCoIn approach. Section 4 presents the motivation and proposal for developing a cooperative perspective to the development of a Wiki ErgoCoIn. And finally section 5 presents the conclusions that can be considered for this environment future development and use.
2 The ErgoCoIn Approach The ErgoCoIn approach development has been motivated by four considerations: (1) web sites development became achievable to a large spectrum of designers (through easily available design tools), not necessarily skilled in computer science or in ergonomics; (2) web sites are often designed along a fast and low cost design process supported by non expensive tools which may lead designers to include numerous and sometimes obvious ergonomic flaws; (3) usability evaluations using the “traditional” methods can be expensive and (4) their results may lack homogeneity [5]. The approach is divided into two main phases: web site Contextual Analysis and Ergonomic Inspection of the components and their attributes (see Figure 1). The Co-Description Phase is based mainly on surveys. Before conducting questionnaires and interviews, inspectors must identify the components of the user interface that will be inspected. The reason for that is to guarantee that, during surveys, the inspectors will collect only the contextual data that is appropriate to inspections of the actual user interface components. Surveys are supposed to be conducted with both users and designers. From users, inspectors are supposed to gather data concerning their profile, work environment and the strategies they apply to accomplish tasks using the web site. Task strategies are described simply as a sequence of pages that the users may access when accomplishing their goals. Satisfaction issues should also be gathered in surveys from users. From designers, inspectors should gather information about the expected context of use, including data concerning the user profile and task strategies. Results from surveys are examined in order to establish comparisons between context of use elements and particular task strategies as prescribed by both users and designers. The second phase of the approach is characterized by ergonomic inspections based on checklists. This sort of technique distinguishes themselves by their organization and content, and, specifically, are defined as a set of checklists items organized according to the Ergonomic Criteria [13] basically related to the ergonomics of web sites supporting e-commerce initiatives. This questions based approach was built from the examination of a large collection of ergonomic recommendations compiled by INRIA researchers [1,14]. Each recommendation selected was reformulated as a question and associated to one ergonomic criterion. Like any other inspection dynamics, application of each ErgoCoIn inspection question follows 3 decision phases: applicability, weighting and adherence. For objectiveness, the checklists should propose only questions which are applicable to the actual web site context of use and interface components. This is insured by having all questions in the ErgoCoIn knowledge base properly indexed to the context
342
M. Morandini, W. de Abreu Cybis, and D.L. Scapin
Fig. 1. The ErgoCoIn Approach Framework
of use aspects (user, task, environment and interface) as gathered from both users and designers. Further, each applicable question has to be weighted in order to allow the production of properly ranked results. Particular decisions about what is more important to be considered when inspecting e-commerce web sites were taken by the ErgoCoIn designers, but they can be modified by inspectors while inspecting web sites from different application domains. For simplicity, the level of importance of an ergonomic criterion may define the level of importance of each individual question associated to it. Finally, user interface adherence to a question (or requirement) must be judged by the inspectors. They do that based on the information concerning ergonomic requirements or questions (explanations, examples and counter examples) and also the data describing the web site´s context of use (concerning users, tasks and environment). Also, the ErgoCoIn application presupposes that information about the context of use should be directly collected from users and designers with the support of questionnaires and/or interviews. As a consequence, the approach can only be applied to web sites that are being used regularly. Furthermore, it is also necessary to have some designers and users available for the interviews or, at least, able to answer some questionnaires. The ErgoCoIn approach was designed to allow extensions and instantiations. The questions base can be extended to consider other type of perspectives, not just the ecommerce, but other domains, like e-learning for instance. Ergonomic Criteria and associated questions can be ranked differently in order to define a weight for the questions in accordance to the context of use of the web site under inspection. Another kind of extension that is being considered concerns the integration of the results from the analysis of usage log data produced with this approach. Such data can be collected using specific software tools for this purpose. In fact, a usability oriented
A Prototype to Validate ErgoCoIn
343
web analyzer called UseMonitor is being developed and associated to the ErgoCoIn approach [4]. This tool can present warnings about the “a posteriori” perspective on usability problems, i.e., interaction perturbations occurring while users are interacting with the web site in order to accomplish their goals. Basically, the UseMonitor can indicate when the observed efficiency rate is particularly low. Detailed efficiency indication is about the rates and time spent of unproductive users’ behaviors like solving error, asking help, hesitation, deviation, repetition and so on. Further, the UseMonitor can indicate web pages related to this kind of perturbations. A logic architecture based on the integration of (i) a typology of usability problems, (ii) the ergonomic criteria/recommendations and (iii) a model of interface components is also being defined. This will allow the UseMonitor warning the inspectors about a detailed interface aspect causing an actual usability perturbation (a posteriori result), while ErgoCoIn will be helping inspectors identifying the user interface component responsible for such perturbation as well as indicating how to fix it (a priori result). The integration of ErgoCoIn and UseMonitor defines the ErgoManager environment [4]. As a tool for usability evaluation such an environment will be automating both processes, the failure identification (by log analysis) and failure analysis (by guidelines processing) [1]. Details of this architecture are being defined and will be detailed in future publications.
3 The ErgoCoIn Environment and Validation Tool A computerized environment was designed in order to support mainly the data capture concerned by the inspection and inquiry techniques proposed by the current configuration of the ErgoCoIn approach [10]. Contextual analysis will be supported by two collectors consisted basically on a series of forms. The Contextual information collector is aimed at guiding inspectors while gathering information from designers and users. The Web site description collector will collect description data concerning web sites functions and interface components. Description questions concerned by these collectors are extracted from the environment Knowledge base. Data gathered (contextual data and site description) in this phase is stored in a Context of use data base. The support to Ergonomic Inspections starts with an Analytic evaluator, that is a system component that compares users’ and designers' information concerning the intended and real context of use features. This component will verify the existence of designer's misconceptions about users’ features, and if necessary, sends warnings to the Checklist builder. The main function of this builder is to create checklists concerning the overall web site and its pages according to the task strategies described by users and designers. It can highlight questions which could reveal ergonomic flaws due to the lack of correspondence between users and designers views about the context of use. These checklists will propose only applicable questions arranged according to their level of importance. A default order of importance is suggested, but it can be modified by the inspectors when considering the characteristics of the current web
344
M. Morandini, W. de Abreu Cybis, and D.L. Scapin
site context of use. Also, the inspectors’ judgments will be supported by the Ergonomic judgment support tool that will supply them with data about the context of use as well as the information about the questions. In order to validate the ErgoCoIn approach, we are developing a tool which follows the general architecture presented in Figure 2. This environment validation strategy consists in employing this tool to support different inspectors while accomplishing inspections of different web sites and by analyzing measures concerning effectiveness and efficiency of their actions as well as the homogeneity of their results.
Fig. 2. Overview of the Logical Architecture of the ErgoCoIn Environment
Based on the ErgoCoIn logic architecture, we have modeled data entities and created Entity-Relationship Models. We have also designed a Use-Case Map as well as a Sequence Diagrams for the main tasks. Figure 3 presents the Use Case Diagram for several registering tasks. Interactions for registering almost all kind of data defined in the EntityRelationship Model were designed according to the CDU (Create, Update & Delete) Model. They include the registering of inspectors, users, designers, web sites, tasks, web pages, interface components, ergonomic criteria and questions among others entities (see Figure 4). Doing so, we insure that interactions are quite homogeneous all over the interface tool. An exception is related to the interaction aimed at changing relative importance of the ergonomic criteria (see Figure 5). The first cycle of the ErgoCoIn´s implementation took place immediately after the conclusion of the design activities mentioned above. The first prototype is mainly concerned to ergonomic inspections and this version features a total of 182 questions registered that are linked to the 18 Ergonomic Criteria properly ranked.
A Prototype to Validate ErgoCoIn
345
Fig. 3. Use Case Diagram for the ErgoCoIn Validation Tool
Fig. 4. ErgoCoIn’s Users Storing Screen
The next step of development will be focused on the functions supporting activities of the others phases: Co-Description (screens concerned with users and designers questionnaires) and Inspection Reports (see Figure 1). The Ergonomic judgment support tool development will be undertaken in the future as well. Once the tool is completed, we will start accomplishing cycles of validation studies focusing not only on the tool, but also on the underlying approach. These cycles will be consisted on phases of (i) planning activities, (ii) inspections achievements, (iii) results analysis and (iv) proposals of revision. At each cycle, a number of inspectors will be invited to use the tool in order to perform inspections of a given e-commerce web site. Results from all inspectors, as well as the log of their actions will be gathered and analyzed from the homogeneity and objectiveness points of view [3]. The goal behind revision proposals is to get inspections more objective and reports more coherent. Validation cycles will be repeated until expected objectiveness and homogeneity criteria have being reached.
346
M. Morandini, W. de Abreu Cybis, and D.L. Scapin
Fig. 5. Screen Aimed at Receiving Definitions Concerning Relative Ergonomic Criteria Importance
The inspections cycles will allow us to have a better understanding of the way tasks concerning ergonomic inspections of web sites are accomplished, and specify a tool specially fitted to those tasks. Indeed, we intent to specify an ErgoCoIn user interface able to support inspectors spread all over the world performing ergonomic inspections of web sites from different domains, not only the ones concerning e-commerce. The idea is to offer the tool to those who wants to make inspections, and wants to contribute to the enrichment of the ErgoCoIn knowledge base and programming code.
4 The Wiki-ErgoCoIn We propose to change the scope of the ErgoCoIn development in order to support a collaborative initiative. In fact, this kind of initiative is among the most interesting phenomena observed in the recent history of the web. Collaboration is authorized by special functions offered by web sites allowing users to create, share and organize the content by themselves. Best examples of socially constructed web sites are Facebook, Youtube, Flickr, Digg, del.ici.ous and Wikipedia. Particularly, the Wikipedia is the most successful example of collaboration concerning scientific content on the web. This socially constructed encyclopedia features remarkable internet traffic numbers as it is the 9th most visited web site in the whole Web. From 2001 to now, 7.2 million of articles were posted in Wikepdia. Those were produced by 7.04 million of editors following some style and ethic rules [16]. Wilkison and Huberman [17] performed a study concerning 52.2 million edits in 1.5 million articles in the English language Wikipedia posted by 4.79 million contributors between 2001 and 2006. They split out a group of 1,211 "featured articles", which accuracy, neutrality, completeness and style are assured by Wikipedia editors. Comparisons between featured and normal articles showed a strong correlation among the article quality, the number of edits and the number of distinct editors. In the same study, the authors could associate attractiveness of the articles (number of visits) to the edits novelty.
A Prototype to Validate ErgoCoIn
347
The goal of having ErgoCoIn as a collaborative web initiative is to increase the generality and attractiveness of its contents as well as the quality of the results this approach could afford. Indeed, the Wiki-ErgoCoIn is being designed in order to allow ergonomic inspectors all over the world to share efforts and responsibilities concerning the ErgoCoIn knowledge base extension and generalization. In doing so, we can expect that the Wiki-ErgoCoIn will always feature newly proposed questions concerning ergonomics of web sites from different application domains, interface styles and components. Contributions should fulfill a basic requirement: follow free-content collaboration rules like those developed by Wikipedia. We believe that the results obtained by such cooperative approach can be much more efficient and reliable than the ones that would be obtained solely by individual initiatives.
5 Conclusions ErgoCoIn is an inspection approach strongly based on knowledge about ergonomics of web site’s user interfaces. This knowledge is intended to guide inspectors while undertaking contextual data gathering and analysis, checklists based inspections and report actions. In this paper we described details of this approach and the environment designed to support it. We have also introduced the tool that is under development to validate its structure and contents. We will perform the validation activities following cycles of application-analysis-revisions until the approach reaches expected objectiveness and homogeneity goals. But the success of the ErgoCoIn initiative depends basically on the variety and the novelty of its knowledge. Nowadays, this approach is linked to the ergonomics of the current e-commerce web applications and interfaces technologies, styles and components. Indeed, all these aspects may evolve continuously using just e-commerce may be a very limited scope. Consequently, there is the need to undertake actions in order to face the challenge of continuously getting ErgoCoIn contents up to date and varied to support the production of inspection reports in different web sites domains. An open initiative is being proposed by which anybody knowledgable will be authorized to contribute to the enrichment of the Wiki-ErgoCoIn knowledge base. Consultative and executive boards will be created to define strategies and policies concerning implementation of this ergonomics inspection wiki. Participation demands are planned to be directly addressed to the authors.
References 1. Brajnik, G.: Automatic Web Usability Evaluation: What Needs to be Done? In: 6th Conference on Human Factors and the Web, Austin, Texas, USA (2000) 2. Cybis, W.A., Scapin, D., Andres, D.P.: Especificação de Método de Avaliação Ergonômica de Usabilidade para Sites/Web de Comércio Eletrônico. In: Workshop on Human Factors in Computer Systems, 2000, Gramado. Proccedings of 3rd Workshop on Human Factors in Computer Systems, vol. I, pp. 54–63. Ed. Sociedade Brasileira de Computação, Porto Alegre (2000)
348
M. Morandini, W. de Abreu Cybis, and D.L. Scapin
3. Cybis, W.A., Tambascia, C.A., Dyck, A.F., Villas Boas, A.L.C., Pagliuso, P.B.B., Freitas, M., Oliveira, R.: Abordagem para o desenvolvimento de listas de verificação de usabilidade sistemáticas e produtivas. In: Latin American Congress on Human-Computer Interaction, 2003, Rio de Janeiro. Proceedings of Latin American Congress on Human-Computer Interaction. Rio de Janeiro, vol. I, pp. 29–40 (2003) 4. Cybis, W.A.: UseMonitor: suivre l’évolution de l’utilisabilité des sites web à partir de l’analyse des fichiers de journalisation. In: 18eme Conférence Francophone sur l’Interaction Humain-Machine, 2006, Montréal. Actes de la 18eme Conférence Francophone sur l’Interaction Humain-Machine, vol. 1, pp. 295–296. ACM - The Association for Computing Machinery, New York (2006) 5. Cybis, W.A.: ErgoManager: a UIMS for monitoring and revising user interfaces for Web sites. Rocquencourt: Institut National de Recherche en Informatique et en Automatique, Research report (2005) 6. Cybis, W.A., Betiol, A., Faust, R.: Ergonomia e usabilidade: conhecimentos, métodos e aplicações, Novatec Editora, São Paulo (2007) 7. Farenc, P., Bastilde, C.R.: Towards Automated Testing of Web Usability Guidelines. In: Tools for Working with Guidelines, pp. 293–304. Springer, London (2001) 8. ISO/DIS 9126; Software engineering – Product quality – Part 1: Quality model. International Organisation for Standardization (1997) 9. ISO/DIS 9241; Dialogue Principles in Guiding the Evaluation of User Interfaces – part 11Guidance on Usability. International Organisation for Standardization (1997) 10. Ivory, M.Y., Heasrstam, M.A.: The State of the Art in Automating Usability Evaluation of User Interfaces. ACM Computing Surveys 33(4) (December 2001) 11. Leulier, C., Bastien, J.M.C., Scapin, D.L.: Compilation of Ergonomic Guidelines for the Design and Evaluation of Web Sites. Commerce & Interaction (EP 22287), INRIA Report (1998) 12. Molich, R., Bevan, N., Curson, I., Butler, S., Kindlund, E., Miller, D., Kirakowski, J.: Comparative Evaluation of Usability Tests. In: Proceedings of the Proceedings of the Usability Professional’s Association Conference (1998) 13. Scapin, D.L., Bastien, J.M.C.: Ergonomic Criteria for Evaluating the Ergonomic Quality of Interactive Systems. Behaviour and Information Technology 16(4/5) (1997) 14. Scapin, D.L., Leulier, C., Vanderbonckt, J., Mariage, C., Bastien, C., Palanque, P., Farenc, C., Bastilde, R.: Towards Automated Testing of Web Usability Guidelines. In: Tools for Working with Guidelines, pp. 293–304. Springer, London (2001) 15. Wammi: Website Analysis and MeasureMent Inventory (Web Usability Questionnaire) (n.d.) (2005), http://www.ucc.ie/hfrg/questionnaires/wammi (accessed, 2009) 16. Wikipedia, http://www.wikipedia.org (accessed February 2009) 17. Wilkinson, D., Huberman, B.: Assessing the value of cooperation in Wikipedia. First Monday 12(4) (2007), http://firstmonday.org/htbin/cgiwrap/bin/ojs/ index.php/fm/article/view/1763/1643
Mobile Phone Usability Questionnaire (MPUQ) and Automated Usability Evaluation Young Sam Ryu Ingram School of Engineering, Texas State University-San Marcos, 601 University Drive, San Marcos, TX 78666, USA
[email protected] Abstract. The mobile phone has become one of the most popular products amongst today’s consumers. The Mobile Phone Usability Questionnaire (MPUQ) was developed to provide an effective subjective usability measurement tool, tailored specifically to the mobile phone. Progress is being made in the HCI research community towards automating some aspects of the usability evaluation process. Given that this effort is gaining traction, a tool for measurement of subjective usability, such as MPUQ, may serve as a complement to automated evaluation methods by providing user-centered values and emotional aspects of the product. Furthermore, experimental comparison of MPUQ assessments and automated usability analysis may enable researchers to determine whether automated usability tools generate metrics that correlate with user impressions of usability. Keywords: Usability, mobile user interface, subjective measurement, questionnaire, automating usability.
1 Development of MPUQ The mobile phone has become one of the most popular consumer products of today because it is suffused with personal meanings and individual experiences. It is carried from home to work and leisure activities, and it not only provides communication whenever needed, but can also act as a primary tool for life management [1-3]. Mobile phones have also been recognized as important indicators of consumers’ tastes for buying other groups of products [4]. The mobile phone’s design encompasses two major components (i.e., hardware and software), and aesthetic appeal and image may play an important aspect in evaluation of its usability. For these reasons, the mobile phone was selected as a worthwhile target-product for the development of a new usability questionnaire. The Mobile Phone Usability Questionnaire (MPUQ) was developed through two different phases. The goal and approach of each phase are described in Table 1. The definition of usability in ISO 9241-11 was used to conceptualize the target construct, and the initial questionnaire items pool was comprised of material derived from various existing questionnaires, comprehensive usability studies, and other sources related to mobile devices. Through redundancy and relevancy analyses completed by representative mobile user groups, a total of 124 items (119 applicable to mobile phones J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 349–351, 2009. © Springer-Verlag Berlin Heidelberg 2009
350
Y.S. Ryu
and 115 applicable to PDA/Handheld PCs) were retained from an original 512 items in the initial pool. To increase reliability and validity of this draft questionnaire, follow-up studies employing psychometric theory and scaling procedures were performed. To evaluate the items, the draft questionnaire was administered to a representative sample involving approximately 300 participants. The findings revealed a six-factor structure including (1) Ease of learning and use, (2) Assistance with operation and problem solving, (3) Emotional aspect and multimedia capabilities, (4) Commands and minimal memory load, (5) Efficiency and control, and (6) Typical tasks for mobile phones. The 72 items with the greatest discriminative power relating to these factors were chosen to include in the Mobile Phone Usability Questionnaire (MPUQ), which evaluates mobile phones for the purpose of making decisions among competing variations in the end-user market, alternatives of prototypes during the development process, or evolving versions during an iterative design process. Table 1. Development procedure of MPUQ
Phase
Goal
I
Generate and judge measurement items for the usability questionnaire for electronic mobile products
Approach Consider construct definition and content domain to develop the questionnaire for the evaluation of electronic mobile products based on an extensive literature review: • •
II
Design and conduct studies to develop and refine the questionnaire
Generate potential questionnaire items based on essential usability attributes and dimensions for mobile phone Judge items by consulting a group of experts and users focusing on the content and face validity of the items
Administer the questionnaire to collect data in order to refine the items by • • •
Conducting item analysis via factor analysis Testing reliability using alpha coefficient Testing construct validity using known-group validity
2 Automated Usability Evaluation and MPUQ Subjective usability measurements focus on an individual’s personal experience with a product or system. According to Ivory and Hearst [5], automation of usability evaluation does not capture important qualitative and subjective information. However, it is not yet known whether subjective impressions of usability are in fact correlated with metrics that automated usability approaches can capture. By conducting subjective usability evaluation using a questionnaire of the same interface as has been modeled with an automated usability prediction tool such as CogTool [6], we can perhaps determine whether it may be the case that a metric such as time taken to complete tasks can be correlated with subjective impressions of usability. One of the single greatest advantages of using questionnaires in usability research is that questionnaires can quickly and economically provide evaluators with feedback from the users’ point of view [7-9]. Since user-centered and participatory design is one
Mobile Phone Usability Questionnaire and Automated Usability Evaluation
351
of the most important aspects in the usability engineering process [10], questionnaires, applied with or without any other more ambitious method, can be a valuable tool, assuming that the respondents are validated as representative of the whole user population. There are many usability aspects or dimensions for which no established objective measurements exist, and those may only be measured by subjective assessment. New usability concepts suggested for the evaluation of consumer electronic products such as attractiveness [11], emotional usability [12], sensuality [13], pleasure and displeasure in product use [14] seem to be quantified effectively only by subjective assessment and those usability concepts are proving to be important these days. The MPUQ incorporated those dimensions; most of them are under the group of (3) Emotional aspect and multimedia capabilities. While other factor group items can be covered by other usability evaluation methods, the emotional aspects cannot presently be captured by any other practical approach than subjective measurement.
References 1. Vnnen-Vainio-Mattila, K., Ruuska, S.: Designing Mobile Phones and Communicators for Consumers’ Needs at Nokia. In: Bergman, E. (ed.) Information Appliances and Beyond: Interaction Design for Consumer Products, pp. 169–204. Morgan-Kaufmann, San Francisco (2000) 2. Sacher, H., Loudon, G.: Uncovering the new wireless interaction paradigm. ACM Interactions Magazine 9(1), 17–23 (2002) 3. Ketola, P.: Integrating Usability with Concurrent Engineering in Mobile Phone Development. Tampereen yliopisto (2002) 4. PrintOnDemand. Popularity of Mobile Devices Growing (2003), http://www. printondemand.com/MT/archives/002021.html (cited February 5, 2003) 5. Ivory, M.Y., Hearst, M.A.: The state of the art in automating usability evaluation of user interfaces. ACM Comput. Surv. 33(4), 470–516 (2001) 6. John, B.E., et al.: Predictive human performance modeling made easy. In: The Proceedings of SIGCHI Conference on Human Factors in Computing Systems, CHI 2004, ACM, New York (2004) 7. Kirakowski, J.: Questionnaires in Usability Engineering: A List of Frequently Asked Questions [HTML] (2003) (cited November 26, 2003) 8. Annett, J.: Target Paper. Subjective rating scales: science or art? Ergonomics 45(14), 966– 987 (2002) 9. Baber, C.: Subjective evaluation of usability. Ergonomics 45(14), 1021–1025 (2002) 10. Keinonen, T.: One-dimensional usability - Influence of usability on consumers’ product preference, University of Art and Design Helsinki, UIAH A21 (1998) 11. Caplan, S.H.: Making Usability a Kodak Product Differentiator. In: Wiklund, M. (ed.) Usability in Practice: How Companies Develop User-Friendly Products, pp. 21–58. Academic Press, Boston (1994) 12. Logan, R.J.: Behavioral and emotional usability; Thomson Consumer Electronics. In: Wiklund, M. (ed.) Usability in practice: How companies develop user friendly products, pp. 59–82. Academic press, Boston (1994) 13. Hofmeester, G.H., Kemp, J.A.M., Blankendaal, A.C.M.: Sensuality in product design: a structured approach. In: CHI 1996 Conference (1996) 14. Jordan, P.W.: Human factors for pleasure in product use. Applied Ergonomics 29(1), 25– 33 (1998)
Estimating Productivity: Composite Operators for Keystroke Level Modeling Jeff Sauro Oracle, 1 Technology Way, Denver, CO 80237
[email protected] Abstract. Task time is a measure of productivity in an interface. Keystroke Level Modeling (KLM) can predict experienced user task time to within 10 to 30% of actual times. One of the biggest constraints to implementing KLM is the tedious aspect of estimating the low-level motor and cognitive actions of the users. The method proposed here combines common actions in applications into high-level operators (composite operators) that represent the average error-free time (e.g. to click on a button, select from a drop-down, type into a text-box). The combined operators dramatically reduce the amount of time and error in building an estimate of productivity. An empirical test of 26 users across two enterprise web-applications found this method to estimate the mean observed time to within 10%. The composite operators lend themselves to use by designers and product developers early in development without the need for different prototyping environments or tedious calculations.
1 Introduction 1.1 Measuring User Productivity Measuring productivity with an interface is a key aspect of understanding how changes impact its ease of use. One measure of productivity is the time saved by a more efficient design, that is, a design with a task flow requiring fewer steps. Time saved over repetitions of a task, as a measure of productivity, is a key aspect to calculating return on investment (ROI). Productivity metrics are often needed well before there is a working product or any existing users (esp. when the product is new). Such constraints make gathering empirical measures of productivity from a summative usability test difficult and untimely. The usual process for obtaining time on task data involves recruiting then testing actual users in a lab or remote test setup. This procedure while providing a wealth of informative data can be expensive, time-consuming and requires a working version of the tested product. As a large software organization, we have dozens of products with hundreds of distinct application areas. There is a large demand for benchmarking and improving the time to complete tasks for mostly productivity-based software such as expense reports, call center applications, etc. Conducting summative usability tests with the main goal to record benchmark task-time data is a herculean undertaking that takes resources away from formative designs. Our challenge was to derive a more reliable J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 352–361, 2009. © Springer-Verlag Berlin Heidelberg 2009
Estimating Productivity: Composite Operators for Keystroke Level Modeling
353
way to estimate time-on-task benchmarks and to inform designers about the productivity of their designs as early as possible during product development. 1.2 Cognitive Modeling Rather than observing and measuring actual users completing tasks, another approach for estimating productivity is cognitive modeling. Cognitive modeling is an analytic technique (as opposed to the empirical technique of usability testing). It estimates the task completion time from generalized estimates of the low-level motor operations. Breaking up the task that a user performs into millisecond level operations permits the estimation of task completion times for experienced users completing error-free trials. The most familiar of these cognitive modeling techniques is GOMS (Goals, Operators, Methods and Selection Rules), first described in the 1970s in research conducted at Xerox Parc and Carnegie-Mellon and documented in the still highly referenced text The Psychology of Human Computer Interaction, by Card, Moran and Newell (1983) [1]. GOMS itself represents a family of techniques, the most familiar of which is Keystroke Level Modeling (KLM). In its simplest form, a usability analyst can estimate user actions using KLM with only a few operators (pointing, clicking, typing and thinking)—see [2] p.72 for a simple introduction. KLM, probably because of its simplicity, has enjoyed the most usage by practitioners. It has been shown to estimate error free time task completion time to within 10 to 30% of actual times. These estimates can be made from either live working products or prototypes. It has been tested on many applications and domains such as maps, PDAs, and database applications [3][4][5][6][7][8][9]. One major disadvantage of KLM is the tedious nature of estimating time at the millisecond level. Even tasks which take a user only two to three minutes to complete are composed of several hundred operators. One must remain vigilant in making these estimates. Changes are inevitable and errors arise from forgetting operations (Bonnie John, personal communication, October 12th, 2008). In our experience, two to three minute tasks took around an hour to two hours to create the initial model in Excel, then an additional hour in making changes. 1.3 Software to Model KLM Operators: Cog-Tool A better way of building the estimates comes from a software-tool called Cog-Tool, built and maintained at Carnegie Mellon [10] Cog-Tool itself is the results of dissatisfaction with manual GOMS estimating [7]. Cog-Tool is free to download and after some familiarity can be a powerful and certainly more accurate cognitive modeling tool than hand-tracked estimates. Cog-Tool builds the task time estimates by having the analyst provide screen-shots or graphics from the application and then define each object the users interact with (e.g., a button, a drop-down list, etc.). There is a bit of overhead in defining all the objects and defining the sequence of steps the users take during a task. Once completed, however, Cog-Tool provides an easy way to get updated estimates on the productivity of a task. User-interface designers can actually do the prototyping within Cog-Tool and this in-fact exploits the functionality since changes made within the prototyping environment will immediately lead to a new task-time estimate. If prototyping is done in another environment (which it is in our
354
J. Sauro
organization) then the analyst will need to import, define and update the objects and task-flows for each change made. 1.4 Consolidating the Operators Our organization has a rather complicated infrastructure of prototyping tools for designers so shifting our prototyping efforts into CogTool, while possible, would be a large undertaking surely met with resistance. We wanted a method to create estimates using KLM like Cog-Tool, that automated the tedious estimation process. We also wanted to allow designers to generate prototypes in whatever environment they preferred. Many requests for productivity come from the Marketing and Strategy teams who can use this information to support sales. We also wanted a method by which we could allow product managers and product strategists to generate their own estimates with little involvement from the usability team. 1.5 Looking to Industrial Engineering Some of the inspiration for GOMS (see [1], p. 274) came from work-measurement systems in Industrial Engineering which began in the early 1900s (e.g., Fredrick Taylor) and evolved into systems like MTM (Methods Time Management see [11]). Just like GOMS, these systems decompose work into smaller units and use standardized times based on detailed studies. These estimating systems evolved (MTM-2, MTM-C, MTM-V, etc.) to reflect the different domains of work and more sophisticated estimates. Generating task-times with these systems, while accurate, are often time consuming. A modification was proposed by Zandin [12] called the Maynard Operation Sequence Technique (MOST). MOST, also based on the MTM system, uses larger blocks of fundamental motions. Using MOST, analysts can create estimates five times faster than MTM without loss of accuracy [13]. Similar to the MOST technique, we wanted to describe user-actions at a higher level of work. Instead of building estimates at the level of hand-motions and mouse clicks, we wanted to estimate at the level of drop-down selections and button clicks. Each of these operations is still composed of the granular Card, Moran, and Newell operators, but the low-level details which caused the errors and were time consuming could be concealed from analysts.
2 Method To refine the KLM technique to a higher level of abstraction we first wanted to see if these higher-level composite operators could predict task times as well as the lowlevel operators. We used the following approach: 1. KLM Estimation: Estimate task times using the KLM technique with low level operators for a sequence of tasks. 2. Generate Composite Operators: Generate an estimate of the task times for the same tasks using the composite operators by identifying larger operational functions. 3. Empirically Validate: Validate the new composite operators by testing users completing the same tasks repeatedly.
Estimating Productivity: Composite Operators for Keystroke Level Modeling
355
4. Refine Estimates: Use empirical data to refine composite estimates (such as updating the system response time) and modify the mental operators to account for concurrent processing. 2.1 KLM Estimation Using the method defined in [1] and [5], we estimated the times. For example, the operators for the initial operations of the task “Create an Expense Report” are: 1. 2. 3. 4. 5.
M: Mental Operation: User Decides where to click (1.350s) H: Home: User moves hand to Mouse (.350s) P: Point: User locates the create expense report link target (1.1s) K: Key: User clicks on the link (.25s) R: System Response time as New Page Loads (.75s)
The system response time was updated based on taking some samples from the applications. 2.2 Generate Composite Operators Using the granular steps from above, the logical composite operator is clicking on a link, so the five steps above are replaced with: Click on Link/Button. The time to complete this operation is modeled as 1.350 + .350 + 1.1 +.250 +.75 = approximately 3.8 seconds. This process was repeated for all steps in the 10 tasks. While not a complete list, we found that a small number of composite operators was able to account for almost all user actions in the 10 tasks across the two web applications. The most commonly used actions are listed below: 1. 2. 3. 4. 5. 6. 7. 8. 9.
Click a Link/ Button Typing Text in a Text Field Pull-Down List (No Page Load) Pull-Down List (Page Load) Date-Picker Cut & Paste (Keyboard) Scrolling Select a Radio/Button Select a Check-Box
2.3 Empirical Validation We tested 26 users on two enterprise web-based applications (hereafter Product O and Product P). The products were two released versions of a similar travel and expense reporting application allowing users to perform the same five tasks. The participants regularly submit reports for travel and expenses and were experienced computer users. Ten of the participants had never used either of the applications, while 16 of them had used both. To reduce the learning time and to provide a more stable estimate of each operator, each participant was shown a slide slow demonstration of how to perform each task. This also dictated the path the user should take through the software. They then attempted the task.
356
J. Sauro
The participants were not asked to think out loud. They were told that we would be recording their task times, but that they should not hurry – rather to work at a steady pace as they would creating reports at work. If they made an error on a task, we asked them to repeat the task immediately. To minimize carry-over effects we counterbalanced the application and task order. We had each participant attempt the five tasks three times on both systems. The training was only showed to them prior to their first attempt. From the 30 task attempts (5*2*3=30) we had hundreds of opportunities to measure the time users took to complete the dozens of buttons, links, dropdowns and typing in text-boxes. These applications were selected because they appeared to provide a range of usable and unusable tasks and exposed the user to most of the interface objects they’d likely encounter in a web-application. The goal of this test setup was to mimic the verification methods Card, Moran, and Newell did in generating their granular estimates. They, however, had users perform actions hundreds of times. Comparatively, our estimates were more crudely defined. We intended to test the feasibility of this concept and were most interested in the final estimate of the task-time as a metric for the accuracy of the model. 2.4 Concurrent Validation When estimating with KLM one typically does not have access to user data on the tasks being estimated. It is necessary to make assumptions about the system response time and the amount of parallel processing a user does while executing a sequence of actions. System response time understandably will vary by system and is affected by many factors. Substituting a reasonable estimate is usually sufficient for estimating productivity. In estimating parallel processing, there are some general heuristics ([2], p. 77) but these will also vary with the system. For example, as a user becomes more proficient with a task they are able to decide where to click and move the mouse simultaneously. The result is the time spent on mental operators are reduced or removed entirely from estimate. In the absence of data, one uses the best estimate or the heuristics. Because our goal was to match the time of users and we had access to the system, we needed to refine the operators with better estimates of actual system response time and of the parallel processing. To do so, we measured to the hundred of a second the time it took users to complete the composite operations (e.g., clicking a button, selecting from a pull-down list) as well as waiting for the system to respond. We adjusted the composite operators’ total time by reducing the time spent on mental operation; in some cases eliminating them entirely (see also [14], for a discussion of this approach). The final empirically refined estimates appear in Table 1 below. Table 1. Composite Operators and the refined time from user times
Composite Operator Click a Link/ Button Pull-Down List (No Page Load) Pull-Down List (Page Load) Date-Picker Cut & Paste (Keyboard) Typing Text in a Text Field Scrolling
Refined Time (seconds) 3.73 3.04 3.96 6.81 4.51 2.32 3.96
Estimating Productivity: Composite Operators for Keystroke Level Modeling
357
Some of the operators need explanation. The Date-Picker operator will vary depending on the way the dates are presented. The Cut & Paste Keyboard option includes the time for a user to highlight the text, select CTRL-C, home-in on the new location and paste (CTRL-V). The estimate would be different if using context menus or the web-browser menu. Typing Text in a Text Field only represents the overhead of homing in on a text-field, placing the curser in the text-field and moving the hands to the key-board. The total time is based on the length and type of characters entered (230msec each). Finally, the refined times above contain a system response time which will vary with each system. That is, it is unlikely that clicking of a button and waiting for the next page to display will always take 3.73 seconds. Future research will address the universality of these estimates across more applications.
3 Results and Discussion Table 2 below shows the results of the KLM estimates using the “classic” Card Moran and Newell operators and the new composite operators for all 10 tasks. Both the number of operators used and the total task times are shown. Table 2. Comparison between Classic KLM Composite KLM Time & Operators Classic KLM Product
O O O O O P P P P P
# of Operators
Task Create Meeting Rprt
Composite KLM
Time (sec)
# of Operators
Time (sec)
81
62
23
98
51
52
21
46
43
26
15
35
32
18
6
26
149
88
32
55
169
134
36
156
93
74
21
82
65
46
13
60
48
31
11
43
131
118
23
111
Mean
86.2
64.9
20.1
71.2
SD
48.1
38.9
9.3
40.5
Update a Saved Rprt Edit User Preference Find an Approved Rprt. Create Customer Visit Rprt Create Meeting Rprt Update a Saved Rprt Edit User Preference Find an Approved Rprt Create Customer Visit Rprt
The data in Table 2 show there to be a difference of six seconds between the composite and classic KLM estimates of the mean task completion time but this difference is not significant [ t (17) = .727 p >.7]. The correlation in task time estimates between the two systems is strong and significant (r =.891 p to hide all windows and make
Providing an Efficient Way to Make Desktop Icons Visible
575
desktop icons visible. With this function, users do not have to perform any window operation to make icons visible. This function makes all desktop icons visible, which is not always the case with the ISS. Two problems can be pointed out on this function. The first problem which was discovered by Hutchings [1] is that it is hard to remember to use the operation. Probably, this is because it is not intuitive to type the keyboard to make the desktop visible. The second problem is that the hidden windows are not restored automatically. This is problematic when the user wants a newly created window to be visible together with the existing windows. The ISS are free from these problems. It is intuitive to move the mouse cursor to the IS to access a desktop icon. The ISS also can restore the window layout automatically. Quick Launch Tray. Windows XP offers a special interaction place called quick launch tray to put very small icons for program launchers. Among those icons is a special icon to make the desktop visible. Located on the taskbar, the quick launch tray is free from being covered by windows. Thus, the icon for making the desktop visible has the advantage of being always accessible. Like <Windows + D>, the icon also has the advantage of making all desktop icons visible. These two advantages cannot be found in the ISS. However, the required operation for this function lacks intuitiveness just as <Windows + D> does. It also requires the user to move the mouse cursor all the way to the bottom part of the screen. The very small size of the icon is also a problem; the well known Fitts’s law [2] says that the smaller the target is, the longer it takes to set the mouse cursor to it. The ISS, on the other hand, requires the least additional mouse movement and no need to place the mouse cursor at a particular position. Machine Learning Scheme. As machine learning algorithms have proven to be practical, a new approach to develop/improve the computer user interface has emerged which incorporates the machine learning scheme to adapt the user interface to each individual user. For example, there have been some efforts to adapt the UNIX command line shell to predict the user’s next command based on the record of commands issued in the past [3]. A similar approach could be potentially used to assist users with accessing desktop icons. That is, it may be possible to predict the exact file that the user wants to access next and bring the icon for that file to the top near the current place of the mouse cursor. However, we avoided this approach. One reason is that user’s access patterns to desktop icons are unlikely to exist. As for commands in the UNIX shell where programming is one of the most common tasks, some patterns are likely to exist. On the other hand, the context of our approach assumes that users are using desktop icons for general purposes, which makes pattern extraction difficult. Our approach, the ISS, may require users to search for the icon they want to access among many icons, but instead it is able to make important icons visible without any mistake. Having being compared with other approaches, the ISS is found to have some advantages and possible problems. Comparison with the machine learning scheme especially clarified important characteristics of the ISS, namely, putting priority on simplicity and stability rather than using sophisticated technology.
576
T. Akasaka and Y. Okada
4 Experiment The aim of the ISS is to provide users with an easy way to make desktop icons visible and by doing so reduce the burden of window management, increasing the overall productivity. In this chapter, we describe the experiments that we conducted to test whether the ISS really achieved its aim. 14 persons took part in the experiments as subjects. These persons were those who usually used computers in similar contexts to that of our approach. The display of the computer used in the experiments was a 15inch single monitor with the resolution of 1024 x 768; this configuration was also chosen in line with the context of our approach. 4.1 Efficiency to Make Desktop Icons Visible The ISS allows users to make desktop icons visible without any window operations and is obviously efficient in terms of workload. However, it needs to be confirmed that the ISS can also make the process of making desktop icons visible faster than window operations. Therefore, we first conducted experiments to examine how much the ISS could reduce the time it takes to make desktop icons visible.
Task Completion Time (sec)
6 w/o ISS
w/ ISS
5 4 3 2 1 0
**
**
1 2 3 The number of windows initially covering the desktop icons Fig. 4. Without using the ISS the task completion times gets longer as the number of windows increases, while the times are stable over the three conditions when using the ISS
The experiment task was to click the desktop icon specified by voice instruction. A task began with a voice instruction and lasted until the subject clicked the specified icon. 17 desktop icons all representing text files were lined up in the right edge of the screen. The area around the desktop icons was initially covered by windows. Consequently, subjects first needed to move windows away from the area before clicking the specified icon. To move windows without using the ISS, they had to either iconify or grab/drag windows. It was up to each subject which of the two operations to use. When using the ISS, on the other hand, subjects were requested to use function of the ISS without performing any window operation. Subjects performed the task under the
Providing an Efficient Way to Make Desktop Icons Visible
577
three conditions. The difference between the conditions was the number of windows that initially covered the area around the desktop icons; the number varied from 1 to 3. For each condition, a subject performed the task five times. Each time the task completion time was measured. Figure 4 shows the completion times averaged over all the subjects (14 subjects x 5 trials = 70 trials) with the error bars indicating the 95% confidence intervals. When subjects performed the task without using ISS (w/o ISS), the task completion time gets longer as more windows cover the area around the desktop icons. This rise in time is inevitable as subjects needed to perform window operations on each window. In contrast, when subjects used the ISS (w/ ISS), the task completion times were stable over the three conditions with relatively little deviations. This resulted in performance gain for the conditions of 2 and 3 windows, and the differences were statistically significant for 0.01 level with paired T-test. However, the ISS did not show any improvement for the condition of one window. In short, with only one window covering desktop icons the ISS gave the subjects the same level of speediness as usual window operations, but the ISS could maintain that level even when several windows cover the desktop icons, effectively bringing performance gain. From the experimental results above, we can conclude that the ISS can at least maintain the same level of speediness as window operations to make desktop icons visible, and that it can raise the level when several windows cover desktop icons. 4.2 Productivity of Primary Task Having confirmed that the ISS could make the process of making desktop icons visible faster, we then conducted experiments to examine whether the ISS could improve the overall productivity of a typical task for office workers. The experiment task was to compile a spreadsheet document. Figure 5 shows a typical screenshot of the desktop during the task. In the spreadsheet (lower right in Figure 5) there was a matrix with rows representing persons and columns representing items of information. The task was to fill the matrix by gathering information spread over many files placed on the desktop. The persons in the rows of the matrix were divided into four groups. Pieces of information in other files than the spreadsheet were described by group (upper). Mappings between persons and groups were not shown in the spreadsheet, but in a different text file (lower left). This situation caused subjects to have the text file always visible, usually at the right bottom of the screen as that place saved the mapping information from being occluded. Consequently, the desktop icons which subjects needed to access to gather information were frequently covered by the text file’s window. In addition, subjects needed to manage three windows as shown in Figure 5 or at least two windows (the spreadsheet and the text file). This formed the situation where subjects were in a dilemma of maintaining the visibility of desktop icons while managing windows. The 14 subjects were divided into two groups, each with 7 subjects. One group performed the task on a set of files with the ISS, and then did the task on a similar set of files without the ISS. The other group also performed the task twice, but this time with the reversed order; first they did not use the ISS and next they did. This design was to counterbalance the learning effect. For each trial, the task completion time was measured.
578
T. Akasaka and Y. Okada
Task Completion Time (sec)
1400 1200 1000
791
800
751
600 400 200 0 w/o ISS
Fig. 5. In order to fill the matrix of the spreadsheet, subjects needed to access several desktop icons, while keeping the spreadsheet and text file visible
w/ ISS
Fig. 6. The ISS did not make a statistically significant difference, although the sampled data showed improvement of about 5 % on average as well as reduction in deviation
The results are shown n Figure 6. It was not confirmed that the ISS made a statistically significant difference, although the sampled data showed the performance gain of about 5%. In addition, the subjects showed the smaller deviation when using the ISS. As mentioned in chapter 3, the ISS can make desktop icons visible without any window operation. This might have contributed to stable window management, which in turn led to the smaller deviation.
5 Conclusion In this study, we developed the Icon Space Saver (ISS) which aimed to provide users with an efficient way to make desktop icons visible, thereby reducing the workload of window management. The experimental results showed that the ISS did make the process of making desktop icons visible faster. However, it was not confirmed that the ISS could reduce the workload of window management and raise the overall productivity of primary tasks. To confirm this needs further investigations, which will be the focus of a future study.
References 1. Hutchings, D.R., Stasko, J.: Revisiting Display Space Management: Understanding Current Practice to Inform Next-generation Design. In: Proceedings of Graphics International 2004, pp. 127–134 (2004) 2. Fitts, P.M.: The Information Capacity of the Human Motor System in Controlling the Amplitude of Movement. Journal of Experimental Psychology 47(6), 381–391 (1954) 3. Davison, B.D., Hirsh, H.: Experiments in UNIX Command Prediction, Technical Report ML-TR-41, Department of Computer Science, Rutgers University (1997)
An Integration of Task and Use-Case Meta-models Rémi Bastide IRIT – Université de Toulouse, ISIS – CUFR J.F. Champollion, Castres, France
[email protected] Abstract. Although task modeling is a recommended practice in the HumanComputer Interaction community, its acceptance in the Software Engineering community is slow. One likely reason for this is the weak integration between task models and other models commonly used in Software Engineering, notably the set of models promoted by the mainstream UML method. To overcome this problem, we propose to integrate the CTT model of user tasks into the UML, at the meta-model level. CTT task models are used to provide an unambiguous model of the behavior of UML use-cases. By so doing, we also bring the benefit of hierarchical decomposition of use-cases (“extend” and “include” relationships) to CTT. In our approach, CTT tasks also explicitly operate on a UML domain model, by using OCL expressions over a UML object model to express the pre- and post-conditions of tasks.
1 Introduction In the current Software Engineering practice, use-cases are routinely used during the early phases of software development, namely requirements gathering. Use-cases are arguably the less formalized of all UML notations. Rather than a hindrance, this is to be considered as an advantage: the main point of use-case modeling is to reach a common understanding of the problem between the various stakeholders of the system under development, and especially between the customer (who holds the knowledge of the business domain) and the software development team (who has the know-how of the software development process). Noted methodologists [2] argue that writing good use-cases is essentially a literary piece of work, and that a natural language description of use-cases is a good way to form consensus and mutual understanding between the stakeholders regarding what has to be done, regardless of how it has to be done. A delicate point comes with the need to relate an informal, natural language description of use cases to the increasingly formal notations used in the UML, for instance class diagrams, behavioral models such as StateCharts, etc, eventually leading to a satisfactory implementation. UML is notoriously vague and non-prescriptive with regards to the precise way to describe the behavior of use-cases. Some authors stick to a detailed natural language scenario, others prefer a partitioned narration, and others use UML sequence diagrams to describe the information exchanges between the usecase actor and the system under design. We contend that task models (in our case J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 579–586, 2009. © Springer-Verlag Berlin Heidelberg 2009
580
R. Bastide
CTT models [8]) offer several advantages over the latter two, notably due to the richness of the temporal operators available. This is increasingly important, since modern user interfaces (direct-manipulation, multi-modal...) depart from the old-fashioned conversational, question-answer style, and are almost impossible to model with sequence diagrams. It is also a routine practice to develop an analysis model of the business objects of the system under design (the so-called “domain model”) early on in the development process, in order to precisely identify the business objects, their structure and their mutual relationships. This domain modeling is performed using UML class diagrams, leaving out premature implementation-related considerations. The main point of this paper is to promote CTT task models as the behavioral language for use cases. To this end, we first introduce our view of the design process which is expected. We then show how the metamodel of CTT can be tightly integrated into the UML metamodel of use-case diagrams, so that the notion of extend and include relationships become meaningful for CTT task models.
2 Design Process For the sake of efficiency, formal modeling work has to be guided by strong methodological, process-oriented guidelines. A design process defines in which order the various artifacts have to be produced during the software lifecycle, defines the expects contents of these artifacts, and what information is needed as an input and produced as an output of the various modeling and design activities. The work presented here deals mainly with the initial phases of the process, namely requirements engineering and preliminary design. • The goal of the requirements engineering phase if to form a consensus between the stakeholders (mainly the customer and the analysis team) regarding what the problem actually is, and what has to be developed in order to solve the problem. The main outcome of this phase is a common understanding between the customer and the development team of the problem domain: no work on the solution domain should be performed at this phase. • Work on the solution domain begins at the preliminary design phase: this is where the first decisions on software architecture are made, and where the best practices of interaction design (in particular iterative prototyping with increasing fidelity level) should be used. Of course, we do not recommend a strict separation between these two phases: it is quite common that work performed at the preliminary design phase uncovers new unforeseen insights on the requirements, and that some iteration has to be performed between these two phases. Although iteration is frequent between these two phases, it should always remain clear to the various actors whether they are working on the problem domain (i.e. the requirements) or on the solution domain (i.e. the design). Our claim is that task modeling is especially useful during the requirements engineering phase, and that it complements nicely the domain models and use-case
An Integration of Task and Use-Case Meta-models
581
models that are developed during this phase. At this stage, class models are used to provide an analysis-level model of the domain (they formalize the vocabulary of the business domain), while use-cases and use-case diagrams are used to provide a useroriented view of the system functionality. The natural language scenarios that are associated with use-cases are essential in easing the construction of a common understanding of the problem between the stakeholders, since they are written in the vocabulary of the business and can be understood and validated by the customer. Our view that task models are essentially a requirement analysis tool contradicts several authors, who recommend using task models at the design phase, for instance to drive the generation of dialogue [6] or abstract interface models. In our approach, requirement task models necessarily remain at a rather abstract level, since at this stage the user interface is not (and should not be) yet under design. It follows that requirement task models should not mention any user-interface specifics: rather, the task models will drive the user-centered design of the UI that will follow in the subsequent phases, where the user interface specialist will strive to design an interface that is best suited to the user task, while taking into account the limitations inherent to the target platform for the interactive system. We do not believe that (except maybe in very stereotyped situations, such as business form-filling applications) a satisfactory user interface can be automatically generated from a task model. Rather, in our view, the task model can be used as a test case for the user interface that will be designed using user-centered techniques such as incremental low-fidelity prototyping. To allow for the smooth integration of task models in the software design lifecycle, we propose to integrate task models and use-cases at the meta-model level [5, 14], thus opening the way for efficient use of Model-Driven Engineering (MDE) techniques such as model weaving and model transformation. The process we advocate is inspired by the “essential use-cases” work proposed by Constantine and Lockwood [4] and the work in [13]. In particular, since use-cases are meant the be an input to interaction design, they should be devoid of any specific reference to user interface, otherwise it would be a premature commitment to a user interface design, before this design has been presented and validated by users through low-fidelity prototyping. We propose that CTT task models should serve as the behavioral language for usecases. In this usage of task modeling, task models are meant to provide an abstract view of the user’s activity, exploring their goals as well as the temporal and causal structure of their interactions with the system. Task models are thus the formal counterpart of the natural language, narrative descriptions of scenarios that is routinely associated with use-cases, and that are still quite useful: natural languages scenarios are ideal to communicate and form consensus with the customer, and can be developed and validated with the customer during brainstorming sessions. Task models, on the other hand, are useful to communicate with the design team, since they convey a precise semantics of the dynamics of human-computer interaction that has to be supported by the software to be produced. Fig. 1 illustrates our view of the early stages of the design process, highlighting the strong bonds between use-cases, domain model and task models that are the main outcomes of the requirements analysis phase.
582
R. Bastide
Requirements analysis Use cases
Domain model
Task model
Preliminary design Dialogue models
Interaction design
Prototyping
Fig. 1. First stages of the design process
3 Related Work The need to bridge the gap between the current practices of Software Engineering (centered on UML diagrams) and user-centered design (including task analysis and modeling) has been stressed by numerous authors. A remarkable variety of solutions to this problem has been proposed. The very father of the CTT notation [12] has identified the main trends of work in this field: •
• •
Representing CTT by an existing notation: Nobrega et al. [10], for instance, provide semantics of the temporal operators of CTT in terms of UML activity diagrams. Nunes et al. [9] use the extensions mechanisms provided by the UML (profiles, stereotypes) to represent the concepts of CTT in a UML framework. Developing automatic converters from UML to task models [6] (and back, we should add). It can be contended that, in the HCI literature, one can find proposals for generators from any kind of model to any other kind. Building a new UML for interactive systems “which can be obtained by explicitly inserting CTT in the set of available notations” [10]. This is the trend we follow in this paper, by integrating a metamodel of CTT inside the metamodel of UML itself.
An Integration of Task and Use-Case Meta-models
583
Although we share the goals expressed in [10], our technical proposal is quite different with the one presented there. − In the first place, we work formally at the metamodel level, whereas only a rough sketch of a solution was provided in [10]. We believe that explicit use of metamodels brings several fundamental advantages, including the opportunity to use existing MDE tools such as model transformation languages or model weavers to extend the potential use of models. We have demonstrated this advantage in previous work [1], by showing how the notions of human errors can be integrated in task diagrams through the use of error patterns and automatic model transformations. − Furthermore, it appears that our proposal is almost an “inside-out” reversal of the approach in [10] : the authors proposed a path to transform a use-case diagram (also called a use-case map) into a CTT task model, that could be further refined. In the contrary, we propose to use CTT as a language to specify the behavior of use-cases.
4 Alignment with the UML Use-Case Metamodel The metamodel of UML use-cases is given in Fig. 2. This is actually the metamodel of use-case maps (diagrams that show the relationships between the use cases for a system), since UML is non-prescriptive as to what a use case actually is, i.e. as to what the behavioral description of a use-case should be.
Fig. 2. The UML use-case metamodel (from [11])
584
R. Bastide
There has been some picky debate amongst specialists over this very metamodel [15], several of its flaws have been pointed out, and better alternative metamodels have been proposed. Although we mainly agree with these criticisms, we have chosen to stick with the “official” metamodel, since our goal is to be as close as possible to the standard. It should also be noticed that the ill-defined notion of use-case specialization relationship, formerly available in the UML, has been removed in the current version of the standard. Starting from this “official” metamodel of UML use-cases we want to cleanly integrate a metamodel of CTT, in order to express that a CTT task model is used to express the behavior of a use-case, and to show that “include” and “extend” relationships can be expressed over a CTT model. The metamodel of CTT illustrated in Fig. 3 improves on the one we previously published in [1] in several ways: − Our earlier metamodel used eCore [14] Ecore as the metamodeling language. The one presented here uses UML class diagrams for the same purpose, which allows us to cleanly express its relationships with other elements in the UML metamodel. For instance, it expresses that the notion of Actor in CTT is identical with the same notion in UML use cases. In turn, this enriches CTT with the features available for UML actors (for instance, one can design a specialization hierarchy of actors with increasing responsibilities) − It explicitly aligns CTT with UML use cases, bringing their structuring features (“include” and “extend” relationships) to CTT. > TaskAllocation User Application Interaction Abstract
0..1 +incoming
1 +next CttNode
CttTransition 0..1 * +name : String +operator : Operator +subtasks +allocation : TaskAllocation +outgoing +iteration : int +type : String +frequency : String Choice 1 +extensionPoint ExtensionPoint +description : String ConcurrencyInfoExchange 0..1 +precondition : BooleanExpression OrderIndependence 1 +postCondition : BooleanExpression Deactivation +extensionPoint Enabling +base +root +base EnablingWithInfoPassing 1 1 1 SuspendResume > Operator
+include
0..1
*
Include
Extend
*
+extension +addition
Actor
+performer 1
1
*
+extend *
1 CttTask
+behaviorSpecifiedBy +name : String 0..1
+behaviorFor 1
UseCase
Fig. 3. A metamodel of CTT integrated in the metamodel of UML
An Integration of Task and Use-Case Meta-models
585
In Fig. 3, the classes with a white background are imported from the UML metamodel, and should be related to the identical ones in Fig. 2. The classes with the filled background are specific to CTT. Basically, a CTT task model (CttTask) is a tree of nodes (CttNode) which can be related by transitions (CttTransition) that feature one of the CTT temporal operators (Operator). The use-case metamodel of Fig. 2 states that a use-case can have several “extend” and “use” relationships (* cardinality). The cardinalities chosen in our metamodel of CTT in Fig.3 should be carefully considered: • Include relationship: a CttNode has 0..1 include relationships, meaning that any CTT node can optionally include another CttTask (which in turn is a tree of CttNodes). This models a classical hierarchical decomposition, which makes it easy to reuse a task model in another one, by simply including it at the proper node. It is natural to allow for a maximum of one inclusion, since otherwise the temporal combination of the included CttTasks would be left undefined. • Extend relationship: an extend relationship is ternary, relating a base to an extension through one extensionPoint. In our metamodel, a CttNode has an optional extensionPoint, meaning that it can be optionally extended. However, a CttNode can have several extensions, discriminated by condition: BooleanExpression in metaclass Extend (cf. Fig. 2). It is noteworthy that the metamodel in Fig. 3 conveys the same information as the initial use-case metamodel, only more so. For instance, the set of Include relationships for a given use-case (which are actually the relationships appearing on use case maps) can be computed by exploiting the Include and Extend relationships of Fig. 3 recursively using the hierarchical composition relationship between CTTNodes. The metamodel in Fig.3 also relates to the domain model, albeit implicitly: the preConditions an postConditions elements in CttNode are meant to be Boolean expressions expressed in OCL (Object Constraint Language) operating on a domain model defined by a UML class diagram. As OCL itself is not part of the UML metamodel, but defined in a separate, language-oriented specification the relationship between task and domain model is not apparent, but is nonetheless fundamental.
5 Conclusion We have presented our view of a design process where task and use-case modeling are tightly integrated during the requirement engineering phase. CTT task models are used to provide an unambiguous description of use-case behavior, complementing natural language scenarios. An integration of CTT into the UML metamodel has also been presented, which opens the way to automatic processing of requirement models, to be use in subsequent phases of the design and implementation, for instance test sequence generation.
References 1. Bastide, R., Basnyat, S.: Error Patterns: Systematic Investigation of Deviations in Task Models. In: Coninx, K., Luyten, K., Schneider, K.A. (eds.) TAMODIA 2006. LNCS, vol. 4385, pp. 109–121. Springer, Heidelberg (2007) 2. Cockburn, A.: Writing Effective Use Cases. Addison-Wesley Professional, Reading
586
R. Bastide
3. Constantine, L., Campos, P.: Canonsketch and tasksketch: innovative modeling tools for usage-centered design. In: OOPSLA 2005: Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pp. 162–163. ACM, New York (2005) 4. Constantine, L.L., Lockwood, L.A.D.: Constantine & lockwood, ltd. structure and style in use cases for user interface design 5. Limbourg, Q., Pribeanu, C., Vanderdonckt, J.: Towards Uniformed Task Models in a Model-Based Approach. In: Johnson, C. (ed.) DSV-IS 2001. LNCS, vol. 2220, pp. 164– 182. Springer, Heidelberg (2001) 6. Luyten, K., Clerckx, T., Coninx, K., Vanderdonckt, J.: Derivation of a dialog model from a task model by activity chain extraction (2003) 7. Montero, F., López-Jaquero, V., Vanderdonckt, J., González, P., Lozano, M.D., Limbourg, Q.: Solving the mapping problem in user interface design by seamless integration in idealXML. In: Gilroy, S.W., Harrison, M.D. (eds.) DSV-IS 2005. LNCS, vol. 3941, pp. 13–15. Springer, Heidelberg (2006) 8. Mori, G., Paterno, F., Santoro, C.: Ctte: Support for developing and analyzing task models for interactive system design. IEEE Trans. Software Eng. 28(8), 797–813 (2002) 9. Jardim Nunes, N., Falcão e Cunha, J.: Towards a UML profile for interaction design: The wisdom approach. In: Evans, A., Kent, S., Selic, B. (eds.) UML 2000. LNCS, vol. 1939, pp. 101–116. Springer, Heidelberg (2000) 10. Nóbrega, L., Jardim Nunes, N., Coelho, H.: Mapping ConcurTaskTrees into UML 2.0. In: Gilroy, S.W., Harrison, M.D. (eds.) DSV-IS 2005. LNCS, vol. 3941, pp. 237–248. Springer, Heidelberg (2006) 11. Object Management Group: Unified Modeling Language (UML), version 2.0. Technical report, OMG (2005), http://www.omg.org/technology/documents/formal/uml.htm 12. Paternó, F.: Towards a UML for interactive systems. In: Nigay, L., Little, M.R. (eds.) EHCI 2001. LNCS, vol. 2254, pp. 7–18. Springer, Heidelberg (2001) 13. Rosson, M.B.: Integrating development of task and object models. Commun. ACM 42(1), 49–56 (1999) 14. Stahl, T., Volter, M.: Model-Driven Software Development. Wiley, Chichester (2006) 15. Williams, C., Kaplan, M., Klinger, T., Paradkar, A.M.: Toward engineered, useful use cases. Journal of Object Technology 4(6), 45–57 (2005)
Model-Based Specification and Validation of User Interface Requirements Birgit Bomsdorf1 and Daniel Sinnig2 1 Department of Applied Computer Science, Fulda University of Applied Sciences, Germany 2 Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada
[email protected],
[email protected] Abstract. Core functional requirements as captured in use case models are too high-level to be meaningful to user interface developers. In this paper we present how use case models can be systematically refined into detailed user interface requirements specifications, captured as task models. We argue that the transition from functional to UI specific requirements is a semi-formal step which necessitates experience, skills and domain knowledge of the requirements engineer. In order to facilitate the transition we sketch out an integrated development methodology for use case and task models. Since the engineer is also responsible for establishing conformity between use cases and task models we also show, how this validation can be supported by means of the WTM task model simulator. Keywords: Requirements specification, use case model, task model, model simulation.
1 Introduction A common challenge in Software Engineering (SE) as well as in Human-Computer Interaction (HCI) is the transition from functional requirements to user interface (UI) specific requirements. UI development and the engineering of functional requirements are still often carried out by different teams using different processes and lifecycles [1]. This is likely to result in duplication of effort, inconsistencies, and even contradicting requirements. The apparent gap between software engineering and UI development has been noted by several authors [2, 3, 4] and has been (partially) addressed in our work [5] and the work of others [6, 7, 8, 9]. It has been noted that one possibility to close this gap is to conceptually join use case models and task models in one common development process. The functional requirements of the application are captured by the use case model, which are then stepwise refined into UI-specific requirements captured by task models. A combination of both models has been first investigated by Paternò [10]. In his work, however, the transition from use case to task specifications is performed informally, and task modeling is part of the design process and is not considered at the requirements level. Kujala [9] defines a systematic process for transforming user needs into use case J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 587–596, 2009. © Springer-Verlag Berlin Heidelberg 2009
588
B. Bomsdorf and D. Sinnig
specifications, but does not take into account task model specifications. Sinnig et al. [5] have defined a common semantic model for use case and task models, and propose a formal, but static, refinement relation between the two artifacts. We firmly believe that the requirements engineer should not be exempted from deciding whether or not a task model faithfully refines the use case it is developed from. On the contrary, finding the answer often depends on domain knowledge and properties specific to a project. Often refinements validation cannot be automated but has to be carried out manually by the requirements engineers themselves. In such a case, simulation and animation have proven to be powerful tools, assisting the requirements engineer in assessing the validity and accuracy of development artifacts [11, 12, 13]. Based on the discussion above, the contributions of this paper are twofold: (1) We propose a systematic and integrated development process according to which UI requirements are derived as a logical progression from a functional requirements specification. (2) We demonstrate how our tool WTM Simulator [12] assists the requirements engineer in verifying whether a task model is a valid refinement of a given use case model. The remainder of this paper is organized as follows: In Section 2, we sketch out, from a generic point of view, the key characteristics of the development process we propose. Section 3 and 4 define use case models and task models as means for capturing functional and UI requirements, respectively. In Section 5 we introduce the WebTaskModel (WTM) approach and present its application to verifying conformity between use case and task models. Finally, in Section 6 we conclude and provide an outlook to future avenues. Related work is discussed throughout the paper.
2 Systematic and Integrated Development Process The basic idea of our current work on a systematic and integrated development process is depicted in Fig. 1. Use cases are used to capture the bare functional requirements of the system, which are afterwards refined to UI specific requirements by means of a set of task models. Both use cases and task models belong to the family of scenario-based notations, and as such capture sets of usage scenarios of the system. In theory, both notations can be used to describe the same information. In practice and in our approach however, use case models capture requirements at a higher level of abstraction whereas task models are more detailed. Ideally, the functional requirements captured in use cases are independent of a particular user interface [7, 14], whereas the refined requirements captured in the task models take into account the specificities of a particular type of user interface and the characteristics of a detailed user role. For example, if the application supports multiple UIs (e.g., Web UI, GUI, Mobile, etc.) and multiple user types (e.g., novice user and expert user), then the use case model is instantiated to several task models; one for each “type” of user interface and user. In modern software engineering, the development lifecycle is divided into a series of iterations. Within each iteration, a set of disciplines and associated activities are performed while the resulting artifacts are incrementally refined and perfected. The development of use case and task models is no exception to this rule. On the one hand, ongoing prioritization and filtering activities during the early stages of development will
Model-Based Specification and Validation of User Interface Requirements
589
gradually refine the requirements captured in the use case model. On the other hand, a task model is best developed in a top-down manner, where a coarse grained task model is gradually refined into a more detailed or more restricted task model. In both cases, it is important to ensure that the refining model is a proper refinement of its relative base model (and all its predecessor models). Validation is an important step of a model-based approach so as to avoid ill-defined or miss-behaving models impacting the final design.
Fig. 1. From Functional requirements to UI Requirements
To illustrate the introduced development process, we use an example that is based on a scenario in which a new web-based Invoice Management System (IMS) is to be developed. It should feature (among others) the following functionalities: “Order Product”, “Cancel Order”, “View Orders”, and “Ship Order”. All the functionalities shall be accessible through a Web UI and should support two user types: New Customer and Registered Customer. As a first step, a functional requirements specification in the form of a use case model is developed, which is shown next.
3 Functional Requirements Specification: Use Cases Use cases were introduced in the early 90s by Jacobson [15]. He defined a use case as a “specific way of using the system by using some part of the functionality.” Modern popularizations of use case models are often attributed to Cockburn [14]. Use case modeling is making its way into mainstream practice as a key activity in the software development process (e.g. Rational Unified Process [16]). There is accumulating evidence of significant benefits to customers and developers [17]. A use case model captures the “complete” set of use cases for an application, where each use case specifies possible usage scenarios for a particular functionality offered by the system. Every use case starts with a header section containing various properties (e.g. primary actor, goal, goal level, etc). The core part of a use case is its main success scenario. It indicates the most common way in which the primary actor can reach his/her goal by using the system. A use case is completed by specifying the use case extensions. These extensions define alternative scenarios which may or may not lead to the fulfillment of the use case goal. An example use case is given in Fig 2. The use case captures the interactions for the “Order Product” functionality of the previously mentioned Invoice Management System (IMS). The main success scenario of the use case describes the situation in which the primary actor directly accomplishes his/her goal of ordering a product. The
590
B. Bomsdorf and D. Sinnig
extensions specify alternative scenarios which may (3a, 6a) or may not (7a) lead to the abandonment of the use case goal. In the next section we show how the “Order Product” use case is refined by UIspecific task models. Use Case: Order Product Primary Actor: Customer Goal: Customer places an order for a specific product. Level: User-goal Main Success Scenario: 1. Primary actor browses the product inventory and selects a specific product for purchase. 2. Primary actor specifies the desired quantity 3. System validates the availability of the product quantity and displays purchase summary. 4. Primary actor provides/validates payment and shipping information. 5. System prompts primary actor to accept the terms of conditions and to confirm the order. 6. Primary actor accepts and confirms. 7. System has the payment authorization unit to carry out payment and finalizes order. 8. System confirms and invoices the order. 9. Use case ends successfully Extension Points: 3a. The desired product is not available: 3a1. System notifies primary actor that product in desired quantity is not available. 3a2. Use case ends unsuccessfully 6a. The primary actor cancels the use case: 6a1. Use case ends unsuccessfully 7a.The payment information is invalid: 7a1. System notifies customer that payment information provided is invalid. 7a2. Use case resumes at step 4
Fig. 2. “Order Product” Use Case
4 Refined UI Requirements Specification: Task Models Task modeling is by now a well understood technique supporting user-centered UI design. The resulting specification is the primary input to the UI design stage in most HCI development approaches. Since we use task models to refine the raw requirements specification given by use cases, several task specifications may be defined for a single use case, one for each type of user interface and/or user type. A task model describes how users will be able to achieve their goals by means of the future application. Furthermore it also indicates how the system will support the involved (sub)tasks. Several approaches to defining such models exist (e.g., CTT [13], TaO Spec [18], MAD [19] and VTMB [11]). The WebTaskModel (WTM) used here is a further development of our previous work [11] to account more appropriately for characteristics of interactive web applications. The enhancements, however, are applicable to conventional interactive systems as well. In the following we are not going to point out web-specific details but introduce only those extensions as relevant for this paper. A more comprehensive overview of WTM can be found in [12, 20]. Fig 3. shows a subset of a task model refining the “Order Product” use case described above. The task model was specifically developed for a Web UI and the user type New Customer. As usual, the task hierarchy shows the decomposition of a task into its subtasks which can be of different task types. In the specification of refined UI
Model-Based Specification and Validation of User Interface Requirements
591
requirements we distinguish between cooperation tasks (represented by ) to denote pieces of work that are performed by the user in conjunction with the application, user tasks ( ) denoting the user parts of the cooperation performed without system intervention, and system tasks ( ) defining pure system parts. Abstract tasks ( ), similarly to CTT [13] and MAD [19], are compound tasks whose subtasks belong to different task categories.
Fig. 3. “Order Product” Task Model for the role New Customer
The order of task execution is given by temporal relations. In the notation used in the figure, temporal relations are denoted by abbreviations: The symbol defines a selection of subtasks. >> denotes tasks that are to be performed strictly one after the other in the specified order (visualized by ). The partial task model shown in Fig 3. specifies the task order product, which is decomposed into the subtasks search for product (according to step S1 of the use case), specify quantity (step S2) feedback (S3 and S3a1) and payment (steps S4 – S8). The task feedback is decomposed into the subtasks display summary, for which we define the precondition C1:product quantity available, and display prod. unavailable, for which we define the precondition NOT C1. Both conditions are derived from the use case extension 3a. Please note that the conditions are not shown in the diagram but were assigned by means of the task property window of the WTM editor (see [20]). The task display prod. unavailable is a so-called stop task. It denotes the premature termination of the scenario and is the task model counterpart to use case step S3a1. In addition to the task model for the role New Customer, a task model for a Registered Customer is compiled. It differs from the presented task model in terms of how the payment task is broken down. Instead of having to provide the shipping and payment information in each case, a registered customer has the option to alter shipping or payment data or to entirely skip the involved subtasks. As seen, different sub-roles lead to slightly different UI requirements. If different UI types were to be supported the use case model would also be refined into device specific task models.
592
B. Bomsdorf and D. Sinnig
5 Tool Supported Validation As mentioned above, use case models capture requirements at a higher level of abstraction whereas task models are more detailed taking into account the specificities of a particular type of user interface and characteristics of a detailed user role. The question arises whether or not a task model faithfully refines the use case it is based on. The requirements engineer is not exempted from deciding this question as finding the answer often depends on domain knowledge and project details. In the following we demonstrate how the tool WTM Simulator [12] can be used to check whether a task model is behaviorally equivalent to a given use case. Firstly, use cases are transformed into a formal (machine readable) presentation based on finite state machines. In the WTM approach, task models are represented by a set of task state machines, which are used within the final application as part of the UI controller [21]. Task state machines are also used to simulate task models within the development steps. In the work reported here a formal correspondence between use case and task models is established to simulate their execution in conjunction. This will be presented by means of a concrete simulation example. 5.1 Mapping Use Cases to UC-FSM At first use cases are transformed into a finite state machine representation called UCFSM. A UC-FSM is a labeled, directed, connected graph, where nodes denote states and the edges represent state transitions. In a UC-FSM the execution of a step is denoted by a transition. The transition labels serve as references to the corresponding steps in the original use case description. We believe that UC-FSM capture easily and intuitively the nature of use cases. As use cases are typically captured in purely narrative form the derivation of the use case graph will be a manual activity. The composition of the use case graph from a given use case depends on the flow constructs, which are implicitly or explicitly entailed in the use case. Examples of such flow constructs are: jumps (e.g. use case resumes at step X), sequencing information (e.g. the numbering of use case steps), or branches to use case extensions. Concrete details on the mapping process as well a slightly more elaborated formal model can be found in [22].
Fig. 4. Use Case FSM for “Order Product” Use Case
Fig 4 depicts the corresponding UC-FSM for the “Order Product” use case. As shown, all the steps of the use case are also present in the UC-FSM. Note that starting from states {quant.selected}, {awaiting confirmation} and {confirmed}, two transitions
Model-Based Specification and Validation of User Interface Requirements
593
are defined, denoting the execution of steps in the main success scenario and alternatively the execution of steps defined in the corresponding extensions. 5.2 Task State Machine and UC-FSM Assignment In WTM each task formally possesses a state machine describing a generic task life cycle (see Fig 5). For each task the state machine can be extended to specify application specific task behavior. The rules that are used for this purpose are of the form task.task-state.task-eventÆ action, where task denotes the task whose behavior is extended, task-state and task-event denote the state and corresponding trigger event upon which the action is to be performed. In the work presented in this paper, this “extension” technique is used to combine task state machines with the UC-FSM. The objective is to specify dependencies between task executions and use case steps.
skipped
Skip Restart
initiated Start
Restart Suspend
State
if all preconditions are fulfilled the task can be started
skipped
the task is omitted
running
denotes the actual performance of the task and of its subtasks, if applicable
completed
marks a successful task execution
suspended
running Resume End completed
Abort
Abort terminated
Meaning
initiated
suspended the task is interrupted
Fig. 5. Generic Task State Machine
In order to run a conformance simulation we extend the various task state machines such that they generate the trigger events needed to run the UC-FSM. The specification of the extensions rules depends on which tasks are meant to be a refinement for which use case step. Hereby, due to the before mentioned different levels of abstraction, one use case step is often refined by several tasks. Table 1 (column 1 and 2) depicts the refinement mapping between use case steps and tasks. Note that abort order product is added since S3a1 is a stop task. The mappings defined by the row of step S4 result from the task model differentiation of the role Customer. Column 3 of Table 1 depicts the state of the task state machine responsible for sending the corresponding use case event to the UC-FSM. Examples of rules resulting from Table 1 are: display summary.completed.on_entry Æ send S3 to Use Case product order display prod. unavailable.completed.on_entry Æ send S3a1 to Use Case product order Æ send abort to task order product
Finally we note that the table is manually created by the requirements engineer. According to our experiences we argue that if the task model was specifically developed based on a given use case specification (as suggested in the paper) the corresponding refinement mappings are clearly defined and hence the conception of the table is a straightforward activity.
594
B. Bomsdorf and D. Sinnig Table 1. Refinement Mapping between Use Case Steps and Tasks Step S1
Task search for a product
Task State completed
S2 S3
specify quantity display summary
completed completed
S3a1
display prod. unavailable New Customer: provide payment information
completed / abort order product completed
Registered Customer: alter data
completed or skipped
S5
prompt confirmation
completed
…
…
…
S4
5.3
WTM Simulation Tool and Example
In [12] we presented a tool that supports the developer in validating task, role, taskobject models and their behavioral interrelations by means of model simulation. In the tool each task is represented by an icon showing static and dynamic information about the task (such as the task type, temporal relations, and the current state). A context menu attached to each task allows triggering one of the events that are defined by the generic task state machine and are currently valid. The WTM simulator provides the software engineer with different areas implementing several views on the models, e.g., showing the hierarchical task structure, listing all tasks that can be started or ended at a current point in time, respectively, and presenting task objects. Some examples are shown by the screenshots in Fig 6. Here, the object area shows only USE CASE product order and its state changes resulting from task execution. Please note that modeling use cases as objects is only a workaround since the use case extensions are not yet implemented in the WTM simulator. In the upper part of Fig 6 the UC-FSM is in the state quant.selected. Since the condition C1 is fulfilled (see condition area) the task display summary can be performed at this point in time. After its completion the UC-FSM state changes to prod. available and provide shipping information is enabled (indicated by the arrow in Fig 6). The second scenario shows the unsuccessful run in case of NOT C1 (defined by C2): Once the display-task is executed order product is terminated (thus the startable leaf task area is empty) and the UC-FSM switches to state prod. unavailable. During simulation the requirements engineer can check whether or not each task sequence allowed by the task model is a valid scenario according to the use case specification and vice versa. Furthermore, the simulator allows also one to observe how the steps of a scenario under investigation affect task-objects and domain objects, respectively. As in the case of the USE CASE object, the simulator tool represents them in the object area showing their name, classes, and their manipulations in terms of state changes. Similarly, but not depicted in Fig 6, a role area shows all defined roles, allowing the investigation of role changes resulting from task execution as well as disabling and enabling of tasks caused by role changes. For example, the requirements engineer can check the validity of a user registration scenario (by which the role has to change from New Customer to Registered Customer) and its coactions with the use cases and task models, respectively, defined for each role.
Model-Based Specification and Validation of User Interface Requirements
595
scenario 1
scenario 2
Fig. 6. Simulating Task and Use Case Executions
6 Conclusion In this paper we presented our current work towards an integrated development methodology for the derivation of UI requirements from high-level functional requirements. The development approach reported here consists of two basic steps. First, a use case model is iteratively created to capture core application requirements. Next, the use case model is successively refined into a set of task models. While use cases capture “raw” functional requirements which are independent of a particular user interface, task models capture refined UI specific requirements which not only take into account the specificities of a particular type of user interface but also the characteristics of a detailed user role. As a result, one use case is typically refined by several task models; one for each UI type or user role. The focus of this paper was on the systematic development of use case and task models. Our approach, however, takes also user roles and involved objects into account - the description of which has been omitted for the sake of conciseness. The tool WTM Simulator was used to check conformity between a task model and a given use case model. In particular, we demonstrated how use cases can be translated into a state machine representation and formally combined with the task state machine approach of WTM, which in turn is used as input to the simulator. The results of the simulation guide and assist the developer in deciding whether the task model is a valid refinement of the underlying use case. The research reported in this paper is the first offspring of a larger project, the goal of which is the establishment of a model-driven UI engineering framework, encompassing all phases of the software lifecycle and involved models. Within our next working step we will elaborate the refinement of the functional requirements, e.g., by means of UML activity diagrams. We also aim to further extend the WTM Simulator such that it allows for direct input of structured textual use cases and (semi) automatically generates refinement mappings between use case steps and tasks.
596
B. Bomsdorf and D. Sinnig
References 1. Kazman, R., Gunaratne, J., Jerome, B.: Why Can’t Software Engineers and HCI Practitioners Work Together? In: Proc. of HCI Intern., Crete, Greece, pp. 504–508 (2003) 2. Ferre, X., Juristo, H., Windl, H., Constantine, L.: Usability basics for software developers. IEEE Software 18(1), 22–29 (2001) 3. Kazman, R., Bass, L., John, B.: Bridging the gaps between software engineering and human-computer interaction. In: Workshop at ICSE 2004, Scotland, UK (2004) 4. Sutcliffe, A.: Convergence or Competition between Software Engineering and Human Computer Interaction. In: Seffah, A., Desmarais, M.C., Metzger, M. (eds.) HumanCentered Software Engineering -Integrating Usability in the Software Development Lifecycle, pp. 71–83. Springer, Heidelberg (2005) 5. Sinnig, D., Chalin, P., Khendek, F.: Common Semantics for Use Cases and Task Models. In: Proc. of Integrated Formal Methods, Oxford, England, pp. 579–598 (2007) 6. Clemmensen, T., Norbjerg, J.: Separation in Theory – Coordination in Practice. In: Workshop Bridging the Gap between Software Engineering and HCI, Portland (2003) 7. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of User Centered Design. Addison-Wesley, Reading (1999) 8. Constantine, L., Biddle, R., Noble, J.: Usage-Centered Design and Software Engineering: Models for Integration. In: Workshop Bridging the Gaps Between SE and HCI, Portland (2003) 9. Kujala, S.: Linking User Needs and Use Case-Driven Requirements Engineering. In: Human-Centered Software Engineering-Integrating Usability in the Development Process, pp. 113–125 (2005) 10. Paternó, F.: Towards a UML for interactive systems. In: Nigay, L., Little, M.R. (eds.) EHCI 2001. LNCS, vol. 2254, pp. 7–18. Springer, Heidelberg (2001) 11. Biere, M., Bomsdorf, B., Szwillus, G.: Specification and Simulation of Task Models with VTMB. In: Proc. of Computer-Human Interaction Conference, pp. 1–2 (1999) 12. Bomsdorf, B.: The WebTaskModel Approach to Web Process Modelling. In: Proc. of Task Models and Diagrams for User Interface Design, Toulouse, France, pp. 240–253 (2007) 13. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 14. Cockburn, A.: Writing Effective Use Cases. Addison-Wesley, Boston (2001) 15. Jacobson, I.: Object-Oriented Software Engineering: A Use Case Driven Approach. ACM Press (Addison-Wesley Pub), New York (1992) 16. Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3rd edn. Prentice Hall PTR, Englewood Cliffs (2004) 17. Merrick, P., Barrow, P.: The Rationale for OO Associations in Use Case Modelling. Journal of Object Technology 4(9), 123–142 (2005) 18. Dittmar, A., Forbrig, F., Stoiber, S., Stary, C.: Tool Support for Task Modelling - A Constructive Exploration. In: Proc. of DSV-IS, Hamburg, Germany, pp. 59–76 (2004) 19. Sebillotte, S., Scapin, D.L.: From users’ task knowledge to high level interface specification. International Journal of Human-computer Interaction 6, 1–15 (1994) 20. Bomsdorf, B.: Modelling Interactive Web Applications: From Usage Modelling towards Navigation Models. In: Proceedings of 6th International Workshop on Web-Oriented Software Technologies – IWWOST 2007, Como, Italy, pp. 194–208 (2007) 21. Betermieux, S., Bomsdorf, B.: Finalizing dialog models at runtime. In: Baresi, L., Fraternali, P., Houben, G.-J. (eds.) ICWE 2007. LNCS, vol. 4607, pp. 137–151. Springer, Heidelberg (2007) 22. Sinnig, D., Chalin, P., Khendek, F.: LTS Semantics for Use Case Models. In: Proceedings of ACM - SAC 2009, Honolulu, HI (to appear, 2009)
A Position Paper on 'Living Laboratories': Rethinking Ecological Designs and Experimentation in Human-Computer Interaction Ed H. Chi Palo Alto Research Center, Augmented Social Cognition Group, 3333 Coyote Hill Road, Palo Alto, CA 94304 USA
[email protected] Abstract. HCI have long moved beyond the evaluation setting of a single user sitting in front of a single desktop computer, yet many of our fundamentally held viewpoints about evaluation continues to be ruled by outdated biases derived from this legacy. We need to engage with real users in 'Living Laboratories', in which researchers either adopt or create functioning systems that are used in real settings. These new experimental platforms will greatly enable researchers to conduct evaluations that span many users, places, time, location, and social factors in ways that are unimaginable before. Keywords: HCI, Evaluation, Ecological Design, Living Laboratories, Methodology, Web Services.
1 Introduction Looking back on the history of Human-Computer Interaction as a field, we see fundamental contributions mainly from two groups of researchers: (1) computing scientists interested in how technology would change the way we all interact with information, and (2) psychologists (especially cognitive psychologists) interested in the implications of those changes. This created a combustible environment for great research, because the computing scientists wanted to create great and interesting tools but did not have a great way to measure its impact, yet many classically trained psychologists were looking beyond classic research in the brain and the understanding of human cognition. This resulted in an area called “Human Information-Processing”, which closely coupled with the growth of cognitive psychology, human factors, and human engineering [1, 11]. One enduring core value in Human-Computer Interaction (HCI) research has been the development of technologies that augment human intelligence. This mission originates with V. Bush, Licklider, and Engelbart, who inspired many researchers such as Alan Kay at PARC in the development of the personal computer and the graphical user interface. Together, both groups of researchers were excited by the possibilities of the computing machinery in producing systems that augmented human intellect [5], which was a possibility that was deeply intriguing to researchers that may have been J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 597–605, 2009. © Springer-Verlag Berlin Heidelberg 2009
598
E.H. Chi
slightly disillusioned with artificial intelligence research but yet believed computers were great tools for modeling and understanding human cognition. The aim of augmented human cognition has remained a core value for Human-Computer Interaction research. With this aim, during the formation of the field, the need to establish HCI as a science had pushed us to adopt methods from psychology, both because it was convenient as well as the methods fit the needs. HCI field's rise paralleled the rise in the notion of personal computing---the idea that each person would have one computer at her command. Systems were evolving from many users using a single system to a single user multi-tasking with her own desktop computer. The costs of these systems forced researchers to think about how users would most productively accomplish knowledge work. The metaphor of the desktop, files, windows, and the graphical icons on bitmapped displays arrived naturally. The study of how users would respond to icons flashing on the screen, how users would move a pointing device like the mouse [2] to move a file from one location to the next paralleled some of the psychological experiments on stimulus and human response that psychologists were already routinely measuring. Fitts' law [1, 2], models of human memory [7], cognitive and behavioral modeling methods like GOMS [1] enabled HCI researchers and practitioners to model a single user interacting with a single computer.
2 Outdated Evaluative Assumptions Of course, the world has changed. Trends in social computing as well as ubiquitous computing had pushed us to consider research methodologies that are very different from the past. In many cases, we can no longer assume: Only a single display. Users will pay attention to only one display and one computer. Much of fundamental HCI research methodology assumes the singular occupation of the user is the display in front of them. Of course, this is no longer true. Not only do many users already use multiple displays, they also use tiny displays on cell phones and iPods and peripheral displays. Matthews et al. studied the use of peripheral displays, focusing particularly on glance-ability, for example. Traditional HCI and psychological experiments typically force users to attend to only one display at a time, often neglecting the purpose of peripheral display designs. Only knowledge work. Users are performing the task as part of some knowledge work. The problem with this assumption is that non-information oriented work, such as entertainment applications, social networking systems, are often done without explicit goals in mind. With the rise of Web2.0 applications and systems, users are often on social systems to kill time, learn the current status of friends, and to serendipitously discover what might capture their interests. Isolated worker. Users performing some task by themselves. Much of knowledge work turn out to be quite collaborative, perhaps more so than first imagined. Traditional view of HCI assumed the construction of a single report by a single individual that is needed by a hierarchically organized firm. Generally speaking, we have come to view such assumption with contempt. Information work, especially work done by highly paid analysts, is highly collaborative. Only the highly automated tasks that are
A Position Paper on 'Living Laboratories'
599
routine and mundane are done in relative isolation. Information workers excel at exception handling, which often require the collaboration of many departments in different parts of the organizational chart. Stationary worker. User location placement is stationary, and the computing device is stationary. A mega-trend in information work is the speed and mobility in which work is done. Workers are geographically dispersed, making collaboration across geographical boundaries and time-zone critical. As part of this trend, work is often done on the move, in the air while disconnected. Moreover, situation awareness is often accomplished via email clients such as Blackberries and iPhones. Many estimates now suggest that already more people access the internet on their mobile phone than on desktop computers. This certainly has been the trend in Japan, a bellwether of mobile information needs. Task duration is short. Users are engaged with applications in time scales measures in seconds and minutes. While information work can be divided and be composed of many slices of smaller chunks of subgoals that can be analyzed separately, we now realize that many user needs and work goals stretch over for long period of time. User interests in topics as diverse as from news on the latest technological gadgets to snow reports for snowboarding need to be supported over periods of days, weeks, months and even years. User engagement with web applications are often measured in much longer periods of time as compared to more traditional psychological experiments that geared toward understanding of hand-eye coordination in single desktop application performance. For example, Rowan and Mynatt studied peripheral family portraits in the digital home over a year-long period and discovered that behavior changed with the seasons [14]. The above discussion point to how, as a field, HCI researchers have slowly broken out of the mold in which we were constrained. Increasingly, evaluations are often done in situations in which there are just too many uncontrolled conditions and variables. Artificially created environments such as in-lab studies are only capable of telling us behaviors in constrained situations. In order to understand how users behave in varied time and place, contexts and other situations, we need to systematically re-evaluate our research methodologies.
3 Re-thinking Evaluations Fundamentally, traditional HCI research is busting the seams in two different ways: (1) ubiquitous computing research is challenging the notion of personal computing in front of a desktop, looking at computation that is embedded in the environment as well as computation done with ever powerful devices that can be taken while mobile [3, 4]; (2) social computing research that is simultaneously challenging the notion of computing systems designed for the individual, instead of for a group or community [6, 12]. Both trends have required re-thinking our evaluation methodologies. Traditional CSCW research have already drawn on qualitative methodologies from social scientists, including field observations and interviews, diary studies, survey methods, as well as focus groups and direct participation. Ubicomp, on the other hand, have used
600
E.H. Chi
a mixture of methods, but have more readily examined actual deployments with real users in the field. In either case, it may be time for us to fundamentally re-think how HCI researchers ought to perform evaluations, as well as the goal of the evaluations. Since, increasingly, HCI systems are not designed for a single person, but for a whole group, we need research that not just augment human intelligence, but also group intelligence and social intelligence. Indeed, a natural extension of research in augmenting human intellect is the development of technologies that augment social intelligence, lead by research in the Social Web and Web2.0 movements. Traditional CSCW research has already studied the needs of coordination for a group and to some extent a community of practice. Many researchers are now conducting research in a social context, in which factors are less easy to isolate and control in the lab. Some research in the past might have treated variations in social contexts as part of the noise of the overall experiment, but this is clearly unsatisfactory since larger subject pools are necessary to overcome the loss in the power of the experiment. Moreover, we now know that many social factors follow distributions that are not normally distributed, making the prediction of individual factors in greatly varying social situations difficult, if not impossible. Since users now interact with computing systems in varied ubiquitous contexts, ecological validity is often much more important than studying factors in isolation. In ubicomp applications, for example, productivity measurements are often not the only metrics that are important. For example, adoption of mobile applications is now often cited as evidence of the usefulness of an application. One might argue that if using an application results in no productivity increase then the fact there is adoption of the application is irrelevant. However, this view is short sighted, because the opposite is also true: If there is productivity increase from using the application, but there is no adoption (perhaps due to ease of use issues, for example), then it is also unclear what benefit the application will ultimately bring. Obviously, the best situation is to have both productivity improvements as well as real adoption. However, research resource constraints often conspire against us to achieve both. Interestingly, academic research often tend to focus on the former rather than the latter, increasing the perceived gulf between academics' ivory tower and the trenches of the practitioners. An example that illustrates this gulf is the studies around color copiers and printers. It has been circulated here at PARC that researchers had studied the need for color output from copiers and printers, and had concluded that there was either negligible increase or no productivity increase from using color. Cost and benefit analysis showed that black-and-white copiers were often just as good and more economical than color copiers in the majority of the cases. While it is unclear whether the studies took into account of increase use of color in various media might possibly drive future demand and utility of color systems, what is clear now is that the adoption of color copiers and printers would occur independent of productivity studies. If what matters in the industry are the adoption of technology, while academic research remains focused on measurements of productivity, we will never bring the two communities together and technology transfer will forever remain challenging.
A Position Paper on 'Living Laboratories'
601
4 Evaluations Using 'Living Laboratories' The Augmented Social Cognition group have been a proponent of the idea of 'Living Labratory' within PARC1. The idea is that in order to bridge the gulf between academic models of science and practical research, we need to conduct research within living laboratories. Many of these living laboratories are real platforms and services that researchers would build and maintain, and just like Google Labs or beta software, would remain somewhat unreliable and experimental, but yet useful and real. The idea is to engage real users in ecological valid situations, while gathering data and building models of social behavior. Looking at two different dimensions in which HCI researchers could conduct evaluations, one dimension is whether the system is under the control of the researcher or not. Typically, computing scientists build systems and want them evaluated for effectiveness. The other dimension is whether the study is conducted in the laboratory or in the wild. These two dimensions interact to form four different ways of conducting evaluations: (1) Building a system, and studying it in the laboratory. This is the most traditional approach in HCI research and the one that is typically favored by CHI conference paper reviewers. The problem with this approach is that it is (1) extremely timeconsuming, and (2) experiments are not always ecologically valid. As mentioned before, it is extremely difficult, if not impossible, to design experiments for many social and mobile applications that are ecologically valid in the laboratory. (2) Not building a system (but adopt one), and still study it in the laboratory. For example, this is possible by taking existing systems, such as Microsoft Word and iWorks Pages and comparing the features of these two systems. (3) Adopting an existing system, and studying it in the wild. The advantage here is to study real applications that are being used in ecologically valid situations. The disadvantage is that findings are often not comparable, since factors are harder to isolate. On the other hand, the advantages are that real findings can be immediately applied to the live system. Impact of the research is real, since adoption issues are already removed. As illustrated below, we have studied Wikipedia usage in detail using this method. (4) Building a system, releasing it, and studying it in the wild. A well-publicized use of this approach is Google's A/B testing approach2. According to Google, A/B testing allowed them to finely tune the Search Engine Result Pages (SERPs). Some details about this kind of A/B online experiments has been documented [8]. For example, how many search results should the page contain was studied carefully by varying the number between a great number of users. Because the subject pool is large, Google can say with some certainty which design is better on their running system. A major disadvantage of this approach is the effort and resource requirement it takes to study such systems. However, for economically interesting applications
1 2
http://asc-parc.blogspot.com/2008/11/living-laboratories-rethinking.html http://news.cnet.com/8301-10784_3-9954972-7.html
602
E.H. Chi
such as Web search engines, the tight integration between system and usage actually shorten the time to innovate between product versions. Of these variations, (3) and (4) are what we consider to be 'Living Laboratory' studies.
5 Examples of Living Laboratory Style Research Here we will illustrate how to conduct Living Laboratory studies with some examples. GroupLens and MovieLens. First, an example of building a real system, releasing it, and studying it in the wild was the seminal work of the GroupLens [9] research group at University of Minnesota. GroupLens was first created to deal with information overload, particularly the high amount of traffic in Usenet news. In this way, GroupLens was hoping to adopt an existing community and system, and augment it with some technology and studying how the technology performs in the wild. The technology in question was collaborative filtering. The idea at the time was related to user profiling. Users expressing interest in the same items must be somewhat similar and can form a virtual neighborhood. Therefore, we can recommend to them items that their neighbors are interested in. The research group was somewhat successful in doing this, as enough users on Usenet news adopted the technology and provided feedback on the system. Later, the research group built a movie recommendation site on the Web that used similar collaborative filtering algorithms called MovieLens [10]. The website retained a community of about 6000 users that became an ecosystem in itself. Someone volunteered to keep the movie database up to date, and some participated in discussions about features the recommendation system should have. Later research on specific recommendation algorithms often split users into groups temporarily, where one group might receive one treatment, while the other would receive another treatment. The results are then compared to see how the two groups differed, including whether they evolved different group behaviors.
Fig. 1. Movielens system is an academic project with a live community
A Position Paper on 'Living Laboratories'
603
Games with a Purpose (gwap.com). Luis von Ahn's work on ESP games has evolved in a highly intriguing site called Games with a Purpose (gwap.com). On this site, users can engage in mini-games that are fun in themselves, but also the games end up collecting data that is useful in some other way. One well-known example is the image labeler, in which two users (without other communication means) must agree on the same keyword to receive points. The objective is to agree on the labeling in as many images as possible in a given timeframe. Here the objective is to engage real users in realistic contexts, in which the goal was to entertain the user and to gather behavioral data that tell us something about the images. One can now analyze word choices over many data points, collective action (including any attempts at cheating), as well as longitudinal issues like number of repeat visits, the diversity of users, or viralness of the game. Engagement measures, such as stickiness, can be directly measured.
Fig. 2. The Games With A Purpose (gwap.com) website engages real users with games, while having them accomplish some task that is useful for research
WikiScanner / WikiDashboard over Wikipedia. One realistic approach is to adopt an existing community and system, and create mashup applications that augment the original system with some new capability and studying its effects. For example, Wikis are collaborative systems in which virtually anyone can edit anything. Although wikis have become highly popular in many domains, their mutable nature often leads them to be distrusted as a reliable source of information. For example, Virgil Griffith took open source data from Wikipedia and enabled people to discover the possible identities of Wikipedia editors by cross-referencing the IP address with institution names3. Our own research on social transparency also took this approach. We downloaded a copy of all of the edits on Wikipedia and tabulated the editing statistics for all articles and all users4. This enabled us to create a visualization the editing patterns for 3 4
http://wikiscanner.virgil.gr/ http://wikidashboard.parc.com/
604
E.H. Chi
each article and each user [13]. WikiDashboard has received tens of thousands of visits from Wikipedia users. We know also that both systems were discussed extensively in the Wikipedia community.
Fig. 3. An example page from WikiDashboard [13] project, which inserts a visualization of the social dynamics and edit patterns for every Wikipedia page
6 Conclusion HCI research have greatly benefitted from borrowing evaluation methods that were fine-tuned in other fields, especially the behavioral sciences. Evaluation methods are inseparable from the kinds of science and models that can be build in a field. HCI have long moved beyond the evaluation of a single user sitting in front of a single desktop computer, yet many of our fundamentally held viewpoints about evaluation continues to be ruled by outdated biases derived from this legacy. In this position paper, we have argued that traditional views of human performance in systems have long been only focused on productivity. It is time for us to break out of these longheld views, and look at evaluations in more holistic ways.
Fig. 4. A way to think about the role of Living Laboratory prototypes in scientific research
A Position Paper on 'Living Laboratories'
605
One way to do this is to engage with real users in 'Living Laboratories', in which researchers either adopt or create real useful systems that are used in real settings that are ecologically valid. This enables a tight loop between characterization of behavior, models of the users and system, prototype, and experimentation. The new Social Web platform is enabling researchers to build systems with amazing speed, enabling the whole loop to be completed within much shorter amounts of time than the past. Similar experimentation platforms for mobile computing is just becoming reachable, with iPhone and Google's Andriod leading the charge. These platforms will greatly enable Living Laboratory researchers to conduct evaluations that span many users, places, time, location, and social factors in ways that are unimaginable before. Acknowledgments. We thank PARC’s Augmented Social Cognition team and the HCIC workshop for many helpful discussions on this position paper.
References 1. Card, S., Moran, T.P., Newell, A.: The Psychology of Human Computer Interaction. Lawrence Erlbaum Associates, Mahwah (1983) 2. Card, S.K., English, W.K., Burr, B.J.: Evaluation of mouse, rate-controlled isometric joystick, step keys, and text keys for text selection on a CRT. Ergonomics 21(8), 601–613 (1978) 3. Carter, S., Mankoff, J., Klemmer, S., Matthews, T.: Exiting the cleanroom: On ecological validity and ubiquitous computing. HCI Journal (2008) 4. Chi, E.H.: Introducing Wearable Force Sensors in Martial Arts. IEEE Pervasive Computing 4(3), 47–53 (2005) 5. Engelbart, D.C.: Augmenting Human Intellect: A Conceptual Framework. Summary Report AFOSR-3223 under Contract AF 49(638)–1024, SRI Project 3578 for Air Force Office of Scientific Research, Stanford Research Institute, Menlo Park, CA (1962) 6. Grudin, J.: Groupware and social dynamics: Eight challenges for developers. Communications of the ACM 37(1), 92–105 (1994) 7. Jones, W.P.: On the Applied Use of Human Memory Models: The Memory Extender Personal Filing System. International Journal of Man-Machine Studies 25(2), 191–228 (1986) 8. Kohavi, R., Longbotham, R.: Online Experiments: Lessons Learned. Computer 40(9), 103–105 (2007), doi:10.1109/MC.2007.328 9. Konstan, J.A., Miller, B.N., Maltz, D., Herlocker, J.L., Gordon, L.R., Riedl, J.: GroupLens: Applying Collaborative Filtering to Usenet News in special section: recommendation systems. Communications of the ACM 40(3), 77–87 (1997) 10. Riedl, J., Konstan, J.: Word of Mouse: The Marketing Power of Collaborative Filtering. Warner Books, New York (2002) 11. Sears, A., Jacko, J.A.: The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications. CRC Press, Boca Raton (2008) 12. Shneiderman, B.: Science 2.0. Science 319(5868), 1349–1350 (2008) 13. Suh, B., Chi, E.H., Kittur, A., Pendleton, B.A.: Lifting the Veil: Improving Accountability and Social Transparency in Wikipedia with WikiDashboard. In: Proceedings of the ACM Conference on Human-factors in Computing Systems (CHI 2008), Florence, Italy, pp. 1037–1040. ACM Press, New York (2008) 14. Rowan, J., Mynatt, E.D.: Digital family portrait field trial: Support for aging in place. In: Proc. of CHI 2005 Conference on Human Factors in Computing Systems, pp. 521–530. ACM, New York (2005)
Embodied Interaction or Context-Aware Computing? An Integrated Approach to Design Johan Eliasson, Teresa Cerratto Pargman, and Robert Ramberg Department of Computer and Systems Sciences, Stockholm University/Royal Institute of Technology, SE-164 40 Stockholm, Sweden {je,tessy,robban}@dsv.su.se
Abstract. This paper revisits the notion of context from an interaction design perspective. Since the emergence of the research fields of Computer supported cooperative work and Ubiquitous computing, the notion of context has been discussed from different theoretical approaches and in different research traditions. One of these approaches is Embodied Interaction. This theoretical approach has in particular contributed to (i) challenge the view that user context can be meaningfully represented by a computer system, (ii) discuss the notion of context as interaction through the idea that users are always embodied in their interaction with computer systems. We believe that the particular view on users context that the approach of Embodied Interaction suggests needs to be further elaborated in terms of design. As a contribution we suggest an integrated approach where the interactional view of Embodied Interaction is interrelated with the representational view of Context-aware computing. Keywords: Embodied Interaction, Context-aware computing, Design, Representation, Context.
1 Introduction In his book “Where the Action Is: The Foundations of Embodied Interaction” [1], P. Dourish introduces the idea that tangible and social computing have a common denominator. They both exploit the idea of people’s familiarity and facility with the everyday world, be it a world of physical artifacts or of social interaction. “This role of the everyday world here is more than simply the metaphorical approach used in traditional graphical interface design. […] Instead of drawing on artifacts in the everyday world, it draws on the way the everyday world works or, perhaps more accurately, the ways we experience the everyday world.” (p. 17, orig. italics). Dourish explains that both tangible and social interactions “draw on the fact that the ways in which we experience the world are through directly interacting with it, and that we act in the world by exploring the opportunities for action that it provides to us – whether through its physical configuration, or through socially constructed meanings” (pp. 17). The Embodied Interaction approach [1] can be regarded as a picture of a user actively engaged in human computer interaction. The focus lies on how interaction is played out when the activity develops smoothly and problem-free; when our daily interaction is handled effortlessly, without reflection. The Embodied Interaction J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 606–615, 2009. © Springer-Verlag Berlin Heidelberg 2009
Embodied Interaction or Context-Aware Computing?
607
approach thereby portrays human agents as engaged in an interaction characterized by skilled and continuous coping. It thus describes an understanding of human computer interaction that can be exemplified with a user finding herself in a situation of being able to handle all difficulties and not losing focus in her activity even once. How she gets there and how she manages to remain in this focused activity, is out of the scope. But how well does this picture of skilled and engaged human computer interaction guide design? And more specifically; How well does it guide design of context-aware systems? To understand these questions better we need to return to one of the targeted problems of Embodied Interaction. Namely the gap between the social conception of context and the technical one [2, 3]. Embodied Interaction is but the latest contribution to improving the understanding of this gap. It uses the philosophical tradition of phenomenology as a theoretical departure point for understanding interaction. This tradition has previously been presented in the HCI research community, since it seems to offer a way to take both social and technical views into account. Based on the “present-at-hand” mode of use, Winograd and Flores [4] discuss user activity in terms of “breakdown”. Weiser [5] introduces the concept of transparency, in calm computing and ubiquitous computing research, based on the “ready-to-hand” mode of use. The approach of Embodied Interaction relies much on the idea of a well practiced and smooth interaction, with and through computers, and it deemphasizes the developmental aspects of the user activity. Without addressing these developmental aspects, it is difficult for designers to operationalize the approach of Embodied Interaction in their work. Without looking at how you become a skilled user in the interaction with a system, one opportunity for design is passed over. Advocating history of use, Chalmers [6] have argued that the ideal of transparency, which can be also found in Embodied Interaction, is an unachievable goal. Räsänen and Nyce [7] have pointed out that the approach of Embodied Interaction is reductionist in that it does not go beyond interaction, and focus the here and now too much. We will go one step further to claim that Embodied Interaction does not take all modes of interaction present in the activity into account, and thereby misses out on how skill is acquired. In this respect we will claim that the Embodied Interaction approach overlooks the interplay between learning and practice, between reflection and action that characterizes any kind of human computer interaction. This observation is particularly interesting for design of context-aware computing systems because this field has strong connection to Embodied Interaction. This paper revisits the notion of context in the field of context-aware computing from an integrated design perspective on Embodied Interaction. We believe that the rich conceptualization of Embodied Interaction deserves to be further developed in terms of design of context-aware computing systems. This leads to the following question: How do we design for context-aware computing systems in the light of Embodied Interaction? In this paper we will not try to answer questions about proactivity in context-aware computing. Instead we follow Rogers [8] in that what we aim for is not proactive computing but proactive people.
608
J. Eliasson, T. Cerratto Pargman, and R. Ramberg
2 The Notion of Context from an Embodied Interaction Perspective Grounded in Merleau-Ponty’s Phenomenology of perception [9], Schutz’s Social phenomenology [10] and Heidegger’s Hermeneutic phenomenology [11], Dourish [1] suggests a theoretical approach to human computer interaction which he coins Embodied Interaction. The Embodied Interaction approach views context not as information but as a relation and, as human actors participate in the world, action does not occur in a particular context but context is rather created and recreated in concert with interaction [12]. Because of this, context is not stable but instead a dynamic feature; constantly changing. What is to be regarded as context is thereby determined by the setting, actors and interaction. According to the Embodied Interaction perspective context is not some delineable aspect of a setting that can be encoded and represented [12]. Rather context is something people do. In this way the context model in Embodied Interaction is an interactional model and not a representational model [12]. The view that context is what people do, comes from the primacy of action in Embodied Interaction. An emphasis on action is shared with Situated Action [13], which is also one departure point for Embodied Interaction. Both approaches regard context and meaning as continually changing and only possible to recognize in how interaction unfolds. According to Embodied Interaction, the way we interact with a computer system is a sign of how we relate to the system. Meaning is also embodied, both in a physical and a wider sense. In this way our interaction is dependent on our physical, social and cultural body. The theoretical approach of Embodied Interaction argues against disembodied, objective and reflective use. What Embodied Interaction instead focuses on, inherited from embodiment [9] and being-in-the-world [11], is a moment of mindless interaction, a moment of skilled coping. 2.1 Challenges for Context Design from an Embodied Interaction Perspective Dourish [1] suggests the following six principles as a backdrop for design (pp. 162): 1. 2. 3. 4. 5. 6.
Computation is a medium Meaning arises on multiple levels Users, not designers, create and communicate meaning Users, not designers, manage coupling Embodied technologies participate in the world they represent Embodied interaction turns action into meaning
When trying to design for context from the Embodied Interaction perspective, we are left with these broad design principles. That the principles are broad make them difficult to operationalize. This while the alternative of designing for context, using objective representations, is merely seen as positivist thinking, incompatible with the philosophy put forward by Embodied Interaction [12]. Take for instance the third and fourth design principle above, they directly address the role of designers although they do so in a rather negative, excluding sense. Principle number three and four state
Embodied Interaction or Context-Aware Computing?
609
what designers of these systems should not do. Thereby the role of the interaction designer seems to be marginalized, to an enabling one. It is probably not meant that the ideal we should strive for is the ultimate and final system, allowing for every kind of appropriation and every kind of interaction. Dourish [12] notes that one and the same system should support evolution: “[...] our concern is not simply to support particular forms of practice, but to support the evolution of practice—the ‘conversation with materials’ [Dourish quoting Schön [14]] out of which emerges new forms of action and meaning.” (p. 25). This seems like a contradictory claim as the evolution of practice is only known in retrospect and in analysis. So how can this be used for claims about design? In a passage about Place and Space, Dourish [1] writes: “…place can’t be designed, only designed for.” (p. 91). If Embodied Interaction is about meta-design, then what are the remaining implications for design, and especially design for context? Our interpretation of Embodied Interaction is that interaction designers should leave context and meaning as open to appropriation as possible. What designers ideally should strive for, then, is completely open systems. In these computer systems each user can interact with the most suitable content and structure. The computer system has, from this particular understanding of interacting with computers, to be able to show every possible structure and the current state and configuration of the system [12]. From an Embodied Interaction perspective on human computer interaction, we can design user interfaces, but not how they should work, as creation of meaning should be left to users in their appropriation of the interfaces. Because we are not allowed to design how an interface should work we also cannot explicitly support skill acquisition. In Embodied Interaction skill acquisition is not an issue because it is does not belong in the picture of skilled and engaged coping, and thereby it falls outside the scope of Embodied Interaction. As a result, acquiring skill becomes something magical, something designers will not need to attend to. The Embodied Interaction approach has an interactional model of context. But if the notion of representation is absent in the description of interaction with a system how can designers design for this interaction? The concept of representations is key for the design of computer systems and especially context-aware computing.
3 The Notion of Context in the Field of Context-Aware Computing In context-aware computing the notion of representations of context is seen as a prerequisite for designing context-aware systems. The assumption is that it is possible to divide the context of a device (or a user) into smaller parts and that some of them are more or less objective and stable. Thereby it is possible to meaningfully represent them in a computer system hosting the device. For example Dey et. al. [15] reasons in terms of identifying and analyzing the constituent elements of context. In identifying and analyzing the constituent elements of context, ubiquitous computing research is bottom-up, starting with sensor data representing aspects of the physical environment [15]. One example is when sensor values as GPS coordinates are used in navigational applications. Starting from sensor values
610
J. Eliasson, T. Cerratto Pargman, and R. Ramberg
then context and meaning is inferred up to the level of human interaction with the device. As described in Dey et. al. [15]: “One hypothesis that a number of ubiquitous computing researchers share is that enabling devices and applications to automatically adapt to changes in their surrounding physical and electronic environments will lead to an enhancement of the user experience.” (p. 100). One last step then is to use the model not only to adapt, but also to try to foresee what is going to take place next and let the application act proactively, guessing what users soon might need to have at hand. In this case questions for system designers are how to adapt to context and how to act proactively in context. Obviously, it is a very hard problem to get all these abstractions, models and inferences right. It can certainly be questioned whether these systems will ever succeed outside very specific domains with very limited scope [8, 16]. 3.1 Challenges for Design of Context-Aware Computing Systems Context-aware computing has been blamed for making only small advances and relying too much on systems engineering to solve problems origination in human interaction [3, 17]. It is also questionable whether we will see a major breakthrough in context-aware computing any time soon as the problems of strong AI and proactive computing are still far from solved [8]. The problem for context-aware computing lies in the representational models that are built in context-aware computing. In a representational model there are inherent questions about what is represented and how it is represented. The next question is how different representations are related. Computational representations use specific values, structures and interrelations. There is no vagueness involved, but every possible value, structure and interrelation have to be decided in advance by the designer. The effect of these decisions is that the behavior of each model of context is also at a basic level determined in advance. Because of this the user model and the system model will diverge as soon as the context-aware system is put in use. The context-aware computing solution to this divergence is either to add an exception to the model every time it diverges or to trust in future AI advancements to solve all discrepancies. In the field of context-aware computing physical and digital representations of context are building blocks of design. As opposed to human and social representations, designed representations are bounded in terms of structure and contents. In computer science, representations are the internal software components that together make up a computer program. These digital software components rely on physical hardware components, which in turn bound the representational power. A computer system is then itself built on representations and therefore cannot be non-representational. But this still allows for non-representational use, with embodied physical or digital representations. This duality between non-representational use and designed and bounded representation is present in every interaction with something that is designed. The representations of context in context-aware computing are seen as objective because of their origin in sensor values. But this concept of objective context should not be interpreted as absolute. Even for instance, GPS coordinates are only valid within their social frame, which in this case is a very wide frame. Chalmers [18], in accordance with Ricoeur and Gadamer, writes: “‘Objectivity’ comes from distanciation: representation is fixed, dissociated from intention and only displays universally
Embodied Interaction or Context-Aware Computing?
611
shared references. […] objectivity is not absolute. Instead, we see degrees and forms of distanciation.” (p. 213). Also on objectivity, Dourish [12] writes: “In contrast to the objective and quantitative nature of positivist theories, phenomenological theories are subjective and qualitative in orientation. By ‘subjective’ I mean that they regard social facts as having no objective reality beyond the ability of individuals and groups […]” (p. 21). The interpretation of this is not that everything is subjective in the sense that everyone has their own interpretation, different from everyone else’s. If this were the case, we would not be able to relate to what others do, we would simply not be able to engage in any interaction without questioning every step of it. Instead we socially create meaning, which we use in interaction. That is: ‘objective’ and ‘subjective’ may not be so far apart. As the extreme of objective representations is never the case and as it is impossible to design for the completely subjective, we need to find a point where we can agree. If “groups” in the previous quote is taken to be the people we design for, then we are essentially agreed, and can meet half way between objective and subjective.
4 Towards an Integrated Approach: Reintroducing the Concept of Representations to Embodied Interaction At some level computer technology is always designed. In fact we both design human computer interaction and we design for human computer interaction. One extreme is the socio-cultural approach. Relying on ethnographical methods, we start out by describing specific users as a basis for design and then design for context. In this view human action is in focus. Action is performed within context at the same time as context is interpreted and recreated. With this focus context is never stable and therefore cannot be deliberately designed. The remaining option for a designer is to support user context formation by relying solely on user appropriation. Human action is subjective and situated, rendering each interaction different from the previous one. In Table 1 this corresponds to “action” as mode of use and because of the subjective nature system designers can only support interaction and design representations for context determined by users. The other extreme is the technology perspective, where we design representations of context and let uses adapt to these representations. Context is modeled using objective and stable representations of sensor values. Users can then interact with this computer model where use is objectifying and reflective. The mode of use as seen in Table 1 is characterized by reflection on representations of context. To combine results stemming from these two approaches is challenging, e.g. [12]: “Translating ideas between different intellectual domains can be both exceptionally valuable and unexpectedly difficult. One reason is that the ideas need to be understood within the intellectual frames that give them meaning, and we need to be sensitive to the problems of translation between these frames.” (p. 20). On fundamental ontological disagreements it is questionable whether it can be done at all. On context in computer supported cooperative work and ubiquitous computing, despite the seemingly contradictory approaches, there have been many attempts to bridge or at least narrow the gap between these two intellectual frames [3, 19]. An alternative to bridging the gap would be to acknowledge that the both sides, computer
612
J. Eliasson, T. Cerratto Pargman, and R. Ramberg Table 1. Mode of use related to artefacts of design
Mode of use Action Reflection
Design artefacts Representations for context Representations of context
representations stemming from sensor data and analytical representations of context are necessary. Instead of searching for one common ground for these views of context we note that they are two sides of the same coin. When learning a new system much time and effort goes into figuring out how the system works instead of engaging in the activity itself. At first when the system has been learnt it can be handled without reflection, with skilled and embodied interaction. But still there are instants when “an event ‘leaps to the eye’ because it is expected or is a deviation from that which one would expect” [20] (p. 294). Also Heidegger noted this (here in the words of Dreyfus [21]): “…mental content arises whenever the situation requires deliberate attention.” (p. 70). These points show us towards an answer in revisiting Heidegger’s original view of hermeneutic phenomenology. His famous example with the hammer does not only serve to show how the hammer is transparent in ready-to-hand use, but also how “breakdown” (when the head falls off and the hammer becomes present-at-hand) leads to acquiring skill (in avoiding this malfunction in the future). As Dreyfus [21] says when clarifying Heidegger “…the occurent is necessary for explaining the functioning of the available…” (p. 121). Here Dreyfus uses the terminology “occurent” instead of present-athand and “available” instead of ready-to-hand. Figure 1 shows how ready-to-hand action and present-to-hand reflection are interrelated. With this integrated view there is no necessity to choose between action and reflection, no necessity to choose between designing representations for context and designing representations of context. Instead the mode of use repeatedly shifts between action and reflection.1 Take GPS positioning for example. Most of the time the coordinates are correct and a user can interact with the navigational program without paying too much attention. The mode of use is here seen as “action”. But there are certainly occasions when the mode of use shifts to reflection; for example when a breakdown in interpretation occurs because of a mismatch between the map position and the position in the real world. Another breakdown could occur when a user moves indoors and gets a message about lack of coverage. In both these examples, interaction is interrupted and the user may need to reflect upon what the problem is, to be able to find a solution (e.g. update GPS-data or move outdoors) before interaction can be reengaged. Objective representations for context are not only to be seen as harmful, constraining user context, but they also form a structure to relate to in a hermeneutic interpretation. Instead of trying to give guidelines for how to design one ultimate design, we need to acknowledge that a design and thereby also the designer is part of this hermeneutic development and that continuous redesigns, done by both designer and user, are necessary for the system to stay relevant to a user. 1
Since both modes of use can be found in the Hermeneutic phenomenology of Heidegger there might be no ontological disagreements in the end.
Embodied Interaction or Context-Aware Computing?
613
Reflection Action
Fig. 1. The two modes of use as interrelated
5 Discussion The Embodied Interaction perspective has both turned away from, and argued against objective representations for context. Although the Embodied Interaction view of context contributes to a better understanding of human interaction with and through computers, at the same time it marginalizes objective representations for context without offering an alternative basis for design. Maybe it even marginalizes design as a whole. It is time to turn the perspective back again to enable both design of context and design for context. The alternative to design systems completely open to appropriation is to use current descriptions of context as a basis for design. If we cannot use current descriptions, but instead need to leave more open for appropriation, then the role of the designer is marginalized accordingly. Computer systems always have room for interpretation and appropriation, but through careful design appropriation and skill acquisition can be guided. Leaving more open to appropriation means constraining the choices that the designer has. A similar trend in design was when the concept of affordance became the one guideline overshadowing all others in HCI. Given the Hermeneutic phenomenology perspective, it poses no problem to reintroduce objective representations of context in the philosophy put forward by the Embodied Interaction approach. Action and reflection are just different modes of use where present-at-hand reflection is an important complement besides embodied ready-to-hand action, and it is not one or the other. Users act in context by (hermeneutically) going back and forth from ready-to-hand embodied interaction to presentat-hand reflection and back again. Our integrated approach undoubtedly have much in common with Winograd and Flores [4], focusing “breakdown” as important, but there are differences. They came
614
J. Eliasson, T. Cerratto Pargman, and R. Ramberg
to the conclusion of modeling computer use through utilizing a state machine representation of speech act theory, with labeled states and directed arcs. Our approach is to use present-at-hand categories, but not to build a general model that enforces some elaborate structure. Instead we only point to the interrelation between present-at-hand and ready-to-hand. This approach can be used either to build general systems with small descriptive powers or specific systems with large descriptive power. But our main contribution is that the present-at-hand categories give us a way of talking about design, while still relating to ready-to-hand Embodied Interaction. It is interesting to note what Dourish [1] write about the states of ready-to-hand and present-at-hand. Dourish explicitly refers to these “states” when discussing coupling using a computer system as example: “If there were simply these two states […] However the truth is more complex. As we have seen, the tools through which we operate when interacting with a computer system are not simply physical objects, but software abstractions, too. There are very many of these abstract entities in operation at any given moment, and programs link them together in a variety of ways.” (p. 139). This surely gives the impression of great complexity. Dourish ends this passage in the following: “The consequence, then, is that there are very many different levels of description that could be used to describe my activity at any given moment. Some, perhaps, are ready-to-hand and some present-at-hand at the same time […]” (p. 140). But that some entities are ready-to-hand while others are present-at-hand is nothing new. On a conceptual level even when Heideggers’ hammer was ready-to-hand some other part of the activity was present-at-hand. Computer systems does not change this. If we design these systems with using present-at-hand categories deliberately, we might even bring Embodied Interaction one step forward.
References 1. Dourish, P.: Where the action is: the foundations of embodied interaction. MIT Press, Cambridge (2001) 2. Dourish, P.: Seeking a foundation for context-aware computing. Hum. Comput. Interact. 16, 229–241 (2001) 3. Barkhuus, L.: The context gap, an essential challenge to context-aware computing, vol. Diss. IT University of Copenhagen, Copenhagen (2005) 4. Winograd, T., Flores, F.: Understanding computers and cognition: a new foundation for design. Ablex, Norwood (1986) 5. Weiser, M.: The Computer for the 21st Century. Scientific American 265, 95, 98–102, 104 (1991) 6. Chalmers, M.: A historical view of context. Computer Supported Cooperative Work: CSCW: An International Journal 13, 223–247 (2004) 7. Räsänen, M., Nyce, J.M.: A new role for anthropology?: Rewriting “context” and “analysis” in HCI research. In: ACM International Conference Proceeding Series, vol. 189, pp. 175–184 (2006) 8. Rogers, Y.: Moving on from Weiser’s Vision of Calm Computing: Engaging UbiComp Experiences. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 404– 421. Springer, Heidelberg (2006) 9. Merleau-Ponty, M.: Phenomenology of perception. Routledge, London (2002/1962)
Embodied Interaction or Context-Aware Computing?
615
10. Schutz, A., Luckmann, T.: The structures of the life-world. Northwestern U.P., Evanston (1973) 11. Heidegger, M.: Being and time. Harper, New York (1962) 12. Dourish, P.: What we talk about when we talk about context. Personal Ubiquitous Comput. 8, 19–30 (2004) 13. Suchman, L.A.: Plans and situated actions: the problem of human-machine communication. Cambridge Univ. Press, Cambridge (1987) 14. Schön, D.: The reflective practitioner: how professionals think in action. Basic Books (1983) 15. Dey, A.K., Abowd, G.D., Salber, D.: A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. Human-Computer Interaction 16, 97– 166 (2001) 16. Dreyfus, H.L.: What computers still can’t do: a critique of artificial reason. MIT Press, Cambridge (1992) 17. Håkansson, M.: Playing with context: explicit and implicit interaction in mobile media applications, Vol. Diss. Department of Computer and Systems Sciences (together with KTH), Stockholm University, Kista (2009) 18. Chalmers, M.: Hermeneutics, information and representation. European Journal of Information Systems 13, 210–220 (2004) 19. Chen, Y., Atwood, M.E.: Context-centered design: Bridging the gap between understanding and designing. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 40–48. Springer, Heidelberg (2007) 20. Schmidt, K.: The problem with ’awareness’: Introductory remarks on ’awareness in CSCW’. Computer Supported Cooperative Work: CSCW: An International Journal 11, 285–298 (2002) 21. Dreyfus, H.L.: Being-in-the-world: a commentary on Heidegger’s Being and time, division I. MIT Press, Cambridge (1991)
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards Mieke Haesen, Jan Meskens, Kris Luyten, and Karin Coninx Hasselt University – tUL – IBBT, Expertise Centre for Digital Media, Wetenschapspark 2, B-3590 Diepenbeek, Belgium {mieke.haesen,jan.meskens,kris.luyten,karin.coninx}@uhasselt.be
Abstract. Current tools for multidisciplinary teams in user-centered software engineering (UCSE) provide little support for the different approaches of the various disciplines in the project team. Although multidisciplinary teams are getting more and more involved in UCSE projects, an efficient approach to communicate clearly and to pass results of a user needs analysis to other team members without loss of information is still missing. Based on previous experiences, we propose storyboards as a key component in such tools. Storyboards contain sketched information of users, activities, devices and the context of a future application. The comprehensible and intuitive notation and accompanying tool support presented in this paper will enhance communication and efficiency within the multidisciplinary team during UCSE projects.
1 Introduction When combining HCI techniques and software engineering principles in usercentered software engineering (UCSE), the biggest challenge is the communication within a multidisciplinary team including the end users. MuiCSer, a framework for Multidisciplinary user-centered Software Engineering processes, focuses on the benefits of both disciplines, and was introduced to investigate the features and shortcomings of current UCSE models and tools [1]. One missing link in most user centered processes is a tool to progress from informal design artefacts (e.g. scenario) toward more structured design artefacts (e.g. task model). Most tools and techniques require specific knowledge about specialized notations or models, thus exclude most team members to be involved. Furthermore, functional information may be missing in informal design artefacts while structured design artefacts may not always contain all non-functional information. We propose the usage of storyboards as a comprehensible artefact related to features of graphical user interface design tools to overcome these shortcomings. In summary, the main contributions in this paper are: • •
a novel user-centered design approach that uses storyboards as a common language in a multidisciplinary team; tool support for creating and editing storyboards in order to bridge the gap between the early stages of the UCSE process and the user interface
J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 616–623, 2009. © Springer-Verlag Berlin Heidelberg 2009
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards
617
design. This tool supports the connection between storyboards and artefacts created later in the process.
2 Related Work User-centered processes recommend combining non-functional as well as functional requirements by involving a multidisciplinary team [2]. The early design stages of usercentered design (UCD) include a user needs analysis and generally result in several artefacts such as usability requirements [3], scenarios [4] and personas [5] describing the user needs. These artefacts are written in a narrative style and are usually created by interaction designers. Similar artefacts are used in software engineering and agile development [6] (e.g. essential use cases, scenarios, story cards, user stories). Although several disciplines provide notations to describe user needs, the notations are not always comprehensible for all members of a multidisciplinary team. Lindgaard et al. [7] address the difficulties in presenting user needs for requirements engineering. Earlier studies describe the needs of interaction designers in a multidisciplinary team. Brown et al. [8] conducted an ethnographic study to investigate the collaboration between user interaction designers and developers. The study describes the benefits of stories and sketches in the early stages of user-centered approaches and emphasizes the power of combining both. Assembling stories and sketches is a powerful technique to reveal errors, and to consider temporal and contextual information. Arecent study of Myers et al. [9] reports that designers are experiencing difficulties when designing the behavior of user interfaces. While prototyping the appearance of user interfaces is straightforward, designing and communicating the behavior is an ongoing process. Furthermore, the survey revealed designers frequently use sketches and storyboards. Currently little tool support is available for storyboarding in multidisciplinary teams. Demais [10] and IBM Rational Requirements Composer1 focus on storyboards in the design process of multimedia applications while Denim [11] and Highlight [12] feature storyboards for web applications. All these tools are developed to describe the behavior of software or web applications and support a first walkthrough of the future system or website. The storyboards created using these tools contain mock-ups of UI designs and their relationships, and thus are designed after the requirements gathering of a future application. The ActivityDesigner [13] tool allows storyboarding at the early stages of design. In this tool, designers can extract activities from concrete scenarios making it possible to include rich contextual information about everyday lives as scenes. Based on the scenes, higher level structures and prototypes can be created. The tool we present in this paper also provides the possibility to build storyboards during the gathering of requirements in order to facilitate the creation of artefacts at later stages.
1
http://www.ibm.com/developerworks/rational/library/08/1118_zhuo, last visited 6 January 2009.
618
M. Haesen et al.
3 Storyboards In a UCSE process, a report of a user needs analysis, scenarios and personas are presented to the entire team after conducting the first user studies. Structuring artefacts that are written in a narrative style is a complex though important process. All artefacts created at later stages need to be consistent with these first results. Unfortunately little tool support is available for the first transition in UCSE processes. This implies that the entire team needs to verify consistency among the informal results of a user needs analysis and artefacts created later in the process. A good understanding within the multidisciplinary team at this point can be crucial for the resulting user experience. We investigate how storyboards can be used in UCSE processes by a multidisciplinary team. 3.1 Users, Activities, Devices, Context The professional use of storyboards originates from the film-industry and is getting introduced in several disciplines such as advertisement and product design [14]. In UCSE a storyboard can have several meanings. Storyboards can depict manual steps, users interacting with a product, screen mockups of a new work practice or the link with the system behind-the-scenes [6]. The focus on visual information renders it highly comprehensible for any member of the team, independent of their background or role in the team [14] [15]. In the context of our research, we want to define storyboards as sketches of real life situations, depicting users carrying out several activities by using devices in a certain context. An example of a simple storyboard is presented in the center of Fig. 1. Since storyboards contain a lot of information about the future use of an application, they can be used to provide a link between a user needs analysis and requirements gathering, containing functional as well as non-functional requirements. Furthermore, the natural style of presenting the use of a future system implies this artefact is very comprehensible for all team members including end users. Since scenes of a storyboard contain contextual information, they are suitable for the specification of context-aware applications. This contextual information has to be taken into account during the entire development process, thus storyboards can contribute to the evaluation, verification and validation of several stages. 3.2 Bridging the Early Stages of UCSE Processes The creation of storyboards happens at the early stages of a UCSE process, after the creation of scenarios and personas. An example storyboard and the interrelationship between a storyboard and other artefacts are presented in Fig. 1. A storyboard is built by splitting up the scenario into scenes and presenting the scenes as sketches depicting users interacting with the future system. Connecting scenes of a storyboard, structures the narrative information of the scenario. The understandability of storyboards increases the amount of team members that can collaborate during this phase. Even end-users can be involved to create or evaluate storyboards.
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards
619
Fig. 1. A storyboard and its interrelationship with other artefacts in the development process. Situations and devices in the scenes are extracted from scenarios, while the user information is extracted from the scenarios as well as the personas. The storyboard is used as input for the creation of task flow diagrams and the UI designs.
Once all scenes are added to the storyboard, personas and devices can be highlighted in each individual scene. This enriches the information contained by the storyboard and can be used to make the transition to other artefacts. Task flow diagrams, presenting user actions and processes to complete a task, can be produced based on the information in and the connections between the scenes of the storyboard. At a later stage of the development process, the storyboard can guide the UI design and development. By carefully considering the situation of each scene, designers and developers build an application corresponding to context, requirements and constraints contained by the storyboard. Interaction designers can use a storyboard to verify that the UI designs take into account all requirements. A storyboard also contributes to the preparation of the usability tests. Using storyboards in UCSE processes increases the visibility of the project. New team members for instance, can explore the requirements of the project at a glance by looking at the storyboard.
4 Tool Support for Storyboards As stated above, storyboards contribute to the development process of software applications in multidisciplinary teams. When suitable tool support is available for all team
620
M. Haesen et al.
members, storyboards become more powerful and the visibility and traceability of a project increase. A literature survey [1] showed there is a need for tools that support UCSE processes in the early stages of design. Since storyboards are created during the requirements gathering, storyboarding tools can partly cover transitions between the early stages of UCSE. Furthermore, storyboards are a very suitable artefact to specify the use of context-aware applications, thus we decided to integrate the tool support for storyboards in the Gummy [16] GUI builder tool. Gummy supports graphical design of multi-device and context-aware user interfaces. To enable this, Gummy automatically adapts its workspace according to the considered target platform and thus allows designers to create user interfaces for a wide range of devices without having to change their work practices. The inclusion of storyboards during this design stage better describes the context of a user interface and provides a more convenient way to describe the intended contextof-use. This way, the storyboards provide guidelines for the design of the UI. In the storyboarding extension of Gummy, a team member, e.g. an interaction designer starts the creation of the storyboard by loading a scenario into the workspace. Following, a sequence in the scenario can be selected and consequently, a new scene can be created. The sequence of the scenario is automatically added to the scene as a description, while the interaction designer can load an image and add a title. The image of the scene can be a photo of the user observations or a scanned sketch, which encourages designers to sketch in a creative and informal way [11]. A screenshot of the storyboarding tool is shown in Fig. 2.
Fig. 2. A screenshot of the storyboarding extension in Gummy. Scenarios can be loaded on the left panel. For a selected sequence in the scenario a new scene can be created. A scene can contain sketches of users interacting with the future system, a title and a description.
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards
621
For each scene in a storyboard, team members can add annotations and point out the personas and devices. When the specifications of a device (e.g. screen size) are included in the scenes, this information can be considered when the workspace for the UI design is loaded. The contextual information of the scenes (e.g. sketches presenting the environment or courses of communication) can be used as guidelines for the UI design without obstructing the creativity of UI designers. By extending existing tool support, the visibility and traceability of UCSE processes can be enhanced. The storyboard extension makes it possible to include the results of the first UCSE stages (user needs analysis) and helps to process and structure these narrative artefacts. Furthermore, a visualization of a scenario by scenes makes it possible to see the usability requirements at a glance, which improves the communication and efficiency in the project team.
5 Ongoing and Future Work Storyboards are implicitly used in different ways by multidisciplinary teams. This partly explains the many interpretations of storyboards and reveals the challenges in developing a storyboarding tool for multidisciplinary teams. In ongoing work we are carrying out a survey considering the roles in a multidisciplinary team and the tools used by members of a team. Observations and interviews are organized to investigate current practices of multidisciplinary teams in industry. Furthermore, storyboards as defined in this paper are introduced in a multidisciplinary project team and the storyboarding tool will be evaluated during several iterations. Based on the findings of these studies, we will fine-tune the features of storyboards and the relationships between storyboards and other artefacts. As the current version of the storyboarding tool is intended for individual use, the user studies may also reveal some expectations of teams regarding a distributed and a collaborative version of the tool. Furthermore, contextual information and platform specifications can be extracted from the scenes in a storyboard to guide design of UIs in the Gummy GUIbuilder.
6 Conclusion In this paper we described how storyboards can contribute to UCSE. By sketching users interacting with a future application, pointing out devices and adding annotations in the early stages of a UCSE project, these storyboards contain functional and nonfunctional requirements. Storyboards can contain rich contextual information and are based on an intuitive notation providing more structure than narrative scenarios of use. We integrated tool support for the creation and the use of storyboards in the Gummy multi-device GUI builder. Ongoing and future studies are being carried out to examine the approach of multidisciplinary teams in industry and to adapt the storyboarding tool according to current practices.
622
M. Haesen et al.
This new level of tool support can simplify the creation of artefacts at later stages of a development process and improves the communication within a multidisciplinary team. The comprehensibility of storyboards allows non-technical team members to be involved in the first activities of model-based UI development. Consequently, the loss of information after a user needs analysis will decrease while the visibility and traceability of a project increase. Storyboards are a common language in multidisciplinary teams, which contributes to the user experience of the final user interface. Acknowledgements. Part of the research at EDM is funded by EFRO (European Fund for Regional Development) and the Flemish Government. The MuiCSer process framework and the Gummy tool, including the storyboarding extension, are based on our experiences in the IWT project AMASS++ (IWT 060051).
References 1. Haesen, M., Coninx, K., Van den Bergh, J., Luyten, K.: MuiCSer: A Process Framework for Multi-Disciplinary User-Centered Software Engineering processes. In: Proceedings of Human-Centred Software Engineering, September 2008, pp. 150–165 (2008) 2. International Standards Organization. ISO 13407. Human Centred Design Process for Interactive Systems. Geneva, Swiss (1999) 3. Redmond-Pyle, D., Moore, A.: Graphical User Interface Design and Evaluation. Prentice Hall, London (1995) 4. Carroll, J.M.: Making use: scenario-based design of human-computer interactions. MIT Press, Cambridge (2000) 5. Pruitt, J., Adlin, T.: The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan Kaufmann, San Francisco (2006) 6. Holtzblatt, K., Wendell, J.B., Wood, S.: Rapid Contextual Design: A How-to Guide to Key Techniques for User-Centered Design (Interactive Technologies). Morgan Kaufmann, San Francisco (December 2004) 7. Lindgaard, G., Dillon, R., Trbovich, P., White, R., Fernandes, G., Lundahl, S., Pinnamaneni, A.: User needs analysis and requirements engineering: Theory and practice. Interact. Comput. 18(1), 47–70 (2006) 8. Brown, J., Lindgaard, G., Biddle, R.: Stories, Sketches, and Lists: Developers and Interaction Designers Interacting Through Artefacts. In: Proceedings of Agile 2008, pp. 39–50 (2008) 9. Myers, B.A., Park, S.Y., Nakano, Y., Mueller, G., Ko, A.: How designers design and program interactive behaviors. In VL/HCC, pp. 177–184 (2008) 10. Bailey, B.P., Konstan, J.A., Carlis, J.V.: Demais: designing multimedia applications with interactive storyboards. In: ACM Multimedia, pp. 241–250 (2001) 11. Newman, M.W., James, A.L.: Sitemaps, storyboards, and specifications: A sketch of web site design practice. In: DIS 2000 Designing Interactive Systems, pp. 263–274. ACM Press, New York (2000) 12. Nichols, J., Lau, T.: Mobilization by demonstration: using traces to re-author existing web sites. In: IUI 2008: Proceedings of the 13th international conference on Intelligent user interfaces, pp. 149–158. ACM Press, New York (2008)
Supporting Multidisciplinary Teams and Early Design Stages Using Storyboards
623
13. Li, Y., Landay, J.A.: Activity-based prototyping of ubicomp applications for long-lived, everyday human activities. In: Proceedings of the Conference on Human Factors in Computing Systems, CHI 2008, pp. 1303–1312 (2008) 14. van der Lelie, C.: The value of storyboards in the product design process. Personal Ubiquitous Computing 10(2-3), 159–162 (2006) 15. Sova, R., Sova, D.H.: Storyboards: a dynamic storytelling tool. Technical report, Sova Consulting Group, Tec-Ed Inc. (2006) 16. Meskens, J., Vermeulen, J., Luyten, K., Coninx, K.: Gummy for multi-platform user interface designs: Shape me, multiply me, fix me, use me. In: Proceedings of the working conference on Advanced Visual Interfaces, AVI 2008, pp. 233–240. ACM Press, New York (2008)
Agent-Based Architecture for Interactive System Design: Current Approaches, Perspectives and Evaluation Christophe Kolski1, Peter Forbrig2, Bertrand David3, Patrick Girard4, Chi Dung Tran1, and Houcine Ezzedine1 1
LAMIH – UMR8530, University of Valenciennes and Hainaut-Cambrésis, Le Mont-Houy, F-59313 Valenciennes Cedex 9, France
[email protected] 2 University of Rostock, Computer Science Department, Albert-Einstein-Str 21, D-18051 Rostock, Germany
[email protected] 3 LIESP, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, F-69134 Ecully Cedex, France
[email protected] 4 ENSMA / LISI, Teleport 2, 1 Avenue Clément Ader, B.P. 40109, F-86961 Futuroscope Chasseneuil Cedex, France
[email protected] Abstract. This paper proposes a survey concerning agent-based architectures of interactive systems. This survey is focused on certain models and perspectives. Indeed, general agent-based architectures are first presented. Then agent-based approaches dedicated to CSCW systems are reviewed. The appearance of web services requires new agent-based approaches; basic ideas are introduced. Agent-based interactive systems necessitate new tools for their evaluation; an example of representative evaluation tool is presented. Keywords: Human-computer interaction, Architecture model, agent-based systems, CSCW, design, evaluation.
1 Introduction Since 1983 and the Seeheim’s workshop, architecture is an important research topic in the Human-Computer Interaction domain. It started by defining recommendations for developers, and today, it allows tool definition that will help designing, developing and validating interactive systems. Different types of interactive system architectures have been proposed in the literature. The paper proposes a survey about agent-based architectures. A first global overview of models available in the literature is showed in Fig. 1. The paper is composed of four main parts. In the first one, basic principles of architecture models will be introduced; general agent-based approaches will be listed. The second part concerns agent-based approaches dedicated to Computer Supported Cooperative Work (CSCW). The third part will link agent-based architecture and web services domains. Finally, the fourth part concerns evaluation of interactive systems based on an agent-based architecture; the first version of a dedicated evaluation tool will be briefly exposed. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 624–633, 2009. © Springer-Verlag Berlin Heidelberg 2009
Agent-Based Architecture for Interactive System Design
625
Fig. 1. Global overview of available architecture models
2 From Seeheim Model to Agent-Based Architectures Two main approaches of architecture models were first elaborated: global models, and agent-based models. Global models define a precise structure based on a fixed number of components, whose role and nature are precisely defined. The well-known Seeheim model is the first of them [1]. It recommends developing user interfaces as a separate module, connected to a functional core on which it must lean. The interface itself is organized in three parts: the Presentation (devoted to the management of inputs and outputs), the Controller (defined as a component that manages the sequence of interaction elements) and the Application Interface (which allows the translation between the interactive “world” and the functional core). The main interest of the Seeheim model is to give original definitions that establish good foundations for all works on architecture and tools in HCI. For example, the Arch model [2] proposes some modifications of the Seeheim model (including the functional core into the model, defining an additional component, defining the notion of a “slinky model”), but keeps the main definitions, namely for dialogue control. Nevertheless, global models bring forward some drawbacks, mainly when trying to apply object-oriented approach. While current object-oriented interactive application may involve hundreds of cases, the global structure gives no help on defining elementary interaction classes. MVC (Model-View-Controller) [3], and then agent-based architecture models, such as PAC (Presentation-Abstraction-Control [4], AMF (multi-Agent-Multi-Facets [5]) and AoMVC (Agent-oriented MVC [6]), were designed to solve this problem. They define elementary software bricks composed of some parts (fixed number or not), and define the relations that must exist between bricks and parts. Some of them have been defined as design patterns. So doing, global functions such as Dialogue Control or Presentation are split in each elementary agent, what helps to support iterative design. Some tools to help define applications with these models have been designed, see for instance [7]. However, as global models, agent-based architecture models suffer from problems. Choosing the right level of decomposition is hard for non-experienced developers. More, ensuring strictly the rules of the model (for example, a PAC object only knows its father and its sons) may be difficult when implementation considerations are to be taken into account. Hybrid models, which are supposed to benefit the most from the two approaches, emerged. Mainly, these models lean on a global definition of the architecture based on
626
C. Kolski et al.
the Arch model, and use an object-oriented approach to refine some of the main components, such as the Presentation or the Controller. For example, PAC-Amodeus [8] facilitates the design of multimodal applications. Another example is H4, a model that was defined firstly for the Computer Aided Design area; tools were created for various applications, to help the design of applications [9, 10], to help their validation [11], or both [12]. Other related research proposes architecture models concerning distributed and plastic UI [13, 14].
3 Agent-Based Architectures: Approaches Dedicated to CSCW Systems CSCW systems are not only interactive systems, but also and mainly multi-user distributed systems. For these reasons their architecture must answer new requirements. Three important characteristics are: (1) taxonomy of collaborations, which can be either related to the crossing in a matrix location (local or distant), and temporal view (synchronous or asynchronous), as suggested by [15], or related to the nature of cooperation (asynchronous cooperation, in session cooperation, in meeting cooperation and close cooperation [16]); (2) awareness is the information about activities done by other actors, needed in synchronous cooperation, which can be actor oriented (their effective participation) or production oriented expressed by WYSIWIS (What You See Is What I See) acronym with a strict or relaxed view of working data; (3) nature of cooperation activities can be examined, as initially proposed by [17] in relation to the support of three main kinds of activities, i.e. production, conversation/ communication and coordination between participants. From an architectural point of view, CSCW systems are clearly inspired by interactive systems architectures, i.e. layered, agent and hybrid architecture are also used for CSCW systems. We can mention Zipper [18] and Dewan [19] models for layered collaborative systems, based mainly on ARCH model adaptation to multi-user distributed situations. ALV and AMF-C [20] are the representatives of agent-based systems. They generalize PAC agent model for collaborative distributed situations. CoPAC, PAC* and Clover (all described in [21]) are typical examples for hybrid systems. In this last case, they reuse ARCH model and adapt it to multi-user and distributed situations. All these architecture models take into account synchronous collaboration allowing real time interaction between cooperating actors. Distant and local interactions are treated in the same way, as only mediated interactions are taken into account, i.e. direct local nonmediated interaction is not supported. Asynchronous collaboration is not addressed mainly because in this case multi-user interaction, awareness and cooperative operations are not done by interaction. Awareness of shared artifacts (data) and participating actors is more or less supported as well as strict and relaxed WYSIWIS. Concerning cooperation activities (production, conversation and coordination), these are either fundamental elements (for PAC* and Clover) or naturally integrated (AMF-C). Hybrid architectures are either agent-based only in Control part of the model (CoPAC and PAC*) or agent orientation can be used also in other parts of the model. Recent evolution of cooperative systems is related to the mobility of the actors, evolving in augmented real environment with pervasive behavior of the environment and related context-aware computing. The concept of nomadism (networking, handheld
Agent-Based Architecture for Interactive System Design
627
devices, mobile communicating objects technology, localization and permanent or non permanent connectivity) extends the CSCW and allows us to introduce the concept of "capillary" CSCW [16]. We use this term by analogy with the network of blood vessels. As its name implies, the purpose of the capillary CSCW is "to extend the capacities provided by co-operative working tools in increasingly finer ramifications, from their use on fixed proprietary workstations to small handheld devices". Main characteristics are: management of collaboration and coordination of the mobile actors, coherence and validity of the information exchanged between handheld devices which are connected only intermittently to the network and the "group" with the aim of having the most synchronized possible information, heterogeneity of the communication protocols of the handheld devices and constraints of interface and overall capacity of the handheld devices in terms of size of screen, speed transmission, memory, autonomy, as well as the interaction devices. In recent evolution of the AMF-C model, its transformation from a fully agent-based system to hybrid system, integration of IRVO perception of new paradigm of interaction (interaction with real and virtual objects) allows it to fully address problems with capillary cooperative systems. In this new mobility context adaptation to different interaction devices, environmental situations, software and hardware platforms and user preferences becomes the core problem. Adaptation techniques can be classified in four different categories ranging from easiest to implement to most powerful: Translation techniques; Markup language-based approaches; Reverse and re-engineering techniques; Model-based approaches. Designing and implementing interactive collaborative applications that are adaptable (manually) or adaptive (automatically) to the context of use requires consideration of the characteristics of the user, the interactive platform as well as the constraints and capabilities of each environment. A state of the art survey shows us that among the large majority of existing approaches for adaptation, the model-based approach seems to be the most powerful. Such approach uses high level and abstract representations that can be instantiated latter on in the development lifecycle to meet specific usability requirements. However, these approaches need to combine apparently independent models such as concepts (e.g. UML), task (e.g. CTT), platform (e.g. CC/PP) or user profiles. The relationships between these models need to be defined at the design step and refined at run-time in order to be able to achieve the overall usability. Our belief is that, what we refer to as an interaction model is the right place to glue together all the models and usability attributes. This model must support both design stage linking other models and run-time. In addition, because Software Engineering and HCI have shown the importance of clearly separate functional core from presentation components, our interaction model is supported by a well structured architecture. In this new version of the AMF-C architectural model [22], we maintain the basic characteristic of the model, i.e. the Multi-faceted approach allows the creation of new facets, to clarify the behavior and allow automation of implementation process; a graphical formalism that expresses the control structure of multi-user interactions and adaptation in real time of awareness characteristics; and a run-time model that allows dynamic control of interactions. We add IRVO interaction formalism allowing the expression of new augmented reality interactions and we structure the system with hybrid approach, allowing to mix XML specifications, engine based interpretation and connection to real components of functional core or managing new interaction devices (Fig. 2).
628
C. Kolski et al.
Fig. 2. Relations between Arch model (dashed lines), AMF-C and IRVO models
4 Web Services and Agent-Based Architectures Web services lead to new possibilities and problems concerning distributed system design. Fig. 3 suggests a complex industrial organization exploiting web services.
Fig. 3. Example of different actors communicating directly or not via web services [23]
The traditional web services provide functionalities based on classical client/server architecture, but agent-based architectures offer new perspectives in this field. They utilize autonomous and proactive behaviors of agents. Interesting new approaches appear in the literature. For instance, a technical framework for AWS (agent-based web services) is described in [24]; it supports the idea of capturing, modeling and implementing service functionalities with autonomous and dynamic interactions. Technically agent-oriented software construction, knowledge representation and interaction mechanisms are integrated. Fig. 4 gives an impression of the framework. DAML-S (DARPA agent markup language for services) is a semantic markup language for describing web services and related ontologies. It has been superseded by OWL-S [25]. A discussion of dynamic web-service invocation by agents can be found in [26]. Their infrastructure is a hybrid peer-to-peer model. Agents are used to specify service providers and service customers. For this purpose JADE [27] (Java Agent Development Environment) is used; it is a framework developed as open source project. A
Agent-Based Architecture for Interactive System Design
629
web service can be published as a JADE agent service and agents services can be published as web service endpoints (see also [24]). Such propositions have to be considered with attention regarding agent-based architecture perspectives concerning service-oriented interactive systems.
Business Application Environment (Business-oriented protocols) (e.g. contract net and e-auction for e-Marketplace) Web Service Operation Protocols (e.g. WSDL, BPEL4WS and WS Security)
AWS: #m
Business Operator
Service Protocols
(e.g. DAML-S)
Communicatio
Interaction
ProblemSolving
HTTP
Communicatio
e
Interaction
(e.g. DAML-S) led g
SOAP
SOAP
Kn ow
ow Kn
ge led
ProblemSolving AWS: #m
Representation
Application Entities
Communication
Transportation
Fig. 4. Integrated technical framework for agent-based web services [24]
5 Agent-Based Architectures: The Evaluation Problem The evaluation of interactive systems aims at ensuring that users are capable of realizing their tasks. The evaluation methods and tools are numerous and of different types; they are generally based on two global criteria: utility and/or usability [28]. When the interactive system uses an agent-based architecture, new methodological and conceptual questions appear. For instance: how to evaluate such systems? Is it necessary to combine several evaluation methods? Is it possible to be assisted by automated or semi-automated evaluation tools? How to connect such tools to the agent-based systems? How to link the agents’ behaviors with the analyzed situations? There are several further questions. We are particularly interested in automated or semi-automated tools. An electronic informer (EI) is a software tool that captures automatically interactions between the user and the UI in real situations in a discreet and transparent way, so that the user does not feel hampered by the tool. The captured data are objective and can be scientifically analyzed by the evaluators. For a review about EI, we refer to [29]. Several tools are available, but very few of them take into account the specificities of agent-based interactive systems in their evaluation approaches [30, 31, 32, 33]. The architecture of a tool dedicated to such systems is showed in Fig. 5. This kind of EI aims at capturing not only interactions between user and interface agents in terms of occurred UI events like other EIs, but also interactions between agents themselves in terms of interactions between services. It aims also to go further than other EIs to assist evaluators in interpreting analysis results of captured data in order to evaluate three aspects of an agent-based interactive system: user interface (UI), some non-functional properties (such as response time, reliability, complexity, etc.), and
630
C. Kolski et al.
Fig. 5. Example of a tool for evaluating agent-based interactive systems [33]
properties of users to operate systems (ability, habits, preferences, progress of a certain user, etc.). Seven independent modules compose this tool. The module 1 is responsible of capturing events that occur from all agents of the system and then, it saves them into a database that will be analyzed by other modules. The connection between this EI and the evaluated agent-based system is based on the association of each type of agents (interface agents, controller agents, application agents) with a corresponding informer. The evaluation can be remotely realized. This module 1 and the evaluated system can run on the same machine, or on two different ones on the network. After capturing data, this EI enables the evaluator to determine tasks that user has realized (module 2). Some synthetic calculations and statistics can be realized on captured data such as the number and frequency of occurred events, average response time of service interactions, time taken to realize a task, number of successful or failed tasks, etc., of any chosen agent or all the agents in any chosen period of time. These analysis results will be showed to the evaluator using tables or graphs (module 3). The tool also enables the generation of Petri Nets (PNs) and the evaluator can
Agent-Based Architecture for Interactive System Design
631
compare PNs (module 4 and 5). A generated PN describes user’s actions in terms of UI events (that have ever occurred on interface agents) and system’s actions in terms of executed services of agents in order to realize a certain task. Generated PNs are called observed PNs or real PNs. The evaluator can compare real PNs to realize a certain task of a certain user with theoretical PNs predicted by the designers for the same task or he/she can do the comparison between real PNs to realize the same task of different users. Exploiting formal aspect of the PNs, such comparisons are very useful for evaluators to detect problems of the interface, the system or the users such as: bad or useless actions of users, non-optimal way chosen by users to realize tasks, failed service interactions, properties of users (habits, evaluation and comparison of abilities of different users, supervision of the progress of abilities of a certain user, etc.). The analysis results of the module 3, the generation and comparison of PNs, all these results can be interpreted with the indications of module 6 (that enables the association with an open list of determined criteria) to help evaluators critique the system and propose useful suggestions to the designers for improvements. This tool is representative of a new generation of tools dedicated to agent-based interactive systems. A lot of research is still necessary in this domain (adaptation to different application fields and architecture models, helps in real time…).
6 Conclusion and Perspectives Since the eighties, many models and approaches are proposed in the literature concerning so-called distributed or agent-based architectures of interactive systems. By lack of place, it was just possible to propose a brief overview of this domain, about (1) general agent-based architecture models, (2) models dedicated to CSCW systems, (3) interactive systems based on web services, (4) evaluation of interactive systems using agent-based architecture. Many research and development perspectives can be now envisaged. Currently, general agent-based architecture models are mainly used at the conceptual level. They allow good design of application, minimizing dependencies and improving maintainability of applications. They need now to be more largely used at the implementation level. Their inclusion into integrated development environments, such as Eclipse for example, might be the next step to allow tools to be developed. Help for software design, simulation, and evaluation are the main topics that are to be addressed. Capillary cooperative systems need important context adaptation. These mechanisms are more easily elaborated in hybrid architectures using agents in several layers. The benefits of autonomous behavior and independence of agent-based systems constitutes an important advantage. Many researches concern currently context-aware interactive systems; different types or generations of adapted agent-based architecture models have to be progressively proposed. Agent-based systems might help to dynamically compose web services. In this way they can support dynamic adaptation of workflow systems. Many research problems have also to be studied and solved regarding the evaluation of agent-based interactive systems.
632
C. Kolski et al.
Acknowledgements. The present research work has been supported by CISIT, the Nord-Pas-de-Calais Region, the European Community (FEDER). The authors gratefully acknowledge the support of these institutions.
References 1. Pfaff, G.E.: User interface management system. Springer, Heidelberg (1985) 2. Bass, L., Little, R., Pellegrino, R., Reed, S.: The Arch Model: Seeheim revisited. In: Proceedings of User Interface Developers Workshop, Seeheim (1991) 3. Goldberg, A.: Smalltalk-80, the interactive programming environment. Addison-Wesley, Reading (1983) 4. Coutaz, J.: PAC, an Object-Oriented Model for Dialog Design. In: Bullinger, H.-J., Shackel, B. (eds.) Proc. Interact 1987, 2nd IFIP International Conference on HumanComputer Interaction, Stuttgart, Germany, September 1-4, 1987, pp. 431–436 (1987) 5. Ouadou, K.: AMF: Un modèle d’architecture multi-agents multi-facettes pour Interfaces Homme-Machine et les outils associés (in French), PhD Thesis, ECL, Lyon (1994) 6. Goschnick, S., Sterling, L.: Shadowboard: an Agent-oriented Model-View-Controller (AoMVC) architecture for a digital self. In: Proc. Int. Workshop on Agent Technologies over Internet Applications (ATIA 2001), Tamkang University, Taipei, Taiwan (2001) 7. Jambon, F.: From Formal Specifications to Secure Implementations. In: Kolski, C., Vanderdonckt, J. (eds.) Computer-Aided Design of User Interfaces (CADUI 2002), pp. 43–54. Kluwer Academics, Dordrecht (2002) 8. Nigay, L.: Conception et modélisation logicielles des systèmes interactifs: application aux interfaces multimodales (in French), PhD Thesis, Joseph Fourier Univ., Grenoble (1994) 9. Texier, G., Guittet, L., Girard, P.: The Dialog Toolset: a new way to create the dialog component. In: Stephanidis, C. (ed.) Universal Access in HCI, pp. 200–204. Lawrence Erlbaum Associates, Mahwah (2001) 10. Depaulis, F., Maiano, S., Texier, G.: DTS-Edit: an Interactive Development Environment for Structured Dialog Applications. In: Kolski, C., Vanderdonckt, J. (eds.) ComputerAided Design of User Interfaces (CADUI 2002), pp. 75–82. Kluwer Academics, Dordrecht (2002) 11. Francis, J., Girard, P., Boisdron, Y.: Dialogue Validation from Task Analysis. In: Duke, D.J., Puerta, A. (eds.) Eurographics Workshop on Design, Specification, and Verification of Interactive Systems (DSV-IS 1999), Braga, Portugal, pp. 205–224. Springer, Heidelberg (1999) 12. Baron, M., Girard, P.: SUIDT: Safe User Interface Design Tool. In: International Conference on Intelligent User Interfaces Computer-Aided Design of User Interfaces (IUICADUI 2004), Madeira, Portugal, pp. 350–351. ACM Press, New York (2004) 13. Balme, L., Demeure, A., Barralon, N., Coutaz, J., Calvary, G.: CAMELEON-RT: A Software Architecture Reference Model for Distributed, Migratable, and Plastic User Interfaces. In: Markopoulos, P., Eggen, B., Aarts, E., Crowley, J.L. (eds.) EUSAI 2004. LNCS, vol. 3295, pp. 291–302. Springer, Heidelberg (2004) 14. Calvary, G., Daassi, O., Coutaz, J., Demeure, A.: Des widgets aux comets pour la plasticité des systèmes interactifs. Revue d’interaction Homme-Machine 6(1), 33–53 (2005) 15. Ellis, C.A., Gibbs, S.J., Rein, G.L.: Groupware: some issues and experiences. Communication of ACM 34(1), 39–58 (1991)
Agent-Based Architecture for Interactive System Design
633
16. David, B., Chalon, R., Vaisman, G., Delotte, O.: Capillary CSCW. In: Stephanidis, C., Jacko, J. (eds.) Human-Computer Interaction Theory and Practice, LEA, pp. 879–883 (2003) 17. Ellis, C.A., Wainer, J.: A Conceptual Model of Groupware. In: Proceedings of CSCW 1994 Conference, pp. 79–88. ACM Press, New York (1994) 18. Patterson, J.F.: A taxonomy of architectures for synchronous groupware applications. In: Workshop on Software architectures for cooperative systems CSCW 1994. ACM SIGOIS Bulletin Special Issue Papers of the CSCW 1994 workshops, vol. 15(3) (April 1995) 19. Dewan, P., Choudhary, R.: Coupling the User Interfaces of a Multiuser Program. ACM Transactions on Computer-Human Interaction 2(1), 1–39 (1995) 20. Tarpin-Bernard, F.: Architectures logicielles pour le travail cooperatif (in French), PhD Thesis, Ecole Centrale de Lyon, France (1997) 21. Laurillau, Y.: Conception et réalisation logicielles pour les collecticiels centrées sur l’activité de groupe: le modèle et la plate-forme Clover (in French), PhD Thesis, Joseph Fourier University, Grenoble (2002) 22. Tarpin-Bernard, F., Samaan, K., David, B.: Achieving usability of adaptable software: the AMF-based approach. In: Seffah, A., Vanderdonckt, J., Desmarais, M.C. (eds.) HumanCentered Software Engineering, Software Engineering Models, Patterns and Architectures for Human-Computer Interaction, Springer, Heidelberg (2009) 23. Idoughi, D.: Contribution à un cadre de spécification et conception d’IHM de supervision à base de services web dans les systèmes industriels complexes, application à une raffinerie de sucre (in French), Ph.D. Thesis, University of Valenciennes, France (2008) 24. Li, Y., Shen, W.-m., Ghenniwa, H., Lu, X.: Model-Driven Agent-Based Web Services IDE. In: Wang, S., Tanaka, K., Zhou, S., Ling, T.-W., Guan, J., Yang, D.-q., Grandi, F., Mangina, E.E., Song, I.-Y., Mayr, H.C. (eds.) ER Workshops 2004. LNCS, vol. 3289, pp. 518–528. Springer, Heidelberg (2004) 25. Paolucci, M., Sycara, K.: Autonomous Semantic Web Services. IEEE Internet Computing 7, 34–41 (2003) 26. Yang, H., Chen, J., Meng, X., Zhang, Y.: A Dynamic Agent-based Web Service Invocation Infrastructure. In: Proceedings of the First Int. Conf. on Advances in ComputerHuman Interaction, Sainte Luce, Martinique, pp. 206–211 (2008) 27. Bellifemine, F., Caire, G., Poggi, A., Rimassa, G.: JADE: A software framework for developing multi-agent applications. Lessons learned, Information & Software Technology 50(1-2), 10–21 (2008) 28. Nielsen, J.: Usability Engineering. Academic Press, Boston, MA (1993) 29. Hilbert, D.M., Redmiles, D.F.: Extracting usability information from user interface events. ACM Computing Surveys 32(4), 384–421 (2000) 30. Trabelsi, A., Ezzedine, H., Kolski, C.: Architecture modelling and evaluation of agentbased interactive systems. In: Proc. IEEE SMC 2004, The Hague, pp. 5159–5164 (2004) 31. Tarby, J.-C., Ezzedine, H., Rouillard, J., Tran, C.D., Laporte, P., Kolski, C.: Traces using aspect oriented programming and interactive agent-based architecture for early usability evaluation: Basic principles and comparison. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 632–641. Springer, Heidelberg (2007) 32. Ezzedine, H., Bonte, T., Kolski, C., Tahon, C.: Integration of traffic management and traveller information systems: basic principles and case study in intermodal transport system management. Int. J. of Comp., Com. & Control (IJCCC) 3, 281–294 (2008) 33. Tran, C.-D., Ezzedine, H., Kolski, C.: A generic and configurable electronic informer to assist the evaluation of agent-based interactive systems. In: 7th international conference on Computer-Aided Design of User Interfaces, CADUI 2008, Albacete (June 2008)
BunBunMovie: Scenario Visualizing System Based on 3-D Character Tomoya Matsuo and Takashi Yoshino Wakayama University, 930 Sakaedani, Wakayamaya, Japan
[email protected] Abstract. There are many text-based contents, such as novels and script. Those contents have only scenario, and lack visual information. The purpose of this research is to provide visualizing environment that can visualize text-based contents easily. Moreover, such environment can also provide the opportunity to get pleasure out of scenario. It is necessary for visualizing scenario to make various motions of characters and to depict various situations. Therefore we propose motion assortment function to make various motions of characters. The function uses a Japanese dictionary and a thesaurus search. We also propose associated image display function that uses an image search to depict various situations. From the experiments about the motion assortment function, we show that the proposal method can assort some motions. From the experiments of subjective assessment, we found that some subjects inclined to use such easy visualizing environment. Keywords: Scenario visualizing, 3-D character, motion synthesis.
1 Introduction There exist many forms of written communication, such as diaries, novels, and scripts. The contents of these communication media are text-based and can be used to provide descriptions of various scenarios. It is extremely easy to use one’s imagination to visualize a scenario. However, to actually recreate a scene, it is necessary to use images and animation. This requires considerable preparation. Recently, animation has gained popularity among people and the demand for it has increased considerably. However, it is difficult to create animated objects using 3D software. To date, there has been no single environment that can be easily utilized for creating animated objects. In this study, we propose a scenario visualization system that only requires users to input sentences describing the required situation. In order to effectively create 3D movies, it is necessary to provide detailed descriptions of the scene, the characters, and their movements. We have attempted to simulate different types of movements for verbs that are not registered in the system by using a national language dictionary and a thesaurus. In addition, our system also retrieves images corresponding to the scenario and displays them. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 634–643, 2009. © Springer-Verlag Berlin Heidelberg 2009
BunBunMovie: Scenario Visualizing System Based on 3-D Character
635
2 Related Works Previous studies have proposed several systems that can recreate scenarios only on the basis of input information. Zeng developed a system named “3D Story Visualiser”[1, 2]. This system utilizes visual information in input sentences to create 3D scenes. The input is in the form of natural language and is processed with a tool known as NLS. The tool extracts nouns and prepositions from the input sentences, converts them into VRML form, and then outputs them as 3D images. In our system, the only task performed by the user is the input of sentences into the system. However, the 3D Story Visualiser can only reproduce information as a 3D image. The movements of the character are not reproduced. The aim of our study is to create a system that produces animated characters. Aoki developed a system (for creating animated objects) entitled the “Digital Movie Director”[3]. His system is based on TVML technology [4, 5] which was developed by the NHK Science & Technical Research Laboratories. In this system, the subject, predicate, and object used in the scene are set by the user. Moreover, the user can set camera angles and sound effects. In our system, the only task of the user is to input sentences. However, in the Digital Movie Director, the user has to first develop both the characters and the scene. Hence, a large amount of time is required to develop 3D images using this system. The Digital Movie Director cannot create a 3D movie unless both the scene and movements of the characters are prepared in advance.
3 System for Visualizing 3D Character Scenarios 3.1 Objective The objectives of our system are stated as follows: 1. The only task to be performed by the user is the input of sentences describing the scenario. We aim to develop a visualization system wherein no special operations are required to be performed by the user. A system that only requires sentences to be input can be easily used by all users. Our system first analyzes the input sentences and then determines the subject and predicate. 2. The system responds to verbs other than the registered verbs. It is difficult to simulate movements corresponding to all verbs. Hence, it is necessary for the system to respond to verbs other than those that are registered. Therefore, we have developed a “dictionary retrieval function” and a “movement synthesis function.” These functions allow us to simulate movements that correspond to verbs that are not registered in the system. 3.2 BunBunMovie The system that we have developed in this study is known as “BunBunMovie.” It analyzes input sentences and uses the information in them to develop a 3D movie. Figure 1 shows the execution screen of our system. The upper part of Figure 1 shows
636
T. Matsuo and T. Yoshino
a screen that displays the input sentences. “3D character,” “Related image,” and “Sentences under analysis” are displayed on the screen. The lower part of Figure 1 shows the screen used to input sentences. The user inputs BunBunMovie: Scenario Visualizing System Based on 3-D Character 3 sentences and pushes the reproduction button, following which the sentences are displayed on the screen. The BunBunMovie system is programmed using C#. We use TVML to create 3D animated objects. MeCab [6] is used for morphological analysis.
Image of character Image of place
Moving character Name of character and movement name of character Charactor name : I Action : Dance Reproduction button Yamada said "yah."And, Yamada jumped high. I was angry and s aid "It is annoying!!" T he uncle was s urpris ed at the voice. I apologized to the uncle, "I'm sorry."
Input sentences
I was reading the book. I went out because I had gotten tired of it. On the way, I met the friend. The friend said, "I was busy now," and went to the other s ide. I became sad a little. However, I played on down at once. I went to the dis co, and I danced with Tanaka.
Input history
Fig. 1. Execution screen of the system
Flow of sentence analysis process. We now explain the process for the analysis of the input. Figure 2 shows the flow of the procedure for sentence analysis. The explanation of each function is as follows. 1. First, the system analyzes the input sentences using MeCab (Figure 2 (1)). The nouns that denote the subject and place are determined from the relation between the noun and the case-marking particle. 2. The system examines whether the noun assumed to be a subject exists in a “subject list” (Figure 2 (2)). If the noun exists in the list, its related image is retrieved (Figure 2 (3)). 3. To analyze the verb, the system examines a “verb list” (Figure 2 (4)). If the verb under analysis does not exist in the list, the system refers to a dictionary (Figure 2 (5), (6), (7)). 4. The system then examines whether the noun assumed to denote a place exists in a “place list” (Figure 2 (8)). If the noun exists in the list, its corresponding image is retrieved (Figure 2 (9)). 5. The subject, predicate, and the noun indicating the place are converted into TVML format (Figure 2 (10)). 3.3 Functions of the BunBunMovie System The BunBunMovie system uses three functions known as “word list,” “dictionary retrieval,” and “image retrieval” to recreate desired scenarios from the information inputted by the users.
BunBunMovie: Scenario Visualizing System Based on 3-D Character
637
[Input sentences]
(1)
[Morphological analysis] The system assumes the noun to be a subject from the relation between the noun and the case-marking particle. MeCab The system assumes that the noun is place noun from the relation between the noun and the case-marking particle. Image search for object
(2) [List of subject]
The image is displayed in a movie.
subject
[Google image search]
(3)
Image TAMATEBAKO] (5)[RUIGO Synonym
(4) [List of verb]
dictionary
verb Case of unregistered verb
Place
Image search for place The image is displayed in Movie.
(7) The national language dictionary retrieval result and the synonym dictionary retrieval result are examined.
[Yahoo dictionary]
(6) Language
When the national language dictionary retrieval result exist in the synonym dictionary retrieval result, the system examines the verb list again.
dictionary
(8) [List of place noun]
Image search With face image search option
[Google image search]
(9)
Image search
Image
(10) [TVML data]
Play
TVML Player
Fig. 2. Flow of the sentence analysis process
Use of “Word list”. The word list consists of a list of words that already have corresponding images present in the system. The word list consists of three lists, a “subject list,” “verb list,” and a “place list.” The system checks these lists for words extracted from the input sentences and uses the corresponding images to recreate the desired scenarios. The detailed description of each list is as follows: 1. Subject List A noun that is recognized by the system as a subject is registered in the subject list. For example, nouns such as “I” and “he” are present in the subject list. All words registered in the subject list are nouns that fall in the “living thing” category of the Japanese dictionary. The number of registered words in our list is approximately 1700. 2. Verb List Verbs such as “run” and “walk” are registered in the verb list. The verbs registered in the verb list can be converted into TVML. There are 103 verbs present in our list. 3. Place List This list contains nouns that indicate a place or location. For instance, this list contains words such as “meadow” and “mountain.” The total number of words in the place list is approximately 6500. The registered words are nouns that exist in the “place” category of the Japanese vocabulary dictionary. The place list is required to accurately recreate the location of a scene from its description. Dictionary retrieval function. In order to recreate a scene properly, we need to recreate the movements of the characters in the scene. However, it is not possible to simulate all types of movements since the number of verbs registered in the list is limited. This results in a 3D character that is inanimate even though sentences describing its behavior have been inputted.
638
T. Matsuo and T. Yoshino
The dictionary retrieval function is used as follows. 1. First, the system obtains synonyms of the unregistered verb from the thesaurus (Figure 2 (5)). 2. Then, the system retrieves the definition of the unregistered verb from the Yahoo online national language dictionary (Figure 2 (6)). 3. The system analyzes the definition of the unregistered verb in the national language dictionary using MeCab. 4. It compares the information obtained from the national language dictionary with the synonyms and examines whether the description obtained from the dictionary matches any of the synonyms (Figure 2 (7)). 5. If a match is found, the system assumes the synonym to be a paraphrase of the unregistered verb. 6. The system examines whether this verb exists in the verb list (Figure 2 (4)). 7. On the basis of the information in the verb list, the appropriate movements are then assigned to the 3D characters. For instance, if the dictionary retrieval function is used for an unregistered verb such as “whisper,” the system paraphrases the verb “whisper” to the verb “say.” The system can re-examine the verb list on the basis of the new information obtained and can recreate the desired scene accurately. The image retrieval function. When sentences are made visible, sight information is insufficient only in the movement of 3D character. We think that the appearances of the character and information of the scene are necessary. The system uses the image retrieval function to add visual information. We use “Google image retrieval” to obtain visual information. Using the image retrieval function, users can obtain and display images related to the subject and the place described in the input. As an example, consider the following sentence inputted by a user: “My uncle was in a meadow.” The system assumes “uncle” to be the subject and “meadow” to be the location. The images corresponding to this subject and place are then retrieved by the system and displayed. Using this function, further information such as the role of the character and other information pertaining to the scene can also be added. Our system also uses Google Image Search’s face filter to retrieve only facial images related to a subject. This option displays all the images related to a particular face by priority. The probability of obtaining inaccurate images of a particular subject using this search option is very low. 3.4 Development of the Movement Synthesis Function Even though the dictionary retrieval function allows us to obtain the definition of several unregistered verbs, it is not possible to process all unregistered verbs using this function. Therefore, we have developed the movement synthesis function. This function combines the movements of verbs already registered in the verb list and then develops movements for unregistered verbs. In other word, the movement synthesis function first selects registered verbs and then combines their movements.
BunBunMovie: Scenario Visualizing System Based on 3-D Character
639
A verb is selected on the basis of synonyms of the unregistered verb. The flow of the movement synthesis function can be explained as follows, and it is illustrated in Figure 3. 1. We first retrieve synonyms for verbs that are already registered in the verb list. As a result, synonym tag is added to these verbs. 2. The system then retrieves synonyms of the unregistered word from a thesaurus (Figure 3(1)). 3. The system compares the synonym tag of the unregistered word with that of the registered verbs and examines the number of agreements between them (Figure 3(2)). 4. It then combines the movements of two or more registered verbs and develops the movement corresponding to the unregistered verb (Figure 3(3)). Comparison of synonym tag information
(1) Unregistered verb
(2) Registered verb
Verb:UNKNOWN [ jump, enjoy…]
Verb:A[ smile, enjoy…] Verb:B[ sad, cry…] Verb:C[ jump, snap…]
(3) The verb with the same synonym tag as synonym tag information on an unregistered verb are chosen. Verb:A
Verb:C
Fig. 3. Movement synthesis function. UNKNOWN, A, B and C denote the verb. The words in parentheses denote the synonym tag.
4 Experiments for Evaluation of Movement Synthesis Function We carried out an experiment to evaluate the performance of the movement synthesis function. The purpose of the experiment was to verify whether a user can correctly recognize the movements simulated by the system. In the experiment, ten students of Wakayama University observed the characters and movements developed by the system. They then provided feedback by filling out a questionnaire. 4.1 Experimental Procedure Movement synthesis function made five movements of the character. Those movements of character are ”set, hang, see off, be penitent, and burst.” The test users evaluated the movement of the characters. The number of verbs that were combined to develop the movements was approximately 70. The movement synthesis function chose two verbs from among these verbs to develop each set of movements. The test users answered a questionnaire with a five-point evaluation system and another description-based questionnaire. 4.2 Results of the Experiment Table 1 shows the results of the five-point evaluation system used in the questionnaire.
640
T. Matsuo and T. Yoshino
A movement that had a high average rating in the evaluation results was a movement that was correctly recognized by the user. However, there were also movements that had a low average rating. The average ratings of the verbs “burst open” and “see off” were high. Table 2 shows the feedback provided by the test users in the description-based questionnaire. The feedback was both positive and negative. One of the positive instances of feedback stated that the movement of the characters was interesting. On the other hands we also received negative feedback that stated that the movements of the characters were inaccurate. Table 1. Results of the questionnaire of the experiments evaluating the movement synthesis function Movement developed by the system
Movements used for Average synthesis set up the traps go out + poke 3.0 hang up dig + apologize 1.8 see off go to + say no 3.8 amend be mad + be worried 2.4 burst be mad + jump 3.4 The values in the “average” column of the table denote the mean values of the ratings given in response to the question “Is the movement of the character appropriate?” The following five-point Likert scale was used as the grading system in the questionnaire. The scale is as follows. 1: Strongly disagree, 2: Disagree, 3: Neutral, 4: Agree, and 5: Strongly agree.
Table 2. Impression of the movement synthesis function
Positive feedback
・The fact that the system itself develops the movements of the character is interesting. ・Even though the movement of the character may not be recognizable, it is important that the characters react to more words. ・Both appropriate and inappropriate movements were present. The appropriate movements were useful. Negative feedback ・The character expresses two separate movements for a single verb. ・Several movements of the character were not recognizable. ・It is difficult to understand for the synthesis of the movement of a monotonous character. 4.3 Discussion The results of the five-point evaluation in the questionnaire show that the average ratings for the verbs “burst” and “see off” were high. This is an example of the effectiveness of the movement synthesis function. It should be noted that when one of the two verbs used in the synthesis of a movement is related to the target verb, the average rating is high. For instance, “jump” was used for the synthesis of “burst”, and “go to” was used for the synthesis of “see off.” This shows that if verbs with similar meanings are used for synthesis, the required movements can be developed accurately. However, the overall accuracy of the movement synthesis function that is based on synonym information is not high. One of the negative instances of the feedback we obtained stated
BunBunMovie: Scenario Visualizing System Based on 3-D Character
641
that the character expressed two separate movements for a single verb. In other words, it implied that the test user was unable to recognize that the movement of the character corresponded only to a single verb. To solve this problem, we plan to improve the system such that it displays more natural movements. Therefore, we intend to utilize composite motions developed by Oshita [7].
5 Trial Evaluation of System by Users We performed trial experiments to allow users to evaluate the system. In the experiment, the test users inputted sentences describing a scene and then viewed the movie developed by the system. The purpose of the experiment was to allow users to evaluate the accuracy of the analysis of the sentences and the quality of the movie. Ten students from Wakayama University tested our system. 5.1 Experimental Process The test users first inputted the sentences and then observed the movies developed by the system on the basis of the information provided. The interval of the experiment was 10 minutes. After the experiment, the test users answered a questionnaire based on a five-point evaluation system and a description-based questionnaire. 5.2 Results of the Experiment When sentences where the test user clearly described the subject and predicate, such as “I went to college” and “I climb the mountain“ were inputted, the system accurately developed the movie. However, when only predicates such as “Went dancing” and “It kicked and knocked it down” were input, the system was not able to develop a movie. Table 3 shows the questionnaire result of the system. Overall system was highly evaluated by the test users. Table 4 shows the requests and impressions of the test users to the system. Some of the requests of the users included a desire to develop their own images and characters that were then recreated by the system. Other requests involved an improvement in Table 3. Questionnaire result of the system Question
Average
2. A related image improves interest.
4.4 3.9
3. I want to visually recreate my diary using this system.
4.1
1. The movements of the characters are interesting.
The values in the “average” column of the table indicate the mean values of the ratings given in response to the questions. A five-point Likert scale was used for the evaluation. The scale is as follows. 1: Strongly disagree, 2: Disagree, 3: Neutral, 4: Agree, and 5: Strongly agree.
642
T. Matsuo and T. Yoshino
the response of the system to different words and an improvement in the analytical accuracy of the system. Most of the test users felt that the system was interesting, although they stated that it was necessary to increase the number of words that can be translated by the system into images. However, most of the feedback that we received was positive. Table 4. Requests and impressions for the system Request for the system I would like to use sentences that allow the system to recreate images that I have already prepared. I would like to develop the character and then use it in the system. The system should be able to respond to more words. When the system can not analyze sentences, the system should display, “We can not analyze sentences.” The speed of the reproduction of images by the system should be improved.
・ ・ ・ ・ ・
Impression for the system The system is interesting. I enjoyed using the system, although the movie was slightly unclear. Although the system is interesting, it is necessary to improve it. The system can only be used to visualize diaries when it responds to more words. If the input sentences are not grammatically correct, the characters do not respond well.
・ ・ ・ ・
5.3 Discussion From the results of the questionnaire with the five-point evaluation system, we found that the rating for the item “I want to visually recreate my diary using the system” was high. However, there were several other requests concerning the system. There were requests for better analysis of the sentences, an increase in the number of words responded to by the system, and improved analytical accuracy. In the experiment, our system was unable to analyze a lot of sentences. Therefore, test users requested an improvement in the accuracy of the analysis of the sentences by the system.
6 Conclusion Conventional visualization systems are not able to recreate scenarios when the words extracted from input information are not registered with the system. Moreover, these systems are also not able to develop movements for each character. To resolve these issues, we proposed a system that uses a dictionary retrieval function and a movement synthesis function to recreate the required scenarios. The performance of our system was evaluated through trial tests by users. The results of our tests were as follows. 1. It was observed that developing appropriate movements for a character was a problem in conventional systems. We developed a movement synthesis function that utilized synonyms to develop the movements corresponding to a particular verb.
BunBunMovie: Scenario Visualizing System Based on 3-D Character
643
To verify the accuracy of the movement synthesis function, we performed experiments that evaluated its accuracy. It was observed that many of the movements developed by the synthesis function were not appropriate. However, the use of the movement synthesis function allowed us to develop animated characters. Hence, we proved that our proposed technique has good potential. 2. We also performed test experiments that allowed users to evaluate our system. The test users stated that they were satisfied with the system. We received feedback that stated that the users found the system interesting. This proves the potential of our system. However, our system only responds to sentences in which the subject and the predicate are input appropriately. Therefore, there were several requests concerning the accuracy of the analysis of a natural sentence. We plan to further improve the analytical accuracy of the system and the movement synthesis function.
References 1. Aoki, T.: Digital Movie Director, http://www.rcast.utokyo.ac.jp/ja/research/pioneers/007/ index.html 2. Douke, M., Hayashi, M., Makino, E.: A Study of Automatic Program Production Using TVML, Short Papers and Demos. In: Eurographics 1999, pp. 42–45 (1999) 3. Hayashi, M.: TVML (TV program Making Language) Make Your Own TV Programs on a PC! In: International Conferences, Virtual Studios And Virtual Production (2000) 4. Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying Conditional Random Fields to Japanese Morphological Analysis. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP-2004), pp. 230–237 (2004) 5. Oshita, M.: Smart Motion Synthesis. In: SIGGRAPH 2007 Posters (2007) 6. Zeng, X., Mehdi, Q.H., Gough, N.E.: Shape of the Story: Story Visualization Techniques. In: Seventh International Conference on Information Visualization, pp.144–150 (2003) 7. Zeng, X., Mehdi, Q.H., Gough, N.E.: From Visual Semantic Parameterization to Graphic Visualization. In: Ninth International Conference on Information Visualization, pp. 488– 493 (2005)
Augmented Collaborative Card-Based Creative Activity with Digital Pens Motoki Miura, Taro Sugihara, and Susumu Kunifuji School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
[email protected],
[email protected],
[email protected] Abstract. Typically, practitioners of the KJ method use paper labels and fourcolored ball-point pens to externalize their thoughts and ideas during the process. A similar approach and method is used in group KJ lessons. However, due to the large paper size required, this approach is limited in effective capturing and sharing of outcomes. Considering the merits of the conventional paper–pen approach and the demand for quick sharing of outcomes after the session, we designed and implemented a system to digitize the group KJ session—not just the outcomes but also the details of the creative work processes. We use digital pens to capture position and orientation of labels, as well as their contents, during the session. We confirmed the efficiency of our system with several KJ sessions. Keywords: CSCW, Creative meeting, Label work, KJ method.
1 Introduction We often use small paper cards or post-it notes to organize our thoughts and ideas. Organization of paper cards has two advantages: (1) moving the cards by hands is intuitive; (2) groups of users can simultaneously access the cards. Therefore such card-based activity is commonly used for summarizing and organizing ideas. However, digitizing the process of card-based activities must address special concerns. There are several tools that handle virtual cards and post-it notes on a PC, such as the KJ Editor [6], GUNGEN [7], and D-ABDUCTOR [8]. These tools are effective in organizing personal tasks but are not suitable in group work, because they require mouse and keyboard input. To solve these input limitations, by introducing a natural and augmented reality approach, Klemmer et al. [1] proposed the Designer’s outpost system to capture the flow and transition of post-it notes on a wall-size large screen. We also developed a system to relieve the occlusion problem of the outpost system by using glass with controllable transparency [2]. However, the outpost system’s large size makes it difficult to move, rendering it inapplicable for digitizing activities in casual meetings usually held in knowledge creating companies and research laboratories. In this paper, we propose a system to capture the location of the paper cards with a small and simple facility. The captured location is then used to reproduce the card organization process after the session ends. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 644–651, 2009. © Springer-Verlag Berlin Heidelberg 2009
Augmented Collaborative Card-Based Creative Activity with Digital Pens
645
2 Capturing Card Locations by Digital Pens We used Anoto-based pens to store drawings on paper cards and a base sheet. The Anoto-based pen can recognize the position of drawings by scanning special dotted patterns on the paper. Using the unique features in patterns, the system can distinguish between data of the drawings on the cards and those on the base sheet. Using these characteristics, the drawings can be used not only for handwriting notes but also describing the relationships between the paper cards and the base sheet. When the user draws a line that covers the sheet and a card (Fig. 1 left), the pen recognizes the line as three drawings (Fig. 1 right). If these drawings are generated at almost the same time, we can consider that, at that time, the paper card was placed so as to connect the three drawings. We call this operation scanning and the connecting points joints.
Fig. 1. A line over the card border is separated into three lines
By enhancing the technique, we can recognize orientation and overlapping state of paper cards if two joints are extracted by scanning. (Fig. 2)
Fig. 2. Recognizing of orientation and overlapping states
To eliminate unnecessary pen drawings on paper for scanning, we can simply use a semi-transparent plastic sheet while scanning (Fig. 3)
Fig. 3. Using transparent plastic sheet to eliminate drawings
646
M. Miura, T. Sugihara, and S. Kunifuji
Fig. 4. Grouping (left) and Ungrouping (right) gestures
A similar method was proposed by [3] as research on digitized experiment record notes; however, to edit the card structure, we introduced extra pen gestures called grouping and ungrouping. Grouping can be performed by a continuous round stroke from the top-level card to the child cards (Fig. 4 left). Ungrouping can be defined by a continuous single stroke from the top-level card to the child cards (Fig. 4 right). These grouping and ungrouping operations can be used for common card-based creative activities, especially for making figures in the KJ method [4, 5]. Of course a digital camera can store the paper sheet status in detail. But reusability of the card content is crucial for creative tasks, and the atomic data should be provided to enhance the process. In particular, an authentic KJ method involves procedures that use repetitive tasks for refinement and deepening.
3 GKJ System We developed a system named GKJ (Group KJ) that handles scanned handwritten drawings and gestures captured by multiple Anoto pens. The GKJ system consists of (1) Anoto pens, (2) an L-Box Digital Pen Gateway System (DPGW), and (3) a GKJ editor. A system overview is shown in Fig. 5. The L-Box DPGW collects pen data from multiple pens simultaneously, via a Bluetooth connection, and sends it to a MySQL table in a PC. The GKJ editor checks the updated data and handles the data to construct a digital representation of the current paperwork status. For further editing
Fig. 5. GKJ system overview and data flow
Augmented Collaborative Card-Based Creative Activity with Digital Pens
647
tasks, the GKJ editor provides functions for organizing the virtual cards, using a mouse and a keyboard as alternative input devices.
4 Usage Scenario Typically the group KJ method session consists of two stages—card gathering and card unfolding. In the gathering stage, the participants discuss and collect cards with similar meanings or arguments. After that, they add extra cards to those gathered, and write an abstract of the cards on the added cards. Then they clip these cards together with paper clips, considering them as a single card. This is repeated until there are less than 6~9 cards in the stack. In the GKJ editor, the folding operation can be performed by the grouping gesture, and the folded cards are shown in the left of Fig. 6. In the authentic KJ method, the participants are basically prohibited from referring to the child cards during this stage. Then they proceed to the unfolding stage. Usually the participants extract the cards on the base sheet, but this requires special care to not destroy the constructed structure of piled cards. Also, in the real world, it is difficult to re-organize the unfolded cards because the amount of area necessary for the cards depends on the number of cards and the layout. A high number of cards prevent a trial and error approach. Therefore we recommend that the participants use virtual cards to estimate a preliminary layout. The GKJ editor provides a function for unfolding virtual cards and pre-organizes the extracted virtual cards by dragging the top-level cards. Figure 6 right shows the unfolded virtual card view. Using this function, the participants can effectively layout the cards by considering the relationship between the cards. After the two stages, the participants obtain a figure (Fig. 7) which represents their issues and viewpoint as an outcome. The participants can review the process by replaying the operations with the GKJ editor. Also they can export the process data or print figures in a PDF format. Incidentally, the curved line in Fig. 7 was constructed by a Bezier curve, whose control points were generated by the convex hull algorithm. The curved lines are automatically recalculated by moving the virtual cards. As described above, the proposed GKJ system allows freely choosing a proper environment (real or digitized cards) for their task such as review by the digitized log
Fig. 6–7. Digitized views of grouping — (left) pile and (right) unfolded view
648
M. Miura, T. Sugihara, and S. Kunifuji
rollback, and distribution of data. The high portability of the GKJ system makes it useful in a variety of environments. Group sessions with the GKJ system can be held with (1) a base paper sheet, (2) paper cards, (3) digital pens, (4) L-Box, a small Linux box, and (5) a PC.
Fig. 8. Final KJ method figure in the GKJ system
4.1 Practice We operated the GKJ system for small courses of collaborative card-base activity (Fig. 9 shows a class at our institute and Fig 10 shows a lecture course with city hall staff). We used pre-printed cards as material, and the participants positioned the cards spatially to represent their thoughts and considerations. In this case, the system may not enhance the ongoing work, but the participants enjoyed scanning and checking the digitized data. The instructor could conduct the course in the same manner as the conventional courses that use the cards (Fig. 9 left). Since the scanning is intuitive, the instructor could easily capture the card locations. The captured data was utilized to generate PDF files, which represent their layout after the course. The precise transition log of the cards was helpful for retrospection. In the lecture course at a city hall (Fig. 10), the participants first wrote their thoughts and work related problems on plain paper cards with pens. Then they classified their cards by hand and discussed the issues. After organizing the cards, by putting them on the base sheet, we scanned the card positions.
Augmented Collaborative Card-Based Creative Activity with Digital Pens
649
Fig. 9. Lecture courses with collaborative card-base activity at the university
Fig. 10. Lecture course with city hall staffs
We obtained the following findings from observations of the sessions and comments from participants. (Advantages) 1. Card writing with a pen was straightforward and intuitive for participants. 2. The participants could naturally organize their cards because they could see other people’s behavior. 3. The quick distribution of digitized logs and figures (PDF) is effective for reviewing the session and discussion. Incidentally, the instructor had been providing digitized logs and session videos for participants even for the conventional lectures, but it took a few days to digitize the outcomes. 4. The instructor and some participants could master the scanning and enjoyed the operation. 5. The position of the scanned data accurately represented the real figures. (Drawbacks) 1. Sometimes the scanning failed due to errors. The most frequent mistake was a scanning section error. The GKJ system utilizes four A2 size sheets to compose an A0 sized base sheet. Since the printed dot pattern of the A2 sheet was same, the user needed to specify the section of the base sheet to the system before scanning by tapping a checkbox. Occasionally the user missed the presetting or scanned with
650
M. Miura, T. Sugihara, and S. Kunifuji
a wrong pen. To reduce this error, we now prepare four preset pens for each A2 sheet section. Even if the error occurs, the user can easily fix the misrecognition by rescanning. 2. Sometimes unnecessary scanning lines appeared on cards and base sheets. The reason was misrecognition during scanning. The misrecognition was caused by weak pen pressure, high scanning speed, and the lack of a gap between cards (less than 1 cm). To solve the issue, we added a “transparent” pen mode, which does not draw unnecessary lines while scanning. 3. Some participants wrote upside down on the cards, because it is difficult to recognize the top and bottom. This caused the card content to be shown wrong side up, and the position was scanned incorrectly. The issue could be solved by implementing a function to automatically detect when the wrong side was up by considering the handwritten note, and handling the card carefully. 4. As we described in points 1 to 3, scanning required skill and know-how. The user needed to understand the characteristics of the pen and the GKJ to operate the system adequately. However, the skill could be easily acquired with a few minutes of training, and a failed scanning could be easily recovered by rescanning. Even with its drawbacks, the GKJ system has the potential to augment conventional paper based discussion. We also found that most of the drawbacks could be solved by further system refinements.
5 Conclusion We proposed a method for capturing the location and hierarchical structure of paper cards written with Anoto-based pens. The method enables the participants to record a precise atomic transition log of the cards. We also developed a system for digitizing paper-based card organization tasks instantly, based on the proposed method. Due to simplicity of the pen-based input, the GKJ system is universal; it can be used by office workers and also the elderly and primary school children. We confirmed the effectiveness of the system with several sessions. We applied it to small group learning sessions of up to 10 persons, but this system is applicable for many participants and groups, since the system can handle up to 40+ pens at the same time. We will refine the GKJ system to improve its usability, and it should contribute to the effectiveness of group discussions that include various types of participants, such as town meetings.
Acknowledgement Digital Pen Gateway System and related technologies are from NTT Comware Tokai Corporation. Our research is partly supported by a grant-in-aid for Scientific Research (20680036, 20300046).
References 1. Klemmer, S.R., Newman, M.W., Farrell, R., Bilezikjian, M., Landay, J.A.: The Designers’ Outpost: A Tangible Interface for Collaborative Web Site Design. In: Proceedings of UIST 2001, pp. 1–10 (2001)
Augmented Collaborative Card-Based Creative Activity with Digital Pens
651
2. Miura, M., Kunifuji, S.: A Tabletop Interface Using Controllable Transparency Glass for Collaborative Card-based Creative Activity. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part II. LNCS, vol. 5178, pp. 855–862. Springer, Heidelberg (2008) 3. Ikeda, H., Furakawa, N., Konoishi, K.: iJITinLab: Information Handling Environment Enabling Integration of Paper and Electronic Documents. In: CSCW 2006 Workshop (Collaborating over Paper and Digital Documents) (2006) 4. Kawakita, J.: An Idea Development Method, Chuuko Shinsho, Chuuo Kouron-sha (1968) 5. http://www.mycoted.com/KJ-Method 6. Ohiwa, H., Takeda, N., Kawai, K., Shiomi, A.: KJ editor: a card-handling tool for creative work support. Knowledge-Based Systems 10(1), 43–50 (1997) 7. Munemori, J.: GUNGEN: Groupware for a new idea generation support system. Inf. Soft. Technology 38(3), 213–220 (1996) 8. Misue, K., Nitta, K., Sugiyama, K., Koshiba, T., Inder, R.: Enhancing D-ABDUCTOR Towards a Diagrammatic User Interface Platform. In: Proceedings of KES 1998, pp. 359–368 (1998)
Usability-Engineering-Requirements as a Basis for the Integration with Software Engineering Karsten Nebe1 and Volker Paelke2 1
University of Paderborn, C-LAB, Fürstenallee 11, 33098 Paderborn, Germany 2 Leibniz University Hannover, Appelstrasse 9a, 30167 Hannover, Germany
[email protected],
[email protected] Abstract. Usability is growing to become an integral quality aspect of software development, but it is not an exclusive attribute of the generated product; it is also a fundamental attribute for the development process itself. The question is how to adapt software engineering processes (or models) in such a way that they can ensure the development of usable solutions. In this paper, the authors present an integration approach pursuing this goal. It draws on so called ‘Compliancy and Key Requirements’ that can be used for the definition of software processes (or process models) and thereby support the integration of both disciplines. The requirements are based upon representative standards (DIN ISO 13407 and ISO/PAS 18152) but were enhanced by the results of an expert based survey using interviews and questionnaires. Additionally the requirements have been verified by experts and represent an evaluated knowledge base for the development of usable products. Keywords: Integration, Software Engineering, Usability Engineering, Standards DIN EN ISO 13407 and ISO/PAS 18152, Process Models, Process Definition, Process Improvement, Assessment.
1 Introduction The integration of Software Engineering (SE) and Usability Engineering (UE) is a widespread topic in theory and practice as the need for usability grew to prominence in software development within the last years. There are far-reaching benefits both for the users, which include a high productivity, increased quality of work and usersatisfaction, as well as for the organizations, e.g. in monetary form, such as the reduction of support- and training-costs [10]. Thus, the usability of the solution became an integral quality aspect in software development, especially in comparison with competing products (respectively competing manufacturers) and can result in a unique selling proposition. But quality is not an exclusive attribute of the generated product; it is also a fundamental attribute of the manufacturing process itself. An optimal process would be designed in such a way as to assure the desired quality of the produced solution. At this point UE methods are used to ensure utilizable solutions. However, in industrial practice UE can only exist in addition to or in combination with SE. Hence, there is a need to integrate the two disciplines of SE and UE. The goal is to combine the procedures and J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 652–659, 2009. © Springer-Verlag Berlin Heidelberg 2009
Usability-Engineering-Requirements as a Basis for the Integration with SE
653
the goals of SE and UE in a way that allows systematic and predictable implementations to be generated while considering the factors of costs, time and quality adequately for both SE and UE proposes. In this paper, the authors present an integration approach pursuing this goal.
2 Integration Approaches In theory and praxis, a considerable number of integration approaches with distinct focuses exist [18]. Some of these approaches tend to define common activities and artifacts for both SE and UE and to integrate these specific activities into the process of development. They aim at a ’soft integration‘ of UE aspects on a mutual basis, e.g. at interlinking relative results (e.g. [17, 5, 2]). Most approaches focus on a minimal organizational and structural transformation and/or change. Quite similar are approaches that aim at a common specification of activities and artifacts. They are grounded on communication and information exchange by using shared definitions (e.g. [1, 21, 20]). These two kinds of approaches could be summarized as a group of approaches that aim directly at the operational development processes in organizations. Other integration approaches relate to the level of process definitions and process models (e.g. [6, 11,3]). These aim to define pre-settings for the development and contain both a more concrete approach (focusing on the integration of UE activities in an already existing SE-Models), and more fundamental aspects of process models (independently of any concrete SE-Model). In general these approaches concentrate on the combination of phases, activities and results (within existing structures) on the level of process models to build up the base for integration. In addition there is another third group of integration approaches focusing on a higher level ob abstraction. Those approaches are independent from any specific process model or activities but rather describe organizational measures, principles, paradigms or meta-models (e.g. [16, 7, 5, 19]). Those approaches aim at the definition of general procedures for the development, which is comparable to standards in SE and UE on this level of abstraction. Accordingly, strategies for the implementation are abstract and need to be adapted to particular situations. Altogether these groups of approaches aim to provide systematic procedures for developing usable software. At a closer look, they address three different levels of abstraction: 1. The abstract overarching level of standards in software engineering and usability engineering, serving as a framework to ensure consistency, compatibility, exchangeability, and quality within and beyond the organizational borders and to cover the improvement of quality and communication. 2. The level of process models for software engineering and usability engineering, to provide a procedural model that can serve as a framework for an organization, providing specific features, e.g. predictability, risk management, coverage of complexity, generation of fast deliverables and outcomes, etc. 3. The operational process level which reflects the execution of activities and the processing of information within the organization. It is an instance of the underlying model and the implementation of activities and information processing within the organization.
654
K. Nebe and V. Paelke
These are related in a hierarchy: standards define the overarching framework, process models describe systematic and traceable approaches within such a framework, and at the operational level the models are tailored to fit the specifics of an organization. 2.1 Integration on the Level of Standards, Process Models and Operational Process It can be observed that this hierarchy of standards, process models and processes exists in both disciplines, but there have been few attempts to exploit these similarities for integration. With this goal in mind, the authors analyzed these three levels and presented a holistic approach for integration of SE and UE [12, 13, 14]. By doing this, the authors identified similarities between SE and UE on the level of standards. The standards’ detailed descriptions of processes, activities and tasks, output artifacts, etc. were analyzed and compared. For this, the SE standard ISO/IEC 12207 [8] was chosen for comparison with the UE standard DIN EN ISO 13407 [4]. On a high level, when examining the descriptions of each activity, by relating tasks and outputs with each other, similarities could be identified in terms of the characteristics, objectives and proceedings of activities. Based on these similarities single activities were consolidated as groups of activities (so called, ‘common activities’). These ‘common activities’ are part of both disciplines SE and UE on the highest level of standards. The result is a compilation of five ‘common activities’: Requirement Analysis, Software Specification, Software Design and Implementation, Software Validation, Evaluation that represent the process of development from both, a SE and a UE point of view [12, 13]. These activities define the overarching framework for the next level, the ‘level of process models’. In a following analysis the authors identified the maturity of software engineering process models’ ability to create usable products [12, 14]. For that purpose, the authors used a two-step approach to synthesize the demands of usability engineering and performed an assessment of selected software engineering models. To obtain detailed knowledge about usability engineering activities, methods, deliverables and their regarding quality aspects, the authors analyzed the two usability engineering standards DIN EN ISO 13407 and the ISO/PAS 18152 [9]. The ISO/PAS 18152 defines detailed base practices that specify the tasks for creating usable products. These base practices were used as a foundation to derive requirements that represent the ‘common activities’ from a usability engineering perspective. The quantity of fulfilled requirements for each activity of the framework informs about the level of compliance of the software engineering model. It provides an estimate of how well the UE base practices are covered in a given SE model. The results of the assessment provide an overview about the degree of compliance of the selected models with usability engineering demands. It turned out that there is a relatively small compliance to the usability engineering activities across all selected software engineering models. This is an indicator that only little integration between usability engineering and software engineering currently exists on the level of process models. The analysis did not only highlight weaknesses of SE Models, it also pinpointed the potential for integration between software engineering and usability engineering:
Usability-Engineering-Requirements as a Basis for the Integration with SE
655
Where base practices are not considered as fulfilled, recommendations could be derived, which would contribute to an accomplishment. The underlying base practices provide indices what needs to be considered on the level of process models. This can be used a foundation for implementing the operational process level. However, during the analysis it became apparent that there is a clear need for more detailed/adequate criteria for the assessment by which more objective and reliable statements about process models and their ability to create usable software could be made. Such detailed criteria would also be useful to formalize process-requirements that can influence the definition of SE-Models and development processes that are usercentered and by improve the interplay of SE and UE in practice. Having this in mind, the authors performed semi-structured interviews with experts from the domain of UE to identify requirements from the UE perspective. The results have been analyzed and evaluated as described in the following paragraph.
3 UE-Process-Requirements In order to make software development processes user-centered there is a need for explicit knowledge about relevant activities, their dependencies, regarding results, roles, and quality aspects, etc. One goal is to develop such a knowledge-base using existing findings and to enrich them by expert’s knowledge. Therefore the authors created an interview-guideline and questionnaires that correspond to the overall process framework of common activities particularly with regards to the usability engineering perspective. The analysis is based on the four humancentered design activities of the DIN EN ISO 13407 (‘context of use’, ‘user requirements’, ‘produce design solutions’ and ‘evaluation of use’) and their respective base practices and specifics as defined in the ISO/PAS 18152 (i.e. fundamental activities, basic conditions and constraints, relevance of activities, resulting outcomes, type of documentation, and respective roles and responsibilities). The goal was not to evaluate these standards but to add details for further use. A substantial part of the analysis referred explicitly to quality characteristics of the four human-centered design activities. The goal was to identify what constitutes the quality of a certain activity from the experts’ point of view and what kind of (potentially measurable) success and quality criteria exist that are relevant on a process level and subsequently for the implementation in practice. Examples of the questionnaire are: How to identify good activities? How to identify good results or deliverables? How to identify appropriate roles? What are properties/characteristics for the relevance and frequency? How could the progress of an activity or deliverable be measured and controlled? Based on the results the authors identified activities, deliverables and roles that are necessary to ensure the development of usable products from the experts’ point of view. Relevant factors of influence could be for instance: „When will an activity A not be performed, and why?” or “Under which circumstances will an activity A be performed completely, when just partly?” Additionally, criteria that allow measuring the progress of the development process.
656
K. Nebe and V. Paelke
It was expected that the results could be used not just as more detailed criteria for an assessment but would also provide an indication of the level of completeness of the ISO/PAS 18152 and identify potential areas of improvement. To achieve this, the authors performed semi-structured interviews and questionnaires with six experts in the field of UE [15]. The experts were well grounded in theoretical terms, i.e. standards and process models, as well as in usability practice. 3.1 Derivation of Requirements As a result, about 470 statements from the experts have been gathered which then have been consolidated and classified by adding references to its source (i.e. the interview partner and the question out of the interview-guideline); to one of the four activities (‘context of use’, ‘user requirements’, ‘produce design solutions’ or ‘evaluation of use’); whether it addresses quality aspects regarding the process, an activity, or deliverable; whether it complies to the activities’ and base practices’ goals (as defined in the two ISO standards), etc. Thus, overarching process- and quality characteristics could be identified that led to findings about the relevance, the applicability and need of usability activities, methods and artifacts to be implemented in SE. By performing several iterations of analysis similar statements were merged and formalized in terms of 107 ‘requirements for development processes or process models. There are two distinct types of requirements: ‘Compliancy and Key Requirements’. Compliancy requirements represent the goals and base practices defined in the standards DIN EN ISO 13407 and ISO/PAS 18152 but refine them with the output of the analysis. The key requirements define core characteristics of the overall frameworks usability activities focusing on the activities’ and results’ quality. Together, the requirements define the demands of UE and lead to the systematically creation of usable products. Examples of the resulting requirements are: • Context-analysis is an integral part of the process. • Analysis takes part early in the process before conceptual work is carried out. • Analysis activities are preformed iteratively until all incompletions and inconsistencies are eliminated. • Resources and time for the elicitation and evaluation of user requirements is sufficiently provided. • User requirements are addressed in the system design. • User requirements are the input for the next process step and accordingly positioned in the development process. • The requirements of the users of the system are defined. 3.2 Evaluation of Requirements In a subsequent analysis, both the compliancy and key requirements have been evaluated by 13 usability experts using questionnaires (three of these experts were also involved in the previous analysis). The questionnaire included a list of all 107 requirements grouped by the four activities (‘context of use’, ‘user requirements’, ‘produce design solutions’ or ‘evaluation of use’) and scales to rate the correctness and the relevance for the appliance in practice of each requirement. Some examples of Requirements are shown in Table 1.
Usability-Engineering-Requirements as a Basis for the Integration with SE
657
Table 1. Examples of the requirements for the UE-activites ‘context of use’ (CoU), ‘user requirements’ (UR), ‘produce design solutions’ (PDS) and ‘evaluation of use’ (EoU) and the experts’ rating in terms of correctness and relevance (in practice) Nr 2 17
Activity CoU CoU
27
CoU
24
CoU
33
CoU
46
UR
71
PDS
105
EoU
Requirement Context-analysis is an integral part of the process. The outcomes of the context analysis serve as the input for the next process step and the activity itself is anchored within the process model accordingly. The characteristics of the intended users and their tasks, including user interaction with other users and other systems, are documented. The analysis is focused on the original context of the users (their goals, tasks, characteristics of the tasks and the environment, etc.). The analysis is independent of any existing solution/implementation. The context-information is based on facts and not an interpretation of any situation. A sufficient amount of user requirements are the basis for the next process step (PDS). The development of solutions is carried out in collaboration with the development team. It is checked that the system is ready for evaluation.
Correctness Correct Correct
Relevance Very high High
Correct
Very High
Correct
High
Sufficient
Medium
Correct
Very High
Correct
Very High
Sufficient
Medium
By looking at the overall results it turned out that most requirements are rated correct by the majority of experts: 31 requirements by all 13 participants; 29 requirements by 12 experts; 27 requirements by 11; and 6 requirements by at least 10 Experts. No requirement has been rated incorrect. All together there is a high compliance of the experts opinions to the requirements. The sum of requirements that has been rated correct by at least 10 experts is 93 – which represent 87% of all 107 requirements. The rating of the relevance was used to derive recommendations about the priority for the appliance in practice (i.e. for the definition of processes). 1. Those requirements that have been rated as ‘correct’ and range from a ‘very high’ to a ‘high’ scale of relevance. (in general: the higher the relevance the higher the priority). 2. Those requirements that have been rated as ‘correct’ and show ‘medium’ scale of correctness. 3. Those requirements that depict a ‘sufficient’ scale of correctness. 4. Those requirements that show an ‘acceptable’ scale of correctness. 5. All remaining requirements. But, by applying the requirements in practice, it is important to consider requirements of all four activities in equal measure. A partially implementation of selective requirements will not lead to usable products. Only using them in a holistic way will support the systematic development of usable solutions. As a result of the analysis and evaluation the compliancy and key requirements represent an evaluated knowledge basis for the development of usable products. The analysis based on representative standards of UE and the requirements add here to more specific criteria based on experts’ knowledge. The requirements account for the
658
K. Nebe and V. Paelke
integration of SE and UE as they can be used for the definition and adaption of SE process models as well as operational development processes.
4 Conclusions and Outlook In summary, there exist many integration approaches that aim to provide systematic procedures for developing usable software. At a closer look, they address three different levels of abstraction: standards, process models and operational processes. However, there have been few attempts to exploit the integration in a holistic way including all three levels. The authors report about such an approach and present a systematic way of integrating usability engineering demands into the software engineering methodology. The results of an expert based analysis (and subsequent evaluation) have been used to derive two distinct types of requirements: ‘Compliancy and Key Requirements’. Compliancy requirements represent the goals and base practices defined in the standards DIN EN ISO 13407 and ISO/PAS 18152 but those are refined by the output of the analysis. The key requirements define core characteristics of the overall frameworks usability activities focusing on the activities’ and results’ quality and are also based of the analysis’ results. The requirements represent an evaluated knowledge basis for the development of usable products. They add to an integration of software engineering and usability engineering as they can be used for the definition and adaption of software development processes and process models, too. In future we aim to evaluate these requirements in practical projects to observe process changes and their resulting effects to the usability of the products.
References 1. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. Addison-Wesley (ACM Press), New York (1999) 2. Constantine, L.L., Biddle, R., Noble, J.: Usage-centered design and software engineering. Models for integration. In: IFIP Working Group 2.7/13.4, ICSE 2003 Workshop on Bridging the Gap Between Software Engineering and Human-Computer Interaction, Portland (2003) 3. Düchting, M., Zimmermann, D., Nebe, K.: Incorporating User Centered Requirement Engineering into Agile Software Development. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 58–67. Springer, Heidelberg (2007) 4. DIN EN ISO 13407: Human-centered design processes for interactive systems, CEN European Committee for Standardization, Brussels (1999) 5. Ferre, X.: Integration of Usability Techniques into Software Development Process. Bridging The Gaps Between Software Engineering and Human-Computer Interaction. In: Proceedings of ICSE 2003 International Conference on Software Engineering, pp. 28–35. ACM Press, Portland (2003) 6. Göransson, B., Lif, M., Gulliksen, J.: Usability Design-Extending Rational Unified Process with a New Discipline. In: Jorge, J.A., Jardim Nunes, N., Falcão e Cunha, J. (eds.) DSV-IS 2003. LNCS, vol. 2844, pp. 316–330. Springer, Heidelberg (2003)
Usability-Engineering-Requirements as a Basis for the Integration with SE
659
7. Granollers, T., Lorès, J., Perdrix, F.: Usability Engineering Process Model. Integration with Software Engineering. In: Proceedings of the Tenth International Conference on Human-Computer Interaction, pp. 965–969. Lawrence Erlbaum Associates, New Jersey (2002) 8. ISO/IEC 12207: Information technology - Software life cycle processes, 2nd edn., 200802-01. ISO/IEC, Genf (2008) 9. ISO/PAS 18152: Ergonomics of human-system interaction - Specification for the process assessment of human-system issues. ISO, Genf (2003) 10. Jokela, T.: An Assessment Approach for User-Centred Design Processes. In: Proceedings of EuroSPI 2001, Limerick Institute of Technology Press, Limerick (2001) 11. Kolski, C.: A call for answers around the proposition of an HCI-enriched model. ACM SIGSOFT Software Engineering Notes 23(3), 93–96 (1998) 12. Nebe, K., Zimmermann, D.: Suitability of Software Engineering Models for the Production of Usable Software. In: Proceedings of the Engineering Interactive Systems 2007. LNCS, Springer, Heidelberg (2007a) 13. Nebe, K., Zimmermann, D.: Aspects of Integrating User Centered Design into Software Engineering Processes. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 194–203. Springer, Heidelberg (2007b) 14. Nebe, K., Zimmermann, D., Paelke, V.: Integrating Software Engineering and Usability Engineering. In: Pinder, S. (ed.) Advances in Human-Computer Interaction, ch. 20, pp. 331–350. I-Tech Education and Publishing, Wien (2008b) 15. Nebe, K.: Integration von Usability Engineering und Software Engineering: Konformitäts und Rahmenanforderungen zur Bewertung und Definition von Softwareentwicklungsprozessen, Doctoral Thesis, Shaker Verlag, Aachen (in print, 2009) 16. Pawar, S.A.: A Common Software Development Framework For Coordinating Usability Engineering and Software Engineering Activities. Master Thesis, Blacksburg, Virginia (2004) 17. Schaffer, E.: Institutionalization of usability: a step-by-step guide. Addison-Wesley, Pearson Education, Inc., Boston (2004) 18. Seffah, A., Desmarias, M.C., Metzker, E.: Human Centered Software Engineering, HCI, Usability and Software Engineering Integration: Present and Future. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-Centred Software Engineering – Integrating Usability in the Development Lifecycle, vol. 8, Springer, Heidelberg (2005) 19. Sousa, K., Furtado, E., Mendoca, H.: UPi: a software development process aiming at usability, productivity and integration. In: Proceedings of the 2005 Latin American conference on Human-computer interaction CLIHC 2005, ACM Press, New York (2005) 20. Tidwell, J.: Designing Interfaces: Patterns for Effective Interface Design. O’Reilly, Sebastopol (2005) 21. Van Harmelen, M.: Object Modeling and User Interface Design: Designing Interactive Systems. Addison-Wesley, Boston (2001)
Design Creation Based on KANSEI in Toshiba Yosoko Nishizawa and Kanya Hiroi Toshiba corporation, Design Center, 1-1-1 shibaura, minato-ku, Tokyo, Japan {yosoko takano,kanya kiroi}@toshiba.co.jp
Abstract. In endeavoring to increase the quality of design, Toshiba has outlined a concept of “perceived quality,” and evaluates designs on the basis of achieving a higher level of perceived quality. We defined six indices from the result of the image research into the design by the user. These six indicators of perceived quality were used in the creation and evaluation of designs, and a number of products were put on the market and evaluated. Keywords: KANSEI, design, product, quality of design, Evaluation of design.
1 Introduction Toshiba’s Design Division was established in 1953, and now boasts a 55-year history. Based on a concept of “Eyes that see into the future, hands that know the joy of creation, and hearts that care for humanity and the environment,” and with “Smart & Unique” as its development slogan, the Design Division currently supports product development for Toshiba as a whole. During the past half-century, the genres of products in development have increased dramatically, and the scope of design activities has also greatly expanded. In addition, with the growth of new markets and the globalization of production bases, it has become essential to develop products of powerful originality that present new value. Company design departments (the in-house design process) must work to respond to these changes. In a market that is saturated with products and services and with minimal differences between them, new value is required in order to enable differentiation of products on a level beyond function, color, or form. The values required by users are changing more and more based on experience of using products and the exchange of impressions and feelings in interpersonal communication. In particular, the more that functions are improved, the more that users rely on an appeal that sways their emotions as a factor in deciding on the purchase of a product. This means is that it is essential to incorporate new value that transcends function. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 660–666, 2009. © Springer-Verlag Berlin Heidelberg 2009
Design Creation Based on KANSEI in Toshiba
661
As the source of added value, having shifted from quantity to quality, shifts from quality to more nebulous factors, we are driven by the necessity to create new value for customers. As one initiative in this direction, Toshiba is engaging in product development with “perceived quality” positioned as an added value to its products. As of the present, we have attempted to define the nature of perceived value, and we have evolved the terms “perceived quality” and “appealing quality.” Unfortunately, the question of how to incorporate these concepts in product development is an ongoing process of trial and error on the ground. In this presentation, we will offer examples to illustrate Toshiba’s concept of “perceived quality,” and will also discuss the methods by which the concept was derived.
2 The Concept of Perceived Quality 2.1 Derivation of Perceived Quality “Perceived quality” is quality that can be expressed in terms of an individual’s feelings and the images to which they respond. It is a quality that can be expressed in terms of the subjective requirements of the individual. For an automobile, examples of these subjective requirements would be “Does it feel good to drive?,” “Is it stylish?,” and the like. Contrasting with this, there are other aspects of quality which can be expressed as objective, physical characteristics[1]. For a car again, examples of these would be high horsepower, good fuel efficiency, and the like. We can say that “perceived quality” resides in design features that appeal to the emotions, and is something that the customer judges subjectively. What, then, is a design that appeals to the emotions, and what is a product in which this quality resides to a high degree? First, we studied what types of products appealed to the emotions and how customers evaluated these products. The results of these studies are shown in Figure 1. The numbers in the figure represent products. The evaluated product is a product that shines to the design prize or it is a commodity that designed evaluation is high for the user. These results, in addition to a series of interviews, showed that a design with a high level of perceived quality is one that is beautiful, easy to use, and that offers feelings of security and pleasure. Further, the systemic structural relationships shown in Figure 2 also exist. This shows that design expression can be used to increase quality, enabling the creation of a product that creates a strong impression on customers. We defined two types of perceived quality: Basic perceived quality and a perceived quality that goes beyond the basic to affect the emotions (appealing quality) (Figure 2). As a prerequisite for the creation of a design that affects the emotions (i.e., that possesses appealing quality), we established that first the design must produce feelings of pleasure (i.e., must possess basic perceived quality). Then, we searched for the factor that became an index from the result of the above-mentioned user survey.
662
Y. Nishizawa and K. Hiroi
Fig. 1. T This is a positioning map. We mapped the results of a correspondence analysis of data from a Web-based questionnaire on design images.
Fig. 2. This figure shows the elements making up perceived quality as defined by Toshiba. A design expression that offers simplicity and ease of use produces feelings of pleasure in users. This can be understood as basic perceived quality, but by itself this is not enough. We must also consider appealing quality, which transcends feelings of pleasure to affect the emotions.
The results of the user survey showed that this basic perceived quality was made up of six elements. Indicators we defined for these six elements are `Aesthetic quality`, `Quality with feeling of warmth`, `Quality in use`, `Universal quality`, `Quality that transmits the “message” of the product`, and `Original quality` (Figure 2).
Design Creation Based on KANSEI in Toshiba
663
Fig. 3. These are the six elements of perceived quality. These were derived by the factorial analysis.
2.2 Evaluating Perceived Quality Using the six indicators to evaluate a group of popular products with excellent design features that were available on the market showed us that it was indeed possible to evaluate these products to some extent. The results of this survey showed two patterns for products of high perceived quality: Either the product was evaluated for all six indicators to some degree (Group B), or it was evaluated extremely highly for only some of the indicators (Group A). Figure 4 shows products that were selected as displaying a high level of perceived quality. Evaluated on the basis of the six indicators of perceived quality, the products A in Group A received relatively high evaluations for `Quality with feeling of warmth` and `Aesthetic quality`, but low evaluations for `Quality in use` and `Universal quality`. The products in Group B, by contrast, received balanced evaluations across the entire spectrum of indicators, despite receiving fairly low evaluations for `Quality with feeling of warmth`. Product A was liked for users more than products B. As this shows, rating highly in all six indicators of perceived quality does not mean that a product will be evaluated as displaying a high level of perceived quality. Rather, we can say that a product that receives an extremely high evaluation on one axis is more likely to be selected as a product of high perceived quality. Given this, perceived quality can be considered as something which strongly displays a specific tendency rather than something that is balanced overall.
664
Y. Nishizawa and K. Hiroi
Fig. 4. This is an example of the evaluation of award-winning designs using the six indicators of perceived quality that we defined. These are the results of a questionnaire given to average users. Both A and B was evaluated highly for only some indicators.
3 Creating Products of High Perceived Quality at Toshiba Based on the results discussed above, we set the six indicators of perceived quality as shared guidelines for designers in creating perceived quality. And we attempted to create products at Toshiba displaying a high level of perceived quality. This enabled us to develop a variety of products of high perceived quality. Examples include a range of IH cookers, a cellular phone (KOTO), and high-quality home electronic products (washing machines, ovens, and vacuum cleaners). The IH cookers and the KOTO cellular phone (Figure 5) incorporate Japanese-style design, and both received extremely high evaluations for some indicators of perceived quality. The form of the Japanese musical instrument, the koto, was used as a design element in the KOTO cellular phones (see Figure 5B), and the projected design image saw them as being finished in vermillion. This products was made for `Aesthetic quality`, `Quality that transmits the “message” of the product` and `Original quality`.
A
B
Fig. 5. A is an IH cooker, and B is a “Koto” model cellular phone. Both products were evaluated highly for design.
Design Creation Based on KANSEI in Toshiba
665
On the other hand, the IH cookers (see Figure 5A) was evaluated extremely highly for `Quality with feeling of warmth` and `Original quality`. This IH cookers feature an unusual combination of the forms of conventional IH cookers and metal pots, and present them as an integrated whole. Design efforts have also enabled the cookers to be presented as tableware. The original designs of the cookers have incorporated materials traditionally used in different parts of Japan – stainless steel in Tsubame city, Nambu ironware in Mizusawa city, earthenware in Yokkaichi, etc. – and they have been marketed as products in which the feelings of users can find a resonance. In addition to gold of the Japanese G Mark, this design has received Germany’s IF Award and Red Dot Award, indicating how highly-regarded it is in Europe. In order to demonstrate the appeal of Toshiba design, we developed advertisements that focused on their perceived quality (Figure 6).
Fig. 6. These are examples of advertising Toshiba that is the front side appeal for the sensibility quality
In addition, Toshiba is globally selling a lot of products and systems now. Therefore, design must be conducted on a global level. Issues for the future will include how to blend design elements having a global appeal with those whose appeal is unique to Japan, how to judge the amount of the ingredients, and how to incorporate essences in the design. To respond to these issues, we are at present engaging in further study of perceived quality and revising our six indicators, in order to enable them to function as a yardstick of perceived quality that will be globally valid.
4 Conclusion In the field of products for B to C, which now represents a mature market, it will be increasingly important in future to use design to create products with originality and
666
Y. Nishizawa and K. Hiroi
high perceived quality. Toshiba has introduced a “yardstick” of perceived quality as a guide to answering the question of how this originality is to be created. In 2006, the Ministry of Economy, Trade and Industry also launched a program for the development of products that were both original and incorporated a new Japanese style, as an initiative for the creation of perceived value [2]. And this year is positioned as a year for the creation of perceived value, and Japanese products falling within this category will be presented in exhibitions in Paris and elsewhere. On such movement, training its focus on the perceptions involved in perceived quality, Toshiba makes efforts to the development of the commodity with a sensibility that runs globally and an original sensibility of Japan.
References 1. The Japanesque Modern Committee: Towards a Japanesque Modern Style – Representing Japanese Tradition to the World, http://www.rieti.go.jp/jp/events/bbl/06041801.html 2. Policy Office for Design and Human Life System Manufacturing Industries Bureau METI 3. Kansei Initiative – Proposal of a fourth value axis, IIST WORLD FORUM (June 16, 2008), http://www.iist.or.jp/wf/magazine/0618/0618_E.html 4. Opinions presented at the symposium Kansei initiatives (Initiatives for the Creation of Perceived Value), held by the Japan Industrial Designers Association (JIDA) (June 18, 2007) 5. Hiroi, K.: About the sensibility value creation of the design: Research leader, vol. 10, pp. 43–51. Technical Information Institute Co., Ltd., Japan (2007)
High-Fidelity Prototyping of Interactive Systems Can Be Formal Too Philippe Palanque, Jean-François Ladry, David Navarre, and Eric Barboni IHCS-IRIT, Université Paul Sabatier – Toulouse 3, France {ladry,palanque,navarre,barboni}@irit.fr
Abstract. The design of safety critical systems calls for advanced software engineering models, methods and tools in order to meet the safety requirements that will avoid putting human life at stake. When the safety critical system encompasses a substantial interactive component, the same level of confidence is required towards the human-computer interface. Conventional empirical or semi-formal techniques, although very fruitful, do not provide sufficient insight on the reliability of the human-system cooperation, and offer no easy way to, for example, quantitatively compare two design options. The aim of this paper is to present a method, with supporting tools and techniques, for engineering the design and development of usable user interfaces for safety-critical applications. More precisely we present the Petshop environment which is a Petri net based tool for the design specification, prototyping and validation of interactive software. In this environment models of the interactive application can be interactively modified and executed. This is used to support prototyping phases (when the models and the interactive application evolve significantly to meet late user requirements for instance) as well as in the operation phase (after the system is deployed). The use of the description technique (the ICO formalism) supported by PetShop is presented on a multimodal ground segment application for satellite control and more precisely how prototyping can be performed at the various levels of the architecture of interactive systems. Keywords: Model-based approaches, formal description techniques, interactive prototyping, reliability, evolvability.
1 Introduction and Related Work Current research in Interactive systems promotes the development of new interaction and visualization techniques in order to increase the bandwidth between the users and the systems. Such an increase in bandwidth can have a significant impact on efficiency (for instance the number of commands triggered by the users within a given amount of time) and also on error-rate [23] and complexity. To address design issues raised by such systems, new design and development processes have to be defined and assessed. Current development processes both in the field of Human-Computer Interaction (HCI) [11] and Software Engineering (SE) [9, 4] promote iteration-centered processes but with a different perspective. In the field of HCI, the product of each iteration is J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 667–676, 2009. © Springer-Verlag Berlin Heidelberg 2009
668
P. Palanque et al.
tested with potential users of the system under development while, in SE, the product is evaluated by different stakeholders including client or customer (the one who pays for or buys the product) and more unlikely users (but user-centered approaches (such as task analysis and modelling). At design stage, HCI approaches promote iteration through the production of prototypes to be presented to and used by “real” users. While such design process is widely agreed upon, the debate is still vivid whether one should use low-fidelity [24] or high-fidelity prototyping [26, 14]. When it comes to complex applications at the interaction level [19], or at the application level [25], low fidelity approaches only address a small part of that complexity. The outcome is too informal for making it exploitable further on in the development process without losing a significant part of it. This limits the use of low-fidelity prototyping approaches to the earlier phases of the development process, where main design questions are addressed and low level ones left to later phases. The main drawback of high-fidelity prototyping lays in the fact that the iterations are more time consuming and thus prevent exploration of new ideas without jeopardizing the entire project by overrun on schedule. Another inconvenient of high-fidelity prototyping is related to the product of that phase which most of the time corresponds to program code, making its integration in the rest of the application very difficult due to lack of abstraction. In this paper, we promote the use of an executable formal approach called Interactive Cooperative Objects (ICOs) within the high-fidelity prototyping phase of interactive systems development. This formal approach solves some of the limitations of Rapid Application Development (RAD) techniques currently used for high-fidelity prototyping. Indeed, it provides abstraction through models, rapid execution through simulation and testing through generation of test cases and scenarios. In addition, when the prototyping phase is terminated, the outcome is not only a partially running prototype, but also a partial formal description of its behaviour that can then be passed on to the development team in charge of the development of the final system to be deployed. Previous work we have done in that domain was focusing on the rapid prototyping of the interactive application [17]; our current work addresses the 3 levels of interactive systems prototyping: interaction technique level (including multimodal interactions with non standard input devices as tactile screens), interactive component (including sophisticated widgets such as range sliders of semi-transparent pop-up menus) [16] and the interactive application in complex environment as cockpits (both military and civil [1]), grounds segment for satellite control rooms [20] and Air Traffic Management interactive applications. This paper focuses on the use of the ICOs formal description technique to support rapid prototyping of interaction techniques. More precisely, it presents how an interaction technique can be defined and then how it can “rapidly” evolve according to users’ feedback and users’ performance. Indeed, the tool support environment for ICOs (called PetShop) has been now extended to provide additional facilities such as model-based logging of events and state-changes to support usability evaluation activities classically imbricated with rapid prototyping. This paper also addresses how logging support can be used to carry out performance analysis of the interaction technique thus limiting user testing to interaction techniques that have been previously formally analysed.
High-Fidelity Prototyping of Interactive Systems Can Be Formal Too
669
This paper is organized as follows. Next section presents some related work and research questions in the field of model-based approaches for interactive systems. The ICO notation is described in section 3. Section 4 presents the CASE tool Petshop which allows editing and execution of ICO models. Section 5 presents, on two small examples, how prototyping can be managed with PetShop and ICOs. Section 6 concludes the paper.
2 Model-Based Approaches for Interactive Systems When formal methods were initially used for interactive systems [16], models were limited to the dialog part, making them less prominent for runtime use as only one part of the interactive system was taken into account. In order to address issues raised by real life application, current trend in interactive systems engineering is to develop models for all the parts of the systems. Another parallel track of research work has been targeting at modelling new interaction techniques in order to be able to deal with current practice in the field of HCI. To deal with WIMP and post-WIMP interaction techniques, several notations have been proposed from Data-flow-based notations such as Wizz’ed [7], Icon [6], Nimmit [23] or InTml [8] to event-based notations such as Marigold [23], Hynets [23] or ICO [8]. Hybrid models integrating both event-based and data-flow-based notations have also been presented in [8] and in [15]. With respect to that later work, the work presented here extends the work presented in [15] by removing the data-flow model dealing with input devices configuration and proposing a single event-based notation described in the next section. The work presented in this paper is about providing a modelling technique capable of representing the behaviour of an entire interactive application (from physical to functional interaction) using a dedicated Petri net dialect. It also targets at new interaction techniques (e.g. multimodal, direct manipulation ...) such as the ones used in the field of HCI. This paper shows how the CASE tool Petshop [1] embeds the system models (which represent an interactive system from the interaction technique through to the system functional core) using the ICO notation at runtime for: • Prototyping of models • Execution of application to check • Analysis as a way of supporting models construction by providing additional information about the properties of the models under construction.
3 The ICO Formalism The ICO formalism is a formal description technique dedicated to the specification of interactive systems [19]. It uses concepts borrowed from the object-oriented approach (dynamic instantiation, classification, encapsulation, inheritance, client/server relationship) to describe the structural or static aspects of systems, and uses high-level Petri nets [23] to describe their dynamic behavioral aspects.
670
P. Palanque et al.
3.1 Cooperative Object The ICO notation depends on Cooperative objects, A Cooperative Object states how the object reacts to external stimuli according to its inner state. The COs behaviour is called the Object Control Structure (ObCS) is expressed in a language based on Object Petri Net (OPN) (see Fig. 1.). An ObCS can have multiple places and transitions that are linked with arcs like standard Petri nets. As an extension to these standard arcs, ICO provides additional input arcs: Test arcs and Inhibitor arcs. Each place has an initial marking (represented by one or several tokens in the place) describing the initial state of the system.
Fig. 1. Metamodel of the COs exhibiting runtime features
With respect to “standard” Petri nets, the object-oriented nature of the Cooperative Objects supports instantiation. Indeed, every ObCS can be instantiated and allows multiple executions of the same class as in object oriented programming languages. These instances can be parameterised by constructor arguments. This parameterisation is used to associate markings to the Petri net describing the behaviour of the instantiated Cooperative Object. For example, in a case of a multiple mouse interaction (i.e. in interactive cockpits such as the Airbus A380), each mouse driver is a distinct instance of an ObCS class with different Class Parameters (i.e. the number of the mouse) and so the behaviour model of each driver handle its own coordinates represented in the marking of the instance. For more details about that type of modelling see [1]. Fig. 1 presents a subset of the class diagram of ICOs. As stated above, the main element used for prototyping is related to the fact that each class can have several instances (as shown on the right-hand side of the figure) and that instances can be Played, Paused or Stopped. 3.2 Interactive Cooperative Objects To allow dealing with the specificities of interactive systems the Cooperative Objects formalism has been extended. The resulting notation is called Interactive Cooperative Objects.
High-Fidelity Prototyping of Interactive Systems Can Be Formal Too
671
An ICO is a 6-tuple where: • • • • •
CO is a Cooperative Object described in section 3.1, Su is a set of user services (a user service is a set of synchronized transitions), Wid is a set of interactive widgets (e.g. buttons, listbox, …) linked to the ICO class, Event is a set of user events coming from items of Wid, Act and Rend are the activation and rendering function described below.
Act: An activation function defines the relationship between events triggered by users while interacting with user interface objects (by manipulation of input devices such as mouse, keyboard, voice recognition systems …) with the transitions of the ObCS. When an event is triggered the related transition can be fired if the transition was fireable (according to the current marking of the Petri net). Rend: A rendering function defines how the state changes in the ObCS influence the changes in the presentation (what the user perceives of the application). The state changes are linked to the entering in or exiting of a token in a place.
4 Prototyping of ICO Models Using Petshop Tool To support the manipulation of the ICO notation, a CASE tool called Petshop [1] has been developed. It includes a Java implementation of a Object-oriented Petri net interpreter and some analysis tools for verifying properties on the models. The tool is publicly available at http://ihcs.irit.fr/petshop. 4.1 Structure Fig. 2 represents the high level structure of Petshop. In Petshop, it is possible to edit, execute and analyze the instances of ObCS. When the user edits an instance, Petshop starts to update the ObCS (the class) and then updates all the instances of this class. During the first execution of the instance, the instantiation engine takes the ObCS to create an instance. Next, this instance is executed and can be directly managed by the user of Petshop (started, paused and stopped). When the instance is running, Petshop can also analyze the model (currently limited to the calculation of place invariants and transition invariants [10]). An example of PetShop user interface is presented in Fig. 2.
Fig. 2. High Level Structure of Petshop
672
P. Palanque et al.
4.2 Edition of Models The CASE tool Petshop allows: • to graphically add Petri net items (place, transition and different arcs) , • to modify the initial construction parameters of the class (e.g. editing a set of variables that may have different values for each instantiation) • to modify the initial marking for each place (that corresponds to raw values or to references to the initial parameters of the class), • to change the executable code in the transition, • to modify the layout of the Petri net, • to cut copy paste part of the model, • to undo redo any change, • to navigate through large models via mini map or through a large set of models via a tree. 4.3 Execution of Models In Petshop a toolbar ( ) allows the user to start/stop/pause an instance of the ObCS. There are two modes of execution of instances: • A normal execution in which the user is a spectator of execution and observes the execution of an instance. Transitions are fired using random enabling substitutions, • A step by step execution in which the user can select a substitution to fire the transition. At runtime, the execution of instances gives the following feedback to the user: • The marking is shown by the number of tokens present in a place, • The fireability of transitions is shown by colour changes: purple for fireable or gray for not fireable, • The firing of a transition and the updating of the marking (by the evolution of tokens in the input and output places of the fired transition). Petshop also provides observability and controllability services via an API for externals programs (in our case the window manager of the plateform handing input devices). Observability services send events to subscribers when: markings change, substitutions change and events are raised in code associated to the transitions. Controllability services receive events from external sources and fire the related transition of a user service. All traces of execution can be logged to an external file allowing further analysis such as usability evaluation of the interactive systems [5].
5 Prototyping Interactive Systems with ICOs This section presents the prototyping capabilities of PetShop and the ICO notation. These capabilities are presented on two examples extracted from case studies. They
High-Fidelity Prototyping of Interactive Systems Can Be Formal Too
673
show different aspects illustrating how prototyping can be performed at different levels of the architecture of interactive systems. 5.1 Prototyping Interaction Techniques The example in this section presents how it is possible using the ICO notation to prototype low level interaction techniques. Such prototyping is critical to increase usability of interactive applications as fine tuning of interaction can have a huge impact on the overall performance of users [13].
Fig. 3. ICO model of a mouse driver
The model of Fig. 3 describes a transducer for handling low level events. It models how events from the input device (in such as a case a pointing device like a mouse) are received from the input device and how they are transformed according to the need of the interactive application. Dark transitions represent the transitions that are available according to the current marking of the model. Their black border means that they are connected to events i.e. even though they are available according to the current marking, they must additionally receive an event to be actually fired. The model can receive 4 different events: mouseMove, mousePressed, mouseReleased and mouseClick. The current position of the cursor of the input device is stored in the place Currentxy. When a mouseMove event is received the transducer has to transform the dx, dy parameters received in x and y position to reflect that change on the mouse cursor. In order to keep the cursor inside a set of predefined bounds (this could be for instance the size of the screen or the size of a portion of a window) the transformation of x and y values according to dx and dy parameters has to be constrained. This is the role of the places named Bounds. As for a notational aspect these places are virtual places i.e. virtual copies of a single place. This notational aspect is used to reduce the number of arcs when the same place is connected to many transitions. The code of the transitions mouseClick, mouseReleased and mousePressed feature contain the Trigger construct. This means that, when one of these transitions is fired
674
P. Palanque et al.
the model will raise an event. Other models registered to the current model will then be notified for each event triggered. The model in Fig. 4 shows how the previous model can be modified according to requests from modification (after usability evaluation for instance).
Fig. 4. Modified ICO model of a mouse driver (acceleration of mouse move events)
The modification includes a new element in the interaction technique: the acceleration. Indeed, the movements on the table where the mouse is located are typically much more constrained than the virtual space available to the cursor. For this reason mouse drivers will embed an acceleration mechanism that increase cursor movement according to speed. This is modelled by adding the places Coef in the models and connecting them to the transitions in charge of the calculation of the new position of the cursor. The code of these transitions shows that dx and dy parameters are multiplied by the coefficient (stored in the token of the place Coef). 5.2 Prototyping Applications While the prototyping of interaction techniques is critical for fine tuning of interaction, prototyping is also needed at a higher level. This section presents how PetShop and ICO support prototyping at the dialogue level of interactive applications. The prototyping aspects remain the same as for the interaction technique i.e. models describing the behaviour of the applications at the dialogue level can be interactively modified and the impact of the modifications can be immediately perceived. The application under consideration here is called MPIA. The Multi Purpose Interactive Application (MPIA) is an application available in the cockpits of several aircrafts that aims at handling several flight parameters. It is made up of 3 pages (called WXR, GCAS and AIRCOND). The WXR page is responsible for managing weather radar information; GCAS is responsible for the Ground Anti Collision System parameters while AIRCOND deals with settings of the air conditioning. Due to space constraints we don’t present in details the interactive modifications of the models but the interested reader can see detailed behaviour of that application (in a reconfiguration process after hardware failure in a cockpit) in [18].
High-Fidelity Prototyping of Interactive Systems Can Be Formal Too
675
6 Conclusion This paper presents the ICO notation for the description of interactive systems via graphical models which can be edited and executed at runtime. The ICO notation, an extension of object Petri nets has a dedicated CASE tool called Petshop. This runtime capability increases the possibilities of modelling by supporting prototyping, testing, and verification. This paper presented how prototyping of interactive applications can be performed at two different levels: interaction technique and dialogue model. The later is extracted from an industrial example dealing with cockpit applications in civil aircrafts. We have studied the usability of ICOs and PetShop for prototyping phases in an informal with software engineers involved in the field of Air Traffic Control applications [2]. Informally we can report that modification of models was fine while creation of models and connecting models was not performed in a satisfying way. Testing of the tool is available at http://ihcs.irit.fr/petshop. The specific application area that we consider in the paper is ground segment applications for satellite control, but the results have been applied and are applicable to other application areas with similar requirements. Acknowledgements. This work is supported by the EU funded Network of Excellence ResIST http://www.resist-noe.eu contract n°026764 and the CNES funded R&T Tortuga project http://ihcs.irit.fr/tortuga/ contract n° R-S08/BS-0003-029. We would also like to thanks the reviewers for their in-depth thoughtful comments.
References 1. Barboni, E., Navarre, D., Palanque, P., Basnyat, S.: Addressing Issues Raised by the Exploitation of Formal Specification Techniques for Interactive Cockpit Applications. In: HCI Aero 2006, p. t.b.p., Seattle (2006) 2. Bastide, R., Navarre, D., Palanque, P.: A Tool-Supported Design Framework for Safety Critical Interactive Systems. Interacting with computers 15(3), 309–328 (2003) 3. Bastide, R., Palanque, P., Duc, L.: Integrating Rendering Specifications into a Formalism for the Design of Interactive Systems. In: DSV-IS 1998, pp. 171–190 (1998) 4. Beck, K.: Extreme Programming Explained: Embrace Change. Addison-Wesley, US (1999) 5. Bernhaupt, R., Navarre, D., Palanque, P., Winckler, M.: Model-Based Evaluation: A New Way to Support Usability Evaluation of Multimodal Interactive Applications In Maturing Usability, Quality in Software, Interaction and Value. In: Human-Computer Interaction Series, pp. 96–119. Springer, Heidelberg (2007) 6. Dragicevic, P., Fekete, J.-D.: Input Device Selection and Interaction Configuration with ICON. In: Proceedings of IHM-HCI 2001, People and Computers XV - Interaction without Frontiers, pp. 543–448. Springer, Heidelberg (2001) 7. Esteban, O., Chatty, S., Palanque, P.: Whizz’Ed: a visual environment for building highly interactive interfaces. In: Proceedings of the Interact 1995 conference, pp. 121–126 (1995) 8. Figueroa, P., Green, M., Hoover, J.: InTml: A Description Language for VR Applications. In: Proceedings of Web3D 2002, Arizona, USA, pp. 53–58 (2002) 9. Fowler, M., Highsmith, J.: The Agile Manifesto. Software Development (August 2001) 10. Genrich, H.J.: Predicate/Transitions Nets. In: Jensen, K., Rozenberg, G. (eds.) High-Levels Petri Nets: Theory and Application, pp. 3–43. Springer, Berlin (1991)
676
P. Palanque et al.
11. Gulliksen, J., Goransson, B., Boivie, I., Blomkvist, S., Persson, J., Cajander, A.: Key principles for user-centred systems design. Behaviour and Inf. Tech. 22, 397–409 (2003) 12. Jacob, R.: A Software Model and Specification Language for Non-WIMP User Interfaces. ACM Transactions on Computer-Human Interaction 6(1), 1–46 (1999) 13. Kabbash, P., Buxton, W.A.: The “prince” technique: Fitts’ law and selection using area cursors. In: Proceedings of the ACM CHI Conference, pp. 273–279. ACM Press, New York (1995) 14. Lim, Y., Pangam, A., Periyasami, S., Aneja, S.: Comparative analysis of high- and lowfidelity prototypes for more valid usability evaluations of mobile devices. In: Proc. of NordiCHI 2006, vol. 189, pp. 291–300. ACM, New York (2006) 15. Navarre, D., Palanque, P., Dragicevic, P., Bastide, R.: An Approach Integrating two Complementary Model-based Environments for the Construction of Multimodal Interactive Applications. Interacting with Computers 18(5), 910–941 (2006) 16. Navarre, D., Palanque, P., Bastide, R., Sy, O.: Structuring interactive systems specifications for executability and prototypability. In: Palanque, P., Paternó, F. (eds.) DSV-IS 2000. LNCS, vol. 1946, pp. 97–120. Springer, Heidelberg (2001) 17. Navarre, D., Palanque, P., Bastide, R., Sy, O.: A Model-Based Tool for Interactive Prototyping of Highly Interactive Applications. In: 12th IEEE International Workshop on Rapid System Prototyping, Monterey, USA, IEEE, Los Alamitos (2001) 18. Navarre, D., Palanque, P., Basnyat, S.: Usability Service Continuation through Reconfiguration of Input and Output Devices in Safety Critical Interactive Systems. In: Harrison, M.D., Sujan, M.-A. (eds.) SAFECOMP 2008. LNCS, vol. 5219, pp. 373–386. Springer, Heidelberg (2008) 19. Navarre, D., Palanque, P., Bastide, R., Schyn, A., Winckler, M., Nedel, L.P., Freitas, C.M.D.S.: A model-based approach for engineering multimodal interactive systems. In: Costabile, M.F., Paternó, F. (eds.) INTERACT 2005. LNCS, vol. 3585, pp. 170–183. Springer, Heidelberg (2005) 20. Palanque, P., Bernhaupt, R., Navarre, D., Ould, M., Winckler, M.: Supporting Usability Evaluation of Multimodal Man-Machine Interfaces for Space Ground Segment Applications Using Petri net Based Formal Specification. In: Ninth International Conference on Space Operations, CD-ROM proceedings, Rome, Italy, June 18-22 (2006) 21. Parnas, D.L.: On the use of transition diagram in the design of a user interface for interactive computer system. In: Proceedings of the 24th ACM Conference, pp. 379–385 (1969) 22. Peterson, J.L.: Petri Net Theory and the Modeling of Systems. Prentice-Hall, Englewood Cliffs (1981) 23. Reason, J.: Human Error, 302 pages. Cambridge University Press, Cambridge (1990) 24. Rettig, M.: Prototyping for tiny fingers. Commun. ACM 37(4), 21–27 (1994) 25. Risoldi, M., Amaral, V.: Towards a Formal, Model-Based Framework for Control Systems Interaction Prototyping. Rapid Integration of Software Engineering Techniques, 144–159 (2007) 26. Rudd, J., Stern, K., Isensee, S.: Low vs. high-fidelity prototyping debate. Interactions 3(1), 76–85 (1996) 27. Vanacken, D., De Boeck, J., Raymaekers, C., Coninx, K.: NiMMiT: a Notation for Modelling Multimodal Interaction Techniques. In: International Conference on Computer Graphics Theory and Applications, Portugal (2006) 28. Wieting, R.: Hybrid High-Level Nets. In: Proc. of the 1996 Winter Simulation Conference, pp. 848–855. ACM Press, New York (1996) 29. Willans, J.S., Harrison, M.D.: Prototyping pre-implementation designs of virtual environment behaviour. In: Nigay, L., Little, M.R. (eds.) EHCI 2001. LNCS, vol. 2254, pp. 91– 108. Springer, Heidelberg (2001)
RUCID: Rapid Usable Consistent Interaction Design Patterns-Based Mobile Phone UI Design Library, Process and Tool Avinash Raj1 and Vihari Komaragiri2 1
Toronto, Canada
[email protected] 2 Bangalore, India
[email protected] Abstract. This paper is based on a research effort at Kyocera Wireless, India that aimed to overcome the limitations in the mobile phone design process, by giving designers an improved design and specification tool and helping them deal routinely with some of the more rooted constraints of phone design. The tool extends the idea of templates from simple visual elements, to more abstract design components. It adds further value to this modularization of design, by taking an approach of extensive and ever-growing library of patterns to define and refine these components. The components cover most of the low- to medium-level building blocks of design. They are specified in the library as a tuple(patterns) of <design problem, design solution, context, constraints> each at the different level of hierarchy. The components are visually represented using standardized shapes with placeholder and help text and are made available as part of the design work surface of a visual prototyping tool such as MS Visio or Adobe Fireworks. Keywords: Mobile phone UI design, patterns, architecture, design process, lib.
1 Introduction Modern mobile phones give its Users many features: voice and data calls, text messaging, personal information management (phonebook and calendar), WAP browsing, games, etc. All these features are packaged into a handset with a small screen and 12/16 key special purpose keypad. Each new release of the phone builds additional features on top of the existing ones and there is a constant race among phone vendors to put attractive phones out into the market faster and cheaper. Among the newer mobile phones deluging the markets most have their own separate interaction paradigms. It is important to find out a uniform set of interaction paradigms to avoid a time-consuming and expensive approach, to avoid re-writing each application for each mobile phone. The Users should not be forced to undergo the grueling process of learning and unlearning these interaction styles when using a J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 677–686, 2009. © Springer-Verlag Berlin Heidelberg 2009
678
A. Raj and V. Komaragiri
same set of application over multiple mobile phones of either the same vendor or, in fact, even of different vendors. 1.1 Current Mobile Phone Interaction Design Process Though design guideline documents deliver coherence among User Interfaces, they cannot be used effectively to communicate how different components of the design will work together and how Users will interact with them. In addition, guidelines can get obsolete or ignored very soon in the fast developing world of mobile phones. Furthermore, the interaction designer is limited to the task of specifying the design that a software engineer then implements it in an embedded software development environment that is notorious for its lack of sophisticated APIs for UI creation. This indirection and the limitations inherent in the development environment, also mean that a lot of design intent and time could be lost in translation. There needs to be a way to put design implementation in the hands of interaction designers, and this needs to be done in a “backward” compatible manner. Even when new technologies like FlashLite, uiOne become the platforms of UI development, there will still be some phones that require UI development in native code. So the solution will have to support both styles of interaction design and implementation. While phone vendors have started adopting a platform approach to software development, adding incremental features to existing code bases and builds, this has the bug/feature of perpetuating design from older phones, whether good or bad. There is also no easy way of upgrading a design element for greater usability because it is difficult to trace a design element across various features where it is used. A solution to this problem could be ensure that interaction design is modular to the extent possible and utilizes design elements in a consistent, traceable manner. 1.2 Proposal to Solve Usability, Consistency and Time-to-Market Constraints The aim of this research is to provide a tool where interaction designers can choose from the pre-packaged design element library and use the appropriate element by mapping the usability constraints and context of the pattern with the needs and context of the feature being designed. Presented in the form of a Microsoft Visio template-based prototype tool, these patterns can be easily used by the designers. After the usual steps of analyzing the design problem, identifying the User goals and then breaking them down into tasks, the paper proposes a change to the design process. Instead of trying to sketch the design from scratch from that point onwards, the designer simply uses the design tool and its template library to look for and use design modules that already exist. For the part of the design that does not yet exists, the designer then builds newer tasks and flows from existing building-block objects. The designer then adds these newer creations as potential candidates to the pattern template library to be verified for usability, to be incorporated and then used by other designers in creating other features. The designer achieves speed and design consistency with this approach. The modular pattern library allows for reuse across designers and design teams, and for usability refinement, design evolution and for backward compatibility with changes as the product evolves.
RUCID: Rapid Usable Consistent Interaction Design
679
2 RUCID Basics In this section, we present a novel formulation of mobile phone interaction design architecture and build on this framework using the patterns based approach inspired by Christopher Alexander [2]. Some samples of mobile phone interaction design patterns are presented here to illustrate the concept. We draw from the work of Alan Cooper [1] to derive interaction design architecture for mobile phones. The mobile phone has many input Triggers (typically the 12-key keypad, plus five-way navigation keys and so on). The context of use of the mobile phone is much different from the mouse and hence its interaction design structure is quite different from that of a computer mouse. There is a need for having a Primitive action for example a key press achieve not just generic input/output or application specific commands but directly address User goals. We address this in our model as follows: The User’s goals in using an application can be broken down into some generic tasks common across applications. These generic tasks precede and succeed specific tasks called into existence by the needs catered to by the feature being designed. For example, “starting an application or closing it” are typically generic tasks; “playing a music track” or “composing an SMS message” are feature specific tasks. Generic tasks are made up of Flows of input interaction in conjunction with the output – actions, symbols, graphics, and other feedback information – expressed on the screen (and speakers, vibrations etc.) of a mobile phone. The Flows are represented as Idioms to left hand side of the Alan Cooper inverted pyramid, while the output is represented on the right hand side of the inverted pyramid. The output shown on screen can be further divided into information, widgets and graphics. A Flow can be thought of as a sub-task or a sequence of Primitive and/or Compound actions that results in an application specific function to be executed. A Compound action in turn constitutes a sequence of Primitive User actions and phone reactions that achieve a User’s sub-objective. In a typical mobile phone design, a Flow that achieves a User objective, or a Compound action that achieves a subobjective, may simply consist of one Primitive action for example; press and hold of hash key can actually achieve a User goal of locking the phone. This User goal can also be achieved by accessing settings from the menu, choosing the keypad lock menu option and then enabling the keypad lock option. The architecture (Fig.1) of interaction design in mobile phones is at the heart of our pattern exploration. It anchors our search for interaction design patterns in mobile phones and also provides means to organize, link and document them. Since this model also articulates the typical top down process of design, it lends itself to very practical application as evidenced by the prototype tool that we created. In the following pages we look at sample patterns generated at the Widget, Primitive, Compound and Flow level, one sample each. The Compound, Primitive and Flow patterns that follow are tailored to the tool rather than the Alexandrian pattern. There were altogether 23 Primitive patterns, 3 Compound patterns, 15 Flow and 29 Widget patterns that were captured during the course of this research.
680
A. Raj and V. Komaragiri
User Goal
Tasks
Screens
Info
Graphics
Primitives
Flows
Compounds
Triggers
Widgets
Fig. 1. “A New Mobile UI Design Architecture Model” that details out the hierarchy of design levels, starting from “User goals” all the way to “ Action-Triggers” Table 1. Sample Widget pattern (Soft key Window) Problem Context Solution
Rationale
User needs to access additional functions that can be performed on a screen. For a given Screen, a User has more number of actions possible than the maximum number of Soft keys. One of the options that can be accessed through a softkey can provide gateway to multiple options. The User can move to these options and select a desired one. A limited number of Softkeys can be displayed at any one time. A dedicated key cannot be assigned to each option because: 1) The options and their number keep changing depending on the screen. 2) The surface area of mobile is small and limited. Using a single key to access a variable sized list allows for any number of items to be accommodated.
Examples Sub Title
Sub Title Slideshow
View
Related Patterns
Options Press Right Soft
Softkey, List, Scroll
Close
RUCID: Rapid Usable Consistent Interaction Design
681
Table 2. Sample Primitive Pattern (Type a Digit) Problem User wants to type a digit
Examples
Related Patterns
Solution Constraint [1]Press a Number [1]User is in Numbers only mode. There is a key clear mapping between what User presses on [2]Press & Hold a the keypad and what appears on the screen. Number key until [2]User is in Normal Alpha or Rapid Entry the mode. Typing a digit in this mode may not be digit appears on the intuitive but it is easy to learn. screen [3]In current Phones it’s only used as feedback. [3]Speak the digit [4]This can be used when input allows a small [4]Press Navigation Numeric range (date, year, month etc.), key (up/down) to otherwise its time consuming. change digits •Press a Number key to type digit in the Numbers only mode • Press & Hold a Number key until the digit appears on the screen in Normal Alpha or Rapid Entry mode • In Voice Dial the digits are typed as one speaks • In Set Alarm Press Navigation key (up/down) to change hours, minutes c1. Type a string
Table 3. Sample Compound Pattern (Type a String) Problem [1]User wants to type a Numeric string
Solution [1] Press Number keys, Navigate, Delete
Constraint [1]User is in Numbers only mode
[2],[3]User wants to type a Alphabetic string
[2]Press a key (29) once for the first alphabet, twice for the second alphabet, and so on, Navigate, Delete; [3]Press0Next key
[4]User wants to type a Alphanumeric string
[4]Press & Release keys for alphabets & Press & Hold keys for Numbers, Navigate, Delete
[2]User is in Normal Alpha mode; [3]User is in Rapid Entry mode. Rapid Entry mode checks its dictionary of common words and guesses at the word User is trying to spell. [4]User is in Normal Alpha mode.
Child Patterns p1.Type a Digit, p3.Delete Character to the left of cursor, p10. Navigate left/right p2. Type an Alphabet, p3.Delete Character to the left of cursor, p10. Navigate left/right; p2.Type an Alphabet
Example p1.1,p3.1,p 10.1
p1. Type a Digit, p2.Type an Alphabet, p3.Delete Character to the left of cursor, p10. Navigate left/right
p1.2, p2.1, p3.1, p10.1
p2.1, p3.1, p10.1; p2.2
682
A. Raj and V. Komaragiri Table 4. Sample Flow Pattern (Initiate)
Problem Solution [1]Go through the Menu hierarchy
[2]Use Shortcuts
[3]Accept Alert
[4]Initiate an Application through another
[5]Type a Contact num, then choose to initiate Call or Send message
User needs to start an Application Constraint Child Patterns [1]This is most simple & p24. Access additional generic, though most time screen functionality/ consuming way. Options, c2.Scroll & Select/p7.2 Select menu items by pressing digits on Num keys [2]User is in the Standby p18.Quick access mode. Using keypad (in Standby) Shortcuts is faster than using Menus. But it’s not possible to have shortcut for every Application. [3]When the User accepts p24.Access additional an alert it takes her to the screen respective Application. This functionality/ is applicable to very few Options Applications & is not a direct User action. [4]User is already in an p24.Access Application. Through additional screen Softkey Window the User functionality/ initiates another Options, c2.Scroll Application. Such & Select Applications generally compliment each other. [5]User is in standby. Only c1, (p12.Create applicable to Messaging and outgoing call)/ Calling. (p24.Access additional screen functionality/ Options, c2.Scroll & Select)
Example p24.2, c2/P7.2
p18.1
p24.2
p24.1, c2.1
c1, (p12.1) /(p24.1, c2.1)
3 The Prototype In this section, the part of the problem statement for this research paper is addressed the design and development of a tool that interaction designers can use to perform rapid, usable, consistent interaction design. The design, details and use of the tool for mobile phone feature design is outlined. The tool is based on the pattern-based UI Design Library sampled in the previous section. In order to make efficient use of the patterns library however a platform is needed. Visio is a diagramming program that can help to create business and technical diagrams that document and organize complex ideas, processes, and systems. Visio seemed like a suitable platform for using the captured pattern for various reasons: 1. Shape libraries can be easily created with Visio tools. 2. New template can be created from a drawing file or from an existing template by assembling diagrams by dragging predefined shapes.
RUCID: Rapid Usable Consistent Interaction Design
683
3. Reviewers’ comments and changes to shapes can be tracked using the review mode. 4. Visio can be integrated with Microsoft Office Excel, Microsoft Office Word, Microsoft, and other formats. Once we identified patterns at each level (Primitive, Compound, Flow, Widget), Visio templates were created for each level of the mobile phone architecture. We built a design vocabulary. Higher levels templates for example Screen could be constructed from lower levels like Widgets. The application will facilitate in effective and efficient designing of Screens and task flows of newer applications by just dragging and dropping existing UIs and Interactions. Revision and iterations are also easier, whenever a shape or a whole template is added to the tool. The tool follows hierarchical method of design on which our research is based on. If we take an application, application has some goals. Theses goals can be broken down into tasks. Task can be broken down into UIs (Widgets, Screen) and Interactions (Flows, Compound, and Primitives). Thus the whole application can be constructed using this tool. 3.1 Sample Application (Slideshow) to Explain the Prototype User Goal: View Camera Picture In slideshow Tasks: 1. Launching Slideshow from image browser 2. Pausing/Continue the Slideshow 3. Skip to “Next/Previous Image” 4. Set Transition duration 5. Enable/Disable Looping 6. Exit Slideshow We take 1st Task i.e. “Launching Slideshow from image browser” and try to construct it through our prototype. This is an instance of Initiate Flow. If it was available we would just need to drag the shape from Intimate template and put customized labels for the Screens & Flows. However let’s assume that the Flow was not available and therefore try to construct it from scratch from the available Widgets and Actions.
Templates: • Basic Flowchart Shapes • Primitive action • Compound action • Widgets • Screens • Flows
Shapes under Templates folder Fig. 2. Starting screen
684
A. Raj and V. Komaragiri Table 5. Task1 (Launching Slideshow from image browser)
1
2
3
4
5
6
7
RUCID: Rapid Usable Consistent Interaction Design
685
Steps: 1. Drag Image browser shape from the Widget template 2. Put appropriate labels to the Image Browser 3. Drag the Access Additional Screen Functionality/Options shape from the Primitive action Template 4. Choose the most appropriate action in the given Constraint 5. Drag the Softkey window from the Widget. The new component gets attached to an appropriate place through the connectors 6. Put appropriate labels to the Softkey Window 7. Drag the Image shape from the Screen Template and make customization to the labels and the layout In the same way rest of the tasks can be constructed. Later this Flow can be added as a shape to the Initiate template after getting it reviewed by the design team.
4 RUCID Testing, Analysis and Future Improvement Peer reviews were conducted from time to time throughout the research and feedback was fed into the next course of the research. Finding people to test with were not difficult because the people going to use the tool were the fellow interaction designers. However it was difficult to find time from their busy schedules. Some basic usability testing was conducted. The Visio prototype was very close to the actual envisioned tool. The Scenario given to the User was same as in the prototype shown earlier in the paper i.e. to Create the Slideshow application using the tool. Since they were already acquainted with Visio they could easily operate it. They felt that the tool gave a lot of control and freedom in the designer's hands. They recommended on adding some more Microsoft Office Visio templates, such as Basic Flowchart Shapes, which will be helpful in making flow diagrams and adding comments. They felt only the generic design should be laid out through the Visio tool. When the task flows with the Screens are made it will have references to particular chapter in the generic UI specification, where the guideline is laid out. This way the patterns will complement the design guidelines. However a pattern based approach to UI design can go beyond the Visio templates or even more customized specification tool. If the specification is actually done in something like Flashlite, and the patterns are codified in FlashLite, the designer can leverage a lot of the common elements of design and finally the specification output actually becomes the UI for the phone and the Flashlite version of the design can just be loaded upon a phone. There is no need for additional software development at least on the UI side. Individual patterns and the language as a whole can be refined by tapping into the broad expertise of the Kyocera design staff by discussing and gathering their collective wisdom. The library of patterns can be converted into a workable set of standards by agreeing on an appropriate rating scale and by assembling a representative group of reviewers who rate the content according to the same criteria.
686
A. Raj and V. Komaragiri
5 Conclusion Ultimately, we expect the patterns-based UI Design library and the process and tool that it facilitates, will result in a strengthened Kyocera User Interface design and brand and a more efficient design staff. A patterns-based approach has a better chance because of its inherent modularity which the tool is able to take advantage of. “Each pattern describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice” (Alexander et al., 1977. p.x). The patterns created as part of this project have still some distance to cover in terms of rigor and validation to justify Alexander’s description above. But we believe they form a promising and important first step in that direction. Future technologies like uiOne [11], Flashlite are a fertile field for the extension of the patterns based approach. The UI design library makes is convenient to constantly evolve the various elements of design and to seamlessly integrate innovative elements into the future phone features as a normal part of the design process. In fact, it should be possible to write parsers that would read a Visio file and produce Flashlite scripts or uiOne triglets that could potentially be directly loaded onto a phone as its User Interface. This removes the indirection involved in implementing a design. The interaction designer is able to control the actual look, usability of a feature on a phone directly. That would be a very good goal to achieve.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Cooper, A., Reimann, R.: Essentials of Interaction Design. About face 2.0 Alexander, C.: A Pattern Language Implementing a Pattern Library in the Real World: A Yahoo! Case Study van Welie, M., van der Veer, G.C.: Pattern Languages in Interaction Design: Structure and Organization Vora, P., Castillo, J.: Using Patterns to Design Usable Interfaces for Web Applications. Alpha Cube, Inc. (June 29, 2005) Todd, E., Kemp, E., Phillips, C.: What makes a good User Interface pattern language? Tidwell, J.: COMMON GROUND: A Pattern Language for Human-Computer Interface Design, http://www.mit.edu/~jtidwell/Interaction_patterns.html van Welie, M.: Patterns in Interaction Design, http://www.welie.com/ Yahoo Design Pattern Library, http://developer.yahoo.com/ypatterns/ Forum NOKIA, http://forum.nokia.com/ uiOne, http://brew.qualcomm.com/brew/en/about/uione.html
The Appropriation of Information and Communication Technology: A Cross-Cultural Perspective Jose Rojas and Matthew Chalmers Department of Computing Science, University of Glasgow Glasgow, G12 8QQ, UK {jose,matthew}@dcs.gla.ac.uk
Abstract. In this paper we explore the process of appropriation attempting to broaden the set of topics considered significant on it. We present a model of appropriation derived from two studies conducted in the UK, Japan, South Korea and China. We describe our model based on a characterisation of elements supportive of appropriation in the context of use (discussed in terms of space/place, social practices and activity) and in the ICT itself (described in terms of meaning, relevance and triviality). We emphasise the pre-eminence of context in achieving the appropriation of ICT. Keywords: Appropriation, ICT, context, infrastructure, layout, marketing, business, domestication, socialisation, peer support, media, triviality, commoditisation, meaning, relevance, space, place, social practices.
1 Introduction One standing concern of the HCI field is facilitating the introduction of new information and communication technology (ICT) into society by lowering the barriers they might experience through an ongoing cycle of design, development and refinement of features. Interest in this problem arises from a preoccupation with the fast-paced change of technological development and the seemingly limited ability of society to cope with this deluge of change. The study of such a phenomenon broadly falls within what has been termed the appropriation of ICT. Appropriation is here understood as the “processes by which individuals and communities consciously take both conceptual and operational control of an idea, a tool, a technology, etc. within the context of their real and perceived culture” [15]. It could be argued that in the HCI field there is an assumption that by achieving the right combination of features in a new technology it is possible to produce a technology that ‘naturally’ fits practice. This view of appropriation as a single trait that can be captured and endowed upon new ICT seems to prevent us from considering a wider range of influences that might affect this process. This crusade to find the technology that can be seamlessly appropriated does not say anything about the fact that many people, every day, continue using new technologies like mobile phones, mp3 players, IM clients, etc. regardless of their proficiency operating them. There are other reasons beyond the technology itself that influence people in appropriating it. Elucidating what those other influences might be J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 687–696, 2009. © Springer-Verlag Berlin Heidelberg 2009
688
J. Rojas and M. Chalmers
is precisely the purpose of this work. In this paper we will first review a popular approach to the study of appropriation in HCI through the concept of cultural dimensions highlighting some of its shortcomings in this regard. Later we will present alternative approaches to the study of appropriation of ICT. Drawing from this, we will then introduce our work arguing on its relevance to describe the appropriation of ICT.
2 Cultural Dimensions and HCI An approach in HCI to the appropriation of ICT is through a discourse centred in the cultural differences observed among people living in different milieux and in creating technologies that, arguably, capture and embody these differences. Geert Hofstede’s concept of cultural dimensions is a popular theory used to justify this approach [3, 10, 11]. Several reasons may account for this circumstance. First, it provides a simple, tested and ready-made model of culture that can be used as a theoretical platform for a research enquiry; second, it is supposed to be based in “quantifiable” traits of culture (power distance, collectivism, uncertainty avoidance, etc.); and thirdly, it was cited in one of the earliest books dealing with culture and HCI by del Galdo and Nielsen [8]. The cultural dimensions concept has drawn a considerable amount of criticism, primarily due to its reductionism. For instance, it claims cultural homogeneity among the members of a culture; disregards behaviour and values that are not necessarily determined by a particular cultural setting; ignores variations in the expression of cultural traits because of changing situations; and does not account for differences in the importance assigned by different cultures to specific cultural traits [9, 12, 14]. 2.1 What in the World Is Culture? From the perspective of cultural dimensions, culture is conceived as a monolithic and fixed entity. The idea of culture as a monolithic entity is nowhere clearer than in Hofstede’s definition of culture as “the collective programming of the mind which distinguishes the members of one group of people from another” [6]. Interestingly, but perhaps quite frequently ignored, even Hofstede denied the homogenising implications of his definition by conceding that cultures cannot be attached to specific geographical settings since cultures can be found both as integrated social systems and as smaller parts of those integrated social systems [6]. Hofstede’s model also characterizes culture as a fixed entity; therefore, the values and norms of a culture remain static (subjective dimension) even when its symbols may change (objective dimension). For Hofstede [6], the slow pace of change in values accounts for the emergence, over hundreds of years, of what he termed cultural dimensions. The fixedness of culture advocate by this theory prevents it from providing an intelligible account of, for instance, the rapid embrace of computer-mediated communication. This can be illustrated in a study by Wurtz [16] analysing McDonalds websites around the world in 2003 with their counterparts in February 2009. Clear-cut issues of low- vs. high-contextuality and individualism vs. collectivism, like those found in Wurtz’s study, are difficult to distinguish in the current design of those websites. An immovable culture such as that portrayed by Hofstede’s cultural dimensions would prevent a society from adapting to a changing environment by learning new ways of performing tasks and solving problems [9].
The Appropriation of Information and Communication Technology
689
In our opinion, trying to define users in cultural dimensions’ deterministic manner and then embodying those definitions in a technology is problematic. We believe that a more sensible approach to the integration of ICT into a person’s real or perceived setting [15] is found by identifying common elements and forces shaping this circumstance. An analysis of successful ICT across countries might provide us with a number of elements that influence people to embrace certain technologies regardless of their cultural differences. An investigation of those elements and their role in encounters between people and technology might be more helpful in providing the conditions necessary to introduce and sustain use of novel digital technology over the long term. With this idea in mind, we now consider other approaches to the study of the appropriation of ICT, and how they explore the influence of different elements beyond culture in this phenomenon. Dourish defines appropriation as “the process by which people adopt and adapt technologies, fitting them into their working practices” (emphasis added) [4]. This highlights two sides of this process: (1) the social element whereby people modify their activities to integrate a technology into their practices and, thereby, create new practices, and (2) the technical features of a technology embedded in its design that can be modified according to a predefined set of options. For Dourish, the problem of designing for appropriation seems to be both a technical issue that arises from dated software structures and a social issue. This observation is interesting when we observe that technologies that continue relying in “old” technological structures continue being appropriated all around us. Dourish’s view does not diminish the role of humans in the process of appropriation, but does focus on the “shortcomings” of current technology and how technological alternatives might address them. Carroll has introduced a model to describe the appropriation of ICT [2] and to aid in the design of technology [1]. Carroll’s model is the result of her exploration of mobile technologies and CRM systems. Carroll suggests the cycle of appropriation is completed only when, in the design process, a new set of requirements is gathered from the different ways in which a technology is appropriated (“design from appropriation”). New requirements and new uses are then incorporated into the design process for future iterations of a technology (“design for appropriation”) in order to incorporate the features designers did not foresee in the early iterations of a technology [1]. Carroll’s cycle highlights the essential role of users in completing the appropriation cycle. However, given the myriad people that every day embrace ICT, satisfying the needs of new users would appear to demand a never-ending cycle of requirements gathering and redesign. Clearly, this infinite cycle is not happening and yet many people continue embracing ICT. Ito suggests that Japan’s adoption of keitai—as mobile phones are known there—is due to a self-feeding loop carved in the particularities of Japanese popular culture (e.g., animation, video games, comics, food and other cultural elements of this nation) [7]. Yet popular culture alone does not account for the adoption of mobile phones and the mobile Internet in Japan. At least two more elements also have an important role in this phenomenon: business practice and mobile technology design. Complex relationships between use, design of mobile technology and business practice provide a better understanding of the adoption of keitai and mobile Internet in Japan [7]. For instance, in his account of mobile phone adoption in Japan, Okada [13] traces the development of the technologies that made the mobile phone possible describing how
690
J. Rojas and M. Chalmers
the addition of a simple LCD on pagers made possible the transition of this technology from a business tool to a personal one. However, even these technological achievements would have had little impact among people if the corporations developing these technologies had not battled each other to penetrate the Japanese market through constant price reductions [13]. It would appear then that the appropriation of ICT is influenced by a constellation of elements beyond the people that use a technology or the features of the technology itself. Analysis beyond these latter two issues seems to be absent from standard research approaches into this phenomenon. To compound this issue, those other elements influencing the appropriation of ICT seem to fall outside traditional areas of concern in HCI. Several questions naturally arise from this observation: What are those other elements influencing the appropriation of ICT and what are their relationships? And how is a deeper understanding of the appropriation of ICT useful for the HCI field as we create novel technologies?
3 A Model of Appropriation of ICT To address these questions we explored the appropriation of common ICT (e.g., computers, Internet, mobile phones, IM, online social networks, blogs) in settings other than work where these technologies are used on a voluntary basis over the long term. First, we conducted an ethnographic study over a three-month period with a sample of fifteen international students from China, Greece and India enrolled in Masters degree courses during the school year 2007-2008 at the University of Glasgow. We did this assuming the process of appropriation of ICT and the dynamics associated with these changes, would be magnified among people relocating to a new milieu. Our study centred on weekly 30-minute interviews with each member of our sample from October 2007 to January 2008. We followed this approach for two reasons: (1) to explore at a high level events that remain imprinted in our sample’s memories regarding their reasons to adopt an ICT, and (2) to track behavioural changes in their use of ICT, and their justifications, over an extended period of time. We analysed transcripts and other collected materials using the Grounded Theory approach [5] identifying forces and elements that shape this process within this community. In this manner we identified elements in the context of use and in the ICT itself that play a decisive role in the appropriation of ICT. In order to increase the relevance of the findings of our first study we decided to undertake fieldwork in other countries to account for different elements in diverse contexts. We believe this experience improved the ecological validity of our findings and of the resulting socio-culturally informed model of appropriation of ICT we built from our data. One of the authors visited Japan, South Korea and China to carry out similar studies to the first, but this time interviewing people in their native countries. In Sapporo, Japan we interviewed two Japanese students and four Japanese workers, as well as four foreign students from Brazil and India. In South Korea we interviewed eleven local students at Ajou University in Suwon. In China, twenty local students at Nankai University in Tianjin were recruited. Interviews lasted forty minutes on average and, with the exception of Japan, where each participant took part in more than one individual interview, all other participants took part in one single individual interview.
The Appropriation of Information and Communication Technology
Participants were recruited through advertisement on the BBS of each university and through word of mouth. The only requirement to take part in the study was the ability to conduct a simple conversation in English. The second study was complemented with observations of people’s conduct and ICT use in public transportation and private and semi-private spaces (e.g., restaurants, classrooms, airports, streets, shops, etc.) across the countries surveyed. Based on the above two studies, we drew out six elements whose interactions seem to provide the necessary conditions for the appropriation of ICT in everyday life. Outlined in Figure 1, and explained later in this section, our model covers the key issues of this phenomenon in all the contexts (and countries) we have analysed. The following example illustrates the use of our model of appropriation of ICT:
691
Fig. 1. Our model of appropriation is constituted of three micro-level features—relevance, meaning and triviality—and three macro-level features: space/place, activity and social practices
“When I was in High School the Internet became very popular in China. Everyone would have a computer at home and in their offices. Everyone would be talking about chatting and QQ. I started using QQ in High School. It was very popular in my environment among my friends to communicate with each other, so I applied, downloaded and installed the software. However, I would seldom use QQ. At that time our work schedule was very tight. My best friend, my classmates, all were very keen on their studies, so we seldom used it. It was common to have a computer, but students would seldom use it.” (Yan, 23, China)
On a micro-level (inner ring in Figure 1), like any other high school student, this participant needs to communicate with her peers. To do that, she needs to use a technology that satisfies the task she is supposed to perform (i.e. it is relevant), one she can afford (i.e., trivial), and one that is in accordance with the practices enacted by her peers (i.e. meaningful). At a macro-level (outer ring in Figure 1) use of an ICT does not take place in a vacuum, it takes place within the structure of a particular context or environment (Meizhou, Guangdong, China in this case). Thus, the use of QQ as a communication technology is set within the following context: 1) it appears to be that around the time of this event (2001-04) a general adoption of computers and the Internet took place, at least in this town; the absence of computers and the Internet would prevent this event from happening at all (space/place); 2) that QQ is even considered as an alternative for communication is only possible because the company who developed this IM client has invested reasonable resources to position itself as a communication channel for the younger generation (social practices); and yet, 3) the use of QQ is opposed by the specter of the Chinese National College Entrance Examination during ‘Black June’ that represents a life-defining moment among the Chinese youth (activity). This example illustrates how the appropriation of ICT is achieved through a delicate balance between the characteristics of the ICT itself and the context of use. We will now describe more fully the six elements mentioned, but in doing so we note the
692
J. Rojas and M. Chalmers
importance of understanding how these elements influence each other. This exercise can serve to remind ourselves of the limits of our sphere of influence as technologists. 3.1 Relevance The practical issue addressed here is based in the fact that in everyday life people continually face challenges they need to solve. Individuals are continually making choices as to what they want, need or should act on. A large part of the success of an ICT resides in its ability to continue opening new possibilities to disseminate information and facilitate communication, and in its ability to sustain existing practices around these issues. The relevance of an application to support these activities might be practical or perceived. For instance, participants across the countries surveyed usually relied on Skype for voice communication abroad because it was the only service that, among the existing alternatives, offered a free service and/or a competitive price. Skype, within this context, is a case of practical relevance as it effectively makes voice communication affordable. Conversely, in China we found participants who would embrace applications such as MSN assuming this move would be to their advantage upon securing a job since the use of QQ is discouraged in a work setting. The case of MSN here is one of perceived relevance. 3.2 Triviality As people address their challenges in everyday life, they may consider whether digital technologies may help them in meeting these challenges. Accordingly, people may frame their problems in terms of the possibilities of ICT within the horizon of expectations engendered by the media, their perception regarding the ease of use of any given ICT, and the costs, if any, associated with operating (or acquiring) an ICT. The practical issue for individuals here is the choice, from among the ICTs that they have at hand and that they can afford, of the most ‘trivial’ one to use. Different factors contribute to this including commoditisation, usability issues and the media. Moore’s Law maintains that the number of transistors (microprocessor performance) doubles every two years. One of the main consequences of this law is the commoditisation of digital goods. Perhaps the best example of the commoditisation of technology is mobile phones. Some participants received their first mobile phone as early as 14 years of age. Some of them are now using their fifth mobile phone. Although not a reason enough to secure appropriation, a high level of usability in ICT facilitates this process. ICTs need to reach a level in which even first-time users can operate them with some degree of proficiency in a short period of time. For instance, before the creation of YouTube it was possible to share video online, but this operation would require skills and resources reserved to a few. The creation of YouTube facilitated the task of sharing video online to such a degree that none of the skills previously required were needed any longer. In our view, the media contributes to the trivialisation of ICT by ingraining it into the public consciousness. It could be argued that the media is one of the most important factors in promoting ICTs, in shaping what ICTs are widely known, and even when and how ICTs are to be used. Needs finally solved by ICTs, images acquired by the use of the latest technologies, and new horizons arguably opened by novel gadgetry seem to be the standard discourse about ICTs in the media.
The Appropriation of Information and Communication Technology
693
3.3 Meaning Users assign subjective and often intangible meanings to ICT, making it meaningful in individual ways, as well as taking up more clear–cut objective functional patterns of use. In the case of information gathering, most cases observed were an individual activity, and meanings assigned to information technology were given on a discrete basis. Tools used to develop an activity become interlocked with the feelings experienced during the task (e.g., success or relief) influencing any future use. The appropriation of an information tool continues over the long term until meanings associated with a technology are lost because the tool is proved to lack the very same faculties and appeal that led to its adoption, or because a superior or more attractive technology comes on stage (e.g., Google’s appropriation in the face of AltaVista’s demise). The case of communication technologies seems to be more complex. Meanings bestowed upon communication technology do not seem to depend only on the activity they are supposed to satisfy but on a larger constellation of issues. A common behaviour observed in the sample was that of mixing and matching various communications technologies (e.g. IM, social networks, etc.) according to different circumstances such as: 1) place where they were used (e.g., bedroom vs. lecture hall vs. library); 2) message that was transmitted (e.g., chit-chat vs. more substantial issues); 3) ‘listener’ or the person at the other end of the communication link (e.g., friend vs. family vs. lecturer; 4) level of involvement with listener (e.g., friends vs. boyfriend/girlfriend); and even 5) time of the day and season of the year when communication takes place (e.g., day vs. night, holidays vs. term time). Users appropriate communication technologies according to these various scenarios of daily use imbuing them with multiple meanings and making them coexist within a constellation of other technologies. Besides addressing an activity in terms of the most trivial ICT at hand, previously established patterns of use also have a significant effect on people. People tend to appropriate those technologies already appropriated by their peers. The practical issue here is which ICT is the most appropriate to communicate a given message to a given person at a given moment, with regard to both sender and receiver. 3.4 Activity ICTs are constantly being assessed by their users according to their abilities to continue satisfying evolving and emerging needs (cf. relevance above), and their expression in everyday activity such as play, communication, study, etc. according to the trends of our time. Not surprisingly, then, the ability of ICTs to continue addressing real and perceived needs (and the activities these needs engender) is another element influencing their appropriation. Broadly speaking, among our participants ICTs were used to satisfy three types of activities: 1) study and work activities that are paramount in their list of priorities and to which all other needs/activities are secondary; 2) leisure and entertainment activities that play an important role in balancing everyday life; and to a lesser degree, 3) economic-related activities that might produce a financial benefit (e.g., purchasing clothes online at a lower price than in a bricks-and-mortar store).
694
J. Rojas and M. Chalmers
3.5 Space/Place The manner in which an activity or need is satisfied does not depend entirely on the tool used to this end, but also on the place where the tool is used. Thus, even when mobile technologies might extend the spatial range where social interactions mediated by ICT occur, their use is still restricted by two elements of the space/place where action takes place: the infrastructure and layout, and the marketing and business practices of a given milieu. In a sense, these elements determine what ICT is trivial in any given setting. Some of our participants across the countries surveyed were housed in university accommodation. Having to live under these conditions forced participants to readjust and modify previous habits to cope with the restrictions and possibilities of a new infrastructure and layout that is shared with non-family members. Thus, while mobile technologies might open new horizons for use, they might not be appropriated because they are effectively constrained by service availability (e.g., a mobile provider without a good signal in certain areas) and by other physical barriers (e.g., a bedroom for six students) that must be negotiated with regard to other people’s activities. Strong influence on the appropriation of ICT comes from the marketing and business practices of a given milieu. This influence is not only shaped by the technical possibilities of a technology, but by business competition. ICT thus exists only within the possibilities made available by different corporations for whom the commercialisation of ICTs and their features are a source of revenue. Within this structure, it would seem obvious to say that users can only appropriate ICTs that are commercially available, as well as affordable or free. We believe the influence of marketing and business practices cannot be ignored in any study of the appropriation of ICT. 3.6 Social Practices Organisational regulation in places such as schools and offices exerts considerable influence in making certain ICT the de facto standard to conduct an activity. For instance, at Nankai University in China students would conduct most of their internal communication through the school’s BBS. Whoever enters those spaces/places is tacitly forced to comply and adapt to the ongoing practices of these environments. Government regulation is another considerable influence on the appropriation (or not) of ICT. For instance, for Korean males, military service is a duty characterised by deprivation of any means to communicate with the exterior except through regular public phones inside the base. As indicated above, the media’s influence in the use of ICT was observed as having a significant effect. Existing and evolving practices of a co-located or dispersed social group make certain technologies the chosen option to conduct various social practices such as the expression of friendship and communication. Thus, the appropriation of ICT is not necessarily a question of the technical features of a technology, but a question of how the environment leads a group of people to embrace an ICT and incorporate it into its existing and ongoing social practices making it meaningful. Two groups are especially relevant in this regard: the family, and friends and acquaintances. The domestication of technologies takes place when these become part of a household dynamics. ICT are also adopted and integrated into the ongoing activities of a domestic environment. For instance, across the countries surveyed, parents were
The Appropriation of Information and Communication Technology
695
always highlighted as the original facilitators of ICT such as computers and mobile phones. The provision of a mobile phone was invariably a form of monitoring enacted by anxious parents. The integration of ICT into the dynamics of a group of friends, acquaintances and classmates—the socialisation of ICT—also seems to be fundamental in its appropriation. For instance, in South Korea the most popular IM client among our sample was NateOn; in China it was QQ. Use of a different client was seen as an oddity and discouraged among our participants and their friends.
4 Conclusion In this paper we have avoided focusing on a structural or functional analysis of ICT, and whether or how structure and function directly determine appropriation. Instead, we took a step back to consider the forces acting on the context of use in terms of space/place, social practices and activity. From this viewpoint, we observed how the presence of certain characteristics of ICT (collectively summarised as triviality, meaning and relevance) with a given context of use seems to encourage, and in some cases impose, the use and appropriation of ICT. We suggest that this model might be applied to other countries and contexts of use beyond university students, as it focuses on elements and characteristics that can be consistently found across diverse settings. By laying out the interplay of the six aspects of our model of appropriation, as they appear in a particular case, we believe it is possible to provide a fair picture of the appropriation of an ICT in everyday life—or the lack of it. However, we invite readers to apply the conceptual framework presented here to other settings, communities and countries so as to assess and improve it. The breadth, openness and interdependence of these aspects makes them difficult to experiment with holistically in controlled lab settings, or to encapsulate in fixed and formal software engineering terms. Most difficult for such approaches, we believe, are uncontrollable and unpredictable aspects of large–scale social activity, such as economics, marketing and the mass media. Our findings suggest that such areas are often essential parts of the adoption and appropriation of ICT, even if they are not conveniently or traditionally part of such approaches. This confirms the need for more situated practices of user study well-established in HCI, such as those based on ethnography, but it also reminds us of the breadth of perspective needed for—and inherent difficulty of—design for appropriation that reflects and respects its dynamism, detail and variety.
References 1. Carroll, J.: Completing Design in Use: Closing the Appropriation Cycle. In: Proceedings of the 12th European Conference on Information Systems (ECIS 2004), Turku, Finland, p. 11 (2004) 2. Carroll, J., Howard, S., Vetere, F., Peck, J., Murphy, J.: Just What Do the Youth of Today Want? Technology Appropriation by Young People. In: Proceedings of the 35th Annual Hawaii international Conference on System Sciences (HICSS 2002), IEEE Computer Society, Washington, DC, USA (2002)
696
J. Rojas and M. Chalmers
3. Choi, B., Lee, I., Kim, J., Jeon, Y.: A qualitative cross-national study of cultural influences on mobile data service design. In: Proceedings of the SIGCHI conference on Human factors in computing systems, Portland, Oregon, USA, pp. 661–670. ACM Press, New York (2005) 4. Dourish, P.: The Appropriation of Interactive Technologies: Some Lessons from Placeless Documents. Computer Supported Cooperative Work 12, 465–490 (2003) 5. Glaser, B.G., Strauss, A.L.: The Discovery of grounded theory. Strategies for qualitative research, pp. x. 271. Weidenfeld & Nicolson, London, printed in USA (1968) 6. Hofstede, G.H.: Culture’s Consequences: Comparing Values, Behaviors, Institutions, and Organizations Across Nations. Sage Publications, Thousand Oaks (2001) 7. Ito, M.: Introduction. In: Ito, M., Okabe, D., Matsuda, M. (eds.) Personal, Portable, Pedestrian: Mobile Phones in Japanese Life, pp. 1–16. MIT Press, Cambridge (2005) 8. Kamppuri, M., Bednarik, R., Tukiainen, M.: The Expanding Focus of HCI: Case Culture. In: Proceedings of the 4th Nordic conference on Human-computer interaction: changing roles, Oslo, Norway, pp. 405–408. ACM Press, New York (2006) 9. Kamppuri, M., Tukiainen, M.: Culture in Human-Computer Interaction Studies. In: Proceedings Cultural Attitudes Towards Communication and Technology 2004, Murdoch University, Australia, pp. 43–57 (2004) 10. Kayan, S., Fussell, S.R., Setlock, L.D.: Cultural Differences in the Use of Instant Messaging in Asia and North America. In: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, Banff, Alberta, Canada, pp. 525–528. ACM Press, New York (2006) 11. Marcus, A., Gould, E.W.: Crosscurrents: Cultural Dimensions and Global Web Userinterface Design interactions 7(4), 32–46 (2004) 12. McSweeney, B.: Hofstede’s Model of National Cultural Differences and Their Consequences: A Triumph of Faith - A Failure of Analysis. Human Relations 55(1), 89–118 (2002) 13. Okada, T.: Youth Culture and the Shaping of Japanese Mobile Media: Personalization and the Keitai Internet as Multimedia. In: Ito, M., Okabe, D., Matsuda, M. (eds.) Personal, portable, pedestrian: mobile phones in Japanese life, pp. 41–60. MIT Press, Cambridge (2005) 14. Ratner, C., Hui, L.: Theoretical and Methodological Problems in Cross-Cultural Psychology. Journal for the Theory of Social Behavior 33, 67–94 (2003) 15. UNESCO Definitions, http://www.unesco.org/education/educprog/lwf/ doc/portfolio/definitions.htm (accessed December 27, 2008) 16. Würtz, E.: Intercultural Communication on Websites: An Analysis of Visual Communication in High- and Low-context Cultures. In: Proceedings Cultural Attitudes Towards Communication and Technology 2004, Murdoch University, Australia, pp. 109–122 (2004)
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping Vinícius Costa Villas Bôas Segura and Simone Diniz Junqueira Barbosa Departamento de Informatica Rua Marques de Sao Vicente, 225 Gavea, Rio de Janeiro, RJ, 22451-900, Brasil {vsegura,simone}@inf.puc-rio.br
Abstract. Sketches are often used during user interface design and evaluation as both a design support tool and a communication tool. Despite recent efforts, computational support to user interface sketching has not yet reached its full potential. This paper reports a study comparing two evaluation techniques: paper prototyping and a simulation-based evaluation supported by the UISKEI tool. Keywords: User interface sketching, prototyping, user interface evaluation.
1 Introduction Most designers create, as part of the design activity, sketches of their alternative design ideas to better communicate them among the design team and to the users, as well as to evaluate them early in the design process [2, 5, 7]. Users, in their turn, prefer to evaluate user interfaces that more closely resemble the final product, but very often find difficulties in going beyond static representations to grasp how the user-system interaction will take place. Through sketches alone, the consequences of certain design choices may go unnoticed, such as restrictions on the configuration of or sequence of actions in order to perform a certain task. There are even users who consider paper sketches a hurried and amateurish representation, despite the success of early evaluation techniques such as paper prototyping [11]. To support designers in the creation of sketch-based prototypes, we have developed a tool called UISKEI (User Interface Sketching and Evaluation Interface), which allows designers to draw and have the tool recognize user interface elements, as well as to associate behavior to these elements in a pseudo-functional prototype. The user can then interact with the prototype as in a simulation of how the application will behave, thus gaining a better understanding of how the interaction will happen, and allowing designers to conduct usability evaluation with users [9] early in the design process. This paper is organized as follows. The next section presents the tool and the goals it intends to fulfill. The following section describes an early evaluation of how UISKEI compares to paper prototyping, explaining how the test was conducted and the results obtained. Lastly, we present the conclusions of the study. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 697–705, 2009. © Springer-Verlag Berlin Heidelberg 2009
698
V.C.V.B. Segura and S.D.J. Barbosa
2 UISKEI: Instrumenting Sketch-Driven Evaluation 2.1 Designing in UISKEI UISKEI has been developed to support two design strategies: based on sketches alone, and based on interaction models and sketches. In the first strategy, designers will manually define the behavior of each user interface element to prepare for the simulation. In the second strategy, designers will import the definition of the application behavior from an XML file representing an interaction diagram in MoLIC, a modeling language for interaction as conversation [1]. MoLIC allows designers to represent interaction scenarios [3, 4] in a more structured representation, in which intersections and relations between scenarios are made explicit. To create the sketch of a presentation unit (window, web page and the like) with the user interface elements contained therein, the user must draw them on the screen, taking advantage of the pen-based interaction supported by the system. There is a predefined language of gestures that allows UISKEI to recognize the corresponding widget (WIMP element), using the algorithms and ideas defined in [6,8,10,12]. Currently, the set of user interface elements recognized by UISK includes the following elements: buttons, labels, radio buttons, toggles, drop-down lists, lists, and textboxes. If an element is not recognized, it remains on the screen as a stroke, allowing the designer to draw any kind of meaningful symbols (for example, simplified images or logos). This possibility of unconstrained drawing grants flexibility to the tool, making it easier for the prototype to better resemble the final interface. In addition, it allows the creation of innovative user interface elements, since every single drawing can be treated as a widget.
Fig. 1. UISKEI elements
Having drawn the user interface elements, it is possible for the user to define values and behaviors for each element. Values are entered as strings and they can only be added to a specific set of elements, having different interpretations depending of the element. For example, the values of a textbox correspond to the texts that could be “typed” in the prototype while the values of a drop-down list correspond to the items that could be selected in the prototype. Some elements have pre-defined values (e.g. radio buttons have two: checked and unchecked) and others have none (e.g. labels).
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping
699
The behavior can be conditionally triggered (e.g. depending on the value of another user interface element or several of them) and can result in a series of actions. For example, the click of a command button can result in navigating to different screens according to the state of a toggle button and the click over a toggle button can enable several radio buttons. Behavior can also be added to strokes, widening the possibilities of the tool. 2.2 User Testing through Simulation in UISKEI After the screens and the corresponding behaviors have been defined, it is possible to launch a simulation to be presented to the final users in order to evaluate the system. The final user interacts with the simulation, and the system performs the behavior assigned to each element, taking into account the context of the interaction. So, without any coding or implementation from the designer’s part, it is possible to have a semi-functional sketch-based prototype evaluated with the final user.
3 Paperless Prototyping Evaluation in UISKEI: A Preliminary Study We have conducted preliminary user testing sessions to evaluate the simulation facility provided by UISKEI and comparing it to the paper prototyping technique. 3.1 Planning the Evaluation To evaluate UISKEI, we chose a photo web application as the target system, and developed two alternative interactive solutions to it (A and B). Each test participant, acting as a final user, went through both solutions in either paper or UISKEI, giving us insight for further developing the tool and allowing us to make a preliminary comparison between the two prototyping techniques. In order to reduce bias due to the order in which each technique was used, the test participants were divided into four groups, as depicted in Table 1: Table 1. Test group division
First A, then B First B, then A
First paper, then UISKEI G1 G3
First UISKEI, then paper G2 G4
The tests proceeded according to the following procedure, instantiated here for group G1: 1. Testing using paper prototyping, beginning with solution A and then testing with solution B 2. Interview about the paper prototyping technique 3. Testing using UISKEI simulation, beginning with the solution A and then testing with solution B 4. Interview about the UISKEI simulation, comparing it to paper prototyping
700
V.C.V.B. Segura and S.D.J. Barbosa
Each user agreed to take part in the evaluation by signing a consent form and all test sessions were recorded (in both audio and screen capture). For this preliminary study, there were two users in each group. Throughout this paper, we call the group G1 participants P1a and P1b, and likewise for the other three groups. All eight users were either undergraduate or graduate students, in their early 20’s: five men (four of them with a background in Engineering, one in Communication Studies), and three women (one in Biological Sciences, one in Engineering and one in Communication Studies). 3.2 The Evaluation Scenario The proposed system was a photo buying web site and all users were presented with the following scenario: You are a recent graduate who wishes to choose certain graduation ceremony photos to buy. In order to do that, you visit the website of the company responsible for the pictures and choose, from among the available photos, the four best shots (¬¬, =P, T_T, ^_^). Next, after looking at the chosen pictures, you decide not to buy one of them (T_T) and then you confirm your request.
Fig. 2. User interface for solution A
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping
701
Alternative solution A was inspired by an actual website and had a more textual approach: users must type the reference code in the appropriate textbox and then click “add photo>>” to add it to the list below. To remove, the user must only click on the reference in the list. When the user clicks the OK button, they finish the purchase. All of the screenshots can be seen below: the screenshots drawn in UISKEI on the left, and the corresponding paper version on the right: Alternative solution B had a more pictorial approach: the user goes through the pages and select the pictures by marking a checkbox below each picture. Also, this solution presents an additional step ⎯a “selected screen”⎯ displaying only the selected pictures side-by-side and allowing the user to compare them and remove those they do not wish to have printed. Again, the screenshots for solution B can be seen below (UISKEI on the left, paper on the right).
Fig. 3. Solution B “screenshots”
702
V.C.V.B. Segura and S.D.J. Barbosa
As can be seen, both test materials have a sketchy look, in order to allow us to make a fair comparison and so that the appearance is not a factor of interference with the test results. 3.3 Results In the interview, besides open questions there were three grading questions, in which the users rated the technique in three aspects: • Adequacy: how adequate the technique was for testing a user interface prototype. (1 = poor, 5 = adequate) • Enjoyability: how much the user would like to use the prototyping technique again in the future. (1 = would dislike, 5 = would like very much) • Comprehensiveness: how much it is possible to understand how the interface works and criticize it. (1 = little comprehensive, 5 = very comprehensive) Although the test was not significant due to the low number of participants, we already have some interesting indications. The overall results of UISKEI are better: UISKEI ended up with an average of 4.6, while paper prototyping ended up with an average of 3.0. Each UISKEI grade has at least 1.2 points of difference to the corresponding one in paper prototyping. Moreover, UISKEI’s strongest point (enjoyability, with a 4.8 average) was the paper’s weakest one (with a 2.6 average). Every user acknowledged and enjoyed the immediate response of UISKEI, saying that the interaction was more visible and dynamic: “With the computer it is easier, since you click and it appears” (P4a) and “it is possible to perceive the basic functionality” (P2b). Also, another common opinion is that UISKEI prototyping was better because it is already in the same environment as the final solution, being “close to the real interface” (P2b). The results obtained in the rating questions are shown below (Table 2): Table 2. Results for the rating questions at the interviews
P1a G1 P1b P2a G2 P2b P3a G3 P3b P4a G4 P4b Average Std. Dev.
Adeq. 4 4 2 2 4 5 2 2 3.1 1.2
Paper Enjoy. 4 4 1 1 3 5 1 2 2.6 1.6
Comp. 3 5 1 3 4 5 3 2 3.3 1.4
Adeq. 4 5 5 4 4 5 5 4 4.5 0.5
UISKEI Enjoy. 5 5 5 4 5 5 5 4 4.8 0.5
Comp. 4 5 5 5 4 5 5 3 4.5 0.8
The influence of the test order can be seen by comparing the results of Table 1 grouping certain rows and columns. Comparing the solution order (tables 3 and 4), it is possible to notice that users who were presented with the solution B first (groups
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping
703
G3 and G4) gave higher grades overall. This can be explained by the users’ preference to the more pictorial approach, confirmed in the interview. Table 3. Results for the rating questions at the interviews of the users presented first with the solution A, than with the solution B
Average Std. Dev.
Adeq. 3.0 1.2
G1 and G2 (First Row – First A, then B) Paper UISKEI Enjoy. Comp. Adeq. Enjoy. 2.5 3.0 4.5 4.8 1.7 1.6 0.6 0.5
Comp. 4.8 0.5
Table 4. Results for the rating questions at the interviews of the users presented first with the solution B, than with the solution A
Average Std. Dev.
G3 and G4 (Second Row – First B, then A) Paper UISKEI Adeq. Enjoy. Comp. Adeq. Enjoy. Comp. 3.3 2.8 3.5 4.5 4.8 4.3 1.5 1.7 1.3 0.6 0.5 1.0
Comparing the technique presentation order (tables 5 and 6), it is possible to notice a clear difference between UISKEI and paper prototyping. The paper prototyping technique received higher grades (4.3, 4.0 and 4.3) from users who experimented it first than from users who tested UISKEI (2.0, 1.3 and 2.3). This can also be seen in the standard deviation data: while the tables comparing by technique order (by columns) have standard deviation for paper protoyping below 1.0, the ones comparing by solution order (by row) have the same parameter equal to or above 1.2, showing the influence of the order in which the techniques were presented to participants. Table 5. Results for the rating questions at the interviews of the users presented first with paper prototyping, than with UISKEI prototyping
Average Std. Dev.
G1 and G3 (First Column – First paper, then UISKEI) Paper UISKEI Adeq. Enjoy. Comp. Adeq. Enjoy. Comp. 4.3 4.0 4.3 4.5 5.0 4.5 0.5 0.8 1.0 0.6 0.0 0.6
During the interview, we discovered that paper prototyping was considered unnatural by many of the users. One of them even missed the clicking noise of the mouse: “[paper] is not much natural, it does not have that feeling, does not have sound, the clicking sound. The experience is different. Cannot say what it has that bugs me, but it is different.” (P3a). Later, when testing UISKEI, the same user said: “same thing as in the paper, but presented in a form that was more comfortable to me” (P3a).
704
V.C.V.B. Segura and S.D.J. Barbosa
Table 6. Results for the rating questions at the interviews of the users presented first with UISKEI prototyping, than with paper prototyping
Average Std. Dev.
G2 and G4 (Second Column – First paper, then UISKEI) Paper UISKEI Adeq. Enjoy. Comp. Adeq. Enjoy. Comp. 2.0 1.3 2.3 4.5 4.5 4.5 0.0 0.5 1.0 0.6 0.6 1.0
One of them (P2b), even called the paper prototyping “boring and senseless”, remarking that he felt that it was “more difficult to understand the dynamics”, because he felt it was not very efficient nor interactive. By contrast, another user (P3b) said that UISKEI was “simple and objective”. It was also a common idea that the paper has a limitation by its own nature: “(…) in the paper it is unreal (…) I kind of ignore the idea of having buttons, list and everything.” (P2a). Many participants felt annoyed by the constant “paper switching”, one of them (P4b) even commented that she “preferred that the computer does the maual job”. The same user, after testing both techniques, said that “there are things that you see in the computer and others in paper”, remarking that she herself had a different behavior while experimenting with the two techniques (in particular, while in the computer, she paid more attention to the screen, noticing the “view results” link in the top right corner, while in the paper she lost her focus more easily). When asked to compare the techniques, most participants (6 out of 8) preferred UISKEI over paper prototyping. P1a, however, stated that paper is a “more established thing”, while UISKEI still have its own bugs and some limitations (as, for example, not being able to type in a textbox, being restricted to a set of predefined values). However, his argumentation was: “UISKEI gave me expectations that were not fulfilled [regarding the aforementioned limitations], opposing to paper, that I would know that I could not type so I would be already resigned”. So, the paper’s natural limitations ended up by being a positive aspect to it. The other participant who preferred paper, P1b, said that she felt more comfortable with it, since “nothing will happen, nothing can be broken”.
4 Conclusions This paper presented a study comparing the paper prototyping evaluation technique to an interaction simulation supported by the UISKEI tool. The purpose of the study was to investigate whether the envisioned computational support is promising, and in which directions the development should evolve to take better advantage of the tool for supporting early evaluation of human-computer interaction. The preliminary results showed that UISKEI was generally well accepted by the study participants. In order to promote UISKEI’s adoption by design and development teams, however, some important facilities must be incorporated in the tool. From the designers’ point of view, it is important to have UISKEI import existing dialogue or interaction models to help define the application’s behavior and ensure consistency between modeled HCI design decisions and the simulated prototype.
UISK: Supporting Model-Driven and Sketch-Driven Paperless Prototyping
705
Besides allowing the evaluation of the user interface defined by the widgets and the user-system interaction defined by the behavior (manually or by importing a MoLIC diagram), UISK aims to advance the software specification efforts. Therefore, the UISK team is currently developing a generator for writing the interface and interaction specification in a user interface markup language, together with some UML diagrams. This specification will improve the traceability between the software specification and implementation activities and the earlier design activity. In the future, we intend to make this markup available for graphics designers to improve the user interface using their own special purpose tool. Acknowledgements. We would like to thank all study participants who selflessly devoted their time to our research. We would also like to thank the Brazilian Council for Scientific Research, CNPq, for grants 479167/2008-7 and 311794/2006-8, which made this work possible.
References 1. Barbosa, S.D.J., de Paula, M.G.: Designing and evaluating interaction as conversation: A modeling language based on semiotic engineering. In: Jorge, J.A., Jardim Nunes, N., Falcão e Cunha, J. (eds.) DSV-IS 2003. LNCS, vol. 2844, pp. 16–33. Springer, Heidelberg (2003) 2. Buxton, B.: Sketching the User Experience: getting the design right and the right design. Morgan Kaufmann, San Francisco (2007) 3. Carroll, J. (ed.): Scenario-based Design: Envisioning Work and Technology in System Development. John Wiley and Sons, New York (1995) 4. Carroll, J.: Making Use: Scenario-Based Design of Human-Computer Interactions. MIT Press, Cambridge, MA (2000) 5. Coyette, A., Faulkner, S., Kolp, M., Limbourg, Q., Vanderdonckt, J.: SketchiXML: towards a multi-agent design tool for sketching the user interfaces based on USIXML. In: Proceedings of TAMODIA 2004, pp. 75–82 (2004) 6. Forbus, K.D., Ferguson, R.W., Usher, J.M.: Towards a computational model of sketching. In: Proceedings of the 6th international Conference on intelligent User interfaces, IUI 2001, pp. 77–83. ACM, New York (2001), doi:http://doi.acm.org/10.1145/359784.360278 7. Landay, J.A., Myers, B.A.: Sketching interfaces: Toward more human interface design. Computer 34(3), 56–64 (2001) 8. Li, J., Zhang, X., Ao, X., Dai, G.: Sketch recognition with continuous feedback based on incremental intention extraction. In: Proceedings of the 10th international Conference on Intelligent User Interfaces, IUI 2005, pp. 145–150. ACM Press, New York (2005) 9. Nielsen, J.: Usability Engineering. Academic Press, London (1993) 10. Sezgin, T.M., Stahovich, T., Davis, R.: Sketch based interfaces: Early processing for sketch understanding. In: Proceedings of 2001 Perceptive User Interfaces Workshop, PUI 2001 (2001) 11. Snyder, C.: Paper Prototyping: the fast and easy way to design and refine user interfaces. Morgan Kaufmann, San Francisco (2003) 12. Xiangyu, J., Wenyin, L., Jianyong, S., Sun, Z.: On-Line Graphics Recognition. In: Proceedings of the 10th Pacific Conference on Computer Graphics and Applications, p. 256. IEEE Computer Society, Washington, DC (2002)
Beyond the User Interface: Towards User-Centred Design of Online Services Marcin Sikorski1, 2 1
Polish-Japanese Institute of Information Technology ul. Koszykowa 86, 02-008 Warszawa, Poland
[email protected] 2 Gdansk University of Technology Faculty of Management and Economics ul. Narutowicza 11/12, 809-952 Gdansk, Poland
[email protected] Abstract. This paper presents an attempt to identify those economic factors relevant to design of online services, which shape long-term customer satisfaction, as well as customer loyalty and business relationship with the service vendor. Using user-based studies and expert-based evaluations and major economic factors were identified as consumer needs. Also typical technical components of online services have been identified and prioritized as design elements, also relevant to satisfying economic needs of consumers of online services. As a result of this study will be delivered: a catalogue of design elements, design guidelines and economics-oriented design methodology for online services. Keywords: HCI, usability of online services, e-commerce, online customer behaviour, customer value.
1 Introduction and Background A growing number of online services available in everyday life (in both business and in private contexts, like financial or travel services) make them interesting object of research and design. Successful design of an online service is not only the issue of providing good usability and smooth user-system interaction, but it is also the problem of delivering appropriate business value to the customer [7]. This issue has been raised by Benyon et al. [1], who discussed interesting interplay between two design streams: HCI and usability engineering on one side, and services marketing on the other; a role of users shifting from website users to conscious consumers-decisionmakers has been also discussed. Demands of contemporary e-commerce and business require to include consumer behaviors and decisions as design drivers and – at the same time – as design constraints. Consumer needs should be converted into system attributes which delivers online service to the user, and its goal is to satisfy customer needs better than competition. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 706–714, 2009. © Springer-Verlag Berlin Heidelberg 2009
Beyond the User Interface: Towards User-Centred Design of Online Services
707
The design questions in this context are: • how to identify customer needs and expected benefits? • how to identify system attributes (service components) delivering previously identified economic benefits? • how to deliver these benefits as system attributes (functions or components) to the user using the service website? • how to use the service website to to stimulate customer loyalty, to build a profitable relationship with customers, and to deliver long-term economic value, matching customer needs and lifestyles? Because usability is only a precondition to use the service, but perceived economic value of business relationship is nowadays presumed [3, 6, 8] to be the main attractor for regular using any specific online service, there is a need to expand existing methodology for online services design and to provide economics-relevant guidelines for designers of online services.
2 Current Approach to Design of Online Services Current approach to design of online services is nowadays a mixture of e-commerce models, website design guidelines and relationship marketing principles [5, 6, 8]. Guidelines produced by HCI (Human-Computer Interaction) research are helpful in designing user interface and in improving website usability; however they are not covering other aspects affecting user behaviour – those aspects reaching “beyond the user interface” like information content, costs, attractiveness, prestige, lifestyles, which rather refer to economic criteria of consumer decision-making and long-term business relationships. By now HCI has been seldom treating the user as a consumer, and the interactive system not only as a technical artifact, bus primarily as a business platform designed for satisfying customer needs (of mostly economic nature). Few HCI studies so far have adopted such a business perspective, apart from focusing on perceived value and worth [2] they were mainly relevant to designing e-commerce solutions and improving their usability with a focus on business performance. As a result, nowadays HCI methodology is challenged to enhance its scope as to include also those elements of user-system interaction – a business interaction, in a sense – which may be useful in developing a better user experience not only when directly using the website, but also in a broader sense and in a longer term: as developing valuable business relationship between the customer and the online service vendor.
3 Methodology 3.1 Research Framework, Foundations and Assumptions Foundations and assumptions for this research are originating from HCI domain as well as from service marketing literature, especially related to e-commerce, e-services and customer relationship management [3, 4, 6]. The importance of economic determinants of customer satisfaction, namely those related to factors “beyond the user interface” and plain usability, has been already discussed early studies on e-commerce [7].
708
M. Sikorski
Also former attempts to combine user needs with technical components that satisfy these needs provided foundations for quality management approach in improving the website as a whole (including quality of contents), not only quality of interaction. Therefore a more economic approach, oriented on improving a service website towards delivering economic benefits for customer and enhancing value of relationship, was used to address the stimuli received by customer from both on-line channels (Internet, mobile interaction) and off-line channels (printed ads, press reviews, opinions of other customers). As a research framework a model describing the gradual development of online services [10] was used to identify layers shaping different aspects of customer behaviour in on-line services (Table 1). Table 1. Levels of service quality and levels of customer focus in development of online services [10] Service quality focus
Value Relationship Experience Usability Functionality
Customer focus
Lifestyle Loyalty Delight Satisfaction Performance
Because customer needs cover various levels and various dimensions of expected service quality, they must have some specific technical realization on the specific website (for instance available as functions, buttons, catalogs, calculators, images, calendars, opinions, baskets, wish lists and other elements often available in ecommerce websites). In general, technical realization of an online service covers two dimensions: • analytical: system functionality and usability, to be used by a customer, • emotional an affective: user experience, to be perceived by a customer. As a result, a model describing interaction between the user and the service has been adopted, covering following technical components: • • • •
business model, leading to customer attitudes and behaviors, transaction process, leading to repeated transactions, interaction layer, facilitating ordering and payment, visual design layer, enabling perception and manipulation on user interface elements.
Categories of user requirements in online service will be treated as consumer needs and they have to be identified using expert evaluation, case studies of real systems, and primarily – user-based testing of online services. 3.2 Research Procedure Research procedure applied in this study was a sequence of several methods: • expert evaluations: several examples of actual online services (financial, ecommerce and tourist information services) were analysed by team of e-commerce and usability experts in order to extract:
Beyond the User Interface: Towards User-Centred Design of Online Services
709
main requirements of users – with focus on economic needs, main technical components of a service website – with focus on visual, transactional, business process and economic value elements; • user-based testing: consisted of three parts, aimed on gathering user requirements and expectations as well as on validating experts’ point of view: o pre-test questionnaires and requirements workshop: experienced users of online services (in three groups) were asked to fill in a paper questionnaire asking them to declare main categories of requirements and expectations when using on-line services, as well as difficulties experienced on familiar service websites; next these requirements were discussed with users, and grouped into several categories; in another questionnaire task users were asked to compare main categories pairwise as to priorities for main requirements could be calculated; o usability test of actual online service: after requirements questionnaire and workshop users had to perform a specific task using a real online service (financial, ecommerce or tourist information); the tasks were designed in a way stimulating economic thinking – users had to decide which option from a given set is the most attractive; o focus groups and post-test questionnaires : after completing the task with a specific online service, users were asked to fill-in a post-test questionnaire, in which they were asked to identify perceived benefits and experienced difficulties falling into four categories: • economic value and relationship • business transaction process • interaction support • visual elements • conceptual design based on the QFD methodology: o selection and analysis of data gathered from users and experts using affinity diagrams, relationship matrices and prioritization techniques commonly used in the QFD (Quality Function Deployment) methodology, adapted to analyze and improve usability of websites [9]. o o
The ultimate goal of this study was to gather enough data to form design guidelines aimed to address economic factors shaping user/customer behaviour not only in an operational dimension when using a specific service, but primarily in a long-term horizon, where total value perceived by customer as well as perceived worth of relationship determine customer loyalty and attitude to using a specific online service or an online brand.
4 Results 4.1 Expected Customer Benefits Main groups of customer needs were identified, in more detail recently described in [11], with assigned following priorities:
710
M. Sikorski
Critical:
Attractive price (F1)
Important:
Security of access to service (F5) Sufficient information provided (F4)
Medium:
Convenient access to service (F3) Short time of website operation (F6) Low additional expenses (F2)
Less important: Low physical workload (F7) Low mental workload (F8) Marginal:
Benefits from marketing incentives (F9)
Priorities computed from user data (gathered from interactions with three types of online services) showed also that marketing incentives (like freebies, gadgets etc.) seem to have practically no influence on attracting users to use on-line services. Physical and mental workload also received surprisingly low priorities, however their meaning should be related not only to fatigue when operating the website, but also to more general convenience-related issues. 4.2 Technical Attributes of an Online Service Post-test questionnaire delivered many qualitative data, which allowed to identify about 60 technical factors (service attributes, functions and components) which users declared as having impact on perceived quality in relation to website, service and to business relationship with a specific vendor. These elements were classified into four layers with selected elements listed below: V – Visual user interface layer easily readable screen elements, websites screen layout consistency, ongoing visual support (visual tips, flags, icons, maps etc). I – Interaction layer easy navigation, automated operations, multiple views of the product. P – Business process layer search support, comparing support, logical sequence of steps, confirmations in operations (instant feedback, feel of control), payment transaction flow, user profile, login etc., tracking service progress (insight into the process). R – Business relationship and economic value layer always attractive prices (attractive prices every day), preview of basket, preview of payment, product configuration customization,
Beyond the User Interface: Towards User-Centred Design of Online Services
711
recommendations and customer opinions, choice of methods for placing orders, choice of methods for payment, choice of methods for delivery, fair terms of use, assurance about security, credibility and trust, related offers, special offers for special customers, customer programs, freebies and marketing incentives.
The section “R”, covering business relationship and economic value layer was especially interesting due to the purpose of this study, and it contained more than 20 items declared by users as having impact not only on buying decisions but also on shaping long-term relationship with the service vendor. 4.3 Identifying Relationships Data gathered from users were aggregated by experts and collected in a QFD matrix in order to search for possible relationships between user requirements and technical factors. Due to a relatively big amount of user requirements and technical components a preliminary selection was made, and the analysis based on a frequency of users declarations. Priorities were calculated using data from pairwise comparisons from user studies, as well as expert estimations relevant to criticality of given technical factors to a business success of a hypothetic online service of a specific sort. Taking into consideration relationships with identified consumer needs local priorities were computed within each category (Visual, Interaction, Process, Relationship). Fig. 1. shows priorities for selected technical factors within the most interesting section, namely Relationship, covering components which shape economic value for customer. These priorities for technical factors have been computed using typical QFD sorting formula (see [9] for details), which combines importance of each identified customer requirement and relationship of this requirement with technical factors (like elements of service website), assigned to be “responsible” to satisfy each of customer requirements. Results shown in Fig. 1 show that presence of authentic customer recommendations and external evidences of credibility gained top priorities. Next, customers appreciate ability to check availability of service (like amount of seats still left) as an important insight into the service process, with clearly written conditions for guarantee and terms of use for service and website. Remaining technical factors relevant to economic value are shown in Fig. 1., but will be not further discussed here. Finally, all (exactly 64) technical factors have been classified according to QFD sorting formula, presenting local priorities within remaining three other categories (Visual, Interaction and Process). Because this study was intended as qualitative only, and pragmatically aimed to deliver design guidelines for developing online services (basing on user-based testing and expert aggregations of user data), no attempt was yet made to use statistical tools to prove generality of obtained results.
712
M. Sikorski
Customer opinions and recommendations Visible prizes/logos of trust and credibility Instant checking of service availability Written conditions for guarantee and terms of use Available privacy protection policy Available multiple methods of payment Off-line contact available (toll-free phone, fax etc) Available multiple methods of delivery Online service available 0-24 Always competitive price Available multiple forms of order Access to associated offers Price reductions for regular customers User account, user profile, shopping history Newsletter for regular customers Freebies: leaflets, comms, multimedia, games, etc. 0
0,5
1
1,5
2
2,5
3
3,5
4
Fig. 1. Priorities for selected technical factors relevant to economic value and customer relationship in online services
Instead, a graphical analysis of relationships among user requirements (43 items finally grouped into 6 categories of customers needs) and technical factors (64 items, grouped into 4 categories V, I, P, R) was made. Because the data gathered from users were collected using three different types of online services, as usual in user-based testing user preferences and biases are bound to affect the raw results. For this reason expert analysis was also applied to aggregate the data – this brought surely some subjectivity to the assessments, but helped in extracting the data relevant to the context of online services of a specific type. Table 2 shows that multiple relationships have been found among (1) identified categories of customer needs and (2) identified categories of technical components. Table 2. Categories of consumer needs and related categories of technical components (1) Consumer needs related to:
Price Convenience Information demand Security Time and speed Workload
(2) Technical components
Visual elements Interaction elements Transaction process elements Economic value/relationship
Beyond the User Interface: Towards User-Centred Design of Online Services
713
Classification of technical components is not yet finished, but intended result will be available as a catalogue of design elements (“online service primitives”) which can be used to compose online services and service websites in a variety of business contexts. Validation of this approach is planned as an user-based testing of online service prototypes, and observing users behaviour (in short-term as a willingness to buy and in long-term as a willingness to maintain a specific business relationship) will offer suggestions for developing a more general design methodology for online services. 4.4 Prospective Design Methodology Prospective design methodology resulting from this project is aimed to address also business factors (beyond the user interface) by: • relating specific user requirements (consumer needs), • relating service attributes (technical components) with economic values, perceived benefits and profitable business relationships. It is also aimed to convert usability perspective in design an online service form website usability perspective to a wider perspective, clearly stating that a service website or a mobile device is only a delivery platforms for a more generic business solution – a specific online service. 4.5 Discussion Presented approach although mostly qualitative, allowed to identify: • 43 user requirements declared by a sample of online service users, • 64 technical factors to be used when planning technical realization of a service website. The items were grouped in several categories, helpful in developing design guidelines for online services. Limitations and constrains of this study are at least the following: • different types of online services were analysed together, basing on imperfect assumption that there is a common pattern of consumer behaviour underlying a general use of online services; • users’ behaviour and data were far from homogeneous, though bringing difficulties in averaging obtained data and extracting the most representative parts; • currently, a lack of convenient visual notation for mapping 43 user requirements against 64 technical components makes presenting results difficult (large spreadsheets full of formulas), what reduces usability of eventual results.
5 Conclusions and Further Work This approach, despite of advantages resulting from involving real users, has major limitations resulting from subjectivity of obtained data; users’ declarations are always subjective as raw data, and even if aggregated by experts, they are still subject to experts’ subjectivity in interpreting recorded observations.
714
M. Sikorski
Within this project further experiments with prototypes of online services are intended to prove if design guidelines stemming from this study are representative and valid enough for practical design contexts. Because the shift in HCI design focus from the user interface, via service website to economic value and business relationship seems to be inevitable, this study – however full of limitations – is believed to present a novel attempt towards broadening HCI design methodology for design challenges of future online services. Acknowledgments. This paper has been prepared with support of the research grant 34/N-COST/2007/0 from the Polish Ministry of Research and Higher Education.
References 1. Benyon, D., O’Keefe, B., Mival, O.: Human-Centred Design of Interactive Services. In: HCI and the Analysis, Design and Evaluation of Services, http://www.eng. cam.ac.uk/~pw308/workshops/HCI&Services/Papers/service%20des ign%20w_shop.pdf 2. Cockton, G.: Designing Worth is Worth Designing. In: Proceedings of NordiCHI 2006, pp. 165–174 (2006) 3. Cheung, C.M.K., Zhu, L., Kwong, T., Chan, G.W.W., Limayem, M.: Online Customer Behaviour: A Review and Agenda for Future Research. In: Proceedings of the 16th Bled eCommerce Conference eTranformation, Bled Slovenia, June 9-11, 2003, pp. 194–218 (2003) 4. Dahlberg, T., Mallat, N.: Mobile Payment Service Development – Managerial Implications of Consumer Value Perceptions. In: Proceedings of ECIS 2002, Gdańsk, Poland, June 6-8, 2002, pp. 649–657 (2002) 5. van Dijk, G.: HCI Informing Service Design, and Visa Versa. In: HCI and the Analysis, Design and Evaluation of Services, http://www.eng.cam.ac.uk/~pw308/ workshops/HCI&Services/Papers/Servicedesign_paper_GvD_ 190708.pdf 6. Fui-Hoon Nah, F., Davis, S.: HCI Research Issues in e-Commerce Design. Journal of Electronic Commerce Research 3(3) (2002), http://www.csulb.edu/journals/ jecr/issues/20023/paper1.pdf 7. Lee, J., Kim, J., Moon, J.Y.: What Makes Internet Users Visit Cyber Stores Again? Key Design Factors for Customer Loyalty. Proceedings of CHI 2000 Letters 2(1), 305–312 (2000) 8. Rust, R.T., Kannan, P.K.: E-service: A New Paradigm for Business in the Electronic Environment. Communications of the ACM 46(6), 37–42 (2003) 9. Sikorski, M.: Zastosowanie metody QFD do doskonalenia jakosci uzytkowej serwisow WWW (Application of QFD Method for Improving Usability of WWW Sites, in Polish). Zeszyty Naukowe Politechniki Poznanskiej, seria: Organizacja i Zarzadzanie, vol. 35, pp. 13–24 (2002) 10. Sikorski, M.: HCI and the Economics of User Experience. In: Law, E., Hvannberg, E., Cockton, G. (eds.) Maturing Usability, pp. 318–343. Springer, London (2008) 11. Sikorski, M.: From user satisfaction to customer loyalty: addressing economic values in user-centered design of on-line services. In: Proceedings of the COST 298 Conference The Good, The Bad And The Challenging: The user and the future of information and communication technologies, Copenhagen, May 13-15, 2009 (in print)
Designing for Change: Engineering Adaptable and Adaptive User Interaction by Focusing on User Goals Bruno S. da Silva, Ariane M. Bueno, and Simone D.J. Barbosa Departamento de Informática, PUC-Rio R. Marquês de São Vicente, 225 Gávea, Rio de Janeiro, RJ, Brasil, 22451-900 {brunosantana,abueno,simone}@inf.puc-rio.br
Abstract. In the human-computer interaction area, research work in end-user programming, end-user development, and user or system-driven adaptation of interactive systems has attempted to cope with variations in users’ intents, context changes and evolutions. In the field of requirements engineering, research that addresses similar issues has been called variability analysis. Most work in variability analysis, however, focuses on prioritizing one or few possible solutions to be implemented in the final product, whereas in humancomputer interaction many researchers advocate that we should strive to enable users to adjust and adapt the product as needed. This paper presents an approach to bring the results obtained in requirements engineering to inform the choice of interaction design solutions to cope with variability. Keywords: Variability analysis, interactive systems adaptation, bridging requirements engineering and interaction design.
1 Introduction Despite the research effort to deal with differences and variations among users and devices in human-computer interaction (HCI), even today we lack a systematic approach to deal with variations that reflect differences in the context of use, in user goals, needs, preferences, and strategies to achieve them. Research focusing on requirements analysis calls our attention to the importance of variations among users, devices and contexts and which propose techniques to analyze them [10, 11, 12, 16, 20]. In HCI, on the other hand, many researchers have proposed strategies to deal with variations at interaction time, typically with flexible, adaptive and adaptable solutions [15, 17, 18, 19, 26]. Nevertheless, there is still a gap between the identification of variability in the problem space and the definition of adequate solutions for coping with variability in the solution space. In requirements engineering, HCI-related concerns are typically treated as softgoals, which can be achieved by selecting, at design time, certain task decomposition, based on the user and business priorities identified by the requirements engineer [3, 13]. This means that, if users’ priorities and softgoals change during system usage, the system will not be able to accommodate them, and may therefore become inefficient or even obsolete. The research area of intelligent J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 715–724, 2009. © Springer-Verlag Berlin Heidelberg 2009
716
B.S. da Silva, A.M. Bueno, and S.D.J. Barbosa
user interfaces, in particular on adaptable and adaptive systems [15], aims to accommodate changes in needs and priorities during interaction, thus rendering the system more efficient and useful for a longer period of time. Most work in adaptable and adaptive user interfaces, however, focuses on the specification and implementation of the adaptation mechanisms. There is little work on how to decide which adaptation strategy to adopt to deal with certain identified variations, based on the analysis of requirements and user needs. More specifically, there is little work on designing adaptation based on contextualized user goals (identified during analysis), as opposed to specific ways of achieving them (which already result from design decisions that, if made too early, may unnecessarily restrict the possible adaptation strategies). We have been working on a variability analysis approach based on the users’ discourse that characterizes their goals [24, 25]. In those previous works, the focus was to separately explore the problem and the solution spaces of the HCI design activity related to differences and variations among HCI concerns. In this work, we relate the dimensions analyzed in the problem space and the corresponding strategies deemed appropriate to accommodate change in the solution space during interaction time. This paper is organized as follows: the next section describes our account of the problem and solution spaces. The third section describes how variability is considered in each space. Next, the paper presents the proposed relations between the two spaces, and the fifth section presents some concluding remarks.
2 The Problem and Solution Spaces The problem space may at first be characterized by the people who will use or benefit from the system, and, as in semiotic engineering, what they “want and need to do, in which preferred ways, and why” [4]. Moreover, the contexts [6] in which user activity occurs is also part of the problem space. Variations can be found in the heterogeneity of the user population, their goals or the context for their activities, as well as their evolution in time. Regarding users, it is important to investigate the psychological characteristics (e.g. attitude, motivation, preferences), knowledge and experience (e.g. typing skill, task experience) and physical characteristics (e.g. color blindness, hearing difficulty). Regarding their goals, it is important to examine the user goals when they perform their activities, the frequency and importance of the activities, and the main artifacts and objects used in performing them. Concerning the context of the activity, it is important to study when and where the goals need to be/are achieved: the time divisions (e.g. hours and days) or intervals (e.g. seasonal intervals) which are relevant to user goals and related activities; the set of places (e.g. home and work) or a hierarchy of places (e.g. a room at a university, a campus, a city, and so on); the physical environment (e.g. open/close work areas, lighting, heat, noise level, distractions and interruptions); the social and cultural environment (e.g. in broad terms, morale, motivation, values and policy, or, in specific terms, the possibility of learning to use the system with colleagues, cases when users are pressured to go fast, and this culture works better with uncertainty than others) [11, 23].
Designing for Change: Engineering Adaptable and Adaptive User Interaction
717
To understand the design solution space, from an HCI perspective, we need to understand what goes on during the use of a system. The semiotic engineering theory of human-computer interaction [4] brings to our attention that, as intellectual artifacts, every software [5]: • linguistically encodes both a particular understanding or interpretation of a problem situation and a particular set of corresponding solutions; and • is designed with the goal that users will be able to formulate and express their intents within the linguistic system encoded in the artifact. By linguistic, de Souza means that the artifact encoding is “based on a system of symbols—verbal, visual, aural, or other—that can be interpreted by consistent semantic rules” [5]. In semiotic engineering, Jakobson’s communication model ([13], Fig. 1.) is used to illustrate human-computer interaction phenomena and as thus it provides a basis for defining the solution space. context contact (channel) message addresser (sender)
addressee (receiver) code
Fig. 1. Jakobson's communication model
As with other semiotic approaches, semiotic engineering views HCI as a particular kind of computer-mediated human interaction [4]. It views software as a metacommunication artifact, i.e., a (meta)message produced by the designer about the communication that may take place when users communicate with the message itself at interaction time. The user interface is said to be the “designer’s deputy”, in the sense that it encodes a range of meanings, meaning manipulations, and design principles that the designer chose to synthesize in the product. The MoLIC language, Modeling Language for Interaction as Conversation, was devised to help designers elaborate the metamessage [1] at a dialogue level, allowing them to represent and reflect on: the user’s goals or intents supported by the system; the conversations (i.e. sequences of illocutions and turn taking performed by the user and the designer’s deputy) through which the users may achieve their goals; restrictions on the utterance of certain illocutions according to the context of the conversation (or the world); the perlocutions or effects of each (segment of) conversation; illocutions aimed at repairing communicative breakdowns that the designer is able to anticipate; and the signs1 contained in the illocutions. Thus, we may say that MoLIC further details the solution space with respect to human-computer interaction. MoLIC does not, however, represent the concrete user interface, which is also part of the solution space. In this paper, we do not deal 1
Peirce defined sign as “anything that stands for something else, to somebody, in some respect or capacity” [22].
718
B.S. da Silva, A.M. Bueno, and S.D.J. Barbosa
extensively with the user interface itself. When necessary, we only point to some forms of user interface adaptation, without detailing how it can or should be represented. In this paper, we propose to classify the signs in the solution space in three groups: • object signs, which represent concepts, entities or things; • task signs, which represent actions that manipulate the object signs; and • user interface signs, which represent the user interface elements that refer to objects and tasks. As we will see in the fourth section, variations in each group of signs (as a result of the requirements elicitation and analysis) will point to different interactive solutions to cope with these variations.
3 Exploring Variability Changes in the problem space may require changes in the solution space. In order to explore variability in the problem space, we follow the variability analysis process defined in [24] and refined in [25], which comprises the following steps: 1. Elicit information about domain, user goals, users, context of use and system (possible hardware and infrastructure). 2. Identify goal-directed user requests. 3. Identify and describe the signs present in the user discourse about their domain, goals, and tasks. 4. Rewrite user requests using cases. 5. Organize signs in an ontology. 6. Explore possible variations by expanding user requests. 7. Explore possible variations by substituting signs in user requests. The data collected in (1) are typically answers to the general 5W2H questions: who, what, when, where, why, how, and how much. In addition, issues of time could combine when and how much to generate questions about how often and for how long. Who participates in the interaction process? Examining the interaction process, one realizes that both user and system participate in it. Regarding the users, information about their skills and preferences [11, 12, 23] need to be elicited, as well as any constraints and special needs they may have. Regarding the system, information about the available hardware platforms (desktop, laptop, PDA, cell phone, etc.), input and output devices (mouse, keyboard, pen, etc.), and infrastructure (network access, disk space, etc.) are necessary. It is important to note that, at this stage, no design or implementation decisions are made. Instead, possibilities are being elicited that will help anticipate variations. Only later should these aspects drive design decisions. What are the participants’ goals? The users’ goals (i.e. the expected results of their interaction with the system) are traditionally investigated during requirements engineering [3, 11, 1420, 20]. The “system’s” goals, on the other hand, are a product of the designers’ work to support the users’ goals, and thus are designed in later stages of development process.
Designing for Change: Engineering Adaptable and Adaptive User Interaction
719
When, where, and in which contexts will goals be achieved? Besides common known time divisions, such as minutes, hours, days, months, and so on, the requirements engineer should investigate other relevant time divisions or intervals, such as seasonal intervals. The interaction can occur in a set or hierarchy of places, such as home and work, or university. Compared to the DOM structure of a TABLE element, our customized structure can be traversed with much lower cost. 3.3 Controller Based on the re-constructed structure MPi, the controller creates a customized user interface supporting interactive access to the corresponding genuine web table GTPi. We implement the controller as a floating toolbar with adjustable transparency, which Table 2. Functions provided by iWebTable No.
Name
Description
F1
Ascent Sorting
Sorting data of the entire table ascendingly ordered by the data of current column or row.
F2 Descent Sorting
Sorting data of the entire table descendingly ordered by the data of current column or row.
F3
Initial Order
Restoring data of the entire table to initial order.
F4
Hiding Info
Hiding current column or row.
F5
Showing Info
Showing current column or row next to current one.
F6
Restoring All
Restoring the table to the initial state.
Enabling Interactive Access to Web Tables
765
automatically appears when the cursor is hovered on a header cell. The toolbar is consisted of six buttons, as illustrated in Figure 1. Table 2 presents the function of each button. By pressing these buttons, users can control the appearance of the table in different ways without changing the layout of the web page.
4 Experiment The experiment is designed to evaluate the effectiveness of iWebTable on enabling interactive access to genuine web tables, by two steps: − Step 1. Setting up the pre-processor of iWebTable. − Step 2. Conducting a user study on iWebTable. 4.1 Pre-processor Set Up We set up the pre-processor by: 1) constructing the training data set for learning genuine web tables; 2) selecting a machine learning method used by the classifier. The training data set is constructed from 3807 tables in 200 common web pages. By manual classification, we found that genuine tables accounts for only about 13.13% (500/3807) of all web tables, which is similar to conclusions in [12]. We tested on three different machine learning methods by conducting 10-fold validations, including: Naïve Bayes, SVM with linear kernel and SVM with RBF kernel. Experimental results show that SVM with RBF kernel performs the best, with precision, recall and F-measure achieving as high as 95.81%, 95.98% and 95.89% respectively. Therefore, we decide to use it to learn genuine web tables. 4.2 User Study The user study is aim to testify whether iWebTable is helpful for browsing genuine web tables and to evaluate its usability. We assume that iWebTable can greatly reduce users’ effort in typical cases as: 1) search for certain information in a large web table; 2) compare data in distant columns or rows. Based on this assumption, we designed two scenarios and assigned a specific task in each scenario, as showed in Table 3. Table 3. Scenarios and tasks designed for user study No.
Scenario
Task
1
You are browsing a 5-column genuine web table which displays the top 50 popular songs ordered by their rankings, with titles showed in col. 2 and the hyperlinks for trial audition showed in col. 5.
Given the titles of three songs (two are listed in the table), to find out whether they are top 50 and click on the hyperlinks for trial audition.
3
You are browsing a 10-column genuine web table which displays the train schedule between two specified stations, with train number showed in col. 1 and detailed information showed in others.
Given the numbers of two trains (in two distant rows), to compare their detailed information and make a selection between them.
766
X. Yang, W. Xu, and Y. Shi
We recruited 5 participants, all of whom were Chinese graduate students familiar with web browsing on desktops using Microsoft Internet Explorer. They were showed a demo on how to use iWebTable in advance and then were required to complete each task in two ways: 1) firstly, with iWebTable disabled; 2) secondly, with iWebTable enabled. Table 4 shows the quantified results calculated from automatically recorded timestamps of relevant operations, with F1-F6 denoting the six functions presented in Table 2. In the first task, which requires searching for certain information in a 5*51 web table, the functions for sorting data (F1 and F2) are frequently used. While in the second task, which requires comparing data in two distant rows, the function for hiding information (F4) is most frequently used. In both tasks, participants spent much less time when iWebTable is enabled. Finally, each participant was orally interviewed about: 1) whether satisfied with each of the six functions; 2) which function is most useful; 3) the usability of the user interface. All participants reported high satisfaction with each function and agreed that F1 and F2 are the most two useful functions. However, they thought that the usability of the user interface should be improved. For an example, the icons of F4 and F5 cannot express the corresponding functions well. Results above demonstrates the effectiveness of iWebTable on enabling interactive access to genuine web tables, especially in such cases as sorting data in large tables and comparing data in distant columns or rows. On the whole, users are satisfied with iWebTable and really enjoy the interactive access mode to web tables. Table 4. Quantified results of user study
Task 1 Task 2
Average times of using F1-F6 / per person F1
F2
F3
F4
F5
F6
Average Time Spent / per person
Disabled
-
-
-
-
-
-
50.406s
Enabled
1.0
1.2
0.2
0.8
0.0
0.2
24.628s
Disabled
-
-
-
-
-
-
39.003s
Enabled
0.8
0.4
0.4
1.8
0.4
0.2
26.869s
Status of iWebTable
5 Discussions and Future Work We propose to enable interactive access to web tables based on automatic table detection and a good understanding of table contents. We designed and implemented a plug-in for the Microsoft Internet Explorer, called the iWebTable, which enables interactive access to genuine web tables by providing: 1) presentation re-rendering functions like hiding unnecessary rows or columns; 2) advanced functions like sorting data of the entire table ordered by the data of current row or column. In spite of the promising experimental results, iWebTable has two limitations. First, it cannot handle web tables not presented by HTML TABLE elements, such as
Enabling Interactive Access to Web Tables
767
formatted by CSS. This is the instinct drawback of detecting web tables by using HTML code. Second, as reported in the oral interview, its user interface needs to be improved. For an example, the icons for toolbar buttons should be more expressive. In the future, we will improve the user interface of iWebTable based on suggestions collected from oral interviews in user study. We also would like to extend the functions of iWebTable and migrate it to small screens. Acknowledgements. This paper was supported by Specialized Research Fund for the Doctorial Program of Higher Education (No. 20050003048).
References 1. Asakawa, C., Itoh, T.: User Interface of a Nonvisual Table Navigation Method. In: ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 1999), pp. 214–215. ACM Press, New York (1999) 2. Cohen, W.W., Hurst, M., Jensen, L.S.: A Flexible Learning System for Wrapping Tables and Lists in HTML Documents. In: 11th International Conference on World Wide Web (WWW 2002), pp. 232–241. ACM Press, New York (2002) 3. Gaterbauer, W., Bohunsky, P., Herzog, M., Krüpl, B., Pollak, B.: Towards Domain-Independent Information Extraction from Web Tables. In: 16th International Conference on World Wide Web (WWW 2007), pp. 71–80. ACM Press, New York (2007) 4. Hassan, T., Baumgartner, R.: Table Recognition and Understanding from PDF Files. In: 9th International Conference on Document Analysis and Recognition (ICDAR 2007), Washington, DC, USA, pp. 1143–1147. IEEE Computer Society Press, Los Alamitos (2007) 5. Hurst, M.: Classifying TABLE Elements in HTML. In: 11th International Conference on World Wide Web (WWW 2002), Poster Paper (2002) 6. Krüpl, B., Herzog, M.: Visually Guided Bottom-Up Table Detection and Segmentation in Web Documents. In: 15th International Conference on World Wide Web (WWW 2006), pp. 933–934. ACM Press, New York (2006) 7. Lim, S.J., Ng, Y.K.: An Automated Approach for Retrieving Hierarchical Data from HTML Tables. In: 8th ACM International Conference on Information and Knowledge Management (CIKM 1999), pp. 466–474. ACM Press, New York (1999) 8. Liu, Y., Bai, K., Mitra, P., Giles, C.L.: Automatic Searching of Tables in Digital Libraries. In: 16th International Conference on World Wide Web (WWW 2007), pp. 1135–1136. ACM Press, New York (2007) 9. Pinto, D., McCallum, A., Wei, X., Croft, W.B.: Table Extraction Using Conditional Random Fields. In: 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003), pp. 235–242. ACM Press, New York (2003) 10. Ramel, J.Y., Crucianu, M., Vincent, N., Faure, C.: Detection, Extraction and Representation of Tables. In: 7th International Conference on Document Analysis and Recognition (ICDAR 2003), Washington, DC, USA, pp. 374–378. IEEE Computer Society Press, Los Alamitos (2003) 11. Tajima, K., Ohnishi, K.: Browsing Large HTML Tables on Small Screens. In: 21st Annual ACM Symposium on User Interface Software and Technology (UIST 2008), pp. 259–268. ACM Press, New York (2008)
768
X. Yang, W. Xu, and Y. Shi
12. Wang, Y.L., Hu, J.Y.: A Machine Learning Based Approach for Table Detection on The Web. In: 11th International Conference on World Wide Web (WWW 2002), pp. 242–250. ACM Press, New York (2002) 13. Yang, Y.C., Luk, W.S.: A Framework for Web Table Mining. In: 4th ACM CIKM International Workshop on Web Information and Data Management (WIDM 2002), pp. 36–42. ACM Press, New York (2002) 14. Yoshida, M., Torisawa, K., Tsujii, J.: A Method to Integrate Tables of the World Wide Web. In: 1st International Workshop on Web Document Analysis (WDA 2001), pp. 31–34 (2001)
Integration of Creativity into Website Design Liang Zeng1, Robert W. Proctor2, and Gavriel Salvendy1,3 1
School of Industrial Engineering, Purdue University, 315 N Grant St, West Lafayette, IN 47907, USA 2 Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907, USA 3 Department of Industrial Engineering, Tsinghua University, Beijing 100084, P.R. China
[email protected],
[email protected],
[email protected] Abstract. The desire to achieve a sound design of a product and its interaction with humans runs the gamut from the simplest hardware products to the most complex information technology systems. This paper proposes a conceptual framework highlighting the central role of creativity in ergonomic design of websites. The integration of creativity helps to achieve synergy of the three dimensions in ergonomic design: functionality, usability, and affectivity. A factor structure of website creativity is further discussed in terms of its relation to the ergonomic design framework. Suggestions for the realization of website creativity are provided, and future research directions are discussed. Keywords: Creativity, e-commerce, information technology, website design.
1 Introduction The first two technological revolutions (the steam engine revolution of the early 19th century and the electricity/internal combustion revolution of the early 20th century) boosted the variety of human products, brought prosperity to the commercial market, and fueled development of the economy. Though the impacts of those revolutions were substantial, they seem minor compared to that of the third technological revolution, with its hallmark of Information Technology (IT). The inventions of computers, the internet, and various IT products have enhanced the quality of life in many respects. The advent of the information age makes it possible for people to enjoy extensive interactions with IT products and services. The most salient characteristics of IT products are that they are generally more complex than ordinary consumer goods and provide intangible and dynamic services. It is those traits that distinguish IT products from traditional hardware goods [1]. The advent of IT also greatly intensifies market competition and calls for sustained generation of creativity. Companies need to launch innovative products and services in order to beat their rivals. In today’s commercial market, with increasingly heightened competition, a corporation merely providing products and/or services like those developed by others finds it progressively more difficult to increase its market share. The J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 769–776, 2009. © Springer-Verlag Berlin Heidelberg 2009
770
L. Zeng, R.W. Proctor, and G. Salvendy
continuous power of creativity calls for the recognition that it is one of the most important determinants of a company’s competitive advantage. Without creativity, long-term failure is a near certainty [2-4]. As a representative of IT products and hypermedia computer-mediated environments, web-based applications and services now play a pivotal role in everyday social life. The World Wide Web (WWW) has risen to be the most versatile mass medium and also a global platform used by individuals, organizations, and governments throughout the world [5]. Over the past decade, the number of businesses that have established their presence on the WWW has been skyrocketing. The websites of companies, institutions, and universities serve as their online shop front. People interact with those websites via the Internet and form their first impressions. All these entities intend to attract prospective customers, partners, or students by making their websites appealing to the target groups [6]. Commercial applications have been developed to explore the capabilities provided by websites. In 1996 over half of the 200 most heavily advertised brands in the United States were represented with websites [5]. Netcraft, an Internet monitoring company that has tracked website growth since 1995, reports that in January 2007 there were 106,875,138 websites with domain names and content on them, compared to merely 18,000 websites in August 1995. Websites, acting as hubs of communication, entertainment, and commerce, have played a major role in daily life. The main focus of this article is on the design of creative and successful websites, which have the power to attract customers and influence their purchase intentions. A conceptual framework regarding the role of creativity in website design is proposed. Then, factors for website creativity are discussed in terms of its relationship with the conceptual framework.
2 The Role of Creativity in Website Design The desire to achieve a sound website design with which humans can easily interact has gained much attention in the area of Human-Computer Interaction (HCI). Consumers are becoming more and more demanding, and thus how to win those increasingly picky customers has become a key to market success [7]. Traditional ergonomic design of websites typically embraces two fundamental dimensions, functionality and usability (see Fig. 1). First and foremost, the design of a website should satisfy its functional needs. Functionality is mainly concerned with what a website provides in practical terms, the level at which it performs its functions, and the different options it offers the customer [7]. Sound functional design can thus be regarded as a hygiene factor, according to Herzberg’s [8] motivation theory. The absence of functionality causes user dissatisfaction, yet the presence of good functionality may be taken for granted by customers and not necessarily lead to user satisfaction. A website with sound functionality should also be easy to use. The importance of usability in website design and more broadly in HCI design has long been emphasized in the human factors and ergonomics literature [9]. Diversified instruments have been developed to evaluate usability and make a website easy for human use. Usability can be considered to be a combination of both hygiene and motivational factors, since it may foster user satisfaction.
Integration of Creativity into Website Design
771
Fig. 1. Integration of Creativity in Website Design
Nonetheless, mere design for performance and usability is no longer sufficient. In other words, a website satisfying these two fundamental aspects is still far from being competitive in the market’s intense competition. Not only must the website perform its functions well and be easy to use, but it should also support an overall experience attractive and pleasant to customers, which calls for considering the user’s affect and pleasure. The significance of affective design (a motivational factor) has been highlighted in the literature, and it is suggested that the three dimensions (functionality, usability, and affectivity) of ergonomic design should go hand in hand to create a sound product or service [7, 10, 11]. However, there is still large room for improvement beyond meeting the aforementioned design requirements, and creativity is the catalyst that can make such difference. Searching for ways of achieving a sound website design can be characterized as a problem-solving activity [9]. The literature in creative cognition suggests that creativity involves viewing a problem from various perspectives at different abstract levels, each leading to a specific problem formulation [12-14]. The three dimensions of website design requirements imply general objectives that can be clearly defined and formulated as problems in accordance with a specific design context. Creative problem solving involves exploration and transformation of conceptual spaces, and creativity is germinated when previously unrelated patterns of thought get connected to produce a creative product [1, 15]. There is usually more than one method meeting design goals. Effectively applying creative problem-solving techniques (brainstorming, morphological analysis, etc.) can facilitate individual or team cognition in generating varieties of design alternatives, which ultimately leads to creative products that better satisfy those specified design goals by way of iterative ideation-evaluation phases [16-18]. Therefore, creativity is an integral part of the design process. By integrating creativity into website development can a designer realize synergy of those three dimensions of ergonomic design and come up with a creative website that functions better, is easier to
772
L. Zeng, R.W. Proctor, and G. Salvendy
use, triggers more of the customer’s pleasure, and thus becomes more commercially competitive (see Fig. 1). Heed was once only paid to issues regarding functionality and usability, but further requirements regarding creativity and desirability have now emerged. There is a need to appreciate the sum total of the customer’s experience with the product and service, rather than just focusing on utility and usability [10]. Previous research indicates that creativity matters – creativity has a halo effect that renders the product more appealing to customers. Research in web design has found that a creative website interface is preferred by users even if its usability is degraded [19]. Therefore, creativity is an important source adding both supplemental and substitutional value to a website [20, 21]. Yet, it is the supplemental value creativity affords that should be emphasized. When a website is creatively designed in terms of functionality, usability, and affectivity, it can achieve larger success. The ever-changing market demand structure galvanizes a shift from the product-based to value-based competition, and thereby creativity (being an important value-adding source) should be addressed in the development of the full spectrum of web-based applications and services. Developing novel and useful web-based applications and services to achieve market success is the key objective for most corporations in web business. The nature of this pursuit is a subjective endeavor. Never can there be a website that is considered to be the best and well-liked by everyone. That is the nature of website design, where every individual or team could come up with a different idea, and the success of a design cannot be evaluated fully until it is released to the target market. It is creativity that serves as the engine propelling the never-ceasing evolution of websites. The designer’s goal should be to develop websites that are novel and appropriate, and which fit the application context well [1, 22]. Horn and Salvendy [11, 20, 23] found that creativity of traditional hardware products could considerably increase the consumer’s purchase intention. Subsequent research of IT product creativity further reveals that website creativity can also shape the consumer’s web use behavior as well as purchase intention when visiting an e-commerce website [21]. It is proposed that website creativity will increase the site’s perceived attractiveness, ease of use, and usefulness, which ultimately predict the user’s behavioral intention to visit, remain on, and revisit the website, as well as the user’s satisfaction and the site’s profitability [1, 21, 24]. So, to sum up, creativity plays a pivotal role in website design. Ergonomic design of websites integrating creativity helps to realize synergy of functional, usable, and affective design, with the aim of developing creative websites that are more creative, appealing, and commercially viable.
3 Factors Influencing Website Creativity Creativity can be defined in terms of creative personality, process, product, and press (the environment or context where creative productions are produced), depending on specific interest of the study and the associating context [1, 15, 20]. This paper takes an outcome-based perspective and adopts a definition of website creativity as “the subjective judgment of a website to exhibit novelty and appropriateness that elicits arousal and pleasure and is compatible with the user’s preferences” (21, p. 568).
Integration of Creativity into Website Design
773
Aesthetic Appeal
Personalization
Interactivity
Website Creativity Commonality and Simplicity
Novelty and Flexibility
Importance
Affect
Fig. 2. Factor Structure of Website Creativity
Horn and Salvendy [11, 20, 23] developed a factor structure of traditional hardware product creativity. Due to the fact that IT products (and especially websites) differ appreciably from traditional hardware products in various aspects, a factor model that better captures such differences is indispensable for thoroughly studying this concept and providing design guidelines boosting website creativity [1]. A factor structure for website creativity has been developed, with seven key factors/dimensions: Aesthetic Appeal, Interactivity, Novelty and Flexibility, Affectivity, Importance, Commonality and Simplicity, and Personalization (see Fig. 2), explaining 63% of the total variance associated with website creativity items. This factor model has both construct validity and predictive validity, and 62% of the total variance regarding the prediction of the user’s overall preference towards creative websites was explained by significant factors [21]. Aesthetic and visual virtues add to the appeal of websites [1, 25, 26]. Interactivity is one of the most important determinants for excellent website design, and the increase of interactivity positively affects the user’s “perceived satisfaction, effectiveness, efficiency, value, and overall attitude” towards the website ([27], pp. 281, [28-30]). A third dimension, novelty and flexibility, reflects the dynamism of web services. Novelty is considered as a crucial determinant of creativity [20]. Flexibility implies continuous updates and management of the website’s configuration, content preparation, and
774
L. Zeng, R.W. Proctor, and G. Salvendy
interaction mode so as to achieve continuous improvement and foster the generation of creativity [31-35]. It is such continuous improvement that guarantees the novelty or originality of the website. A fourth dimension, affectivity, is concerned about emotional impacts of website creativity, which generally consists of two sub-dimensions: arousal and pleasure. Another major dimension, importance, deals with how important and useful the product is to the customer. It embraces two subscales: relevance and significance. This dimension reemphasizes that only those websites that are both novel and appropriate are of real creativity, echoing the definition of creativity from a pragmatic point of view which fits well with the business reality [13, 20, 21]. A sixth dimension, commonality and simplicity, suggests that there should be large quantities of and various types of components (developed by web designers) that can add creativity to web sites. Yet a trade-off between complexity, level of sophistication, and simplicity should be found. A moderate level of complexity is recommended; beyond that point the website interface would appear busy and confusing. Additionally, personalization refers to automatic adjustment of web service configurations, content, structure, and presentation, tailored to each individual consumer’s preferences. It is apparent that those seven key factors of website creativity link closely with the proposed design framework. Every factor can evoke design objectives that correspond to all three dimensions of design requirements (functionality, usability, and affectivity) and may emphasize a subset of these dimensions. For example, there are a number of technical issues related to making personalized websites a reality. Usability issues need to be taken into account because customized features may not be easy to use. In addition, personalized web interfaces with high aesthetic appeal would attract more users. Customizable features can be determined through data mining, which is a term broadly used for methods used to identify each individual customer’s behavior (by analyzing his/her interactions with the interface, purchases, repeat visits, etc.). These features can then be set to provide personalized purchase suggestions. Some websites (such as Gmail and Facebook) allow each individual customer to personalize the display of the sites so as to increase their aesthetic appeal, catering to each individual’s taste. It is suggested that website developers and managers can enhance the creativity of their websites by effectively addressing functionality, usability, and affectivity issues in light of the factor structure of website creativity.
4 Conclusion The ergonomic design framework proposed in this paper highlights that creativity plays a central role in achieving synergy of functionality, usability, and affectivity. Integration of creativity in website design depends heavily on the specific design goal of the website and its application context. Creativity can add both supplemental and substitutional values to websites, and ultimately influence the web user’s behavior. A creative website, being novel and appropriate, would increase the site’s perceived attractiveness, ease of use, and usefulness, which ultimately predicts the user’s behavioral intention and satisfaction, as well as the site’s profitability. Future research should go further toward considering the impacts of creativity on IT product and service design. Previous research in IT creativity has mainly taken an outcome-based viewpoint, yet the value of unraveling those variables and processes that lead to creative design entails a process-based
Integration of Creativity into Website Design
775
perspective with application of the creative cognition approach. In business, teams have been playing a pivotal role in generating creative productions in organizational contexts. Therefore, research regarding the achievement of team-level creativity in IT product and service development with the pursuit of business success would be of much interest.
References 1. Zeng, L., Salvendy, G.: How creative is your website? In: 2nd International Conference on Applied Human Factors and Ergonomics (AEI 2008), pp. 1–10. USA Publishing, Las Vegas (2008) 2. Satzinger, J.W., Garfield, M.J., Nagasundaram, M.: The creative process: the effects of group memory on individual idea generation. Journal of Management Information Systems 15, 143–160 (1999) 3. Howard, T.J., Culley, S.J., Dekoninck, E.: Describing the creative design process by the integration of engineering design and cognitive psychology literature. Design Studies 29, 160–180 (2008) 4. Couger, J.D., Higgins, L.F., McIntyre, S.C. (Un)structured creativity in information systems organizations. MIS Quarterly 17, 375–397 (1993) 5. Eighmey, J.: Profiling user responses to commercial web sites. Journal of Advertising Research 37, 59–66 (1997) 6. Andreatos, A.: A framework for web site assessment. In: IEEE Mediterranean Electrotechnical Conference, Malaga, Spain, pp. 737–740. IEEE, Los Alamitos (2006) 7. Jordan, P.W.: The four pleasures: understanding users holistically. In: 2nd International Conference on Applied Human Factors and Ergonomics (AEI 2008), pp. 1–10. USA Publishing, Las Vegas (2008) 8. Herzberg, F.: Work and the Nature of Man. World, Cleveland, OH (1966) 9. Proctor, R.W., Van Zandt, T.: Human Factors in Simple and Complex Systems, 2nd edn. CRC Press, Boca Raton (2008) 10. Helander, M.G., Khalid, H.M.: Affective and pleasurable design. In: Salvendy, G. (ed.) Handbook of Human Factors and Ergonomics, Hoboken, NJ, USA, pp. 543–572. Wiley, Chichester (2006) 11. Horn, D., Salvendy, G.: Measuring consumer perceptions of product creativity: impact on satisfaction and purchasability. Human Factors and Ergonomics in Manufacturing (in press, 2009) 12. Geschka, H., Schaude, G.R., Schlicksupp, H.: Modern techniques for solving problems. Chemical Engineering 6, 91–97 (1973) 13. Ward, T.B.: Cognition, creativity, and entrepreneurship. Journal of Business Venturing 19, 173–188 (2004) 14. Clemen, R.T., Reilly, T.: Making Hard Decisions with Decision Tools. Duxbury, Pacific Grove (2004) 15. Warr, A., O’Neill, E.: Understanding design as a social creative process. In: 6th Conference on Creativity and Cognition, pp. 118–127. ACM, London (2005) 16. Paulus, P.B.: Groups, teams, and creativity: The creative potential of idea-generating groups. Applied Psychology 49, 237–262 (2000) 17. Lubart, T.I.: Models of the creative process: past, present and future. Creativity Research Journal 13, 295–308 (2000–2001)
776
L. Zeng, R.W. Proctor, and G. Salvendy
18. Basadur, M.: Managing the Creative Process in Organizations. In: Runco, M.A. (ed.) Problem finding, problem solving, and creativity, pp. 237–268. Ablex Publishing Corporation, Norwood (1994) 19. De Angeli, A., Sutcliffe, A., Hartmann, J.: Interaction, usability and aesthetics: What influences users’ preferences? In: 6th conference on Designing Interactive systems, pp. 271–280. ACM, University Park (2006) 20. Horn, D.B., Salvendy, G.: Consumer-based assessment of product creativity: A review and reappraisal. Human Factors and Ergonomics in Manufacturing 16, 155–175 (2006) 21. Zeng, L., Salvendy, G., Zhang, M.: Factor structure of web site creativity. Computers in Human Behavior 25, 568–577 (2009) 22. Niku, S.B.: Creative Design of Products and Systems. Wiley, Hoboken (2009) 23. Horn, D.B., Salvendy, G.: Product creativity: conceptual model, measurement and characteristics. Theoretical Issues in Ergonomics Science 7, 395–412 (2006) 24. Davis, F.D., Bagozzi, R.P., Warshaw, P.R.: User acceptance of computer technology: a comparison of two theoretical models. Management Science 35, 982–1003 (1989) 25. White, B.: How to “Webby-ize” your web site - factors in award-winning design. In: 3rd Latin American Web Congress, Buenos Aires, Argentina, pp. 25–26. IEEE Computer Society, Los Alamitos (2006) 26. Laviea, T., Tractinsky, N.: Assessing dimensions of perceived visual aesthetics of web sites. International Journal of Human-Computer Studies 60, 269–298 (2004) 27. Teoa, H.H., Oha, L.B., Liua, C., Weib, K.K.: An empirical study of the effects of interactivity on web user attitude. International Journal of Human-Computer Studies 58, 281–305 (2003) 28. Hostetter, M., Kranz, D., Seed, C., Terman, C., Ward, S.: Curl: a gentle slope language for the Web. World Wide Web Journal 2, 121–134 (1997) 29. Cao, M., Zhang, Q.Y., Seydel, J.S.: B2C e-commerce web site quality: An empirical examination. Industrial Management and Data Systems 105, 645–661 (2005) 30. Kuan, H.H., Bock, G.W., Vathanophas, V.S.: Comparing the effects of usability on customer conversion and retention at e-commerce websites. In: 38th Annual Hawaii International Conference on System Sciences, Big Island, HI, USA, pp. 174–182 (2005) 31. Albert, T.C., Goes, P.B., Gupta, A.: GIST: a model for design and management of content and interactivity of customer-centric web sites. MIS Quarterly 28, 161–182 (2004) 32. Lin, H.X., Choong, Y.Y., Salvendy, G.: A proposed index of usability: a method for comparing the relative usability of different software systems. Behaviour and Information Technology 16, 267–278 (1997) 33. Smith, S.L., Mosier, J.N.: Design Guidelines for Designing User Interface Software. Technical Report MTR-10090. The MITRE Corporation, Bedford, MA, USA (1986) 34. Dysart, J.: Custom pages on-the-fly: a new model of interactivity emerges on the web. Online 22, 39–40, 42, 44 (1998) 35. Canali, C., Casolari, S., Lancellotti, R.: Architectures for scalable and flexible Web personalization services. In: 1st International Workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, Orlando, FL, USA, pp. 50–57. IEEE, Los Alamitos (2005)
YVision: A General Purpose Software Composition Framework Ant˜ ao Almada, Gon¸calo Lopes, Andr´e Almeida, Jo˜ ao Fraz˜ ao, and Nuno Cardoso YDreams, Madan Parque - Sul, P-2829-149 Caparica, Portugal http://www.ydreams.com/
Abstract. Expectations for the industry of Human Computer Interaction are much higher today than they were ten or even five years ago. Innovative solutions to sense and gather information from the real world in real-time must be combined with lightning-fast computer graphics to deliver high-quality designs for the new interaction paradigms. The very combination of all these emerging technologies presents difficult challenges, not only for finding good design and programming methodologies, but to encapsulate those patterns in a collection of frameworks and tools enabling rapid-prototyping and agile development. Application designers should be able to express their creative endeavours by quickly trying out different design combinations with full access to leading edge technology. In the following we present the YVision general purpose software composition framework and show how it achieves the goal of managing the complexity and reducing the development time of parallel, data-driven, multimedia applications.
1
Introduction
Interactive real-time multimedia systems have been at the core of the recent Human-Computer Interaction (HCI) revolution. By breaking free of the traditional decades-old interfaces (i.e. keyboard and mouse), such systems pioneer explorations in new design spaces allowing for more natural exchanges with computational devices. Wall and floor displays [16,1,4], interactive surfaces [7,15,14], tangible interfaces augmenting everyday physical artifacts [10,11] or even collaborative work environments exploiting social settings to enhance productivity in meaningful ways [3] have stretched the limits of how we understand the relationship between man and machine. Expectations for the industry of HCI are much higher today than they were ten or even five years ago. Previous publicly available interactive displays were mostly limited to artistic and cultural exhibitions (e.g. Zack Booth’s interactive installation artwork [12]). However, interaction today begins to permeate marketing and advertising initiatives, fostering the interest of major brands worldwide and opening a new untapped market segment [16,4,1]. There is a much more global awareness and acceptance of technology in everyday life. Rich interactive J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 779–788, 2009. c Springer-Verlag Berlin Heidelberg 2009
780
A. Almada et al.
systems have the ability to arouse curiosity and interest even among individuals who may be unfamiliar with the more technical aspects of computation, by their focus on natural and familiar paradigms. Users empathize more easily with interaction models mimicking the tasks they perform routinely, allowing them to understand the possibilities made available by leading edge technology in a seamless manner. The shift from artistic and cultural mediums to large scale marketing and commercial applications was supported on key investments made by technology and IT corporations hoping to bring together the computing vision ever closer to the common citizen and to explore and launch new profitable markets on computation. The development of these new interfaces requires exploration at both the hardware and software levels. Innovative solutions to sense and gather information from the real world in real-time must be combined with lightning-fast computer graphics and application logic simulation to deliver high-quality designs for the new interaction paradigms. In addition to their own specific development problems, the very combination of all these emerging technologies presents difficult challenges for the development of competitive out of the box HCI software applications. The first challenge is the sheer computational power necessary to both analyze and integrate information collected from all available sensors while keeping the application logic and visualization responsive. For instance, computer vision has become an increasingly powerful tool in the design of natural interaction, being able to extract structural, dynamical and even semantical information from the environment using video capture devices. However, the algorithms necessary for filtering, pre-processing and analyzing the input signal can be extremely expensive, especially considering that they need to run at sufficiently high-frequencies so that interaction can be deemed natural by a human user. While available computational power continues to escalate over time, the limits in processor technology have forced manufacturers to move on to multiprocessor development. This means that harnessing next-generation processing hardware to its fullest will call for a mastery of parallelism in software development. Concurrent application design remains a difficult and problem-specific task, and often compromises composability and reuse of developed code. Another challenge is figuring out how to incorporate all these new inputs into the traditional application development pipeline. While keyboard, mouse and joystick interactions have become quite familiar and stable foundations to application developers, the amount of new information made available by currently existing technology can be daunting. Supporting the development of cuttingedge HCI commercial products means being able to incrementally incorporate new information into existing designs with minimal development effort. Incorporating new sensory information should not imply rebuilding the structure of previously developed applications. Addressing these development challenges means not only to find good design and programming methodologies, but to encapsulate those patterns in a collection of frameworks and tools enabling rapid-prototyping and agile development.
YVision: A General Purpose Software Composition Framework
781
The goal is to provide a set of fast mappings between high-level concepts and code, so that HCI application designers can express their creative endeavours by quickly trying out different design combinations with full access to leading edge technology. YVision is a heterogeneous general purpose software composition framework with the goal of managing the complexity and reducing the development time of parallel, data-driven, multimedia applications. In the following we will attempt to show how it achieves this goal by facilitating code reuse, supplying automatic concurrency management mechanisms, enforcing good programming practices and providing a set of high-level flexible mappings particularly suited for composition of HCI applications.
2
Framework Architecture
YVision is a general purpose programming framework developed to enable rapid prototyping and development of parallel, data-driven, multimedia applications. The whole package integrates several different technologies required to address the many distinct challenges emerging during the development of HCI applications, including 3D rendering, rigid-body physics or image processing and computer vision. Each of these technologies constitutes an available Module implemented on top of the more general YVision Core framework. The Core framework provides the abstract concepts for software composition. The Modules layer contains specific components for HCI such as rendering and image processing functionalities. Data-driven engines can then be built using those modular components with the variant part of each application being detailed in data files. Fig. 1 illustrates the relationship between the different framework components.
Fig. 1. The YVision framework integration stack
782
A. Almada et al.
It is important to understand this layered structure of the YVision framework. The Core framework knows absolutely nothing about image processing, render primitives or sound libraries. It is rather a collection of abstract architectural elements attempting to encapsulate a set of good programming practices, also known as design patterns [5] with a special emphasis on the design of modular and composable systems. Modularity and composability are probably the two primary key concepts behind the entire YVision Core development effort. Alongside these principles, the YVision Core clearly defines a set of high-level concepts for assembling an application. These concepts build on different programming paradigms, from the functional composition of the image processing pipeline to declarative specification of application objects. This heterogeneity stems from the realization that while logically equivalent languages may have achieved the same upper bound on expressive power, in practice we find each of them better suited to describe specific classes of problems. The faster we can map concepts to program, the faster and more efficient the development process becomes. Recognizing this important practical necessity, the YVision Core has been bred from the start to address the very distinct requirements emerging in the development of a multimedia application. Nevertheless, each paradigm is mostly self-contained and capable of being applied in its own right, and developers are free to apply the ones most suited to their specific problem. The abstract concepts and motivations for each Core section are presented below, along with the most common usage scenarios for each part of the framework. Advanced development scenarios are then progressively introduced, building up from previous sections. Methodological concepts are brought up whenever they are deemed relevant. 2.1
Dataflow Graphs
Drawing workflows is a common way of designing and understanding many information processing scenarios. Diagram boxes abstractly represent information processing tasks. These tasks work on some set of inputs and eventually produce a set of outputs. Arrows or straight lines connect some task’s output to some other task’s input. A graph (or workflow) is defined by a finite set of tasks (i.e. blocks; activities) and connections representing the data flow between those tasks (see fig. 2). In workflow models, the pattern of connections between tasks not only represents relations between objects in the system, but also implicitly enforces constraints on the execution model. Basically, data flows along the connections and tasks execute as soon as data is available. Execution follows data flow. This allows for the easy orchestration of complex data processing tasks in a succinct and flexible manner. The YVision Graph framework enables the explicit use of the workflow abstraction to build computer programs. Using an object-oriented architecture based on functional composition, the programmer can develop application functionalities by encapsulating them inside atomic black boxes, or blocks. More
YVision: A General Purpose Software Composition Framework
783
Fig. 2. Some common workflow representations
complex functionalities or even entire programs can then be assembled by composing available blocks together using data flow connections. A block is simply a collection of input, output and property pins. Input and output pins receive and deliver flow data and activate information processing tasks using event handlers. Property pins specify customizable parameters controlling how the data is to be handled. The workflow paradigm has several advantages in specifying, at different abstraction levels, the control and data flow of an application. By isolating implemented functionalities inside block boundaries, the programmer is encouraged to develop decoupled, and hence reusable, functionality. As long as some valid data can be fed to the block, the task code can be reused in arbitrary applications. Dataflow graphs are mainly used to compose signal processing pipelines, such as in computer vision or audio processing. They are also commonly used to specify the main application loop. Modern interactive applications often define one or more internal loops composed of different tasks such as rendering, physics, acquiring user input or simulating the application logic. Dataflow graphs are especially suited to compose and manage complexity at the macroscopic application execution level. Adding a new operation to the render loop or replacing a filter in the image processing pipeline is simply a matter of adding a new block or rewiring a connection. Furthermore, the functional nature of the dataflow’s execution model easily allows for attribution of specific subsections of the graph to distinct processors. The YVision framework cleanly separates this concern from the dataflow’s functional connectivity. Events raised from parallel threads will be automatically marshalled to the correct execution context in such a way that internal block execution code is shielded from synchronization concerns. Each execution context is assigned its own system thread. The framework also provides block components for controlling flow execution, allowing for the synchronization of critical path subsections. 2.2
Object Composition
Multimedia interactive applications like games and other virtual (or augmented) reality environments are concerned with the definition of the objects that
784
A. Almada et al.
compose the game world. These objects often exhibit very different properties and characteristics. Some have distinct visual representations, others have rigidbody physics, yet others display autonomous behavior, emit sounds, are controlled by user input, etc. Despite the countless combinations and possibilities in defining these objects, many functionalities are similar across different types of objects, like rendering or physics. Code reuse of shared functionality is essential for managing complexity in large projects, and for this reason the game industry has contributed extensive research efforts to the development of object management models. The YVision framework has followed the more recently proposed approach of using component-based architectures [8,6,13] to model the world objects. In this paradigm, objects are simply collections of components, and the overall object behavior and characteristics are defined by the properties and interactions between these components. Traditional designs relied on inheritance-heavy hierarchies to define the objects composing the game world, but experience and growing game complexity have shown this approach to be brittle [8]. A component is responsible for the data and behavior of one specific application aspect. Each component is thus an independently developed black-box exposing a certain interface. In its purest form, a component is implemented with no specific knowledge of the object in which it is placed and is completely independent of other components. In practice, components need to be able to communicate with one another, but they do so only through well-defined interfaces, so that implementation independence can be maintained. By adhering to the tenets of object composition, the YVision framework allows for the creation of entirely new objects simply by assembling existing components. These components can also be parameterized to fit specific needs. This means that as long as existing functionality suffices for creating our objects, there is no need to develop new code at all, as some specific parameterized composition will provide some approximation of the result we are aiming for. The upper dataflow graph layer provides not only the execution framework and context for the specified world objects, but can also generate new objects for the world. Image processing pipelines usually terminate with the creation of interactive virtual objects which are then embedded in the application logic where they can communicate autonomously with the other persistent objects. 2.3
Behavior Composition
Interactive objects achieve self-containment and autonomy in the world through a set of behaviors which lay out their rules for action. A central task in the definition of world objects for any application is thus the specification of object behavior, application logic or even artificial intelligence for autonomous virtual agents. Behavior logic usually takes up the role of linking together distinct components in a world object. The need for such generalized component access may lead to increased coupling between behavior code and all the other components
YVision: A General Purpose Software Composition Framework
785
in a world object. However, it is important that this coupling does not propagate to the behavior composition itself. The YVision framework enforces behavior modularity by providing a single common interface to all behavior implementations. Behaviors are defined as latent tasks, since they can take up several application cycles to be evaluated, and can either succeed or fail. Complexity is then built in the system by hierarchically composing the behaviors in a tree-like manner [2]. Sequences and selectors are simple ways of composing behaviors together, and are logically equivalent to the and and or nodes in Hierarchical Task Network planners [9]. Hierarchical logic is an excellent way to manage complexity in the specification of object behavior, allowing for flexible yet detailed control over low-level behaviors [2]. Each leaf in the behavior tree is a procedural implementation which can be reused arbitrarily in different levels of the hierarchy thanks to the common behavior interface. 2.4
Settings
Multimedia interactive applications often need calibration steps to be performed in order to function as expected. Several parameters are usually available to achieve this, situated at different levels of the application architecture. This is often achieved by developing graphical user interfaces providing access to the configuration parameters. Since application settings are spread out over the different application layers, configuration code needs to know about related objects and their properties. If care is not taken, the configuration code can easily contaminate an otherwise well-designed framework. While it is true that an object’s configuration code is more easily developed in concert with the object definition itself, it is bad practice to add new methods and functionality to the object so that this specific configuration can be accessed. One way to achieve this requirement without the aforementioned interface contamination is to specify configuration hooks in each object by annotating its properties and definition with metadata. Metadata can associate any kind of information with objects and their properties and methods without compromising the functional interface. This methodology has the additional advantage that metadata can even be overriden at run-time in order to change some of the object’s characteristics without having to recompile the module containing the object. A general configuration framework can then be developed to analyze and use this metadata to instantiate, at run-time, an adequate graphical user interface for any given object. Such a general configuration framework is implemented by the YVision Settings.
3
Data-Driven Software Composition
From the moment that modularity and composability became well established in the framework it was possible to think about data-driving the composition
786
A. Almada et al.
Fig. 3. The dataflow graph for the interactive application Bubbles. The upper dataflow specifies the image processing pipeline including contour extraction and dynamics analysis. The lower dataflow defines the main render loop. Each of these pipelines can be assigned to a single core on a multi-core machine without compromising the functional connectivity.
Fig. 4. The behavior tree specifying the logic of a bubble. The solid nodes represent conditions and the dashed nodes represent actions. The behavior composition relies on a sequence of three basic bubble states: creation, lifetime and destruction.
itself. Working with highly decoupled and modular systems allows for massive code reuse, but it is still necessary to specify how the different components will be pieced together in order to form a coherent whole. Since the entire system is built up around well-defined interfaces, this composition process can be automated to work from scripts which are stored alongside
YVision: A General Purpose Software Composition Framework
787
with the application data resources themselves. This means that, at some level, application behavior is driven by its own data. Multimedia applications can have very complex data definitions, including outputs from several tools such as 3D modeling applications or other visual editors. In order to maintain a certain degree of uniformity and accessibility across all these definitions, XML was chosen as the main representation for data scripts, being a human readable and highly flexible data exchange language. A common YVision application usually contains one or more dataflow graph layers, usually specifying signal processing pipelines and the main render loop. Figures 3 and 4 illustrate the specification of the main application graph and hierarchical logic layer of the Bubbles application. In this application, virtual soap bubbles are projected in a vertical surface. Participants can then use their shadow to interact with the bubbles, popping them and pushing them following physically plausible rules. The bubbles are simulated using a correctly parameterized rigid-body physics engine and can interact both with the structure and dynamics of users. Changing from physical soap bubbles to volley balls or other physical objects would simply be a matter of changing the visual component parameters and adjusting the physical properties in the object scripts so that new interaction reactions are generated. Different kinds of objects can even be combined so that bubbles and volley balls coexist in the same application.
4
Conclusions and Future Work
The YVision framework is currently being employed in a production environment where it has contributed to increase code quality and the robustness of developed applications due to effective code reuse. Although at this stage it is mostly being used by programmers, it still contributes greatly to the rapid-prototyping of HCI applications, by allowing quick experimentation and composition of available and previously used image processing filters and analyzers. In the two years following its first official release, the framework has been applied successfully to nearly a hundred of HCI applications and continues to grow with each new challenge it meets. New components and modules developed for specific applications have been successfully generalized and reapplied in applications with similar requirements. Development is actively taking place for new HCI computer vision modules such as object tracking and facial expression recognition. The Core framework is also evolving, hopefully becoming even simpler so that any new programmer can be quickly integrated and start producing more modules and components using it. The general purpose nature of the framework is also being put to the test as YVision begins to be applied to other application domains such as autonomous robot control. The conclusion of the study of Domain Specific Languages for the specification of the flexible part of the framework will also be a major step forward, as the high-level concepts can be brought forth visually into an integrated development
788
A. Almada et al.
environment where the various aspects of a HCI application can be defined as models and then be immediately instantiated as a working prototype. At this point in time, software composition and prototyping will be available for direct experimentation by interaction designers, thereby increasing productivity and focusing programmer development on new functionalities and core engine code.
References 1. CatchYoo, http://www.catchyoo.com/ 2. Champandard, A.J.: AiGameDev (2003-2008), http://aigamedev.com/ (cited 2008) 3. Dourish, P.: Where the Action Is: The Foundations of Embodied Interaction. MIT Press, Cambridge (2001) 4. EyeClick, http://www.eyeclick.com/ 5. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns. Addison-Wesley Professional, Reading (1995) 6. Garces, S.: AI Game Programming Wisdom 3, Charles River Media, pp. 251–263 (2006) 7. Kaltenbrunner, M., Bencina, R.: reacTIVision: A computer-vision framework for table-based tangible interaction. In: Proceedings of the First International Conference on Tangible and Embedded Interaction (TEI 2007), Baton Rouge, Louisiana (2007) 8. Rene, B.: Game Programming Gems 5, Charles River Media, pp. 25–37 (2005) 9. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, London (2003) 10. Baraldi, S., Benini, L., Cafini, O., Del Bimbo, A., Farella, E., Landucci, L., Pieracci, A., Torpei, N.: Introducing TANGerINE: A tangible interactive natural environment. In: Proceedings of ACM MultiMedia 2007, Augsburg (September 2007) 11. Schmieg, S.: Roy block, http://www.kingcosmonaut.de/royblock/ 12. Simpson, Z.B.: Mine control, http://www.mine-control.com/ 13. Stoy, C.: Game Programming Gems 6, Charles River Media, pp. 393–403 (2006) 14. Wellner, P.: The digitaldesk calculator: Tangible manipulation on a desk top display. In: Proc. ACM SIGGRAPH Symposium on User Interface Software and Technology, pp. 107–115 (1991) 15. Wilson, A.: PlayAnywhere: A compact tabletop computer vision system. In: Symposium on User Interface Software and Technology, UIST (2005) 16. YDreams (2000-2008), http://www.ydreams.com/
Collaborative Development and New Devices for Human-Computer Interaction Hans-Jörg Bullinger and Gunnar Brink Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Hansastraße 27c,80686 München
[email protected] www.fraunhofer.de
Abstract. The article pays tribute to the emergence in 1993 of graphical browsers that allow users to address electronic information with a point-and-click interface, and places this development on a par with other important historical events that shaped society and the life of the individuals. It describes the resistance that some voiced at the time to the prompt economical utilization of the Internet's new possibilities. It goes on to describe current technical developments in the human-computer interface environment that could be very, perhaps even comparably, important. It concludes with an appeal for the courage to develop technical innovations, particularly in difficult economic times. Keywords: Context-aware services, pervasive gaming, collaboration, semantic web, testing methods, interfaces and peripherals, mini-projector, augmented immersive 3D displays, acoustic wave field synthesis.
1 Introduction There are certain events that most of us remember fairly accurately if we are old enough. You know exactly what you were doing the day the Berlin Wall fell, just as you remember when you found out that the World Trade Center had been destroyed by terrorists with captured airliners, or what you were doing when you learned that Kennedy had been assassinated. Armstrong's steps on the moon: such moments in history are engraved in our memories if we have had the fortune or misfortune to live through them. There is another event that I can also remember very well: In 1993, a colleague at the Fraunhofer Institute showed me an innovative program: Mosaic 1.0, one of the first browsers with a graphical interface. Of course, we already had an Internet connection at that time, and we used the E-mail function extensively. But compared to today, using the Internet then was a painstaking business. It took at least a handful of UNIX control commands for the simplest matters. Even back then, however, we had more than you might think, for example, e-mail lists, the forerunners of today's Internet forums. In 1993, in an institute (FIT) that today belongs to the FraunhoferGesellschaft, BSCW (Basic Support for Cooperative Work) was developed. In 1995 this was the first fully Web-based groupware system [1] (European Software Innovation Prize 1996). J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 789–795, 2009. © Springer-Verlag Berlin Heidelberg 2009
790
H.-J. Bullinger and G. Brink
But Mosaic and all subsequent browsers radically changed the Internet, because they changed the way in which we use the Internet: intuitively, graphically and with a substantially higher quality experience. "Mosaic" NetScape was already available in 1994, followed by Netscape Navigator. Perhaps some of you still remember how the new Netscape business model unsettled business economists: the product was supplied at no cost. Only later and improved versions required payment of license fees. At that time, this type of market entry seemed revolutionary. Today we use Explorer, Firebird, Safari or Opera as browsers. Each browser is better in some areas than the others. It is hard to imagine that before 1993, there was not a single browser like these in circulation. What began so quietly in 1993, an Internet that could be used graphically and intuitively started a revolution. JavaScript, frames and tables soon made their appearance. The first Java programs were integrated. The first homepages for private individuals became fashionable around 1996. And from 2000, the Internet became a daily topic of conversation, and so its own safeguard.
2 A Look Back A number of years ago, Gary Hamel described the impact of the introduction of browsers in the business world in an impressive article in the Harvard Business Review [2]. In 1994, at IBM, which at that time was experiencing great economic difficulties, there was an employee named David Grossman, who had also already loaded the abovementioned Mosaic browser on to his UNIX computer in order to familiarize himself with the graphical world of the Internet. Completely without the knowledge of the topmost management level, he formed a small team; the first corporate Intranet was set up, and a team of around 300 activists was put together – a growing movement that developed IBM's first homepage. Only after this homepage was finished was the CEO at that time, Lou Gerstner, included. He was quickly convinced of the new possibilities. It took Gerstner's support to convince the many managers in the hierarchy of the opportunities offered by the Internet. From this event, Grossman developed a few rules, the five most important of which I would like to quote for you here: 1. 2. 3. 4. 5.
Start simple; grow fast. Trial by fire. Skip the krill (go to the top of the food chain when you're trying to sell your idea). Take risks, make mistakes quickly, fix them fast. Don't get pinned down (to any one way of thinking).
I could relate any number of similar events at this point. We were ourselves very active in introducing HTML into the factory workshop, into manufacturing [3], [4], [5]. Today most engineers understand how expedient it is to use Web-based systems for designing and manufacturing. This includes such things as design information integration, remote system control, the use of HTML and Java programming, client-server architectures and, above all, the active inclusion of user interface design and human-machine interaction. In those days, when we talked to the responsible people in companies, most of them saw the Internet as just a game, something for children. Or they were worried by the reports in the media about the unsavory aspects of the Net. Serious applications in the producing trades? Most considered
Collaborative Development and New Devices for Human-Computer Interaction
791
these to be out of the question. We, or other people, or simply reality convinced them in a process that took many years.
3 What Happens Next The past is important, because we want to learn from it for the future. Which trends do we see? Where can we see developments in interaction technologies today that are significant in the way that the Internet and browser were in 1993? How can we use them effectively and rapidly? Today, many people take it for granted that they can be reached anywhere at any time, that they have constant access to e-mail and data at work and at home, that they can listen to music on an MP3 player, skip through their digital photo collections and always find the right route thanks to GPS and satnav. The rapid progress in information and communication technology has made all this possible at the touch of a button. This technology will continue to shape and facilitate our everyday lives. Tiny sensors and wireless communications are the key to this future, imparting new functions to passive objects such as carpets, clothing and windows. As a result, they become active objects which automatically adapt to the user. And they deliver valuable information to communication networks, which for example help drivers to avoid traffic jams or doctors to check the health of patients. How many of the ever-present, invisible helpers are actually used is a matter of personal choice: for instance, in the intelligent house, which autonomously ensures safety, security and optimal energy use, or in consumer electronics systems wirelessly networked throughout the house. People will also be able to deploy their own software agents to handle appointments and find information on the Internet. The best interaction with services is ultimately the one that the user does not even experiences as an explicit interaction. If services are embedded in the surroundings or if services are context aware, it is possible to reduce the extent of interaction for the user and to increase the accuracy of the support. Context-aware services are an enhancement of user-adaptive services [6]. The Fraunhofer Institute FIT has developed practical solutions in the field of context-aware services, e.g., for guide systems for trade fair and museum visitors [7], for assisting warehouse clerks [8] and for situated learning [9]. We furthermore expect that in the future, too, important impetuses for "serious applications" will come from the computer game segment. Games are an important part of our society. In addition to being plenty of fun, they make it possible to learn new skills and to try these skills out in a safe social environment, and also to explore where our own skills and limits lie. New human-computer interfaces allow users to immerse themselves in highly detailed virtual worlds, play with people from other countries and other cultures and experience the games together [10]. Pervasive gaming removes the limits of conventional games in a real (social and physical) environment and expands (virtual) game worlds for players and spectators. Various electronic media are used here, such as electronic music, video, the Internet, computers, smart phones, etc., in order to blend real game environments and virtual elements as seamlessly as possible; instead of being tied to the PC or game console, players can move about freely in their natural environment, which can itself be an element in the game.
792
H.-J. Bullinger and G. Brink
"Epidemic Menace" is such a pervasive gaming approach which is being developed in the integrated EU project IPerG by a consortium of leading European research institutes and companies [11]. Internet players are tied to players in the real world via augmented reality. In addition to the Fraunhofer Institute for Applied Information Technology FIT, some of the other participants are Sony NetServices, Blast Theory, Daydream, the Swedish Institute of Computer Science, Nokia and the Mixed Reality Laboratory at the University of Nottingham. Epidemic Menace includes design and evaluation methods, as well as marketing strategies as the basis for attractive and profitable games, the development of an optimized technical infrastructure, efficient design tools and user interfaces suitable for games. Speaking very generally, multimodal, immersive worlds are particularly fascinating when various stimuli give rise to an overall impression that is perceived as the integrated user experience. Such approaches, such as linking visual, acoustic and tactile input and output modalities, have meanwhile come to be used in diverse application scenarios. We are also presenting an article on this at this conference [12]. Very generally, many IT applications will be offered and used as services in the future. The boundaries between a company's internal solutions and its external service offers are becoming blurred. New applications and business models networked via the Internet are establishing themselves as attractive alternatives or supplement to existing solutions. At Fraunhofer ISST, at Fraunhofer FOKUS and at Fraunhofer IAO, we are working not only on new technological formats and solutions, but also on utilization strategies, concrete solutions and future business models [13], [14]. Successful Internet information systems must strike a delicate balance between autonomy, flexibility, and governance [15]. Collaboration also calls for semantic integration: documents that know where they belong. This includes Theseus, a research program for developing a new, Internetbased knowledge infrastructure funded by the BMWi (German federal ministry of economics and technology) with a grant of roughly 90 million euros. In addition to the Bertelsmann subsidiary SAP and Siemens, research partners are a total of ten Fraunhofer Gesellschaft institutes receiving grants, as well as other research organizations and universities [16]. The program is expected to produce its results in the form of innovative products, tools, services and business models for the World Wide Web, as well as the service and knowledge society of tomorrow. The program consists of six so-called Use Cases, each of which is led by a "Use Case Captain" from a company, and of a Core Technology Cluster, led by Fraunhofer, for the basic technologies. Further "transversal activities" are the development of business cases at Fraunhofer IAO and the assessment and evaluation of the technologies by Fraunhofer IDMT [17]. The accompanying research, under the direction of Prof. Weber from ISST, includes the development of new methods for project profiling and service profiling as an innovation management tool. The acceptance of new interaction technologies hinges on the stability, robustness and performance capability of the systems that build on them. The Fraunhofer Institute FOKUS applies systematic and automated testing methods, which were developed in the European ITEA projects TT-Medal and D-MINT, among others [18], [19]. These are used, for example, for safeguarding Web service-based systems [20], business processes [21] and medical workflows in accordance with IHE [22]. In one current research project at Fraunhofer FOKUS, questions involving the trustworthiness of
Collaborative Development and New Devices for Human-Computer Interaction
793
network-based offers such as online purchasing or personal networks, which are central for all Internet users, are examined. In order to offer users better assessment and control possibilities for their Internet use, technical and organizational aspects of trustworthiness are analyzed, modeled and evaluated.
4 Interfaces and Peripherals When there's a discussion of interfaces, it often centers not on interfaces, but on new peripherals. Portable computer displays are a terrible nuisance. They are scarcely usable on airplanes, because there is not enough room to flip the display back. They are delicate, expensive, heavy and use up the battery. Displays the size of those on portable computers offers reasonable resolution, but they are too large to be used on PDAs and our growing collection of devices in our jacket pockets. One possible solution can be found in a projector that is barely largely than a sugar cube [23]. Now Fraunhofer researchers have improved their mini-projector to the point that it can project jitter-free pictures when held in the hand. PowerPoint presentations during a business trip can be shown on the spot, and it is no longer a problem for people to find their way in unknown cities when they use a city map projected on the wall of the nearest building. No matter whether the projector is put into our cell phone, PDA or digital camera: soon we will always have a mini-laser projector in our pocket. The various display techniques developed at different Fraunhofer institutes, e.g. at Fraunhofer HHI, enable any content to be presented in augmented, immersive 3D format [24]. Applications range from the immersive 3D dome cinema, to mixedreality applications where reality and the virtual world merge, through to presentation on mobile terminal units. An auto stereoscopic display system provides users stereo visualization without the uncomfortable and inconvenient drawbacks of wearing stereo glasses or head-mounted displays. The users perceive a different image with each eye, giving a stereo image. Our solution from Fraunhofer Heinrich Hertz Institute uses an eye tracking system to automatically adjust the two displayed images to follow the viewer's eyes as they move their heads. This brings us to the subject of acoustics, which is also a part of the humancomputer interface of the future. By means of the Wave Field Synthesis technology, every room can be filled with natural and realistic sound [25]. The optimum sound in a room, e.g., a movie theatre, can usually be experienced by only few people in the audience. Outside of this determined local range, the quality for optimal sound cannot be provided. Other deficiencies of these surround sound systems are the lack of realistic reproduction of natural and virtual sound sources. The Wave Field Synthesis based IOSONO sound system, developed at Fraunhofer IDMT, creates natural sound fields for every room and seat. The basic idea comes from the Huygen's principle of wave propagation. Using this principle in conjunction with audio allows - not only in cinemas - the realistic emulation of virtual sound sources and waves by using of a ring of loudspeakers around the listening space. Every listener is able to enjoy his own sonic sphere where he perceives dialogues and effects from the right acoustic perspective. This reproduction of 3D-wave fields in real time also offers new possibilities for editing and mixing of sound scenes.
794
H.-J. Bullinger and G. Brink
5 Outlook IBM was in an existential crisis in 1994. Today, when everyone is asking: • "How can we avoid the impact of the incipient economic crisis? • How can we stabilize already plunging figures and reach our budgeted targets? • Where can we save additional costs?" The message of the events from that time, not only at IBM, a company that was really doing poorly, is more important than ever. Success or failure starts in our heads. The brain construes the future. Feed it the right input, so that the right output emerges. We must have a climate free of fear and threats in order to develop new ideas and forge new paths. We must have friendly conditions so that employees can and want to provide their intellectual potential fully and completely to the company. Focus on what can work. Instead of questions directed inwardly, such as those above, which are usually the type asked in the crisis, questions focused on the customers, or an examination of the new technologies and their opportunities, bring us much further. Then the attention is focused outwards, and not inwards.
References 1. De Michelis, G., et al.: Cooperative information systems: A manifesto. Cooperative Information Systems: Trends & Directions, 315–165 (1997) 2. Hamel, G.: Waking up IBM: how a gang of unlikely rebels transformed Big Blue. Harvard Business Review 78(4), 137–146 (2000) 3. Bullinger, H.J., Lentes, H.P., Scholtz, O.H.: Invited paper Challenges and chances for innovative companies in a global information society. International Journal of Production Research 38(7), 1469–1500 (2000) 4. Balve, P., et al.: Prozeßorientierte Fertigungssegmentierung. ua: Montageplanung-effizient und marktgerecht, pp. 75–94. Springer, Heidelberg (2001) 5. Bullinger, H.J., Gudszend, T.: IT-Supported Knowledge Generation and Storage in Competence Networks for Plant Maintenance. In: Proceedings of the The 8th International Conference on Concurrent Enterprising, Rome, Italy, University of Nottingham (2002) 6. Oppermann, R.: From useradaptive to context-adaptive information systems. iCom. Zeitschrift fur interaktive und kooperative Medien 3(2005), 4–14 (2005) 7. Oppermann, R., Specht, M., Stephanidis, C.: Contextualized Information Systems for an Information Society for All. In: Universal Access in HCI: Towards an Information Society for All, vol. 3, pp. 850–853 8. Kaufmann, O., et al.: Implicit interaction for pro-active assistance in a context-adaptive warehouse application. ACM, New York (2007) 9. Oppermann, R.: Situated learning in the work process (2005) 10. Prinz, W., Aschersleben, G., Koch, I.: Cognition and Action. In: Oxford Handbook of Human Action: Mechanisms of Human Action, p. 35 (2008) 11. Lindt, I., et al.: Combining multiple gaming interfaces in epidemic menace. ACM Press, New York (2006) 12. Beinhauer, W., C.H.:, Using Acoustic Landscapes for the Evaluation of Multimodal Mobile Applications HCII 2009 (2009)
Collaborative Development and New Devices for Human-Computer Interaction
795
13. Renner, T., et al.: Open Source Software: Einsatzpotenziale und Wirtschaftlichkeit. Eine Studie der Fraunhofer-Gesellschaft (2005) 14. Hunt, G.C., et al.: System and method for logical modeling of distributed computer systems, Google Patents (2006) 15. Jarke, M.: On Technology Convergence and Platforms: Requirements Challenges from New Technologies and System Architectures. Springer, Heidelberg (2009) 16. Reuse, B., Vollmar, R., Broy, M.: Informatikforschung in Deutschland. Springer, Heidelberg (2008) 17. Dunker, P., et al.: Content-based Mood Classification for Photos and Music 18. Deiß, T.R.A., Schieferdecker, I., Vassiliou-Gioles, T.: Advanced Test Processes using TTCN-3. ITEA Publications (2006) 19. Din, G., Schieferdecker, I., Petre, R.: Performance Test Design Process and Its Implementation Patterns for Multi-services Systems. In: Suzuki, K., Higashino, T., Ulrich, A., Hasegawa, T. (eds.) TestCom/FATES 2008. LNCS, vol. 5047, pp. 135–152. Springer, Heidelberg (2008) 20. Vouffo-Feudjio, A.a.S.I.: Availability testing for Web services. In: Telektronikk 2009 (to appear, 2009) 21. Din, G., Eckert, K.P., Schieferdecker, I.: A Workload Model for Benchmarking BPEL Engines (2008) 22. Vega, D., Schieferdecker, I., Din, G.: A TTCN-3-based Test Automation Framework for HL7-based Applications and Components. In: 11th Intern. Conf. on Quality Engineering in Software Technology (2008) 23. Scholles, M., et al.: Illumination & Displays Building ultra-compact laser projectors 24. Pastoor, S.: 3D Displays. 3D Video Communication. Wiley, Chichester (2005) 25. Brandenburg, K., Brix, S., Sporer, T.: Wave field synthesis: From research to applications
Orchestration Modeling of Interactive Systems Bertrand David and René Chalon LIESP laboratory, Ecole Centrale de Lyon, France 36, avenue Guy de Collongue, 69134 ECULLY cedex {Bertrand.David,Rene.Chalon}@ec-lyon.fr
Abstract. In this paper we study the role of orchestration and its modeling for interactive systems. After a common sense explanation of orchestration and its meaning in information technologies and mainly SOA, we explain its use specifically for the design and use of interactive systems. We propose a taxonomy and, in relation with it, we point out both adaptation and plasticity of HCI systems as a partial answer for orchestration. We then suggest complementary aspects needed for orchestration and their modeling in an MDA approach. We also present a case study and we conclude by considering perspectives. Keywords: Human-computer interaction, Orchestration, SOA, MDA, CSCW, static and dynamic evolution.
1 Introduction Orchestration is a word which is increasingly used in the information technology (IT) world. It seems interesting to study its meaning and use in different domains of IT and mainly its use in SOA (Service Oriented Architecture). Our objective is, after this general view, to focus our attention on its use in HCI, in order to answer, at least partially, the question: is orchestration something new (fully or partially), or do we already have answers in HCI which cover the orchestration problematic? For everybody, orchestration commonly defines a music arrangement to be played by several instruments. Its generalization is given in Webster’s New Collegiate Dictionary which gives following definitions for the words “orchestrate” and “orchestration”: Orchestrate: to arrange or combine so as to achieve a maximum effect. Orchestration: harmonious organization – develop a world community through orchestration of cultural diversities (L.K.Frank). In the information technology field, orchestration has first been used by OMG [1] referring to orchestration as "the modeling of directed, internal business processes". OMG also introduced choreography as "the specification of interactions between autonomous processes". Orchestration in business processes is a series of activities in a controlled work flow – typically involving one single swim lane. Whereas choreography relates to the observable public exchange of messages, rules of interaction and agreements between two or more business process end points can address the collaboration among multiple swim lanes. There are several design factors which need to be addressed to achieve both orchestration and choreography. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 796–805, 2009. © Springer-Verlag Berlin Heidelberg 2009
Orchestration Modeling of Interactive Systems
797
The key design attributes for Orchestration include participant and role definition, variables, properties which enable conversation, fault handlers for exception processing, compensation handlers for error recovery and event handlers to respond to concurrent events with the process itself and set of activities. For Choreography, message structure, asynchronous communication, message rules, invocation, events and event handling are some of the important design factors.
2 Orchestration and SOA We can define orchestration as a standards-based mechanism that defines how web services work together, including business logic, sequencing, exception handling, process decomposition, including service and process reuse. Orchestrations may span a few internal systems, systems between organizations, or both. Moreover, orchestrations are long-running, multi-step transactions, almost always controlled by one business party, and are loosely coupled and asynchronous in nature. In SOA context, we can consider orchestration as really another layer in itself over and above more traditional application integration approaches, including informationand service-oriented integration. Orchestration encapsulates these integration points, binding them together to form higher level processes and composite services, indeed orchestrations themselves should become services. The view presented above is an interesting point of view which focuses on an architectural point of view (need of layers) and evolution (need of changes). SOA is presented as mainly back office oriented, to express and organize exchanges between service and process organization and evolution. The meaning of the term Interaction is oriented to exchange between software components. As pointed out by D. Davies [2] the missing link of SOA is UI orchestration, i.e. extending SOA principles to the UI. Our feeling is that the front office, user-oriented applications are also concerned by orchestration. This orchestration is humancomputer interaction oriented and is in charge to organize the front end (user-oriented view) of process satisfaction and evolution. In HCI literature, orchestration is not a usual term. We only found one reference [3], which offers to use the orchestration model to express interaction dependencies among widgets and grouping of widgets into Ajax pages in Rich Internet Applications (RIA). Unfortunately this view is too strongly related to implementation aspects. We are far from the SOA definition: The objective is to define different rules of distribution for different tasks to be carried out by different Web Services. A more comprehensive view of orchestration is needed in respect of the following statement [4]: “Orchestration is a necessity if you are building a SOA, intra- or inter-organization. It's the layer that creates business solutions from the vast array of services and information flows found in new and existing systems. Orchestration is a godlike control mechanism that's able to put the SOA to work, as well as provide a point of control. Orchestration layers allow you to change the way your business functions, as needed, to define or redefine any business process on-the-fly. This provides the business with the flexibility and agility needed to compete today.” An orchestration engine is a server that is responsible for acting as an intelligent intermediary between services. An architectural point of view is explained in figure 1, inspired by [5].
798
B. David and R. Chalon
Services
SOA pattern
Business solutions management –process automation Pattern Context-aware
Profile Management –Workflow management
Orchestration Engine Internet access Users
Fig. 1. Main layers of SOA architecture
3 From SOA Orchestration to Interactive System Orchestration Interaction orchestration is not a common term. As mentioned above, Orchestration describes the automated arrangement, coordination, and management of complex computer systems and services. It is often discussed as having an inherent logic or intelligence or even implicitly autonomic control. In reality, orchestration is largely the effect of automation or systems deploying elements of the control theory. As seen in figure 1 this architecture is mainly back-office oriented, no user considerations are pointed out, i.e. environment, working conditions, … The HCI point of view appears as complementary, i.e. mainly concerned by front-end user considerations, i.e. working conditions, interaction devices and environment. From the HCI point of view, the most related concept is adaptation, primarily in terms of portability [6] (adaptation to interaction devices – hardware), then now in terms of plasticity [7] (adaptation to the context – platform & user & environment). A very interesting in-depth taxonomy of adaptation is presented in [8] and summarized in figure 2. The adaptation is, on the one hand, defined by this taxonomy and, on the other hand, by the way to obtain it. Four approaches, two and two being opposed, can be used: explicit versus implicit adaptation (adaptability versus adaptivity) and static versus dynamic adaptation. In respect of the SOA Orchestration Fig. 2. Adaptation taxonomy [8] we can adapt the SOA back-office orchestration view to the HCI front-office orchestration and rename from HCI adaptation to HCI orchestration, or we can maintain the distinction between the “classical” HCI
Orchestration Modeling of Interactive Systems
799
adaptation – plasticity term for well-defined HCI field approach and only use the Orchestration term for new aspects, related to specific aspects of the SOA view of Orchestration. As mentioned above, there are several design factors which need to be addressed to achieve both orchestration and choreography. The key design attributes for Orchestration are participant and role definition. For Choreography, message structure, asynchronous communication, message rules and invocation are some of the important design factors. We propose to study the HCI adaptation and SOA orchestration views of interactive systems in order to take into account three user situations (single user, multi user and cooperating user), related context characteristics (platform, user, environment) which can be more or less sophisticated, and collaboration – cooperation requirements. Table 1. Taxonomy of user situations and associated Single user Platform
Adaptation
User preferences
Adaptation
Environment
Adaptation
Mixed reality interaction
Adaptation
Role and process requirements Cooperation requirements
X X
Multiple users Adaptation & Orchestration Adaptation & Orchestration Adaptation & Orchestration Adaptation & Orchestration Orchestration X
Cooperative users Adaptation & Orchestration Adaptation & Orchestration Adaptation & Orchestration Adaptation & Orchestration Orchestration Orchestration
As indicated in table 1, in three user contexts (single, multiple users, cooperative users) the overall adaptation process is either HCI adaptation oriented (for all aspects of single users), or a mix between HCI adaptation- and SOA orchestration-oriented when it comes to multiple and cooperative users (single user facets based on HCI adaptation and multiple and cooperative aspects based of SOA orchestration). It is purely SOA orchestration-oriented for role and process requirements and cooperation requirement aspects.
4 HCI Model-Based Approach and Orchestration To integrate these so-called SOA orchestration aspects into the HCI view of orchestration, it is important to introduce new models to the Model-based view of the HCI systems design process. As generally admitted [9], 5 complementary models are used for the requirements, design, development and use live cycle of interactive systems. They are respectively: • Task model defining user and computer actions and activities and their temporal organization. • Concepts or data model expressing application behavior world to be represented in the computer.
800
B. David and R. Chalon
• Dialog model expressing dialog organization, structure and temporal organization without look and feel aspects. • Abstract interaction model expressing abstract interaction style and interaction and presentation aspects without concrete considerations • Concrete interaction models expressing concrete instantiation of all aspects of the user interface
Task model Data model Dialog model
Context Adaptation Model
Abstract Interaction model
Classic HCI models
Static Multi-user Functional Model
Dynamic Multi-user Functional Model Concrete Concrete Interaction Concrete Interaction model Interaction model model Adaptation and orchestration models
Fig. 3. Model-based process for Interaction systems elaboration and orchestration
Different transformation processes can be proposed based on these models [9]. We prefer the process (Fig. 3) starting with the task model, continuing with the data model, and allowing to elaborate a dialog model based on these two models. An abstract interface model and several concrete interaction models can now be elaborated. Context adaptation model is used to adapt user interface to different identified situations. For multi user and collaborative interactive systems we need two complementary models (static and dynamic functional multi-users model) [10] which are able to express in a static and dynamic way the situation (role and context) of each user. This information is used to provide orchestration for each user. Manipulation of these different models can be done in different ways. Either they are statically accessible for developers before execution (activation) of the system, or they are dynamically accessible during the execution for dynamic adaptation. These interfaces are called metainterface [11] by J. Coutaz and extra-interface [12] by G. Calvary. We offer a further approach [13], which allows to not only provide the developer with this adaptation interface, but also the user himself. In AMF-C approach, based on multi-faceted agents, the control of each agent is expressed in a graphical way. This clean definition of the relation between facets i.e. presentation (interaction) and abstraction; can be dynamically modified by the user in order to modify associated behavior. This graphical formalism is not only available at
Orchestration Modeling of Interactive Systems
801
design stage, for the designer, but can be given to the final user. This way he can, during the execution, adjust on-demand awareness of actions, i.e. his actions (or results of actions) can be, or not be, propagated to other users. As shown in fig. 4 a particular configuration of propagation administrator allows on-demand to propagate, or not, the result of action of user 1 to the user 2.
A A
The B target port is activated when a source port is activated, when the lock is opened. The activation lock the administrator
B
unlock
Fragmented AMF-C Agent User 1 Pres.
Control
Abstraction
Start Action
Do Action
2
User 2 Pres.
1
3
Start_Action
Fig. 4. Graphic manipulation of control in AMF-C architecture to allow, or not, awareness of action of user 1 to user 2
In the case of multi user interactive systems, orchestration mainly means the adaptation of the application behavior to the role of the user and the adaptation of the system to all adaptation characteristics expressed in the user model. This adaptation is either static, if the characteristics of the process and distribution of users’ roles are defined before the execution, or dynamic if the process can evolve dynamically and the user can change his role in the process as well as the context characteristics (mobility i.e. geographical or logical location, platform changes …).
5 Orchestration of Collaboration In the case of cooperative systems, orchestration means organization of work of each actor and mainly inter-actor exchanges. The users working in CSCW situations are concerned by both, the front-end and back-end views of orchestration. Specification and dynamic evolution of collaborative situations and processes must be expressed to
802
B. David and R. Chalon
allow the orchestration engine to manage collaboration situations and processes and their changes. To express this behavior and associated orchestration we are using ORCHESTRA formalism [14] of which the objective is to describe cooperative behavior of different actors of the process. The formalism is based on the metaphor of music scores expressed separately for each actor (music instrument) as it is done in orchestra presentation for the conductor. This way a global view of collaboration is expressed and can be used for orchestration, i.e. allocation of tasks to the actors in respect of their roles, establishing cooperation, coordination, and conversation in relation with expressed strategies and styles. Based on this formalism, orchestration can be done in respect of this description. Cooperation patterns have been proposed to exemplify more common cooperation situations.
. . . .
. . . .
Fig. 5. ORCHESTRA description of heating maintenance activities example
In ORCHESTRA formalism (Fig. 5) for each actor (a score is devoted to each) it is possible to describe his instantaneous behavior by means of a chord composed of several keys describing step or transition of collaborative workflow concerned, tasks, actions or procedures authorized, as well as artifact-tools used and artifact-objects used or produced and in which environment(s). Cooperation between actors is expressed by means of a vertical presentation of scores of each actor. The left to right time progress is the main principle. However, each cooperation episode (between two vertical separators, can be “played” individually with the possibility to jump from episode to episode, to skip an episode, to repeat it in order to clearly express a
Orchestration Modeling of Interactive Systems
803
working process. To each key a more precise in-depth textual description is associated, indicating more precisely the behavior. In fact, this specification of actor behavior is not based on the actor himself, but on a more basic role behavior description. Actor description is the composition of different roles. This accumulation is either simultaneous or sequential, i.e. an actor can cumulate several (compatible) roles at the same time or take different roles in different cooperation episodes. In this way ORCHESTRA is able to provide appropriate information corresponding to the static or dynamic multi-user model mentioned above (Fig. 6) which can be used in static or dynamic orchestration of the interaction system.
Fig. 6. Main concepts manipulated by ORCHESTRA and associated patterns [14]
6 Case Study In a sophisticated maintenance collaborative system, two categories of several actors are concerned. Fix “seating” actors: a secretary collecting intervention calls, a clerk and administrative person in charge of maintenance contract management (billing, renewal, …) a warehouseman (supplying and stock management) and a supervisor (managing maintenance interventions); and moving, mobile actors: a technician, a technician in chief, an expert. In reality, mainly for mobile actors, their activity is closely related to the precise maintenance contract specific to each case. For efficiency and repairing contractual delays, it is important to be able to call up an appropriate number of technicians and to allocate among them precise tasks and deadlines. For this reason, the overall ORCHESTRA description must be complemented by indepth textual mainly for tasks to be done, artifact-tool used and artifact-objects used, created or updated. In the same way, the description is not based on actors, but on roles; actors are instantiated by the supervisor through the composition of several roles, either simultaneous or sequential, and assigned to persons (technicians). This way, a precise orchestration of each actor is done either statically before the beginning of the intervention or dynamically during the maintenance process. The Orchestration engine is in charge to put these requirements elaborated by the supervisor into practice.
804
B. David and R. Chalon
7 Conclusion and Perspectives In this paper we discussed Orchestration from different points of view. We first reminded of the common sense definition, then we discussed Orchestration in IT and more deeply in the SOA approach, where it is extensively used. We tried to study its use in the HCI field and we observed two main differences: in the SOA field, Orchestration is mainly used for multi-user back-office task organization based on basic services. In this field, interaction refers to the relation between two software components. The front-office use in SOA is not yet well covered. In HCI field, orchestration is not a common term and principle. In this front-office view, interactive systems are mainly user interface oriented and therefore concerned with concrete user interfaces taking into account plasticity principles (platform, user preferences and environment) in order to provide the user with an appropriate interface. The majority of interactive systems are single user oriented ones and for this reason, Orchestration, used to manage multi-user activities, is not taken into account. A dimension, which is part of the SOA Orchestration, which is deeply treated in HCI systems, is concerned with adaptation, related to the plasticity principles (platform, user preferences and environment) and more [8] (Fig. 2). Multi-user and cooperative user interactive systems are not yet in the mainstream of interactive systems. They are for this reason less studied. Multi-user systems can be, from an HCI point of view, considered as single user systems mainly for two reasons: 1/ no collaboration between users and 2/ static, before the execution, definition of user rights and duties. For these reasons applying to them the Orchestration in the SOA sense seems legitimate. For true cooperative systems, the static or dynamic management of user rights and duties is mandatory. This way the SOA orchestration can be transposed to this kind of interactive systems. In order to take into account the Orchestration of multi-user and collaborative systems we suggested to add to the list of models used either for design – implementation – use process or for the adaptation of two new models concerned with the static and dynamic description of functionalities of the system. The static multi-user functional model works on roles and can be compared to the UML class model. The dynamic multi-user functional model works on users, as concrete compositions of roles, and can be compared to the UML object model. These models can be used in the same way as in the “orchestration engine” of SOA, to statically (before execution) or dynamically (during the execution) adapt multi-user or collaborative systems to the accurate assigning of actors (and their roles). We had no space to accurately describe the content and modeling of these new models, but they are naturally XML based in order to be able to be treated by different transformation engines, like AMF engine [14]. For the future, two ways are offered to HCI, either to rename all its adaptation activities and approaches and call them orchestration, in order to be understood by a large community of Information System designers and users, or to preach the distinction between adaptation and orchestration and clearly advocate the differences. However, just like SOA orchestration increasingly studies the front-office aspects of orchestration, HCI field could take into account what happened in the back-office, in order to take into account the impact on the front-office in general and on User Interface in particular. It is not possible to maintain the classic macro-module called Functional Core (FC) as a black box, it seems important to inspect it and extract important information to be used in UI design.
Orchestration Modeling of Interactive Systems
805
References 1. Siegel, J.: OMG Specifications for Business Modeling, The OMG standard (fall 2007) 2. Davies, D.: The Missing Link of SOA, Opinion Piece, International Developerp. p. 71 (September 2006) 3. Perez, S., Diaz, O., Melia, S., Gomez, J.: Facing Interaction-Rich RIAs: the Orchestration Model. In: Eighth International Conference on Web Engineering, pp. 24–37. IEEE Xplore (2008) 4. Linthicum, D.: Why Orchestration Defines Your SOA, Toolbox for IT, Knowledge Sharing Communities (posted, March 31, 2005), http://it.toolbox.com/ 5. http://www.serviceoriented.org/orchestration_engine.html 6. Bono, P.R., Herman, P. (eds.): GKS theory and practice. Springer, Heidelberg (1987) 7. Thevenin, D., Coutaz, J.: Plasticity of User Interfaces: Framework and Research Agenda. In: Sasse, M.A., Johnson, C. (eds.) Interact 1999, pp. 110–117. IOS Press, Amsterdam (1999) 8. Rouillard, J.: Adaptation en contexte: contributions aux interfaces multimodales et multicanales, HDR dissertation, Université de Lille (2008) 9. Vanderdonckt, J.: A MDA-Compliant Environment for Developing User Interfaces of Information Systems. In: Pastor, Ó., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 16–31. Springer, Heidelberg (2005) 10. David, B., Ouadou, K., Sadou, S., Vial, C.: A Framework for Intelligent User Interfaces. In: Proceeding of OZCHI 1991, Sydney, Australie (1991) 11. Coutaz, J.: Meta-User Interfaces for Ambient Spaces. In: Coninx, K., Luyten, K., Schneider, K.A. (eds.) TAMODIA 2006. LNCS, vol. 4385, pp. 1–15. Springer, Heidelberg (2007) 12. Calvary, G.: Plasticité des Interfaces Homme-Machine, HDR Dissertation, Université Joseph Fourier, Grenoble (2007) 13. Tarpin-Bernard, F., Samaan, K., David, B.: Achieving usability of adaptable software: the AMF-based approach. In: Seffah, A., Vanderdonckt, J., Desmarais, M.C. (eds.) HumanCentered Software Engineering, Software Engineering Models, Patterns and Architectures for Human-Computer Interaction, Springer, Heidelberg (2009) 14. David, B., Chalon, R., Delotte, O., Masserey, G.: ORCHESTRA: formalism to express static and dynamic model of mobile collaborative activities and associated patterns. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 1082–1091. Springer, Heidelberg (2007) 15. Chalon, R., David, B.T.: IRVO: an Interaction Model for designing Collaborative Mixed Reality Systems. In: Proceedings of 11th International Conference on Human-Computer Interaction (HCII 2005), Las Vegas, Nevada, USA (2005)
An Exploration of Perspective Changes within MBD Anke Dittmar and Peter Forbrig University of Rostock {anke.dittmar,peter.forbrig}@uni-rostock.de
Abstract. Most current model-based design approaches tend to be specification-driven. Using task models solely at the specification level, contradicts the very idea inherent to task-based design. This paper suggests to look for improvements of the situation at different levels of artifact use. First, HOPS is introduced as a general specification formalism for the interaction paradigm which also allows advanced task modeling. Second, we propose to apply formal modeling in different modes during the different stages of a design process. “Task sketching” is elaborated more deeply. And third, a combination with complementary techniques is recommended to embed the development of formal system specifications in a reflective conversation between all stakeholders. An example is used throughout the paper to illustrate our ideas.
1 Introduction Interaction design is typically a multidisciplinary process with different stakeholders being involved. One basic idea of model- or task-based design approaches (MBD) is that design decisions are supported by the creation of (semi-)formal models, which represent different perspectives on the domain of interest. Consulting different perspectives may help to broaden the view on the design problem and to prevent designers from applying first-best solution strategies. Typical models in use are models about tasks, task domains, users as well as system descriptions such as models about dialogues and functional cores or specific device models. Although the intention behind MBD is to develop useful and usable interactive systems they are often criticized for their narrow view on tasks and human activity in general. However, a more nuanced view on (semi-)formal design approaches may suggest that they should neither condemned nor glamorized. It is a truism that artifacts allow people to deal with the world in new ways but also impose limitations by focusing on certain aspects only and marginalizing others. Formal or semiformal models are abstract descriptions. Hence they can never reflect the full richness of an actual situation. However, by using abstractions people are able to recognize recurrent patterns in different situations. Every modeling formalism has underlying assumptions and supplies more or less elaborated means (modeling constructs) to describe the domain of interest appropriately. As a consequence, abstract models may guide the design of envisioned interaction situations, while at the same time they may prevent designers from seeing a problem “with different lenses”. When exploring an artifact, in this case a design approach, it is useful to distinguish three facets. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 806–815, 2009. © Springer-Verlag Berlin Heidelberg 2009
An Exploration of Perspective Changes within MBD
807
− Which properties are inherent to the artifact itself? − Which modes of thinking/acting are particularly supported by the artifact? − What is the common or prevalent culture of using the artifact? Of course, these are related. By thinking about properties of an artifact one may develop new modes of interactive use. By using an artifact one may discover new properties. Certain artifacts may be “pre-assigned” to certain communities of practice and this may prevent others from applying them and so on. This paper investigates all three facets with respect to task-based design. We introduce HOPS as a general specification formalism for the interaction paradigm. Particularly, we concentrate on task modeling in the illustrative examples to show advantages of HOPS over existing modeling approaches (Sect. 4). Furthermore, we suggest applying (semi-)formal modeling in different modes during the different stages of a design process. We illustrate the “sketchy mode” more deeply (Sect. 5). In this mode incomplete and inconsistent descriptions and model variants are even desired. They reflect different perspectives and help to provoke discussion and a more thorough elaboration of the design space. Although model-based design techniques reflect user perspectives on tasks and work processes, they do not require nor support true stakeholder participation. A combination with complementary approaches is needed (Sect. 6). This is also in line with our previous work in [1,2].
2 Related Work 2.1 Prevailing Practices in MBD Model-based approaches assume that task knowledge can partially be described and that task representations support a more sensitive design. Table 1 shows that early publications already recommend to model both existing and envisioned task situations. Which practices should be changed and what is worth to maintain? How can functional specifications be derived from existing knowledge of task domains? A dialogue about possible effects of task reallocation is necessary. Table 1. Suggested names for representations of existing and envisioned task situations (Extant) Generalised Task Model Existing Task Model Task Model 1
(Target) Composite Task Model [3] Envisioned Task Model [4] Task Model 2 [5]
However, a critical review of existing work in MBD rather reveals modeling practices as indicated in Fig. 1. One rarely finds existing task models. Most papers focus on the exploitation of task models for deriving systems specifications, in particular UI models, in a more or less direct way. The design process is mainly specification driven. One consequence is that the models in use do not necessarily reflect different, even conflicting design perspectives anymore. For example, task models and dialog models often become indistinguishable (see e.g. [6,2] for a discussion). Stakeholders are not encouraged to ask what-if questions and to explore the design space.
808
A. Dittmar and P. Forbrig
Fig. 1. Most common models in current MBD approaches
2.2 Common Modes of Use in MBD Carroll points out that traditional task analysis assumes correct and complete structural task descriptions [7]. Terms like those mentioned in Table 1 (left hand side) and common modeling practices in MBD seem to support this statement. Generally, formal approaches are often focused on achieving complete and correct specifications. This makes sense in many contexts. As for specifications of interactive systems it is additionally desirable that they are executable on a machine. However, it does not make sense in every context. For example, Carroll notes that the objective of task modeling is often an optimal performance. Supposedly complete or correct models may hinder the consideration of other important aspects of work [7]. Fowler distinguishes in [8] between three modes in which people use UML. − UML as sketch: “developers use the UML to help to communicate some aspects of a system… The essence of sketching is selectivity.” − UML as blueprint: is about completeness, e.g. about a detailed design for a programmer to code up. − UML as programming language: “UML diagrams are compiled directly to executable code” [8]. In MBD and related approaches like model-driven development of software systems the last two modes are prevailing. It is our suggestion to use models as sketches more intensive in MBD. We elaborate this mode of use more deeply with respect to task modeling in Sect. 4. 2.3 Review of Task Modeling Formalisms Most existing modeling formalisms allow a hierarchical decomposition of tasks and the specification of temporal dependencies between sub-tasks (e.g. HTA, TKS, MAD, CTT, GOMS; for a detailed overview, see [9]). A task model describes a set of sequences of basic tasks. It is said that the task-related goal can be accomplished by performing one of these sequences. Sometimes, the use of and the effect on task domain objects can be specified (e.g. TaOSpec [2]). However, objects are always seen as “second class” [10]. Dix et al. propose more sophisticated artifact life cycles alongside task descriptions. “Users recruit their everyday knowledge and properties of the world to coordinate their activity.”
An Exploration of Perspective Changes within MBD
809
Approaches like GTA [5] and CCTT [11] aim to describe cooperative work by introducing concepts such as roles and actors. Fig. 2 shows two roles Lecturer and Registrar in a CCTT-model. A lecturer is responsible for grading students on the basis of homework and quizzes. The registrar files the results students achieved. The cooperative tree on the left side specifies the coordination between the roles.
Registrar:
Lecturer:
Fig. 2. CCTT-model with two roles and a task scenario with the corresponding enabled task set
The authors of [12] criticize task modeling for considering tasks as “discrete, isolated chunks of behavior”. Indeed, a scenario of a CCTT-model describes e.g. cooperation as a sequence of coordinated actions assigned to roles (see Fig. 2). The essence of interaction may be lost. Does it mean that MBD rather supports a “mechanization” of work, even though it is well intended? We suggest that improvements are necessary and possible at all three levels of artifact use mentioned in the introduction. Modeling formalisms have to be developed which reflect a richer task understanding. However, even improved techniques will continue to supply idealized and normative descriptions. On the one hand this is necessary for developing software systems. On the other hand, we need to learn how to merge (semi-)formal and complementary design approaches in a better way.
3 Modeling with HOPS HOPS is a prototypical tool to specify and animate interactive systems. It is based on the concept of higher-order processes and allows a unified description of structural and behavioral aspects of a system. For more details see e.g. [13]. The essential elements of the approach are operations and processes. Operations refer to the smallest units of behavior in a certain context of analysis or design. They are represented by names. They happen without interruption but can be characterized by pre- and post conditions. Processes are abstractions over operations. They have an inner structure and interruption is inherent to them. Each process defines a set of
810
A. Dittmar and P. Forbrig
operations. Furthermore, processes can contain components, which are processes themselves. An enclosing process describes how its components interact with each other, while at the same time these components constitute its environment. An essential characteristic of higher-order processes is that they do not fully control their environment. Only those operations of components are considered (in the focus of the process) which are relevant for describing the interaction of components. Operations defined in a higher-level process can “absorb” operations of components and describe some new “atomic” behavior. HOPS allows, in addition, the assignment of “foreign code” to operations. This feature is applied in Sect. 4. However, a higherlevel process controls those behavioral aspects only which are in its focus. It does not impose unnecessary restrictions on the rest of the components’ behavior. Generally, behavior is described by sets of sequences of operations. Constraints reduce the number of valid sequences. In HOPS, valid sequences of operations – valid in the context of a process – are specified by partial equations and pre-defined temporal and structural operators. In addition to operations and components, a process definition can contain sub-processes specified by such partial equations. A sub-process describes a partial behavior of its process. A sub-process can furthermore consider additional components and hence introduce operations, which were not in the initial focus of the process. Sub-processes are used for two purposes mainly: first for describing states in pre- and post conditions, second for putting different processes into a common context or for describing specialized sub-structures. An illustrative example In the following example, sub-processes are used for the second purpose only. We take up the task domain of Fig. 2. Some basic ideas of HOPS and of the underlying modeling philosophy are shown. It is to be seen that sub-systems (represented by processes) exist independently, that their interaction can produce new, uninterruptible behavior (represented by operations of a higher-level process) and that processes have no full control over their environment (represented by components). We continue to use the example in the following sections to sketch the suggested modeling approach. Five processes are modeled. Teacher, CourseRequ, and Registrar are basic processes which contain no components but only simple operations (represented by names). Temporal operators in the partial equations are: [] for alternatives, * for iterations, [..] for options, ; for sequences and ||| for concurrency. Furthermore, the structural operators AND, XOR and NOT are used. For operator definitions see e.g. [13]. − Process CourseRequ knows three operations homework, exam, and project. There are two sub-processes SE and HCI describing the requirements for specific courses. For instance, in order to pass the HCI course one has to do one or two exercises and then pass an examination. − Process Teacher reflects a task understanding of lecturers and tutors. A Lecturer has to deal with lists of students, has to evaluate homework, project work, and exams. They also have to decide about grades students get for courses and publish these grades. There are no temporal restrictions on performing these operations (actions). A Tutor is a teacher but is not allowed to evaluate exams, to grade or to publish grades. − Process Registrar describes aspects of the work of a registrar in a student’s office. They get information about grades and file them in any order.
An Exploration of Perspective Changes within MBD
811
PROCESS Teacher OPS create_participant_list, eval_homework, eval_exam, eval_project, grade, publish_grades SUB PROCESSES Teacher =
Lecturer = Tutor = END PROCESS
(create_participant_list [] eval_homework [] eval_project [] eval_exam [] grade [] publish_grades)*, Teacher, (NOT eval_exam) AND (NOT grade) AND (NOT publish_grades)
PROCESS CourseRequ OPS homework, exam, project
PROCESS Registrar OPS receive_grades, file_grades
SUB PROCESSES CourseRequ = SE XOR HCI, BasicRequ = exam, SE = BasicRequ ||| project, HCI = homework ; [homework] ; BasicRequ END PROCESS
SUB PROCESSES Registrar = receive_grades* ||| file_grades* END PROCESS
The two higher-level processes contain components. Their operations absorb operations of the components. A “op = ”-construct denotes a sequence of lowerlevel operations which is now uninterruptible and referred to by operation op. If an operation of a component occurs within it is in the focus of and so controlled by the enclosing process. − Process Semester08 is to say that two lecturers and one tutor are involved in organizing the HCI course and the SE course which run in parallel. The SE course has two groups se1 and se2. The definitions of the operations reflect responsibilities of the teaching staff. For example, tutor barton is responsible for the project work of group se1 as well as for homework in HCI and he accepts to be so. The partial equations (following the keywords SUB PROCESSES) describe temporal constraints for performing the single operations. For instance, there is only one HCIhomework required in this semester though two would be possible according to the specification of sub-process HCI of CourseRequ. − Process FileSemester focuses on how information about grades is passed on to the students’ office. PROCESS Semester08 BASIC COMPS barton: Tutor(Teacher), smith: Lecturer(Teacher), potts: Lecturer(Teacher), se1: SE(CourseRequ), se2: SE(CourseRequ), hci: HCI(CourseRequ) OPS se1_project se2_project se_exam se_grading hci_homework hci_exam hci_grading
= = = = = = =
>, >, >, >, , , >
812
A. Dittmar and P. Forbrig
SUB PROCESSES Semester08 SE_Course HCI_Course END PROCESS
= = =
SE_Course ||| HCI_Course, (se_exam ||| se1_project ||| se2_project); se_grading, (hci_homework ||| hci_exam) ; hci_grading
PROCESS FileSemester BASIC COMPS sem: Semester08, thompson: Registrar OPS transmitHCI = >, transmitSE = > SUB PROCESSES FileSemester = transmitHCI ||| transmitSE END PROCESS
HOPS is implemented in Prolog. The GUI is implemented in Java using JPL. A screenshot of an animation run of process FileSemester is given in Fig. 3. It shows that, after executing se1_project, se_exam, and se2_project in this order (see left side of the window), the operation transmitSE of the highest-level process is enabled (see right side of the window). However, some operations, which are enabled by lower-level processes and are not in the focus of process FileSemester, could also be performed in the next step. That is, higher-level processes only partially influence the behavior of lower-level ones.
Fig. 3. An animation run of process FileSemester in the HOPS-GUI
An Exploration of Perspective Changes within MBD
813
What is offered to task modeling? The suggested modeling approach may be more appropriate to understand task situations as interactive situations. HOPS specifications allow to reflect different perspectives on tasks at different levels of abstraction. A process can describe a role but also artifacts or work processes as illustrated in the example. Task objects are not “second class” anymore. Components can be used e.g. for describing actual actors, artifacts and their interaction. In the example, three teachers give courses with certain requirements in a specific semester. The description concentrates on the cooperative task of grading students. Traditional task models can be expressed by simple process definitions without components. Sub-processes represent sub-tasks, operations represent basic sub-tasks and temporal constraints can be specified by partial equations. However, the expressiveness of HOPS far exceeds these concepts.
4 Sketching Task Situations with HOPS As indicated in Sect. 2, most task-based approaches assume generalized models representing task knowledge of a specific domain independently of the detail of the individual tasks analyzed or imagined. However, Diaper argues that “only a small number of tasks can be selected for analysis” [14]. Tasks can be fulfilled by the same or by different actors in a rich variety of ways which we can only observe in parts. He also refers to the problem of task combination and states that quite a few methods are able to combine different tasks into a single task representation though they claim so. In [2], we suggested to understand task models as fragmentary descriptions which may be enriched by stories (e.g. [1]) or illustrations, such as the one in Fig. 4 for our example. In [15], Fallman points out that sketching supports the idea of design as a dialogue or reflective conversation. He emphasizes that sketching is “not simply an externalization of ideas already in the designer’s mind, but on the contrary a way of shaping new ideas.” It is the actual process of creating “task sketches” which may help to embed the development of formal system specifications in an argumentation process. Stakeholders are encouraged to look at the current task situation from different perspectives. “Inconsistencies” between descriptions may support a dialogue about (and sketching of) more coherent envisioned situations. In HOPS, “foreign calls” can be assigned to operations. Such a call can invoke e.g. the execution of Java code. We used this feature to couple illustrations and operations. During the animation of a task model, the illustrations appear in Java frames along with the operations. The frames are entitled by the name of the components which perform the corresponding operations. An impression is given in Fig. 5. The actual task scenario is shown on the left side. On the right side, visualizations for the executed operations are shown. They give hints to artifacts like different lists and forms which the actors use for recording students’ results. For example, the frames linked to operation transmitSE (6) depict what is needed by Prof. Smith in order to do the grading for the SE course and how the results are sent to the registrar. The figure associated to (7) sketches problems Mrs. Thompson sometimes encounter while filing grades.
814
A. Dittmar and P. Forbrig
Fig. 4. Overview of the current task situation in the illustrative example
Fig. 5. An animation run of process FileSemester with illustrations assigned to operations
5 Revisiting MBD: A Brief Summary Task-based design is a means to embed interaction design into the analysis of current and the design of envisioned task situations. However, most current MBD approaches tend to be specification-driven. Often, task models even lose their character of being descriptions of task practices but can rather be understood as dialog models of the envisioned technical system. This undercuts the very idea of MBD. This paper suggests to look for improvements of the situation at three different levels of artifact use. HOPS was introduced as a specification formalism for the interaction. An example was given to illustrate how it can be used for advanced task modeling. We emphasize the importance of “task sketching” for the emergence of design ideas out of a shared understanding of stakeholders about current practices. Task sketches combine formal and informal representations and may support a smooth transition to formal specifications as they are needed for developing interactive systems.
An Exploration of Perspective Changes within MBD
815
References 1. Dittmar, A., Forbrig, P.: Bridging the Gap between Scenarios and Formal Models. In: Proc. of HCII 2003 (2003) 2. Dittmar, A., Gellendin, A., Forbrig, P.: Requirements Elicitation and Elaboration in TaskBased Design Needs More Than Task Modelling: A Case Study. In: Coninx, K., Luyten, K., Schneider, K.A. (eds.) TAMODIA 2006. LNCS, vol. 4385, pp. 275–291. Springer, Heidelberg (2007) 3. Lim, K.Y., Long, J.: The MUSE Method for Usability Engineering. Cambridge University Press, Cambridge (1994) 4. Johnson, P., Wilson, S.: Bridging the Generation Gap: From Work Tasks to User Interface Designs. In: Proc. of CADUI 1996 (1996) 5. van der Veer, G.C., Lenting, B.F., Bergevoet, B.A.J.: GTA: Groupware Task Analysis Modeling Complexity. Acta Psychologica 91, 297–322 (1996) 6. Cass, A.G., Fernandes, C.S.T.: Using Task Models for Cascading Selective Undo. In: Coninx, K., Luyten, K., Schneider, K.A. (eds.) TAMODIA 2006. LNCS, vol. 4385, pp. 186– 201. Springer, Heidelberg (2007) 7. Carroll, J.M.: Scenarios and Task Analysis as Design Methods. In: Workshop on Exploring Design as a Research Activity, CHI 2007 (2007), http://www.chi2007.org/ attend/workshops.php 8. Fowler, M.: UML Distilled: A Brief Guide to the Standard Object Modeling Language, 3rd edn. Addison-Wesley, Reading (2003) 9. Limbourg, Q., Vanderdonckt, J.: Comparing Task Models for User Interface Design. In: [16] (2004) 10. Dix, A., Ramduny-Ellis, D., Wilkinson, J.: Trigger Analysis: Understanding Broken Tasks. In: [16] (2004) 11. Paternò, F., Santoro, C., Tahmassebi, S.: Formal Models for Cooperative Tasks: Concepts and an Application for En-Route Air-Traffic Control. In: DSVIS 1998, Springer, Heidelberg (1998) 12. Randall, D., Hughes, J., Shapiro, D.: Steps towards a partnership: Ethnography and system design. In: Jirotka, M., Gougen, J. (eds.) Requirements Engineering: Social and Technical Issues, Academic Press, San Diego (1994) 13. Dittmar, A., Hübner, T., Forbrig, P.: HOPS: a prototypical specification tool for interactive systems. In: Graham, T.C.N., Palanque, P. (eds.) DSV-IS 2008. LNCS, vol. 5136, Springer, Heidelberg (2008) 14. Diaper, D.: Understanding Task Analysis for Human-Computer Interaction. In: [16] (2004) 15. Fallman, D.: Design-oriented Human-Computer Interaction. In: Proc. of CHI 2003 (2003) 16. Diaper, D., Stanton, N.A. (eds.): The handbook of task analysis for human-computer interaction. Lawrence Erlbaum Associates, Mahwah (2004)
Rapid Development of Scoped User Interfaces Denis Dub´e, Jacob Beard, and Hans Vangheluwe School of Computer Science, McGill University, Montreal, Qu´ebec, Canada
Abstract. As the demand for domain- and formalism-specific visual modelling environments grows, the need to rapidly construct complex User Interfaces (UIs) increases. In this paper, we propose a MultiParadigm Modelling (MPM) approach whereby structure, visual appearance and above all reactive behaviour of a UI are explicitly modelled. These models are constructed at the most appropriate level of abstraction, using the most appropriate modelling formalisms. This allows for rapid application synthesis, easy adaptation to changing requirements, and simplified maintenance. In this paper, we introduce Scoped User Interfaces, and illustrate how one may model them using Hierarchicallylinked Statecharts (HlS). The use of HlS is demonstrated through the rapid development of a DChart formalism-specific modelling environment.
1
Introduction
There are many challenges developers face during the development of a complex User Interface. Desired behaviour may be autonomous or reactive, and possibly real-time. Each UI component may be required to exhibit a radically different behaviour from that of any other component and the behaviour of components may be inter-related. These complex behavioural relationships between components are often difficult to express, and are even more difficult to encode and maintain. There are also difficulties related to the development process: the developer must be able to rapidly adapt the structure and behaviour of the UI to changing system requirements. Unfortunately, conventional code-centric approaches fall short. Hence, a developer needs to capture the structure and behaviour of a UI such that “accidental complexity” [3] is minimized. We claim that an elegant solution to these problems may be found in Multi-Paradigm Modelling [15]. By modelling every aspect of the system-to-be-built, at the most appropriate level of abstraction, using the most appropriate formalisms, it becomes possible to completely capture the structure, behaviour and visual appearance of a UI, to rapidly generate prototype implementations, to easily adapt the UI as project requirements change, and, finally, to synthesize a UI and maintain it. The modelling of UIs is an active field of research. Navarre et.al. for example developed an architecture capable of handling failures of input and output devices [16]. The goal of their research is to facilitate specification, validation and implementation, and testing of User Interfaces, and to achieve plasticity, or dynamic reconfiguration of user interfaces, not just for visual appearance, but also J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 816–825, 2009. c Springer-Verlag Berlin Heidelberg 2009
Rapid Development of Scoped User Interfaces
817
for behaviour. In this approach, the behaviour of each UI component is explicitly modelled in the ICO formalism, a variant of Petri Nets [2]. The Presentation Framework of VMTS, the Visual Modeling and Transformation System [13] on the other hand, provides a flexible environment for model visualization and provides a declarative solution for appearance description. The project leverages XAML, an XML-based user interface descriptor language, to describe not only the static appearance of the UI, but the dynamic behavior of an element as well. There is a lot of similarity between our approach and the cited projects: we believe in facilitating rapid, domain-specific modelling of the UI, and that this may best be achieved by explicitly modelling the behaviour of each individual UI component. Our approach distinguishes itself in several important ways from related research. First, we attempt to solve the problems of UI development by casting it as a pure “language engineering” problem. Second, we are primarily concerned with modelling the reactive behaviour of the class of user interfaces that are made up of hierarchically-nested entities. The following section introduces the notion of Scoped User Interface and its uses. Section 3 presents Hierarchically-linked Statecharts (HlS) and section 4 demonstrates the use of HlS to model a visual modelling environment for DCharts, and extension of the Statecharts formalism [11]. Finally, we conclude and give some directions for future work.
2
Scoped User Interfaces
A Scoped User Interface is one in which reactive visual components (widgets) such as buttons and windows, but also domain-specific entities, are hierarchically nested. At the highest level of the hierarchy, widgets exhibit general behaviour. Deeper in the hierarchy, widgets have more specific behaviour. The notion of scope as it pertains to a UI is analogous to that in high-level programming languages which provide a syntactic means of specifying the hierarchical scope of a variable. The latter is used by the compiler to bind a variable use to its declaration by searching successively higher levels in the scope hierarchy until reaching the global variable space. The variable becomes an event and the bounding boxes of graphical entities become scope delimiters. A Scoped UI, then, is one which has a notion of hierarchical scope, and can bind an event to the most tightly-binding component in a hierarchy, based upon event coordinates. There are many real-world examples in which Scoped UIs are implicitly used. Hereafter, we focus on domain/formalism-specific modelling environments. Domain- and formalism-specific modelling have the potential to greatly improve productivity as they [12]. – match the user’s mental model of the problem domain; – maximally constrain the user (to the problem at hand, through the checking of domain constraints) making the language easier to learn and avoiding modelling errors “by construction”;
818
D. Dub´e, J. Beard, and H. Vangheluwe
– separate the domain-expert’s work from analysis and transformation expert’s work; – are able to exploit features inherent to a specific domain or formalism. This will for example enable specific analysis techniques or the synthesis of efficient code exploiting features of the specific domain. While editing in multiple formalisms within a single environment is highly desirable, it is important to be able to delimit each formalism’s scope. This is achieved by assigning formalism-specific behaviour based on graphically-delimited regions. The primary challenge faced when developing Scoped UIs is to describe the interaction between the user on the one hand and the various entities in the UI on the other hand. As those entities may exhibit reactive as well as autonomous, timed behaviour, it makes sense to consider them as “actors” [1,8]. The second challenge is to avoid creating an entirely new specification of UI behaviour for each formalism, but rather modifying it to suit special requirements. Hence, it makes sense to have at the root level a single, generic specification. Scope-specific modifications to this generic specification can then be made. That is, for each entity of the scope that has specific user-interface requirements, a specific UI specification is created specializing or complementing the more generic one.
3
Hierarchically-Linked Statecharts
Hierarchically-linked Statecharts (HlS) is a formalism for visually describing the structure and behaviour of Scoped UIs based on a combination of UML Class Diagram and Statecharts [11]. UML Class Diagrams are used to describe permissible relationships such as containment and connectivity between UI components. Statecharts are used to encode reactive behaviour of individual visual entities and their interactions. As will be demonstrated in section 4, HlS make it easier to develop applications with complex UI behaviour faster and more reliably. This is possible, as HlS allow the developer to see UI development as a language engineering problem. Specifically, HlS entails the following work-flow: 1. One uses an appropriate formalism, such as UML Class Diagrams, to specify the Abstract Syntax of the visual language. This entails specifying all elements in the domain one wishes to model, and qualifying their relationships with other elements. This Class Diagram, together with constraints over its elements is commonly known as a meta-model. 2. Subsequently, one models the Concrete Visual Syntax by associating a visual entity (such as an iconic shape [4]) of the application being developed, 3. One finally specifies UI behaviour using Statecharts, such that each Statechart is associated with a class and specifies the reactive behaviour of each instance of that class. The Statechart “glues” together
Rapid Development of Scoped User Interfaces
819
– reaction to user events such as mouse clicks and key-presses; – interaction with the non-visual part of the language. In particular, checking of well-formedness of constructs against the Abstract Syntax specification as well as reflecting the Semantics of the language which is often encoded as transformation rules; – layout operations which act exclusively on the Concrete Visual Syntax. The Abstract Syntax, Concrete Syntax, and Behaviour models are sufficient to specify the structure, behaviour and appearance of a visual language, each sentence of which is a valid application instance. This visual language specification is sufficient to allow the automatic synthesis of a language-specific modelling environment.
4
Example
To demonstrate the usefulness and feasibility of explicitly modelling UI behaviour using HlS, a visual modelling environment for the DCharts formalism was created. DCharts, a formalism created by Thomas Feng [9] is an extension of Statecharts. For the implementation, we will use our own tool AToM3 [5,6] (A Tool for Multi-Formalism and Meta-Modelling). 4.1
Specifying DCharts Abstract and Concrete Syntax
The Abstract Syntax of the DCharts visual language is shown in Fig. 1, modelled as a variant of UML the Class Diagram formalism. It has classes with attributes, associations with multiplicities, and inheritance. The rectangular boxes in the class diagram describe the nodes/vertices in the visual language. The meaning of the nodes is as follows: – DC DChart is a representation of the entire model. All other entities will be contained by this entity. – DC Basic corresponds to a simple state that does not hierarchically contain others. – DC Composite is nearly identical to DC Basic. A major structural difference is that it can contain other states. – DC History is the history (pseudo-)state. – DC Orthogonal is an orthogonal block that allows for concurrently active states. The entities whose icons have a hexagonal shape at the top describe relationships/edges in the visual language. The first type is the hierarchical containment topological constraint relationship. The following entities are of this type: DC ChartContains, DC Contains, and DC Orthogonality. The second type of relationship are the visible arrows. DC Hyperedge, is a simple directed transition between states. It consists of common Statechart attributes such as a trigger, guard (condition), and action code. It also has DCharts specific attributes: priority, broadcast code and broadcast to field.
820
D. Dub´e, J. Beard, and H. Vangheluwe
Fig. 1. DCharts Meta-model in the Class Diagram formalism
4.2
Specifying Formalism-Specific Behaviour Using DCharts
Although the above models describe both Abstract and Concrete Syntax of the visual language, we still need to model the behaviour of a language-specific visual modelling environment. This will be done in the form of Hierarchically linked Statecharts. In the following, the labels on the states and transitions of the UI behaviour Statecharts use a custom notation. This notation does not change the expressiveness of the formalism, but does make the communication between Hierarchically linked Statecharts which is encoded in explicitly in transition, enter, and exit actions, more intuitive to the modeller. A star, x*, indicates that action code is present. A plus, x+, indicates that a different Statechart handles the action. Parenthesis, <x>, indicate that the trigger event is generated by another Statechart. Regular brackets, (x), indicate the event was generated by the initialization routine for the entity when it is first instantiated. Square brackets1 [x] indicate that the event was generated by the Statechart itself, usually within the action code of a state. Note that while the following behaviour Statecharts were designed to accommodate layout behaviour, and include events specifically targeting layout, a detailed presentation of this behaviour is beyond the scope of this paper. Button Behaviour Model. Code for Buttons is automatically synthesized for each of the classes in the meta-model. They allow for the instantiation of DChart entities. The button behaviour model shown in Fig. 2 is simple. When the button to create entity X is pushed, the events “” and “<X Button>” are sent 1
This should not be confused with the UML Statechart notation for transition guard.
Rapid Development of Scoped User Interfaces
821
DChartActions
*
Composite Mode *
Orthogonal Mode (create)* <State Button>
Default
Idle
*
State Mode
*
History Mode
Fig. 2. Button behaviour Statechart
to this Statechart. If not already there, the Statechart moves to an Idle state upon receipt of the first event. The second event then moves it to a state whereby entity X can get instantiated. It then waits for an event requesting the creation of that entity. The “” event is generated by the DC DChart specific behaviour Statechart when it intercepts and handles the “Model Action” event. DChart Entity-Specific Behaviour Models. All visual entities of the DCharts formalism require their own behaviour models. The most important are the root entity that contains all other entities of the DCharts formalism and the composite state. Referring to the class diagram in figure 1, these correspond to DC DChart and DC Composite respectively. At the other extreme, the behaviour Statechart for the transition edge, DC Hyperedge, is trivial. All the remaining entities, excluding the non-visual containment relationships, use behaviour Statecharts that are specializations of that of the composite state. DC DChart behaviour Statechart. The behaviour of the DC DChart entity begins with initialization when the entity is first created. This initialization includes a “(create)*” trigger that sets the active state to “Idle”. From then on, the following five events trigger interesting behaviour: 1. The “” event indicates that a new DCharts formalism entity should be added to the canvas. Note that the same event is generated if one uses the AToM3 menu system or a keyboard/mouse shortcut. The actual creation of an entity is of course handled by the button behaviour Statechart described previously in 4.2. 2. The “*” event triggers a “modal” lock, forcing all events to be routed only to this Statechart. The lock is only released when either an arrow is finally created or the process is aborted, via the “*” and “Reset*” events respectively. It is necessary to refine the behaviour found in this generic UI behaviour Statechart for two reasons. The first is merely for the convenience of the
822
D. Dub´e, J. Beard, and H. Vangheluwe New Arrow
Create DChart Entity+
*
(create)*
Rollback*
[Done]
Idle
Drop Point*
[Done]
Default
[Done]
serviceLayoutRequest+
Drop Point
Snap Points*
* * <Edit>* Toggle Snap* Toggle Snap* * Reset*
Drop Point
No Snap*
[Done]
*
Drop Point2* Rollback*
Fig. 3. DC DChart behaviour Statechart
user. Instead of allowing the user to draw arrows to indicate containment relationships, only transitions may be drawn. This saves time, and a dragand-drop behaviour model exists for creating and destroying containment relationships as shall be shown later. The second reason is simply to know when transitions are actually created so that their UI behaviour Statecharts may be initialized. DC Composite behaviour Statechart. The behaviour of DC Composite, the composite state, is the most complex of all. Fortunately, it is also re-usable by many other entities. The initialization phase is rather involved, with two main possibilities. The first is that an interactive session with the user is in effect, in which case the “(create)” trigger signals the creation of a new DC DChart. Immediately, the user is presented with a dialog asking to which of the entities in the region of the newly created DC Composite, they would like to contain the new composite state. If the composite state is successfully connected to either a DC DChart or another DC Composite, then the “[didConnect]” trigger is generated, followed by a “” event to the container, and finally a “[Done]” event to set the state to “HasParent”. If the composite state is not successfully connected, then a “[didNotConnect]” event is generated and the active state is set to “NoParent”. Finally, the second of the two possibilities is that the model is being loaded rather than interactively edited. In this case, a “(loadModelCreate)” event is first sent when the DC Composite is first instantiated, setting the active state to “NoParent”. Then, a second “(loadModelCreate)” event is sent if a containing relationship is instantiated with this DC Composite as its parent, thus setting the active state to “HasParent”. The following is a list of all the events that occur after the initialization phase.
Rapid Development of Scoped User Interfaces
Default*
[didConnect]
hierarchicalConnect*
823
hierarchicalDisconnect*
(create) [disconnected]
[Done]
requestLayoutOnParent*
requestLayoutOnOldParent*
Idle
(loadModelCreate)
[drop]
[Done]
[didNotConnect] (loadModelConnect)
[stayedConnected]
NoParent [Done]
HasParent *
[notDropRoot]
History [Request drop]
serviceLayoutRequest+
[drop]
H
[Done]
Request Drop*
<Edit>
Edit*
finalLayoutRequest*
Fig. 4. DC Composite behaviour Statechart
1. The ”*” event is dealt with in the same manner as the DC DChart UI behaviour Statechart. All hierarchical children are selected. 2. The “<Edit>” event indicates that the user has opened an edit dialog on the DC Composite attributes. This allows the user to modify the visual appearance of the node, which may trigger requests for layout. 3. The ”” event indicates that this composite state, among potentially many other entities, has just been dragged and then dropped. The transition with this trigger promptly generates two events: “[Done]”, which restores the active state to either “NoParent” or “HasParent”, followed by “[drop]”, which causes hierarchical connection or hierarchical disconnection, respectively, to be attempted. The latter occurs only if the entity has been dropped outside of its parent container and the user has explicitly agreed to disconnect it. This triggers a “” followed by an attempt to hierarchically connect the disconnected composite state in its new location. 4. The “” event indicates that this composite state is to be deleted. DC Hyperedge behaviour Statechart. The behaviour of the DC Hyperedge or transition, is trivially simple, as Fig. 5 shows. As noted earlier, the transition is a hyper-edge only in the meta-model, in the generated DCharts formalism itself it is a simple directed edge with one source and one target. The transition is first initialized with a “(create)” event. Afterwards, it simply awaits “<Edit>*” events in order to apply changes made in its edit dialog. These changes affect the information content of the label associated with the transition. For a full description of all behaviour Statecharts we refer to M.Sc. thesis of the first author [7]. It is finally possible to synthesize a visual DChart modelling environment from the Class Diagram meta-model and the Hierarchically linked Statechart models as shown in Fig. 6. Note how this demonstrates support for multi-formalism modelling, with specific behaviour for Buttons and DChart entities.
824
D. Dub´e, J. Beard, and H. Vangheluwe <Edit>* (create)
Default*
Idle
Fig. 5. DC Hyperedge behaviour Statechart
Fig. 6. Synthesized DCharts modelling environment with model instance
4.3
Conclusion
We have shown how it is possible to model complex, scoped, formalism-specific UI behaviour using HlS. This was possible by modelling abstract syntax and concrete syntax of a visual language explicitly as well by attaching models of behaviour to all entities in the language. It was possible to develop an example application quickly. The result has proven to be both robust and easy to maintain. Our contribution is most closely related to the work by Minas on DiaGen and DiaMeta [14]. Our focus is however mostly on the explicit modelling of behaviour. We believe HlS can be used as the “assembly language” for UI behaviour modelling. As such, we will explore other notations such as task models [10] and map them onto HlS. Our current work implements the above idea in the form of UIs running entirely within a web-browser, using only SVG and ECMAScript (JavaScript).
Rapid Development of Scoped User Interfaces
825
References 1. Agha, G., Hewitt, C.: Actors: a conceptual foundation for concurrent objectoriented programming. pp. 49–74 (1987) 2. Barboni, E., Conversy, S., Navarre, D., Palanque, P.: Model-based engineering of widgets, user applications and servers compliant with ARINC 661 specification. In: Doherty, G., Blandford, A. (eds.) DSVIS 2006. LNCS, vol. 4323, pp. 25–38. Springer, Heidelberg (2007) 3. Brooks, F.P.: No silver bullet: Essence and accidents of software engineering. Computer 20(4), 10–19 (1987) 4. Costagliola, G., Lucia, A.D., Orefice, S., Polese, G.: A classification framework to support the design of visual languages. J. Vis. Lang. Comput. 13(6), 573–600 (2002) 5. de Lara, J., Vangheluwe, H.: AToM3 : A tool for multi-formalism and metamodelling. In: Kutsche, R.-D., Weber, H. (eds.) FASE 2002. LNCS, vol. 2306, pp. 174–188. Springer, Heidelberg (2002) 6. de Lara, J., Vangheluwe, H.L.: Defining visual notations and their manipulation through meta-modelling and graph transformation. Journal of Visual Languages and Computing, Special Issue on Domain-Specific Modeling with Visual Languages 15(3-4), 309–330 (2004) 7. Dub´e, D.: Graph layout for domain-specific modeling. M.Sc. dissertation, School of Computer Science, McGill University (June 2006) 8. Edward, S.N., Lee, A., Wirthlin, M.J.: Actor-oriented design of embedded hardware and software systems. Journal of Circuits, Systems, and Computers 12(3), 231–260 (2003) 9. Feng, T.H.: DCharts, a formalism for modeling and simulation based design of reactive software systems. Master’s thesis, School of Computer Science, McGill University, Montr´eal, Canada (February 2004) 10. Forbrig, P., Patern` o, F. (eds.): HCSE/TAMODIA 2008. LNCS, vol. 5247. Springer, Heidelberg (2008) 11. Harel, D.: Statecharts: A visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987) 12. Kelly, S., Tolvanen, J.-P.: Domain-Specific Modeling: Enabling Full Code Generation. Wiley, Chichester (2008) 13. Levendovszky, T., Lengyel, L., Mezei, G., M´esz´ aros, T.: Introducing the vmts mobile toolkit. In: Sch¨ urr, A., Nagl, M., Z¨ undorf, A. (eds.) AGTIVE 2007. LNCS, vol. 5088, pp. 587–592. Springer, Heidelberg (2007) 14. Minas, M., K¨ oth, O.: Generating diagram editors with diaGen. In: M¨ unch, M., Nagl, M. (eds.) AGTIVE 1999. LNCS, vol. 1779, pp. 433–440. Springer, Heidelberg (2000) 15. Mosterman, P.J., Vangheluwe, H.: Computer Automated Multi-Paradigm Modeling: An Introduction. Simulation: Transactions of the Society for Modeling and Simulation International, Special Issue: Grand Challenges for Modeling and Simulation 80(9), 433–450 (2004) 16. Navarre, D., Palanque, P.A., Ladry, J.-F., Basnyat, S.: An architecture and a formal description technique for the design and implementation of reconfigurable user interfaces. In: Graham, T.C.N., Palanque, P. (eds.) DSV-IS 2008. LNCS, vol. 5136, pp. 208–224. Springer, Heidelberg (2008)
PaMGIS: A Framework for Pattern-Based Modeling and Generation of Interactive Systems Jürgen Engel and Christian Märtin Augsburg University of Applied Sciences, Faculty of Computer Science, Postfach 110605, 86031 Augsburg, Germany {Juergen.Engel,Christian.Maertin}@hs-augsburg.de
Abstract. This paper introduces the PaMGIS framework for pattern-based modeling, generation and usability evaluation of interactive systems. It describes the structural aspects of HCI pattern languages and how such languages and patterns for various modeling stages (e.g. task modeling) and abstraction levels can be exploited to automate part of the software development process for interactive applications. The main components and the general functionality of the framework are discussed. The remaining part of the paper focuses on the low-level automation component of the framework and illustrates how the code for concrete interaction objects is generated from semi-abstract user interface patterns. Keywords: Interactive system, user interface, model-driven development, pattern-based development, HCI pattern languages, task-models, software generation, usability evaluation.
1 Introduction Over the last decades interactive software has become a more and more essential ingredient of human life. No matter where people are and what they do, they are used to interact – knowingly or unwittingly – with products built around interactive software components, such as web applications, telecommunication devices, car navigation systems, household appliances, or other electronic equipment. In equal measure, software engineering and software development have become increasingly important disciplines. There is a steadily growing demand for targeted, cost-effective, and efficient production of high quality software applications. To meet these requirements several very worthwhile techniques for supporting the overall software process have been introduced, such as model-driven and pattern-based development, automated software generation and usability evaluation. In order to maximize the overall benefit for developers and user interface designers of interactive systems we combined these techniques within an integrated software development framework for pattern-based modeling and generation of interactive systems (PaMGIS), which is described in the following chapters (fig. 1). The system is currently under development with prototypical implementations of major components already available. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 826–835, 2009. © Springer-Verlag Berlin Heidelberg 2009
PaMGIS
PaMGIS Framework
827
Modelbased Development
Automation Usability Evaluation
Patternbased Development
Fig. 1. Combined and integrated software development approach of PaMGIS
2 Related Work 2.1 Structured Pattern Languages Patterns and pattern languages have a growing impact on the disciplines of HCI and web engineering. Originally pattern languages were constructed for solving problems in architecture and urban planning [1]. In the early nineties they were adopted by software engineers and software architects for promoting the reuse of high-quality design solutions [5]. For over a decade now patterns and pattern languages have also entered the fields of HCI, usability engineering [7], user experience [14] and organizational workflow [6]. Numerous comprehensive pattern catalogues and pattern languages have been developed for improving usability aspects and the quality of interface and interaction design, e.g. [13],[15]. However, most of these pattern collections lack an appropriate organizational structure in order to facilitate pattern selection and ensure the overall coverage of domain dependent and independent modeling and design problems. In [3] manageability aspects of various existing UI pattern catalogues are discussed and compared. In [8] we introduced a structured approach both for designing hierarchically organized HCI pattern languages and controlling the selection of the really needed patterns during the software development process. Meanwhile we have tested and evaluated our approach for three newly designed HCI pattern languages from different domains (online shops, industrial automation, database administration). We have also added components for the semi-automated generation of real user interfaces from the selected patterns represented in XML and for binding them to the application domain code. These results heavily contribute to the PaMGIS framework introduced in this paper. 2.1 Patterns and Software Engineering Pattern-based software engineering has to cover the overall software development life-cycle from early planning down to coding, maintenance and usability evaluation
828
J. Engel and C. Märtin
activities. Several pattern-based approaches explore the detailed design aspects for getting from problem definitions to possible solutions, e.g. by using case-based reasoning techniques [16]. Other approaches apply formal specification models for defining flexible pattern languages for domain-specific interactive systems [2]. However, some of the most promising software engineering approaches for highly usable interactive systems combine both patterns and task models for detecting all the dynamic and contextual aspects for modeling and designing high-quality target applications. In [11] a task- and pattern-driven way for designing multiple user interfaces is discussed. In [4] the authors demonstrate that concrete task models may be derived from more abstract task model patterns. And in [12] a comprehensive process for user interface engineering using patterns is introduced.
3 PaMGIS Framework Concept 3.1 General Functionality The PaMGIS framework is designed to support software developers with a combination of model-based and pattern-based approaches (fig. 2). By exploiting information inherent to task-models and pattern languages it is intended to provide a maximum degree of automation for the generation of artifacts from abstract (UML/XML) and semi-abstract (XML) application models down to the resulting application source codes. In any process step the user of the framework has the option to interfere with and manipulate results as desired. Further PaMGIS is capable of letting the resulting software applications automatically generate usage data log files during runtime. These logs are evaluated and the results are fed back to the original pattern definitions. The central component of the framework is a pattern repository containing different types of patterns and pattern languages of different abstraction levels, i.e. architecture patterns, design patterns, and HCI patterns. It not solely incorporates the pure pattern definitions structured according to [8], but also pattern metadata and additional components for code and usage data generation and usability evaluation feed back. PaMGIS provides capabilities for pattern administration and maintenance as well as reporting facilities. In a first process step the various input models – in particular the task, user, device, and context models – are interpreted. Related patterns are selected from the pattern repository and an abstract application model is generated, representing the high-level architecture and structure of the future software application. As task model representation language we prefer CTTs as described in [9]. In a subsequent step, the model generation phase produces a semi-abstract application model, a much more detailed description of the target application. Here, the various components from the additional pattern information are exploited to extend the model in order to prepare the next process phase. Within the now following code generation step the semi-abstract application model is transformed into application source code. The related generator consists of various plug-in components for different target programming languages. Currently we have prototypical implementations for C++, C#, Java and HTML available.
PaMGIS
User Model
Task Model
Device Model
829
Context Model
Pattern Selection and Preparation
Abstract Application Model
Pattern Repository
Model Generation
Barge-in Option
SemiAbstract Application Model
Code Generation Pattern Administration
Application Code
Usage Data Evaluation Compiler
Application
Usage Reporting Usage Data Repository
Fig. 2. PaMGIS Architecture Overview
Having the source code compiled and the resulting software application up and running, usage data records are written to the usage data repository. From here, pattern language-specific and pattern-specific evaluations can be performed, e.g. using predefined metrics. As the related results are stored back to the pattern repository this new information can be used to improve quality in the next software generation cycle. 3.2 From Patterns to Generated Code The framework stores pattern languages and the actual patterns in XML format. Patterns are composed of typical elements, such as the name of the respective pattern language, pattern name, problem, context and solution descriptions, and application examples. For software generation purposes we have enhanced the pattern structure by a special component named . It includes information required for application model and code generation. This element is mission-critical for the generation procedures and must not contain erroneous content. Therefore these parts of the pattern definitions are validated using a specific document type definition (DTD).
830
J. Engel and C. Märtin
The component consists of two mandatory elements, and . While specifies the later appearance of the respective artifacts, the element provides definitions of potential child elements. The section is mandatory and consists of three potential child elements: <element>, and (table 1). Table 1. Child elements of the XML tag Child element <element>
Description Specifies the name of the tag to be generated. Occurs exactly one time. Allows the specification of desired attributes of the tag to be generated. Optional. Allows the specification of a specific value for the tag to be generated. Optional.
Here, for and several attributes can be specified (table 2). Table 2. Attributes of the child elements and Child element
Attribute name mandatory
range
option fixed
mandatory
Description Specifies the name of the attribute. Mandatory. Specifies whether the respective generated field must be filled out or not. Boolean, default false. Specifies a range of possible numerical values. Optional. Specifies a list of possible values. Optional Enforces the system to use the specific value which has been specified within the pattern language. Cannot be modified during runtime of the generated application. Boolean, default false. Specifies whether the respective generated field must be filled out or not. Boolean, default false.
The section within the component is also mandatory and consists of only one potential child element: (table 3). Table 3. Child elements of the XML tag Child element
Description Specifies the potential child elements. Optional.
The following table shows the possible attributes of the child element (table 4).
PaMGIS
831
Table 4. Attributes of the child element Child element
Attribute ref
min max choose
priority
relation
anchor
Description Reference to the child element by its pattern ID. If the pattern is part of a different pattern language the reference has to be specified in the format of PLname#PatternID. Mandatory. Minimum number of occurances. Maximum number of occurances. Gives the user the possibility to choose one out of several alternatives. Specification of priorities in order to influence the order and positions of elements. Specifies the relation between elements and therefore influences their positions. Specifies the general orientation of a child element.
In the following it is demonstrated how an example user dialogue can be modeled and generated by using the previously described functions of the framework. We produce a simple dialogue box providing the user with two buttons: ‘ok’ to confirm the antecedent action and ‘cancel’ in order to abandon.
Fig. 3. Pattern hierarchy used for the exemplary ‘Ok/Cancel’ user dialogue
In the first step we select three patterns: ‘StandardFrame’ representing the standard window for the user dialogue, ‘OkCancelBar’ describing the panel which will later on incorporate the buttons, and ‘Button’ specifying the appearance of the two buttons. The patterns and their relationships follow the rules and structures as described in [12]. They are interrelated as shown in figure 3. The most relevant information required for the generation process is covered within the components of the chosen patterns. In the section of the ‘StandardFrame’ pattern (please refer to the following code listing) we find the specifications of the <element> ‘panel’ and two attributes called ‘name’ and ‘type’. Both attributes are mandatory (mandatory=”true”). The ‘type’ attribute defines that the respective artifact will become a frame. The section defines a child element which may occur not once or at most one time. It is based on the ‘OkCancelBar’ pattern (ref=”B1”, which is the pattern ID of that pattern).
832
J. Engel and C. Märtin Code fragment: component of the ‘StandardFrame’ pattern (ID=A1)
<element>panel Frame gridx=0;gridy=1
On the next deeper level, i.e. the ‘OkCancelBar’ pattern, the section of the component also specifies an element named ‘panel’ and several attributes, such as ‘name’, ‘type’ and further ones which define the position of the frame inside its parent element. The section defines two child elements which will be derived from the ‘Button’ pattern (ID=C1), occur in each case exactly once and will be positioned as specified. One will be labeled ‘Ok’, the other ‘Cancel’. Code fragment: component of the ‘OkCancelBar’ pattern (ID=B1) <element>panel Ok_Cancel internalPanel gridx=0;gridy=0;fill=NONE; anchor=EAST;gridwidth=1.0;value=Ok gridx=1;gridy=0;fill=NONE; anchor=EAST;value=Cancel
Finally the model generator has to interpret the ‘Button’ pattern twice, one time for each of the two buttons. The section defines an element ‘button’ and several attributes for the name and various positioning data. Please note, that the section is particularly specified, but does not contain any child element definitions.
PaMGIS
833
Code fragment: component of the ‘Button’ pattern (ID=C1) <element>button
After having processed the pattern definitions listed above, the model generator produces the following application model definition. Code fragment: respective application model definition as computed by the model generator <windows> <panel name=”OkCancelFrame” type=”Frame”> <panel name=”Ok_Cancel” type=”internalPanel” gridx=”0” gridy=”1”> Ok Cancel
In the last step, the code generator processes the application model and creates the source code which subsequently can be compiled [10]. When using the code generator for Java 1.6 Swing, the resulting ‘Ok/Cancel’ user dialogue looks like as shown in figure 4.
Fig. 4. ‘Ok/Cancel’ user dialogue as produced by the Java 1.6 Swing code generator
834
J. Engel and C. Märtin
4 Conclusion PaMGIS is our approach to enhance productivity and developer experience for interactive system designers by combining two valuable software development techniques, model-based and pattern-based development. The framework supports our research with respect to the potentials and limits of automated software development. The Pattern Repository serves as a structured reservoir of patterns organized in different pattern languages and on different levels of abstraction which can be used for generating interactive software applications for a growing number of problem domains. Moreover the system allows evaluating the patterns’ usability and to reflect these findings back to the pattern repository. This leads to a better understanding of how to use individual patterns. Finally PaMGIS supports to generate, evaluate, and compare alternative designs of interactive applications within the same problem domain.
References 1. Alexander, C., Ishikawa, S., Silverstein, M.: A pattern language. Oxford University Press, Oxford (1977) 2. Borchers, J.O.: A Pattern Approach to Interaction Design. In: Proc. DIS 2000, pp. 369– 378. ACM Press, Brooklyn, New York (2000) 3. Deng, J., Kemp, E., Todd, E.G.: Managing UI Pattern Collections. In: Proc. CHINZ 2005, Auckland, New Zealand, pp. 31–38. ACM Press, New York (2005) 4. Gaffar, A., Sinnig, D., Seffah, A., Forbrig, P.: Modeling Patterns for Task Models. In: Proc. TAMODIA 2004 (2004) 5. Gamma, E., et al.: Design Patterns. Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading (1995) 6. Guerrero Garcia, J., Vanderdonckt, J., Gonzalez Calleros, J.M., Winckler, M.: Towards a Library of Workflow User Interface Patterns. In: Graham, T.C.N., Palanque, P. (eds.) DSV-IS 2008. LNCS, vol. 5136, pp. 96–101. Springer, Heidelberg (2008) 7. Marcus, A.: Patterns within Patterns. Interactions 11(2), 28–34 (2004) 8. Märtin, C., Roski, A.: Structurally Supported Design of HCI Pattern Languages. In: Jacko, J.A. (ed.) HCI 2007. LNCS, vol. 4550, pp. 1159–1167. Springer, Heidelberg (2007) 9. Paternò, F.: Model-based Design and Evaluation of Interactive Applications. Springer, London (2000) 10. Roski, A., Märtin, C.: Pattern-Sprachen und Automatisierung. In: Koschke, R., et al. (eds.) Proc. zur Informatik 2007, September 24-28, 2007. Bremen, Informatik trifft Logistik, Band 1, GI-edn., Lecture Notes in Informatics, pp. 454–458 (2007) 11. Seffah, A., Forbrig, P.: Multiple User Interfaces: Towards a Task-Driven and PatternsOriented Design Model. In: Forbrig, P., Limbourg, Q., Urban, B., Vanderdonckt, J. (eds.) DSV-IS 2002. LNCS, vol. 2545, pp. 118–132. Springer, Heidelberg (2002) 12. Seffah, A., Gaffar, A.: Model-based User Interface Engineering with Design Patterns. J. of Systems and Software 80(8), 1408–1422 (2007) 13. Tidwell, J.: Interaction Design Patterns. In: Proceedings of the Pattern Languages of Programming PLoP 1998 (1998)
PaMGIS
835
14. Tiedtke, T., Krach, T., Märtin, C.: Multi-Level Patterns for the Planes of User Experience. In: Proc. of HCI International, Las Vegas, Nevada, USA, July 22-27, 2005. Theories Models and Processes in HCI, vol. 4, Lawrence Erlbaum Associates, Mahwah (2005) 15. van Welie, M., Traetteberg, H.: Interaction Patterns in User Interfaces. In: 7th Pattern Languages of Programs Conference, Allerton Park Monticello, USA, August 13-16 (2000) 16. Wentzlaff, I., Specker, M.: Pattern-based Development of User-friendly Web Applications. In: Proc. ICWE 2006 Workshops, Palo Alto, July 10-14, 2006, ACM Press, New York (2006)
People-Oriented Programming: From Agent-Oriented Analysis to the Design of Interactive Systems Steve Goschnick Department of Information Systems University of Melbourne, VIC 3010, Australia
[email protected] Abstract. Where the Object-Oriented paradigm set about abstracting objects, Agent-Oriented (AO) theory draws on Psychology to abstract mentalist notions like: beliefs, perceptions, goals, and intentions. As such, the associated AgentOriented analysis can be used quite successfully to design interactive systems for people, delivering applications that are heavily individual-oriented. This reversal of the AO lens focuses analysis back upon people. It puts a multi-faceted agent used in analysis ‘into the shoes’ of the user and turns the design and implementation into one we call People-Oriented Programming (POP). POP calls on users to gather ethnographic data about themselves using Cultural Probes and on end-user innovation via software toolkits. This turn of focus is timely as the analyst/designer of interactive systems is facing new challenges regarding flexibility, user situatedness, dynamic environments, incomplete data, diversity in user needs, sensors in the environment, and users emersed in multiple parallel social worlds. Based on an extensive background analysis this paper distills a set of key aspects that any POP effort should possess. Keywords: Agent-oriented analysis, agent-oriented paradigm, user innovation, HCI, people-oriented programming, agent meta-models, ShadowBoard Agents.
1 Introduction Several Agent-Oriented (AO) architectures draw on models from Psychology (e.g. BDI and ShadowBoard [9]), abstracting mentalistic notions, such as: beliefs, perceptions, goals and intentions. As such, some associated agent-oriented analysis, can be used quite successfully to design interactive systems for individuals with heterogeneous needs. This reversing of the lens of AO back upon people, places a multi-faceted agent analysis ‘into the shoes’ of the user and turns the design and implementation into one we call People Oriented Programming. This reversal of focus for AO analysis is timely, as modern interactive systems are placing new challenges upon the analyst/designer: a heightened degree of flexibility, situatedness of users, uncertain and dynamic environments, incomplete information, diversity in users and their needs, sensors proliferating in the environment, and users emersed in multiple parallel social worlds, instead of in one fixed organisation. It has a particular strength in the domestic setting, where people spend a significant amount of their time, often on non-work tasks, goals and less descript activities. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 836–845, 2009. © Springer-Verlag Berlin Heidelberg 2009
People-Oriented Programming
837
Agent-Oriented analysis and design can deal with user situatedness via agent adaptability. An agent’s internal world view, coupled with high-level core value goals [9] facilitates autonomous behaviour. An AO system can deal with the non-sequential external events in an agent’s environment (reactive behaviours), while continuing with their current goals (proactive behaviours). In this paper we look in detail at these two aspects of interaction design in mixed-initiative human-agent systems [3]: dealing with the changing user context; and the message-flow model that facilitates reactive and proactive behaviours. It is presented in the context of the ShaMAN multi-agent meta-model [11], as instantiated in the DigitalFriend software [10]. Then, as the main contribution, this paper presents People Oriented Programming (POP) as a new design paradigm for building personal systems. Based on an extensive background analysis it distills a set of key aspects that any POP effort should possess. POP calls upon the user in three capacities: as the focus of customised software, which von Hippel and Katz describe as ‘markets of one’ [24]; as a self-ethnographer using Cultural Probes [7] to gather data; and as an end-user developer via software toolkits [24] designed to make the user central to innovation in new product development, the way that end-users are doing in the games genre [18] and in mashups of Internet services [2]. The technology used here to pursue People Oriented Programming is the DigitalFriend, V1 of which instantiates the ShadowBoard Agent Architecture [9]. Its theoretical base draws on Analytical Psychology – giving POP its fourth layer of meaning. Before looking at POP and the agent analysis, we first look at the pathways for interaction within the ShaMAN agent meta-model, in order to characterise and scope the design of the necessary interaction system.
2 Interaction through Operators, SpeechActs and UI devices The DigitalFriend V2 is multi-agent system (MAS) software – an instantiation of the ShaMAN meta-model – with a central goal of helping an individual user in the fullspectrum of their life (work, leisure, family, community activity, etc.). It is designed to monitor, alert, filter and initiate tasks, messages and resources, all within the context of the user’s goals and activities. The Personal Assistant Agents accumulated within the DigitalFriend make it a mixed-initiative human-agent system [3]. Interactions that can take place, include: inter-agent communication which is facilitated via speech-acts in an agent communication language; agent-to-user, usually via messages accompanied by visual and/or aural alerts; and also via direct user-to-agent interaction usually accomplished through UI interface components. 2.1 Interaction through Speech-Acts SpeechFlow in ShaMAN represents an interaction model for agent intercommunication. The allowable message types between two given agents, is an interaction plan (or a communication protocol), within which all the dynamically produced speech-acts, abide. Note: the right-hand side of figure 1 represents a part of an interaction plan, in the form of sender Æ speechact Æ receiver, for a number of sub-agents within a user’s DigitalFriend.
838
S. Goschnick
Fig. 1. Message mapping and Agent Interaction Plans
On the left is the part of the ShaMAN meta-model that deals with the flow of speech-acts. The sending agent uses speech-acts to communication to either: other agents (including the user); to whole SocialWorlds; or to a specific SocialRole across SocialWorlds. In the DigitalFriend, each agent has a queue of messages received from the SpeechFlow, and the situatedness of the agent (particularly the human user), often determines which ones come off the queue, at what times, as we see in Section 3. 2.2 Human-to-Agent Interaction through UI Devices The level of granularity of interaction between the human user and the agents in their DigitalFriend are varied. The concept of Locales is taken from Fitzpatrick [5], and can represent any place in which interaction takes place between the members of a Social World. This includes abstract places such as the graphical representation of a GUI on the screen. It can be something as commonplace as a File-Chooser GUI component that an agent uses to ask the user for a file. Or it can be a custom UI widget an agent uses to request specific information from the user. Figure 2 is part of the ShaMAN meta-model as follows: an Agent has a number of roles in multiple SocialWorlds represented as AgentRole. Roles have a set of goals that once initiated, form an Agent’s intentions (represented here as AgentRoleGoal). Tasks are set in motion to achieve a goal. Sub-tasks are performed by sub-agents, but some require the user to perform a task, or to ponder a new situation. An agent may call on Human-Agent-Interaction (the HAI entity) to achieve the necessary task. A Locale can be a place for interaction. I.e. the UI components in fig.3 include an interactive map of the world put to screen by an agent, which is waiting on the user to select and confirm a country. Where a traditional Task Analysis may go down to key-stroke interactions, in an agent system that uses high-level UI components, the task-granularity stops at putting the UI component to screen, and receiving the user’s selection. I.e. The nonsequential nature of user interaction with a complex GUI component, is not of specific interest beyond what the user actually selects, facilitated here by the link between the Task and the HAI entities.
People-Oriented Programming
839
Fig. 2. Model of Human-Agent-Interaction within ShaMAN
3 Agent State as a Basis for Analysis In the DigitalFriend the user is represented as another agent, but with several added features: agents can interact with the user via direct actions (in addition to speechacts) via UI interfaces represented as Locales; the user has a set of known behaviours including walking, driving, reading, meeting, sleeping; and, the user is represented computationally at the top of the DigitalFriend’s hierarchy of agents, with their Goals sitting at the top of the Goal hierarchy/network.
Fig. 3. Human-Agent-Interaction within DigitalFriend V2
There is a complex representation of agent-state within the ShaMAN meta-model, giving a comprehensive coverage of the user’s situatedness (figure 4), including what Locale they are currently in, who and what agents currently inhabit that Locale, what resources they have at hand, and more. In analysis it is paramount to discover the ‘what’ rather than the ‘how’ [17]. To do so, we assume all people in a user’s SocialWorlds are equally empowered with a MAS of the sophistication of at least the DigitalFriend. In a given Locale the Inhabitants are assumed to be known, along with their Roles. Of the Resources across the system, those in AgentResource are immediately available to an Agent, in addition the agent may use OnSiteResources.
840
S. Goschnick
Fig. 4. Agent State in ShaMAN [11], with respect to situatedness Note: The ShaMAN meta-model (the greyed-out background in fig. 4, available elsewhere [11]) has 30 classes/entities in UML class diagram notation. It effectively combines several sub-models: Role, Goal, Task, Interaction, SocialWorld, Resource and Locale models. These models are interconnected by a number of associate entities, in ER (entity relation) terminology – i.e. the entities on the many end of the one-to-many relationships, in crows-foot notation. E.g. Inhabitant, Responsibility, Member, AgentResource and OnSiteResource cross-relate several sub-models, to great benefit regarding situatedness, in analysis, design and implementation.
Communication to the user from the DigitalFriend can be filtered for their current situation. These elements of state (as per figure 4 above), together with the user’s current behaviour, allows ShaMAN to select messages from the agent’s current queue of messages (in particular, the user), according to a set of message delivery rules. It becomes easy to stipulate rules such as: No taking a phone call while driving. The rules can be stipulated more clearly once a user’s state has been ascertained: What SocialWorlds do they belong to, and what are their Roles in those worlds? What are the Responsibilities that go with those Roles? Who else are members of those SocialWorlds and what are their respective Roles? What Locales (real and abstract) does the user frequent in the course of fulfilling their Responsibilities? What are the current conditions in the Locale? What Resources does the user have at their disposal, within each Role? What Resources do their agents have at their disposal? What Resources are available in the Locales a user is expected to use? What Agents does the user expect to have available when in each of the Locales? What Agents does the user expect to have available when they are in each of their Roles?
4 People-Oriented Programming as a Design Paradigm 4.1 Background This general method follows on from the Shadowboard Methodology by Goschnick and Graham [12]. It was not a generic AO methodology, but was specifically aimed at gathering and tailoring the AO requirements for a multi-agent Personal Assistant Agent (PAA) system, for an individual user. The authors made the point with:
People-Oriented Programming
841
“…the primary requirement in the work presented here was to have a 24x7 userrepresentation available (even) while the user sleeps or is otherwise offline, within a tool capable of autonomous computation, some decision-making, some information filtering and with the ability to concentrate the presentation of relevant information and to inform the user at the most convenient time” Their central idea then was to marry a top-down 62 role/sub-role model starting template, with certain bottom-up techniques from ethnography including Cultural Probes by Gaver et al [7] in the form of “user-kept diaries” and user scenarios in the tradition of Rosson and Caroll [19], into a methodology that addressed the personal aspirations and desires of an individual together with their social needs. To bring an individual’s social needs into the equation, Goschnick and Graham foreshadowed future work that called upon the theory of Social Worlds by Strauss [21] and the Locales framework [5]. That foreshadowed work there, is the ShaMAN meta-model here, which now facilitates Social Worlds and Locales in a MAS system. A significant difference in the Shadowboard Methodology over other agent methodologies involves the Role entity: a role model is not only used in the requirements gathering process, in addition, it serves functions within the analysis, design and in the implementation of a Personal Assistant Agent. E.g. A role-hierarchy lens is used to filter and organise messages for the user’s benefit, that come from various subagents [10]. These messages are also stored in a log that represents an interaction trajectory arc [5] of the user’s life so far. Cultural Probe [7] data captures the richness of individual users in the domestic and social space, however, many Ethnographers use it to inspire their own design processes [1,8] rather than to enhance the design process. There have been pockets of research trying to bring ethnographic data and/or scenarios into the Software Engineering process as seamlessly as possible [14,22,6,13]. Some provide support for a multi-disciplinary team approach to bring in the richness of social context [13]. While ethnography and software engineering are complementary - ethnographic studies capture the details that are useful in analysis, while software engineering design looks for and uses abstractions as often as possible [22] - the two forms of data and focus, from these two quite different disciplines, each with different notations and concepts, means that there has been a bottleneck in getting from one to the other, difficult to negotiate without loosing much of the detail captured in the ethnographic data. Furthermore, there is neither an efficient or affordable way to capture it on the scale needed for heterogeneous individual user needs. A way forward involves users collecting data about their own lives with Cultural Probes and Software Engineers providing model-based toolkits to enable end-user development of interfaces and mash-ups of the software and Internet-based services they regularly use in their lives. There is a movement of people towards such end-user development of software and computer-based artifacts – whenever they get the appropriate tools to do so. We can see it in the current rise of mash-ups in the Internet world [2] by hobbyist users. And we see it in the user modifications (so-called mods) in computer games that facilitate user additions to their game playing [18, 16]. End-user development is in part researched in the context of user innovaton – i.e. innovations created by end-users themselves rather than by corporate software houses [23]. Von Hippel and Katz investigated the use of toolkits [24] distributed to endusers, in order for manufacturers to be able to service the unique needs of individuals in what they called “markets of one”. I.e. Some manufacturers have abandoned their
842
S. Goschnick
“increasingly frustrating efforts to understand users’ needs” (ibid) and instead have outsourced need-related innovation tasks to the users themselves. To do so, the tasks involved in bringing a new product into existence, are divided into two interrelated parts: solution-related tasks, and needs-related tasks. The solutions-related tasks are catered for with flexible user-friendly toolkits, initially provided by the manufacturer. The needs-related tasks are what the end-users then do with those toolkits – i.e. they customise the initial product to suit their specific needs. Von Hippel has been researching in the user innovation space since the early 1990’s and in his book Democratizing Innovation [23] he gives two primary reasons that help to explain the recent exponential growth in the user-innovation area: tools that were previously only available to a niche professional base, have become available to mass end-users (in both cost and ease-of-use); and secondly, online communities of end-users confide their needs and share their solutions through the various communication channels and social networks afforded by the Internet. One of the earliest mass-enlistments of end-users via software toolkits, is in the computer games area [15], where numerous games have tens of millions of users, and several of them have tens of thousands of end-users providing additional innovative content and functionality to those games. Prugl and Schreier [18] studied the use of toolkits for the popular computer game The Sims (28 million units sold within 2 years of release), in which they studied samples from four online user communities, with an average of 15,000 members (ibid). Many other games offer toolkits to end-users to extend game functionality and content. Jeppesen and Molin [15] found that of the 94 games they surveyed, 33 of them included toolkits for end-user development. Where von Hippel mainly describes end-user innovation as the way that ‘markets of one’ can be appropriately and cost effectively serviced with the goal of ‘satisfying each user’s needs’, Prugl and Schreier looked deeper into ‘how’ users deal with the invitation to innovate (including the model of open innovation), and they also investigated the attractiveness of end-user designs, to other users. They single out ‘leadingedge’ users as a potential source of radical product development (ibid), since their designs find large user-bases amongst other users in online communities that centre around the toolkits. This is a useful finding since end-user toolkits are used by a minority of users, whereas some of the innovation produced by those users, can be used by many of the rest. In a study about what motivates users to modify the games they play [16], Kadarusman focused on the World of Warcraft (WoW) game, which has over 11 million registered users (as at October 2008). He reported that WoW has more than 4600 user-modifications available for download, the most popular of which was downloaded on average 110,927 times per day. 4.2 Definition of People-Oriented Programming We are now in a position to describe People Oriented Programming (POP) as a new design paradigm for developing individual-oriented systems, and define the four elements that it includes. POP calls upon the individual user in three main capacities: firstly, as the central focus of a customised software system addressing heterogeneous needs, which von Hippel and Katz describe as ‘markets of one’ [24]; secondly as a selfethnographer [12,1] administering and using Cultural Probes [7], personal role models [9,10] and scenarios [19] to gather their own very-specific data (including in the domestic space); and thirdly, as end-user developers, coming up with their own solutions to
People-Oriented Programming
843
match their personal needs, utilising well-engineered software toolkits [24] designed to make the user the centre of innovation in new product development. The fourth element of POP is the cognitive models behind the tools, techniques and frameworks upon which the user toolkits are built. These models are drawn from two disciplines that are not often cross-referenced [11]: the Agent-Oriented paradigm, and Cognitive Task Modelling. E.g. the technology used in this research to pursue People Oriented Programming is the DigitalFriend, which appropriately has its theoretical base anchored upon a century of evolution of models of mind from Analytical Psychology [9,10].
Fig. 5. Home Environment Locale with interface to Java SunSpot sensor (insert)
The following section briefly explores the use of Locales from the ShaMAN metamodel, which has featured in parts of Sections 2 and 3, by way of an example DigitalFriend of a user engaged in POP, with the DigitalFriend V2 toolkit: 4.3 Example Locales in a User’s DigitalFriend In the personalisation of the DigitalFriend for a given individual, numerous private and personally significant Locales are brought into the analysis, and into the technology. For example, a map of the user’s home is represented as a Locale in figure 5. It is cross-related with the Role lens in the DigitalFriend (see insert in figure 5), and is connected with a sensor (a Java SunSpot technology kit, in this prototype) that knows exactly where the user currently is spatially. For example, if the person is in the kitchen sub-Locale, then the DigitalFriend can be set to assume the person is in the default role of cook, and likewise the Resources that become available can be cooking suggestions and recipes. When in the garden, they can be assumed to be in the gardener role by default, and can receive messages about their previous activity against significant Resources (e.g. “You last pruned the Apple tree in late Winter of 2008”) from a trajectory arc (i.e. via log files). With more specialised sensors, they would be able to get information about the state of the garden as they pass by them – e.g. moist content sensors could trigger: “The south-east garden bed has a moisture content well below the recommendation for this time of the year”. The Locales in the model do not have to represent maps or rooms that actually exist, as those in figure 5 do. From the Jungian Psychology that underlines the theory of
844
S. Goschnick
sub-selves behind the many-facetted, many-role model of the individual, also comes the Jungian concept of archetypal symbols [9], which hold common meanings (across people) when they appear in peoples’ dreams. From Jung we are told that to dream often of a given House or Home usually symbolically represents the person’s mind itself in some compartmentised way. The different rooms: kitchen, lounge, dining, bedroom, laundry, etc symbolically representing analogous-aspects of the individual’s life. Therefore an individual user can build ‘the house of their dreams’ in computerbased imagery, either 2D plans or 3D representations, and then link those images/media to the roles in their life within the MAS DigitalFriend, as a personally satisfying and highly intuitive interface.
5 Conclusion People Oriented Programming (POP) as defined above sounds simple i.e.: focus on the heterogeneous needs of individual users; get the users to record their own ethnographic data; and then have them develop their own enhancements to agent-oriented software using user-friendly toolkits. That stated simplicity belies the actual complexity to carry it out. Just as the modern GUI PC is much easier to use than old non-GUI PCs, multiple layers of complexity and indirection were needed to bring about that simplicity of use. Not surprisingly then, ethnography is an inexact way to gather requirements as compared to traditional requirements engineering methods; AO technologies are an order of magnitude more complex that traditional OO languages and frameworks; and useroriented toolkits that are user-friendly enough to build personal systems from disparate services and applications, are complex in terms of designing and building them. However, the POP approach retains the richness from the cultural probe data, through into the technology in a way that reflects peoples social needs, desires and goals, and to the benefit of the collective aspirations of the social worlds they are a part of. The recently reported amount of end-user innovation in the games genre touched on above, is testimony to the approach working, when the mix between user-needs and the functionality of the solution-related technology on offer, is right. The agent-oriented model-based approach to personalising an individual’s interface to the technology in their lives is a natural fit. The mentalistic notions that the AO paradigm abstracts in a computational form, draws upon Psychology, and therefore can be reverse-focused upon programming for individuals, by end-users themselves. The models from cognitive task modeling told us that goals, plans, tasks, actions, roles and objects are represented in the cognitive functioning of the mind. These models confirm those from the agent-oriented paradigm [11], and vice versa, through their strong underlying similarities. And it is the models that will hold POP together as the artifacts shared between the Software Engineers building the toolkits, and the end-users innovating their own creations and customisations with them.
References 1. Arnold, M.: The Connected Home: Probing the Effects and Affects of Domesticated ICTs. In: Proceedings of PDC 2004, ACM, Toronto (2004) 2. Feiler, J.: Web 2.0 Mashups. McGraw Hill, New York (2008)
People-Oriented Programming
845
3. Fleming, M., Cohen, R.A.: User Modeling Approach to Determining System Initiative in Mixed Initiative AI Systems. In: International Conference on User Modeling UM 1997 (1997) 4. Fitzpatrick, G.: The Locales Framework: Understanding and Designing for Wicked Problems. Kluwer Academic Publications, London (2003) 5. Forbrig, P., Dittmar, A.: Bridging the Gap between Scenarios and Formal Models. In: Jacko, J., Stephanidis, C. (eds.) Human Computer Interaction: Theory & Practice (Part I), pp. 98–102. Lawrence Erlbaum Associates, Mahwah (2003) 6. Gaver, B., Dunne, T., Pacenti, E.: Cultural Probes. Interactions 6(1), 21–29 (1999) 7. Gaver, B., Boucher, A., Pennington, S., Walker, B.: Cultural Probes and the Value of Uncertainty. Interactions 11(5), 53–56 (2004) 8. Goschnick, S.B.: ShadowBoard: an Agent Architecture for enabling a sophisticated Digital Self. Thesis, Dept. of Computer Science, University of Melbourne, Australia (2001) 9. Goschnick, S.B.: The DigitalFriend: the First End-User Oriented Multi-Agent System. In: OSDC 2006, Open Source Developers’ Conference, Melbourne, Australia, December 5-8 (2006) 10. Goschnick, S., Balbo, S., Sonenberg, L.: From Task to Agent-Oriented Meta-models, and Back Again. In: Forbrig, P., Paternò, F. (eds.) HCSE/TAMODIA 2008. LNCS, vol. 5247, pp. 41–57. Springer, Heidelberg (2008) 11. Goschnick, S., Graham, C.: Augmenting Interaction and Cognition using Agent Architectures and Technology Inspired by Psychology and Social Worlds. In: Universal Access in the Information Society, vol. 4(19), Springer, Heidelberg (2005) 12. Haesen, M., Coninx, K., Van den Bergh, J., Luyten, K.: MuiCSer: A Process Framework for Multi-disciplinary User-Centred Software Engineering Processes. In: Forbrig, P., Paternò, F. (eds.) HCSE/TAMODIA 2008. LNCS, vol. 5247, pp. 150–165. Springer, Heidelberg (2008) 13. Hughes, J., O’Brien, J., Rouncefield, M., Blythin, S.: Designing with Ethnography: A Presentation Framework for Design. In: Proceedings of Design of Interactive Systems (DIS 1997), pp. 147–158. ACM, Amsterdam (1997) 14. Jeppesen, L.B., Molin, M.J.: Consumers as Co-developers: Learning and Innovation Outside the Firm. Technology Analysis and Strategic Management 15(3), 363–384 (2003) 15. Kadarusman, J.: User-Innovation in the Modding Community of World of Warcraft. Honours Thesis, Department of Information Systems, University of Melbourne (2008) 16. Pressman, R.: Software Engineering: A Practitioner’s Approach. McGraw-Hill, New York (2004) 17. Prugl, R., Schreier, M.: Learning from leading-edge customers at The Sims: Opening up the innovation process using toolkits. R&D Management 36(3), 237–250 (2006) 18. Rosson, M.B., Caroll, J.M.: Usability Engineering: Scenario-based development of humancomputer interaction. Morgan Kaufmann, San Francisco (2001) 19. Stary, C.: Toward the Task-Complete Development of Activity-Oriented User Interfaces. International Journal of Human-Computer Interaction 11(2), 153–182 (1999) 20. Strauss, A.: A Social World Perspective. Studies in Symbolic Interaction 1, 119–128 (1978) 21. Viller, S., Sommerville, I.: Conherence: An Approach to Representing Ethnographic Analyses in Systems Design. Human-Computer Interaction 14, 9–41 (1999) 22. Von Hippel, E.: Democratizing Innovation. MIT Press, Cambridge (2005) 23. Von Hippel, E., Katz, R.: Shifting Innovation to Users via Toolkits. Management Science 48(7), 821–833 (2002), http://userinnovation.mit.edu/papers/10.pdf
Visualization of Software and Systems as Support Mechanism for Integrated Software Project Control Peter Liggesmeyer1,2, Jens Heidrich1, Jürgen Münch1, Robert Kalcklösch2, Henning Barthel1, and Dirk Zeckzer2 1 Fraunhofer IESE, Fraunhofer Platz 1, 67663 Kaiserslautern, Germany {peter.liggesmeyer,jens.heidrich,juergen.muench, henning.barthel}@iese.fraunhofer.de 2 TU Kaiserslautern, Post Office Box 3049, 67653 Kaiserslautern, Germany {kalckloesch,zeckzer}@informatik.uni-kl.de
Abstract. Many software development organizations still lack support for obtaining intellectual control over their software development processes and for determining the performance of their processes and the quality of the produced products. Systematic support for detecting and reacting to critical process and product states in order to achieve planned goals is usually missing. One means to institutionalize measurement on the basis of explicit models is the development and establishment of a so-called Software Project Control Center (SPCC) for systematic quality assurance and management support. An SPCC is comparable to a control room, which is a well known term in the mechanical production domain. One crucial task of an SPCC is the systematic visualization of measurement data in order to provide context-, purpose-, and role-oriented information for all stakeholders (e.g., project managers, quality assurance managers, developers) during the execution of a software development project. The article will present an overview of SPCC concepts, a concrete instantiation that supports goal-oriented data visualization, as well as examples and experiences from practical applications. Keywords: Software Project Control Centers, Visualization Mechanisms, Data Visualization, GQM.
1 Introduction The complexity of software-intensive systems and development projects continues to increase. One major reason is the ever-increasing complexity of functional as well as non-functional software and systems requirements (e.g., reliability or time constraints for safety critical systems). The more complex the requirements, the more people are usually involved in meeting them, which further increases the complexity of controlling and coordinating a project. This, in turn, makes it even harder to develop the system according to plan (i.e., matching time, quality, and budget constraints). Project control issues are very hard to handle. Many software development organizations still lack support for obtaining intellectual control over their software development projects and for determining the performance of their processes and the quality of the J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 846–855, 2009. © Springer-Verlag Berlin Heidelberg 2009
Visualization of Software and Systems as Support Mechanism
847
produced products. Systematic support for detecting and reacting to critical process and product states in order to achieve planned goals and quality is usually missing [15]. One way to support effective control of software development projects is the use of basic engineering principles [7], [19], with particular attention to the monitoring and analysis of actual product and process states, the comparison of actual states with planned states, and the initiation of any necessary corrective actions during project execution. Effectively applying these principles requires the collection, interpretation, and appropriate visualization of measurement data according to a previously measurement goals and plans in order to provide stakeholders with up-to-date information about the project state. One major challenge is to adequately (and partially integrated) visualize process and product properties during project execution so that informed decisions can be made by the relevant stakeholders (such as project managers, quality assurance personnel). This addresses, for instance, early warning mechanisms that recognize insufficient quality characteristics of development products or the ability to generate accurate effort and cost predictions. In the aeronautical domain, air traffic control systems are used to ensure the safe operation of commercial and private aircraft. Air traffic controllers use these systems to coordinate the safe and efficient movement of air traffic (e.g., to make certain that planes stay a safe distance apart or to minimize delays). These systems collect and visualize all critical data (e.g., the distance between two planes, the planned arrival and departure times) in order to support decisions by air traffic controllers. Software project control requires an analogous approach that is tailored to the specifics of the process being used (e.g., its non-deterministic, concurrent, and distributed nature). A Software Project Control Center (SPCC) [15] is a control system for software development that collects all data relevant to project control, interprets and analyzes the data according to the project’s control needs, visualizes the data for different project roles, and suggests corrective actions in the case of plan deviations. An SPCC could also support the packaging of data (e.g., as predictive models) for future use and contribute to an improvement cycle spanning a series of projects. Controlling a project means ensuring the satisfaction of project objectives by monitoring and measuring progress regularly in order to identify variances from the plan during project execution, so that corrective action can be taken when necessary [17]. Planning is the basis for project control and defines expectations, which can be checked during project execution. Project control is driven by different role-oriented needs. We define control needs as a set of role-dependent requirements for obtaining project control. A project manager needs different kinds of data, data of different granularity, or different data visualizations than a quality assurance manager. In this article, we want to illustrate selected existing project control approaches (Section 2), and then focus on a concrete instantiation that supports goal-oriented data visualization, the so called Specula approach (Section 3). Afterwards, we will present advanced visualization mechanisms used for controlling risks and quality of development projects and selected lessons learned from their application (Section 4). Finally, we will give a summary and illustrate future research fields (Section 5).
848
P. Liggesmeyer et al.
2 Related Work An overview of the state of the art in Software Project Control Centers can be found in [15]. Most of the existing, rather generic, approaches for control centers offer only partial solutions. Especially purpose- and role-oriented usages based on a flexible set of techniques and methods are not comprehensively supported. In practice, many companies develop their own dashboards (mainly based on Spreadsheet applications) or use dashboard solutions that provide a fixed set of predefined functions for project control (e.g., deal with product quality only or solely focus on project costs) and are very specific to the company for which they were developed. The indicators used to control a development project depend on the project’s goals and the organizational environment. There is no default set of indicators that is always used in all development projects in the same manner. According to [14], a “good” indicator has to (a) support analysis of the intended information need, (b) support the type of analysis needed, (c) provide the appropriate level of detail, (d) indicate a possible management action, and (e) provide timely information for making decisions and taking action. The concrete indicators that are chosen should be derived in a systematic way from the project goals [12], making use of, for instance, the Goal Question Metric (GQM) approach [3]. Some examples from indicators used in practice can be found in [1]. With respect to controlling project cost, the Earned Value approach provides a set of commonly used indicators and interpretation rules. With respect to product quality, there exists even an ISO standard [10]. However, the concrete usage of the proposed measures depends upon the individual organization. The test / diagnosis of complex systems was put on a formal basis in 1967 by [16]. One of the drawbacks was addressed by [13]. A good overview of system diagnosis models can be found in [2]. With respect to the visualization and applicable tools, an overview is presented in [20].
3 Goal-Oriented Software Project Control Specula [8] is a state-of-the-art SPCC. It interprets and visualizes collected measurement data in a goal-oriented way in order to effectively detect plan deviations. The control functionality provided by Specula depends on the underlying goals with respect to project control. If these goals are explicitly defined, the corresponding functionality is composed out of packaged, freely configurable control components. Specula provides four basic components: (1) a logical architecture for implementing software cockpits, (2) a conceptual model formally describing the interfaces between data collection, data interpretation, and data visualization [9], (3) an implementation of the conceptual model, including a construction kit of control components, and (4) a methodology of how to select control components according to explicitly stated goals and customize the SPCC functionality [8]. The methodology is based on the Quality Improvement Paradigm (QIP) and makes use of the GQM approach [3] for specifying measurement goals. QIP is used to implement a project control feedback cycle and make use of experiences and knowledge gathered in order to reuse and customize control components. GQM is used to drive the selection process of finding the right control components according to defined goals. The different phases that have to be
Visualization of Software and Systems as Support Mechanism
849
considered for setting up and applying project control mechanisms can be characterized as follows: I. Characterize Control Environment: First, stakeholders characterize the environment in which project control shall be applied in order to set up a measurement program that is able to provide a basis for satisfying all needs. II. Set Control Goals: Then, measurement goals for project control are defined and metrics are derived determining what kind of data to collect. In general, any goal derivation process can be used for defining control objectives. For practical reasons, we focus on the GQM paradigm for defining concrete measurement goals addressing the measurement object, purpose, quality focus, viewpoint, and context information. III. Goal-oriented Composition: Next, all control mechanisms for the project are composed based on the defined goals in order to provide online feedback on the basis of the data collected during project execution; that is, control techniques and visualization mechanisms are selected from a corresponding repository and instantiated in the context of the project that has to be controlled. This process is driven by interpretation and visualization models that clearly define which indicators contribute to specific control objectives, how to assess and aggregate indicator values, and how to visualize control objectives and intermediate results. IV. Execute Project Control Mechanisms: Once all control mechanisms are specified, a set of role-oriented views is generated for controlling the project. When measurement data are collected, the control mechanisms interpret and visualize them accordingly, so that plan deviations and project risks are detected and a decision-maker can react accordingly. If a deviation is detected, its root cause must be determined and the control mechanisms have to be adapted accordingly. This, does, for example, require data analyses on different levels of abstraction in order to be able to trace causes of plan deviations. V. Analyze Results: After project completion, the resulting visualization catena has to be analyzed with respect to plan deviations and project risks detected in time, too late, or not detected at all. The causes for plan deviations and risks that were detected too late or that were not detected at all have to be determined. VI. Package Results: The analysis results of the control mechanisms that were applied may be used as a basis for defining and improving control mechanisms for future projects (e.g., selecting the right control techniques and data visualizations, choosing the right parameters for controlling the project). Fig. 1 illustrates the basic conceptual modules of the Specula approach. The customization module is responsible for selecting and adapting the control components according to project goals and characteristics and defined measurement (control) goals. It is possible to include past experience (e.g., effort baselines, thresholds) in the selection and adaptation process. This experience is stored in a experience base. A Visualization Catena (VC) is created, which formally describes how to collect, interpret, and visualize measurement data. The set of reusable control components from which the VC is instantiated basically consists of integrated project control techniques (for interpreting the data in the right way) and data visualization mechanisms (for presenting the interpreted data in accordance with the role interested in the data). The central processing module collects measurement data during project performance and
850
P. Liggesmeyer et al.
interprets and visualizes them according to the VC specification. Measurement data can be retrieved automatically from project repositories or manually from data collection forms and formal documents. Finally, charts and tables are produced to allow for online project control. A packaging module collects feedback from project stakeholders about the application of the control mechanisms and stores them in an Experience Base (e.g., whether a baseline worked, whether all plan deviations were detected, or whether retaliatory actions had a measurable effect). Using these modules, the Specula framework is able to specify a whole family of project control centers (which is comparable to a software product line for control centers). Project Members GQM GQM Plan Plan
G G Q Q
M M
Project Planners
M M
Q Q M M
GQM GQM VI VI FI FI DE DE WFI WFI
Documents Q Q M M
Customisation Customisation
Goals Goals and and Characteristics Characteristics
M M
Project Project Repositories Repositories
Experience Experience Base Base
Data Data Collection, Collection, Interpretation, Interpretation, and and Visualisation Visualisation
VI VI
Visualisation Visualisation Catena Catena VI VI
Goal Goal Question Question Metric Metric View View Instance Instance Function Function Instance Instance Data Data Entry Entry Web Web Form Form Instance Instance Association Association Data Data Flow Flow
Packaging Packaging
Online Online Control Control
VI VI
VI VI
FI FI
Control Control Components Components
FI FI
FI FI
FI FI
DE DE
DE DE
DE DE
DE DE
WFI WFI
WFI WFI
DB DB
IDE IDE
Project Stakeholders
Fig. 1. Overview of the Specula framework
Fig. 2. Example visualization of a simple hierarchical Gantt chart
The Specula approach was evaluated as part of industrial case studies in the SoftPit project (a public German research project, no. 01ISE07A) in which a prototypical implementation of the concepts was used. Results of the first two iterations can be
Visualization of Software and Systems as Support Mechanism
851
found in [5] and [6]. In general, people perceived the usefulness and ease of use of the Specula control center as positive. However, usefulness and ease of use also varied across the different case study providers depending on the state of the practice before introducing the control center solution and it also largely varied across the different visualization mechanisms used. In the Soft-Pit case, mostly “standard visualizations” for project control were used, such as Gantt charts, line/bar charts, tables/matrixes and simple trees (see, e.g., Fig. 2). One major success factor for the usefulness of visualizations was how intuitive the visualization can be interpreted. Especially, when aggregating data and complex, multi-dimensional relationships needs to be illustrated, this requires more advanced visualization concepts.
4 Advanced Visualization Mechanisms The following sub-sections present examples for advanced mechanisms used for visualizing the risks and quality of development projects and summarize lessons learned from their application. Further visualization mechanisms of quality properties, especially safety and security for embedded systems, is currently investigated in the German research project ViERforES (see http://www.vierfores.de).
Fig. 3. 3D Treemap visualizing different metrics
4.1 Visualizing Code Quality To analyze the quality of a software system, metrics are used that measure certain attributes of the software’s internal structure. Many metric tools exist that are able to define such metrics and collect measurement data automatically. For analysis purposes different visualization techniques such as node-link diagrams and graphs are used in order to help the user in drawing conclusions about the quality of the software system. These techniques use a limited set of graphical elements like text, simple
852
P. Liggesmeyer et al.
geometric shapes or uniform color fills to highlight relevant attributes of the software system being visualized. For combining different metric values within one picture, a 3D-Treemap technique (see Fig. 3. ) was developed and integrated into a code analysis system at Fraunhofer IESE. This visualization mechanism allows us to map data measuring code quality to different graphical properties of each cube (such as position, size, height, and color). To further analyze these values, the user is able to interactively define new views, pan and zoom within the 3D scene and use a pull-down menu to initialize other measurement or visualization actions. 4.2 Visualizing Risk Management Risk management and especially risk avoidance plays an important role in all development and construction activities. For managing risks, a structured process is mandatory. Visualization is used for analyzing risks and supporting managers in deciding upon necessary actions. Siemens developed a methodology named sira that is used for collecting data about possible risks [4]. This includes structured interviews for determining possible risks, their probability and importance as well as the possible damage that may be caused. Based on this analysis, a risk portfolio is created. In order to analyze these risks, the so-called sira bubble charts were created that summarize all necessary information that has to be discussed with the customer (see Fig. 4 and [4]).
Fig. 4. sira.iris, a visualization of a risk portfolio
4.3 Fault Detection in Distributed Systems The functionality of a software system is distributed over many components and the interaction between these components plays a crucial role. In order to analyze the reliability of a system, data about communicating components and their errorproneness is collected. The output is analyzed using an interactive visualization (see Fig. 5 and [20]). In this visualization color coding is used to characterize the “faultiness” of components with respect to communication relations. The ratio between faulty communication and the overall amount of communication is used for coloring
Visualization of Software and Systems as Support Mechanism
853
all nodes and edges of the graph, indicating starting points for bug fixing activities. The overall approach is described in [11]. Interaction plays an important role in this application. For instance, changing colors helps understanding the impact of faults. Changing transparency of clusters helps understanding structural information about the system.
Fig. 5. Faults collected during the execution of a system
5 Conclusions This article presented the basic concept of an SPCC for establishing project control by means of systematic visualization mechanisms. We illustrated existing approaches and presented a goal-oriented way to establish project control by formalizing the way measurement data are interpreted and visualized according to a previously defined measurement goal. Existing approaches offer mostly partial solutions. Especially goal-oriented usages based on a flexible set of techniques and methods are not comprehensively supported [15]. The expected benefits of the goal-oriented visualization approaches include: (1) improvement of quality assurance and project control by providing a set of custom-made views of measurement data, (2) support of project management through early detection of plan deviations and proactive intervention, (3) support of distributed software development by establishing a single point of control, (4) enhanced understanding of software processes, and improvement of these processes, via measurement-based feedback, and (5) preventing information overload through custom-made views with different levels of abstraction. An important research issue in this context is the development of a schema for adaptable control techniques and methods, which effectively allows for purposedriven usage of an SPCC in varying application contexts. Another research issue is the elicitation of information needs for the roles involved and the development of
854
P. Liggesmeyer et al.
mechanisms for generating adequate role-oriented visualizations of the project data. Another important research issue is support of change management. When the goals or characteristics of a project change, the real processes react accordingly. Consequently, the control mechanisms, which should always reflect the real world situation, must be updated. This requires flexible mechanisms that allow for reacting to process variations. One long-term goal of engineering-style software development is to control and forecast the impact of process changes and adjustments on the quality of the software artifacts produced and on other important project goals. Goal-oriented visualization mechanisms can be seen as a valuable contribution towards reaching this goal.
References 1. Agresti, W., Card, D., Church, V.: Manager’s Handbook for Software Development. SEL 84-101, NASA Goddard Space Flight Center. Greenbelt, Maryland (November 1990) 2. Barborak, M., Malek, M., Dahbura, A.T.: The Consensus Problem in Fault-Tolerant Computing. ACM Computing Surveys 25(2), 171–220 (1993) 3. Basili, V.R., Caldiera, G., Rombach, D.: The Experience Factory. Encyclopaedia of Software Engineering 1, 469–476 (1994) 4. Bülte, H., Mäckel, O.: Mehr sehen mit sira: Mit einem Blick IT-Projekte durchleuchten. In: SE 2009, Kaiserslautern, Germany (2009) 5. Ciolkowski, M., Heidrich, J., Münch, J., Simon, F., Radicke, M.: Evaluating Software Project Control Centers in Industrial Environments. In: Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, pp. 314–323. IEEE Computer Society, Los Alamitos (2007) 6. Ciolkowski, M., Heidrich, J., Simon, F., Radicke, M.: Empirical results from using custom-made software project control centers in industrial environments. In: Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 243–252. ACM, Kaiserslautern, Germany (2008) 7. Gibbs, W.W.: Software’s Chronic Crisis. Scientific American, 86–95 (1994) 8. Heidrich, J., Münch, J.: Goal-Oriented Setup and Usage of Custom-Tailored Software Cockpits. In: Jedlitschka, A., Salo, O. (eds.) PROFES 2008. LNCS, vol. 5089, pp. 4–18. Springer, Heidelberg (2008) 9. Heidrich, J., Münch, J.: Cost-Efficient Customisation of Software Cockpits by Reusing Configurable Control Components. In: Dekkers, T. (ed.) Proceedings of the 4th Software Measurement European Forum, SMEF 2007, Rome, Italy, May 9-11, 2007, pp. 19–32 (2007) 10. ISO 9126: Software Engineering – Product Quality. Technical Report. ISO/IEC TR 9126. Geneva (2003) 11. Kalcklösch, R.: Gossip-Based Diagnosis of Arbitrary Component-Oriented Systems. Technische Universität Kaiserslautern, PhD Thesis (2008) 12. Kitchenham, B.A.: Software Metrics. Blackwell, Oxford (1995) 13. Kuhl, J.G., Reddy, S.M.: Distributed fault-tolerance for large multiprocessor systems. In: ISCA 1980: Proceedings of the 7th Annual Symposium on Computer Architecture, La Baule, United States, pp. 23–30. ACM Press, New York (1980) 14. McGarry, J., Card, D., Jones, C., Layman, B., Clark, E., Dean, J., Hall, F.: Practical Software Measurement – Objective Information for Decision Makers, 1st edn. AddisonWesley Professional, Reading (October 15, 2001)
Visualization of Software and Systems as Support Mechanism
855
15. Münch, J., Heidrich, J.: Software Project Control Centers: Concepts and Approaches. Journal of Systems and Software 70(1), 3–19 (2003) 16. Preparata, F.P., Metze, G., Chien, R.T.: On the Connection Assignment Problem of Diagnosable Systems. IEEE Transactions on Electronic Computers EC-16(6), 848–854 (1967) 17. Project Management Institute: A Guide to the Project Management Body of Knowledge (PMBOK® Guide) 2000 edn. Project Management Institute, Four Campus Boulevard, Newtown Square, PA 19073-3299 USA (2000) 18. Rombach, H.D., Verlage, M.: Directions in Software Process Research. Advances in Computers 41, 1–63 (1995) 19. Shaw, M.: Prospects for an Engineering Discipline of Software. IEEE Software 7(6), 15– 24 (1990) 20. Zeckzer, D., Schröder, L., Kalcklösch, R., Hagen, H., Klein, T.: Analyzing the Reliability of Communication between Software Entities Using 3D Force-Directed Layout of Clustered Graphs. In: ACM Conference on Software Visualization (SoftVis 2008), Herrsching am Ammersee, Germany, September 16-17 (2008)
Collage: A Declarative Programming Model for Compositional Development of Web Applications Bruce Lucas, Rahul Akolkar, and Charlie Wiecha IBM T.J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598 {bdlucas,akolkar,wiecha}@us.ibm.com
Abstract. Collage is a declarative programming model and runtime expressly targeted at building and deploying cross-organizational software as compositions of web components. Collage is based on an RDF data model, data-driven execution model, and flexible support for cross-organizational composition of both application and UI components. In this paper we outline a uniform set of Collage language features addressing end-to-end application design, including business objects, but with particular focus on user interaction, and adaptation to current interaction platforms such as web browsers. Keywords: Declarative languages, Cross-organizational applications, Distributed Computing, Resource Description Framework, Constraint-based programming.
1 Introduction The goal of the Collage project is to design a radically simplified declarative programming model and runtime expressly targeted at building and deploying crossorganizational software as compositions of web components. An additional goal is to support an evolutionary style of software development that allows rapid application prototyping, but also enables progressive refinement of the initial prototype into a hardened asset. The interaction architecture and presentation model in Collage allows for the description of user interfaces at a range of levels of abstraction and mapping onto a range of interaction devices. In the sections below we outline a simplified and uniform set of Collage language features addressing end-to-end application design, including business objects, but with particular focus on user interaction, and adaptation to current interaction platforms such as web browsers.
2 RDF-Based Data Model The Collage data model is built on a core subset of RDF [1]. RDF was developed as part of the vision of a semantic web, that is, a web of knowledge based on fundamental principles of representation of and reasoning about knowledge. However, the core challenge presented by such a web - managing distributed inter-organizational connected graphs of data - is also shared by web application developers. Collage focuses on and builds on J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 856–865, 2009. © Springer-Verlag Berlin Heidelberg 2009
Collage: A Declarative Programming Model
857
the basics of RDF that enable it to meet that challenge, thus applying RDF to a somewhat different domain than that which initially motivated its development. Each resource may have zero or more associated classes. Classes are named by URIs. RDF represents the classifications of a resource R as a set of triples with R as subject, rdf:type as predicate, and the associated class(es) as object(s). RDF supports multiple classification by associating any number of classification triples with a resource. In Collage multiple classification is an important mechanism for supporting composition: • Collage provides declarative (and therefore easily composable) language constructs by which separately authored program artifacts may independently specify different classifications for the same resource. When those program artifacts are composed to form a single program, the resource is multiply classified. • Collage provides declarative (and so composable) language constructs for associating program behavior, such as creation of dependent structure and reaction to resource updates, with resources on the basis of their classification(s). When those program artifacts are composed to form a single program, the multiply classified resource thereby acquires multiple independently specified facets to its associated behavior. The features discussed in this section that support a flexible approach to class definition and use play an important role in Collage, for such diverse purposes as flexible cross-organizational composition of programs and processes, user interface styling, device adaptation, and so on. The Collage execution model centers (as do many execution models) around a concept of mutable entities with composite values that may be read, and whose values evolve over time by being updated. The RDF data model underlying Collage lacks such a notion, so the Collage data model builds on the RDF model by introducing the concept of resources with associated composite values to model mutable entities. The recursively-composed value of a resource R is represented in Collage by a tree of RDF resource nodes and triples rooted at R. Thus RDF triples are used in Collage both to model a tree-structured composite value of a resource, and to connect related resources in an arbitrary graph. The triples comprising the value of a resource are distinguished from triples that connect resources by having a predicate that is a subproperty of a distinguished property, c:value, defined by Collage. A heavy line in the graphical notation shown in Figure 1 indicates value edges.
3 Execution Model Collage is a programming model for building reactive systems. Thus the Collage execution model defines how the runtime state of a program changes in reaction to the occurrence of external events such as user inputs or messages from external systems; or in reaction to the occurrence of asynchronous internal events. The Collage execution model is data centric, meaning that all aspects of the execution model are described in terms of changes of the runtime state of the program. The runtime state of a program comprises a set of RDF triples contained in a triple store.
858
B. Lucas, R. Akolkar, and C. Wiecha
Fig. 1. Collage c:value properties (shown in bold) model composite values in RDF
Execution in Collage is driven by updates. An update is the assignment of a value to a resource (even if that value is the same as the value that the resource has before the update). An update may be either an initiating update or an ensuing update. Each external event, such as a user input, is reflected in Collage as an initiating update. In response to an initiating update, a Collage program performs an execution cycle. An execution cycle performs a cascade of ensuing updates, as declaratively directed by the Collage program constructs. In effect, the initiating updates may be viewed as "clock ticks" that define a succession of values for the resources in the cascade of the initiating update. The Collage language provides constructs that the programmer uses to specify updates to the runtime state. Following the dichotomy in the data model between value and non-value triples, the execution model provides two classes of constructs for updating program state: • The construct declaratively specifies how the value of a resource changes in response to changes in the values of other resources, using XQuery as a functional value computation language. • The and constructs declaratively specify the creation of new resources, relate new and existing resources by creating and destroying non-value triples, and change the classification of new and existing resources, all based on current resource values. During any given execution cycle a set of binds will be triggered by the update of an input resource. A bind that is triggered is called a relevant bind. Each resource that will be updated is updated exactly once by the execution of a construct, after all input resources to that bind have been updated. Thus each resource has a welldefined old value, which is its value prior to the start of the execution cycle; and a new value, which is its value after the completion of the execution cycle. When a construct executes, it has access to both the old and the new values of its input resources. Binds that use new values can be used to model constraintlike updates, for example keeping a data or presentation view in sync with the data being viewed. Binds that use old values can be used to model non-idempotent operations such as adding to a total, appending an item to a list, or inserting a record into a database.
Collage: A Declarative Programming Model
859
After performing the resource value updates directed by the constructs (by creating and destroying value triples associated with a resource), the Collage program creates new resources, resource relationships (represented as non-value triples), and classifications (also represented as non-value triples), based on the new resource values. These actions are declaratively directed by and constructs. The Collage programming model constructs are declarative, meaning that they may be viewed as conditional constraints on the runtime state. These constraints may be constraints either on the new state of the system after the completion of an execution cycle, or constraints relating the new state of the system after the completion of an execution cycle to the old state of the system prior to the initiating update. At the completion of an execution cycle, the state of the system will satisfy the constraints specified by all and constructs, and the constraints specified by the relevant binds, which are the binds whose execution was triggered on that execution cycle. There are other programming models that work with RDF data stores, by providing the necessary data layer for existing scripting languages. ActiveRDF [2] is an objectoriented API for managing RDF data which can be used as a data layer in Ruby-on-Rails applications. In contrast with ActiveRDF and Ruby, Collage provides a declarative datadriven programming model that supports live data connections, or queries.
4 End-to-End Example The example shown above is a simple end-to-end Collage application that provides a form (1) that allows querying and updating a relational database of weather information (2). The construct associates user interface elements such as inputs (3) and triggers (4) with the WEATHERMAN resource class that represents the form. The construct (5) uses the "city" input field to select a row from the database, recording it using the "selected" property. The construct (6), triggered by the "set" trigger (4), updates the database with the quantity in the "temperature" input field, after converting Fahrenheit to Celsius. A similar construct (7) retrieves the temperature from the database, converting Celsius to Fahrenheit.
Fig. 2. An end-to-end example Collage application – query and update temperatures
860
B. Lucas, R. Akolkar, and C. Wiecha
As indicated by the dashed boxes above, the application may be distributed among multiple computing nodes, such as a browser and a server. Distributed data structures are formed by triples that connect resources with URLs residing on different computing nodes, such as (5). Distributed execution occurs when constructs that reference distributed data structures are executed, such as (6) and (7).
5 Presentation and Interaction Model The presentation and user interaction model in Collage allows the description of user interfaces, or application "front-ends", at a range of levels of abstraction. A recursive MVC pattern is supported, allowing the developer to refine an abstract user interface description through successive levels of more concrete specification. The following diagram summarizes the Collage application data-centric recursive MVC model. The model of an instance of the MVC pattern is represented by a Collage resource. The view of MVC is a set of resources associated with the model, whose instantiation is declaratively driven by the construct. The controller of MVC is a set of constructs that update the model resource in response to updates to the view resources, and vice versa. The set of resources that comprise a view of a model resource may themselves serve as models for other views, thus supporting a recursive MVC pattern. The set of resources comprising a view, together with the -based controller that connects the view resources with the model resource, may also be seen as a more concrete refinement of an abstraction represented by the model. Each instance of this MVC pattern in Collage is referred to as an abstract interaction unit. The development of a user interface in Collage centers around abstract interaction units, and in particular, around the definition of the semantics of the class of an abstract interaction unit's model resource. Composite abstract interaction units are defined by declaratively decomposing them into a collection of subordinate interaction units defined elsewhere.
Fig. 3. Recursive MVC design pattern
Collage: A Declarative Programming Model
861
A variety of composition patterns are supported in this way: • In a typical composition pattern, a subordinate interaction unit is created corresponding to each piece of the model of the composite interaction unit that needs to be presented to the user or collected from the user. • Collections of interaction units may be aggregated based on application semantics, constituting an abstract form. • Interaction units may be designed with an eye to reuse, or they may be designed for a single use. The Collage pattern described above supports either use case equally. The MVC recursion is grounded in built-in abstract interaction primitives, such as , , and, each of which represents a primitive unit of presentation and associated reactive behavior. The abstract interaction primitives are discussed in greater detail in subsequent sections. Necessary user interface layout information for each interaction unit is then specified in terms of an associated abstract layout tree. The layout tree comprises a set of layout containment triples whose predicate is the Collage layout containment property, . The subjects in the layout containment triples are resources whose class indicates the kind of layout. Styling information is added similarly, either interspersed with layout information or separately per developer choice, using the Collage styling property c:style. The subjects in styling triples may be either interaction units or layout units. Top level user interaction window resources in Collage have a c:top classification. An abstract layout tree is presented to the user by either connecting the root of the tree to a c:top resource via a containment triple, or by multiply classify the root of a layout tree as a c:top. Page flow is achieved by using classification and containment mechanisms to dynamically change the content of a top-level window. 5.1 Abstract Interaction Primitives Collage provides built-in abstract interaction primitives, which form the basis of the user interface or application “front-end” definition. The built-in primitives each constitute an abstract unit of user interaction, encapsulating an element of an abstract user interface. These built-in primitives are defined as resource classifications. The builtin set of primitives provides a common ground for development of various concrete user interfaces, such as for supporting different types of devices and interaction modalities.
Fig. 4. Structure of an input with value, label, and hint properties. Labels and hints have value properties in turn for their content.
862
B. Lucas, R. Akolkar, and C. Wiecha
For example, the abstract interaction primitive is used to collect a single piece of information, as an RDF literal. The dependent structure for , the application code that supplies its values, and a possible rendering are exemplified in figure below. A primitive represents a simple one-line input field. A resource has an associate RDF string literal value, which at all times represents the current value of the input field as seen by the user. Thus the value associated with the resource initially represents the default value of the input field, and subsequently represents the response from the user for this unit of interaction. When the user enters a new value into the input field, the system responds by changing the value of the resource, which generates an update of the resource that in turn initiates a Collage execution cycle. The execution cycle may result in the update of a number of other resource values, such as the values of the models further up the tree. In addition, the resource has associated resources whose values represent information such as labels and hints to be presented to the user in association with the input field. 5.2 Layout and Styling The abstract interaction primitives and abstract composition units described above are abstract in the sense that they do not provide information about concrete rendering, such as the physical layout of the interaction units on a screen. This section describes the Collage mechanism for overlaying concrete layout and styling information on the abstract interaction mechanism described above.
Fig. 5. Layout of a login dialog using RDF triples describing containment relations
Collage provides a mechanism for describing a layout tree as an overlay on the abstract interaction units. A layout tree consists of • a set of resources connected as a tree by the Collage layout containment property c:contains; • a root node, such as a c:top resource representing a top-level user interaction window; • a set of interior nodes which are resources of a layout container class such as the built-in containers , , and ; and
Collage: A Declarative Programming Model
863
• a set of leaf nodes which are resources of one of the abstract interaction primitives described above (, , etc.). The following example illustrates a simple layout tree for a login form. In the layout tree diagram, the c:contains triples are shown as blue lines. The separation of concerns between the abstract user interface definition and the layout tree and styling information can be factored in, for example, by having separate resource classifications for the concerns, which may be defined in the same or different Collage program units, by the same or different developers. 5.3 Renderkits The preceding sections describe the mechanisms by which a developer defines abstract and concrete elements of a user interface. In order to present the user interface on any particular device, it must be rendered using a technology that may be specific to the device. A rendered user interface is implemented in Collage as a refinement of the user interface described using the presentation primitives and layout mechanism described above. The resource classifications necessary for such refinement are defined in a renderkit. The following diagram illustrates the relationship between • resources that are created by code that a typical application developer writes: model, abstract UI, and layout, on the left; and • resources that are created by a renderkit, on the right, which most developers need not be concerned with.
Fig. 6. Layers of abstraction in a Collage user interface, and RDF mapping among layers
We anticipate most developers will not need or want to author a renderkit, but rather will use a pre-existing renderkit, such as the system-provided renderkit(s). However, it is possible that advanced developers might want to author a renderkit using the Collage programming model.
864
B. Lucas, R. Akolkar, and C. Wiecha
6 Browser Model Collage is designed to support distributed systems: it is built on RDF, a fundamentally distributed data model, and via its execution model distributed computations as well. Collage extends application distribution all the way to the browser, treating the browser as a Collage computing node (albeit with possibly limited capability). In our current implementation, upon first contacting a server with an HTTP GET request, a browser is sent a bootstrap Javascript program whose execution makes the browser a limited Collage node, capable of managing resources related to the presentation of the user interface. After this initial bootstrap exchange further communication between the browser and the server is via HTTP POSTs using the Collage protocol outlined above. This is accomplished with an AJAX-style JavaScript Collage Engine running in the browser. In effect, Collage functions as a high-level programming environment supporting AJAX-style applications. The browser is capable of initiating updates for resources related to user inputs, of receiving notification of updates to resources related to user outputs, and of receiving and displaying Collage value trees, represented as XML documents, that represent screen layouts. It is envisioned that portions of a Collage program may be compiled to Javascript and loaded into the browser for execution to create a more responsive user experience. The Collage protocol infrastructure maintains equality between the server and browser copies of the rendered user interface representation by sending Collage update messages, in both directions. Update messages from the browser to the server represent user actions such as field inputs and button presses, while update messages from the server to the browser represent the server’s response to user actions, or asynchronous "push" updates to the current browser page.
7 Discussion and Related Work The use of Collage to model user interaction can be compared with earlier work in the area of User Interface Management Systems (UIMS)[3], model-based UI design methods [4], and constraint systems for UI execution models [5]. Collage bind, let, and create expressions, along with their associated execution model, clearly avoid the well-known limitations of traditional UIMS – namely the rigidity deriving from a fixed separation between “application” and “user interface. Collage provides a generic relation and constraint management mechanism suitable to an open ended set of design patterns – some of which may correspond to typical UI layerings. We are interested in the more general case in which an application may be decomposed more continuously between back-end and front-end UI logic -- via the same underlying set of abstractions. More recent work on Model-based UI design [4] avoids the rigidity of traditional UIMS but typically still assumes a given set of model types and relations. Collage can be viewed as a lower-level framework suitable for representing an open-ended set of ontologies for model-based UI design. Note that some earlier systems for modelbased UI design, including XIML [6], are extensible in their model, attribute, and relation types as well, and are intended as more open-ended UI data frameworks.
Collage: A Declarative Programming Model
865
Collage, in addition, provides an execution semantics that supports runtime evaluation of structural and value changes across a set of linked models. Collage’s support for implicitly propagating structure and value change events in a distributed application makes possible late design decisions in partitioning applications across multiple nodes in a network. Current web design practice involves one programming model on the server to generate markup and script, and another on the client to execute them. This split forces very early and rigid design decisions in partitioning applications to suit a given network topology. Collage allows a uniform endto-end specification language allowing for flexible retargeting and redeployment of components as desired. Function originally developed for the server might be migrated to the client to aid performance. Conversely, function originally prototyped on the client might be migrated to a hosted server-based environment for larger-scale managed deployment. Finally, we can compare Collage with XForms [5], an existing declarative datadriven language for web applications. Collage builds on and generalizes many of the concepts familiar from XForms to produce a uniform programming model across all application tiers. The XML tree-based MVC design of XForms is made recursive and generalized to RDF graphs in Collage. The concept of view-model and model-model binding is expanded to a general purpose computation model based on resourceresource binding. Data-driven user interface instantiation is generalized to declarative resource instantiation. The event-driven execution model of XForms is simplified and regularized by the data centric update-driven execution model of Collage.
References 1. Klyne, G., Carroll, J. (eds.): Resource Description Framework (RDF): Concepts and Abstract Syntax. World Wide Web Consortium (2004), http://www.w3.org/TR/rdf-concepts 2. Oren, E., Delbru, R., Gerke, S., Haller, A., Decker, S.: ActiveRDF: Object-Oriented Semantic Web Programming. In: 16th International Conference on the World Wide Web, pp. 817– 824 (2007) 3. Olsen, D.: User Interface Management Systems: Models and Algorithms. Morgan Kaufmann, San Mateo (1992) 4. Trætteberg, H., Molina, P., Nunes, N. (eds.): MBUI 2004, First International Workshop on Making Model-based User Interface Design Practical, Funchal, Portugal (2004), http://ftp.informatik.rwth-aachen.de/Publications/ CEUR-WS/Vol-103/ 5. Boyer, J., Dubinko, M., Klotz, L., Landwehr, D., Merrick, R., Raman, T.V. (eds.): XForms 1.0, 3rd edn. World Wide Web Consortium (2007), http://www.w3.org/TR/xforms 6. Puerta, A., Eisenstein, J.: Towards a General Computational Framework for Model-Based Interface Development Systems. Knowledge-Based Systems 12, 433–442 (1999)
Hypernetwork Model to Represent Similarity Details Applied to Musical Instrument Performance Tetsuya Maeshiro1, Midori Maeshiro2, Katsunori Shimohara3, and Shin-ichi Nakayama1 1
School of Library and Information Science, University of Tsukuba, 1-2 Kasuga, Tsukuba, 305-8550 Japan {maeshiro,nakayama}@slis.tsukuba.ac.jp 2 School of Music, Federal University of Rio de Janeiro, Rua do Passeio, 98, Lapa, Rio de Janeiro, Brazil
[email protected] 3 Faculty of Engineering, Doshisha University, 1-3 Tatara Miyakodani, Kyotanabe, 610-0394 Japan
[email protected] Abstract. This paper treats the quantification and description of similarities among entities being represented as a network. The proposed representation model, hypernetwork model, allows more specific description of relationships among represented entities than conventional knowledge representation models. Musical instruments performance is represented with hypernetwork model. Detailed description of similarity relationships provided by the hypernetwork model enables the discrimination of various types and degrees of similarity. A method to compare similar relationships is also discussed, which leads to the analogical reasoning, associative search and retrieval.
1 Introduction This paper treats the detailed description of relationships among entities represented as a network, where a node represents an instance of entity and nodes are linked if represented instances are related. Basic assumption is that target entities can be represented using three elements that constitute a network representation, namely nodes, edges and edge labels. For instance, knowledge is represented as a set of concepts and semantic relationships among them. A central problem that resides is the quantification of "similar" or "similarity" among entities. Detailed description of similarity relationships provided by the proposed model enables the discrimination of various types and degrees of similarity. Furthermore, it is also possible to compare similarity degrees and ranks according to likeness, which leads to the analogical reasoning, associative search and retrieval. Any human computer interaction and communication, whether verbal or nonverbal, requires that computers store information exchanged with humans based on some kind of representation model. Description of similarity is not possible using conventional knowledge representation models such as semantic network [1]. The reason is that these models are based J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 866–873, 2009. © Springer-Verlag Berlin Heidelberg 2009
Hypernetwork Model to Represent Similarity Details
867
on graph [2] : (1) Links (edges) have no child nodes attached to describe the links, and only labels are allowed. (2) When one creates a graph (network) to describe association among link labels, this graph should exist independently from the network whose link labels were extracted. However, this “link-label network” should be integrated for more powerful representation. The hypernetwork model, the proposed representation model, allows a specification of relationships. An immediate advantage of a precise description of similarity is a more precise comparison of entities represented in networks, because accurate comparison allows precise detection of subnetworks with desired degree of similarity. Identification of relevant subnetworks, or the subnetworks classified as highly similar, is crucial for pattern matching and analogical reasoning, and the latter relies totally on how similar subnetworks are extracted from the whole network. The precision and quality of reasoning are directly influenced from the quality of matched subnetworks. Exact match is not useful because analogy exploits similarity or approximation among data or information. Conventional knowledge representation models such as semantic network [1], frame and entity-relation model [3] are all equivalent models. These conventional models require the use of predefined keywords (controlled keywords) to label the links to specify relationships. To indicate that two nodes represent entities or objects that are alike or related, a simple "similar" or "related" labels are commonly used without distinguishing how similar or related the connected nodes are. A possible solution is to provide unique label for all relationships, but this is unreal due to the problem related with frame problem or symbolic AI problem. Quantification of similarity among entities implies that two relationships can be related objectively utilizing verifiable method. Evidently, subjectiveness may occur in the description of each similarity relationship, but the comparison is reproducible and gives identical results. Once the specification of similarity is provided, comparison of similarity and quantification of similarity, both absolute and relative, are possible. An efficient network matching mechanism for the proposed method is also discussed, based on a high speed computer system suitable to process network data.
2 Hypernetwork Model The proposed model is extended from hypergraph [4], which has more representation power than conventional knowledge representation models that are based on graph [2]. The proposed model follows basic definitions of semantic networks [1], where a node is connected to other nodes (1) to specify the nodes or (2) when nodes are related by some relationship. Figure 1 shows examples of the graph and the hypergraph. Edges of hypergraph connecting more than two nodes are represented with circle surrounding the connected nodes. A graph G is defined as G=VxV, where V is the set of nodes, and edge connects a pair of nodes. As the formula indicates, a graph is a matrix generated by product of nodes. On the other hand, the hypergraph H is defined as H=VxE, where E is the set of edges. VxE is a matrix that results from product of vertices by nodes. Hence an edge of hypergraph, called hyperedge or hyperlink, connects any number of nodes.
868
T. Maeshiro et al.
This is a notable difference from graph that can connect only a pair of nodes. Note that N-ary relationship is fundamentally different from the combination of binary relationships, as they represent distinct concepts. Another interpretation of the formula VxE is under the framework of set theory, where each hyperlink defines a distinct subset of nodes.
a d
c
a 0 1 1 0
a b c d
b
b 1 0 1 0
c 1 1 0 1
d 0 0 1 0
(A)
e1
a
a b c d
e2 c
b e3
e1 1 1 1 0
e2 1 0 1 0
e3 0 1 0 1
d
(B) Fig. 1. (A) A graph. An edge connects a pair of nodes. The matrix on the right is the adjacency matrix VxV. 1 if the nodes are connected, and 0 otherwise. (B) A hypergraph. The circle surrounding nodes a, b and c is an edge that connects these three nodes. An edge may contain any number of nodes. The matrix on the right is the connection matrix VxE. 1 if the edge connects the nodes, 0 otherwise.
Another representation of hypergraph, the Bipartite graph, is obtained by converting the formula VxE to a formula similar to the graph, VxV. If the set of links E of hypergraph is treated as a set of nodes VE, H=VxE becomes HB=VxVE, a structure called bipartite graph. Two different types of nodes exist in bipartite graph. One is the set of nodes that exists as nodes (vertices) in original hypergraph, denoted vertex nodes. The other one is the set of hyperlinks in original hypergraph, denoted link nodes. Although bipartite graph has expression capability equivalent to the hypergraph, it is a graph because a link always connects a pair of nodes. However, a link never connects nodes of the same type, i.e., two vertex nodes or two link nodes. A connection is allowed only between a vertex node and a link node. The third type of node exists in the proposed model, denoted attribute node, which has no correspondence in hypergraph and subsequently in bipartite graph. Due to the presence of this node, though a hypergraph can be converted to a bipartite network, this process is irreversible.
Hypernetwork Model to Represent Similarity Details
e1
e1 a
e2
869
e3
e2 c
b e3
d
a
c
b
(A)
d
(B)
Fig. 2. Bipartite graph. (A) hypergraph with labels attached to nodes (a, b, c, d) and edges (e1, e2, e3). (B) Bipartite representation of the hypergraph in (A), where the four nodes on bottom denote nodes in hypergraph (a, b, c, d), and the three nodes on top (shadowed circles) denote edges in hypergraph (e1, e2, e3). The node e1 is connected to nodes a, b and c in (B) because the edge e1 connects nodes a, b and c in (A).
In the context of knowledge representation models, a concept or an entity represented by vertex node can be specified in two ways: by describing the attributes of the vertex node, or by relating to other vertex nodes. Combination of the two is also possible. The attribute node exists to specify any of three node type: vertex node, relation node and attribute node. An attribute node can further specify other attribute nodes. Table 1. Connectivity among vertex node, relation node and attribute node
Vertex node
Relation node
Attribute node
−
Connect
Connect
Relation node
Connect
−
Connect
Attribute node
Connect
Connect
Connect
Vertex node
Table 1 shows the connectivity constraints among three node types. Two connections are prohibited: between vertex node and vertex node, and between link node and relation node, constraint imposed from their role in hypergraph. Table 1 is symmetrical on diagonal axis although the directionality of links depends on the context and what the network represents. 2.1 Specification of Similarity Attribute node linked to a vertex node specifies or defines the properties of the entity represented by the vertex node. Attribute node linked to another attribute node is homologous to the previous case, and it defines the properties of the quality or concept expressed by the other attribute node. On the other hand, attribute node connected to a link node is the element absent in conventional knowledge representation
870
T. Maeshiro et al.
models. This connection enables a detailed specification of relationship among vertex nodes. Note that the relationship treated here is N-ary which covers the binary relationship, the only relationship that conventional models can represent. The ability to assign attributes to relationship is essential to qualify and then quantify the similarity relationships that are simply labeled “similar” in semantic networks, for example. Furthermore, attribute nodes connected to link node are specified with more details by further connecting attribute nodes, generating a multi-level hierarchical structure of attributes. Any type of relationship is specified in same manner, but this paper focus on “similarity”, a very broad conception, meaning anything between identical and different. Since vertex nodes and relation nodes can connect to attribute nodes, relation nodes are related to other relation nodes. Consequently, a concept network with higher density than conventional models emerges. 2.2 Qualification and Quantification of Similarity Qualification is sufficient for exact match of concept networks. Assignment of attribute nodes to a relation node is equivalent to qualification of the relationship represented by the relation node. Any detail is possible by attaching necessary number of attributed nodes directly to the relation node or further detailing the attribute nodes with any of vertex, relation or attribute nodes. Opposite to qualification, quantification of similarity is necessary for approximate match of networks, which has broader and more useful applications. Typical operation is the network search and match, where the task is to enumerate all subnetworks present in the target network that are similar to the query network. Another task is to compare two networks, then detect similarities and differences, in order to calculate quantitative value of similarity. Another important application is the enumeration of similar concepts from a given concept, which is a simulation of associative reasoning. The process is like associative game, where each participant suggests a keyword related to the previous keyword, generating a chain of sequentially related keywords. The hypernetwork model can simulate this process by choosing an initial node and then successively “firing” the nodes connected directly with similarity relationship. This activation process selects multiple nodes at each step, and the number of activated nodes increases monotonically.
3 Representation of Musical Instrument Performance We employ the hypernetwork model to represent performance techniques of musical instruments. Music is unique because it contains emotional and logical aspects. Emotional aspect is evident. Logical aspect is found in musical theory, harmony and signal processing, among others. Musical instruments performance or playing techniques described as rules are useful for automatic “humanization” system that processes exact compilation of musical score data in MIDI format to generate MIDI data that sounds similar to human players’ performance [5]. The system, named MorPH, incorporates a set of rules provided for each musical instrument to modify note loudness (velocity), timing, duration, and effects caused by performance techniques such as ghost notes, fill-ins and vibrato.
Hypernetwork Model to Represent Similarity Details
871
Different players of same musical instrument are described by different rule sets. A rule is a combination of one or more conditions and the instrument playing technique to be applied when conditions are fulfilled. Examples of rules are (1) Initial note of a phrase is soft (all instruments); (2) Insert a ghost note on the high beat if note is absent (bass). In some cases, particularly in classical music, different rule set is required for each musical piece even for same player and instrument. It is useful to study the relationship between performance techniques and performance impressions. One can study performance impressions by varying the performance rules and analyzing what people feel or perceive. Parameters of the analysis are the rule sets and similarity among rule sets. The generation of a rule set that completely describes the characteristics of a professional musician is hard because of the difficulty in generating symbolic expression of specific and subtle conditions to apply the rule. Each performance rule set, available for each instrument of a player, is represented by a vertex node. Individual rules are also vertex nodes, connected to the vertex node representing the rule set (player and instrument) through a relation node (Figure 3). Similarity among rule sets is valuable to detect similar players. If the description of similarity is unsupported by representation model, on the fly calculation of similarity among rule sets is necessary while executing network search and match. In some cases, however, calculation of similarity is not possible, particularly when associated with impressions. In this case, similarity has to be specified by humans and input manually. We synthesized a hypernetwork of five musical instruments: base, drums, guitar, sax and piano, with three players for each instrument, totaling fifteen rule sets. Generated similar
Player1 Instrument1 whole
Player2
rule1
part
Instrument1
whole
part-of
part
Player3 Instrument1
part-of
part part
rule2
rule3
rule4
part
rule1
part part
rule2
rule3
Fig. 3. Hypernetwork representation of musical instrument performance rule sets. Circles are vertex nodes, shaded circles are relation nodes, and double circles are attribute nodes. Illustrative example to describe similarity between instrument1 performance of two players. “whole” and “part” are labels assigned to link between relation node and vertex node. Attribute nodes connected to “similar” relation node specifies properties particular to the relation node. Note that relation nodes denote N-ary relationships.
872
T. Maeshiro et al.
rules were classified according to performance techniques, such as fast passage notes, fill-ins, short phrase melodies, and chord successions. In addition to the relation nodes that relate rule sets at the player level (Figure 3), rules were also related on performance technique level. Similarity among rule sets is calculated based on the number of rules and similarity among rules. The similarity among rules is defined by comparing the conditional and execution parts of rules, and by similarity among conditions. A peculiar characteristic of musical performance rule is that both condition part and execution part of rules describe notes of musical score, but is actually referring time duration that the note sounds. The time duration in actual performance is never identical to what is indicated in musical score, and can be shorter or longer. Specificity of a rule is defined as a combination of specificity of condition and execution parts, and shorter the time range (duration) referred by the condition or affected by the execution, more specific the rule is. Furthermore, the priority of the specificity of execution part is higher than the condition part.
4 Discussions Given the hypernetwork of musical instrument performance, one can choose a rule set (player) and then gradually change the performance by traversing the hypernetwork. Further the rule set positioned from the starting rule set, the difference of listening impression should be larger. But this depends on the algorithm to calculate the difference among rule set, and experiment is necessary to analyze the impression. Many graph search and match algorithms have been proposed in the field of graph theory, but usually their objective is to find cliques (completely connected graphs) or exact matched graphs. Furthermore, many are approximate algorithms due to their NP completeness nature. We have developed Starpack, a fast computer system to process data represented as hypergraphs. Thus graphs are included. Starpack has been used to simulate gene regulatory networks [6,7]. A node represents a gene, and genes are connected if they participate in the same reaction or one regulates others. The architecture of Starpack consists basically of simple processors and registers. The simulation consists of calculating temporal variation of quantities of all genes, and computation is ruled by different gene regulatory formula associated with each relationship that uses quantities of genes that participate in the regulatory relationship. Registers store the values of gene quantities. Starpack is a massively parallel system, and mapping between regulatory formula and processors is one to one. So all gene regulatory formula is executed in parallel, and the execution speed is up to one million times faster than software based simulators. The processor represents a relation node of bipartite graph. The hypernetwork model is easily implemented on Starpack. Vertex nodes and attribute nodes are assigned to registers that are used to represent gene quantity, and relation nodes are assigned to processors. Both mappings are one to one. Instead of quantity value, ID number that identifies symbols and labels are stored in registers. To execute network search and match, the whole target network is loaded onto Starpack, and the query network, whose size is smaller than the target network, is input to Starpack. Then the query network propagates over the processors through connection
Hypernetwork Model to Represent Similarity Details
873
determined by the target network. This architecture has two advantages. First, selection of search strategy such as breadth-first, depth-first and beam search is irrelevant because the query data propagates simultaneously to all connected processors, and match is executed in parallel. Second, NP computational complexity problem vanishes due to parallel execution as long as the target network fits into Starpack. Even if the number of relation nodes is greater than the number of processors, or the number of vertex nodes and attribute nodes exceeds the number of registers, computational cost is still polynomial because the target network data can be stored in external storage and swapped. Experiments indicate that some concepts are represented by single neuron in human brain [8]. The implementation of the hypernetwork model on Starpack is a possible approach to build brain-like inference system, where concepts are represented by vertex nodes, and semantic relationships by relation nodes. The study of network search and match is under progress, and results will be presented in future. Some argue that pattern matching and hierarchical structure are two basic factors of intelligent computer systems [9]. Pattern matching can be described as a network matching process if the target data is represented as networks. Our model can be applied to intelligent machines using concept network, where a node represents a concept and nodes are connected if the concepts are semantically related. Specification of "similar" relationships among nodes enables the firing of reasoning process on concept network, and the process continues, spreading the firing signal over the network. This is an associative search that traces paths interconnected by semantic relationships among concepts.
References 1. Quillian, M.R.: Word concepts: a theory and simulation of some basic semantic capabilities. Behavioral Science 12(5), 410–430 (1967) 2. Berge, C.: The Theory of Graphs, Dover (2001) 3. Date, C.J.: An Introduction to Database Systems, 8th edn. Addison-Wesley, Reading (2003) 4. Berge, C.: Hypergraphs: Combinatorics of Finite Sets. North-Holland, Amsterdam (1989) 5. Maeshiro, T., Shimohara, K.: MorPH - Musical Performance Humanizer. In: Proceedings of the International Symposium: KANSEI 2001, pp. 195–198 (2001) 6. Maeshiro, T., Hemmi, H., Shimohara, K.: Ultra-fast Genome wide Simulation of Biological Signal Transduction Networks - Starpack. In: Frontiers of Computational Science, pp. 177– 180. Springer, Heidelberg (2007) 7. Hemmi, H., Maeshiro, T., Shimohara, K.: New Computing System Architecture for Scientific Simulations - Non CPU-oriented Methodology. In: Frontiers of Computational Science, Springer, Heidelberg (2007) 8. Quiroga, R.Q., Reddy, L., Kreiman, G., Koch, C., Fried, I.: Invariant visual representation by single neurons in the human brain. Nature 435, 177–180 (2005) 9. Hawkins, J., Blakeslee, S.: On Intelligence, Times Books (2004)
Open Collaborative Development: Trends, Tools, and Tactics Kathrin M. Moeslein1,2, Angelika C. Bullinger1, and Jens Soeldner1 1
Chair for Information Systems 1, University of Erlangen-Nuremberg, Lange Gasse 20, 90403 Nuremberg, Germany {kathrin.moeslein,angelika.bullinger, jens.soeldner}@wiso.uni-erlangen.de 2 Center for Leading Innovation & Cooperation (CLIC), HHL – Leipzig Graduate School of Management, Jahnallee 59, 04109 Leipzig, Germany
Abstract. Following the successful trend of open source, companies can be observed to open their innovation and development processes towards interested and capable partners inside and outside the organization. Previous research has neglected the need to integrate these different innovators. In this paper, we start to explore how this integration can be facilitated by social software, a class of applications that belong to the group of web-based, user-centric applications commonly referred to by the term Web 2.0. We show data of 24 social networking services which we examined along the characteristics typically used in the field of social software. Keywords: Social networking services, open innovation, collaborative development.
1 Trends: Open Innovation Innovations, generally defined as the development of novel products, services, and processes, have come to be accepted as the main driver for successful companies (e.g. [1, 2]). It is no longer an issue for companies and research whether innovation is important, but how innovation is conducted and innovation processes are steered. Companies have accordingly invested considerable resources in the establishment and development of capable R&D departments. Researchers and developers have typically grown experts in their field who are sometimes ‘locked in’ their R&D knowledge. To overcome this challenge of ‘seeing and not seeing’ [3], organizations have increasingly been found to open their development process towards more collaborative approaches. Companies have been opening up formerly closed innovation and development processes, traditionally conducted by their internal R&D department, and started to allow external actors to participate in development processes. Accordingly, external sources of innovation have undergone growing attention in studies investigating innovation. The strategy of organizations to open up their innovation processes to outside innovators has been termed open innovation [4]. Consequently, we speak of open collaborative development if it comes to collaborative development of products, services and processes over the web. J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 874–881, 2009. © Springer-Verlag Berlin Heidelberg 2009
Open Collaborative Development: Trends, Tools, and Tactics
875
From this perspective, innovation or development is no longer dependent on members of an organizational unit within a single company, but rather generated by the (social) interaction of members from distributed, organization spanning networks [5]. For open collaborative development to be successful, the active participation of competent participants is as crucial as it is to open innovation processes. Innovators inside and outside the company borders need to be integrated in the relevant processes – from idea generation and selection to implementation. Motivating and enabling these potential participants to publish their ideas and to comment on and participate in the further development of their own and the ideas of others’ presents a challenge. In the field of open innovation, first ideas emerge that collaborative applications from the area of web 2.0 could be promising for this task [6]. In this paper, we go beyond open innovation and focus in particular on open collaborative development, i.e. collaborative development over the web. The paper proceeds as follows: the next section provides information on the characteristics of social software (cf. tools) and a particular type of application in this field: social networking services (cf. tactics). The following section then shows our research approach, succeeded by presentation of the empirical findings on 24 social networking services focusing in particular on the support of open collaborative development which were part of our analysis. Finally, conclusion and outlook close the paper.
2 Tools: Social Software Social software denotes internet-based applications that enable, support and extend indirect and direct interpersonal interaction in (parts) of public social networks leveraging network effects and effects of scale [6, 7, 8]. The emergence of applications from the field of social software is related to several current trends that are proponents for software based collaborative development. First, the continued expansion of the open source movement provides interested users with new and modified technical tools - and, besides, influences the way of thinking and (collaboratively) working [9]. Second, the development of information technologies is powering the evolution of the internet. The change of the internet as we have known it for years into the so-called web 2.0 and the semantic web brings about important changes in both technologies and in usage patterns. Beginning with technologies, advances in web technologies have improved the way to search for, decide on and implement innovation [10]. Attributes of web 2.0 include RSS feeds, tagclouds, mashups and rich internet applications as well as new ways and tools for managing content and for delivering services. Thus, technological advancements support openness, foster innovative collaboration and ease information exchange – i.e., the web 2.0 changes usage patterns [11]. It stands for a new way of interacting with information on the internet [12, 13]. Users are enabled to participate as active users that can easily provide new content and comment on existing content. Similar to the ideas of open innovation, web 2.0 effectively enables a usergenerated internet. This kind of user involvement is made possible by a variety of social software applications, e.g. weblogs, wikis, and services for social networking and social bookmarking. Important traits of social software are their ease of use and
876
K.M. Moeslein, A.C. Bullinger, and J. Soeldner
high flexibility. Motivating reports point out that social software can effectively support collaboration and knowledge exchange of employees within the firm [14, 15]. This leads us to conclude that applying the mental model of open innovation to an analysis of social software might indicate potential ways how development processes can be made more collaborative over the web.
3 Tactics: Social Networking Services From the variety of social software applications, social networking services seem particularly suitable to initiate, support and extend indirect and direct interpersonal interaction and thus enable open collaborative development. A social network is a set of people connected by a set of social relations, such as friendship, co-working, or information exchange [16]. Members in social networks tend to have weak ties - in contrast to communities, where members tend to have strong ties [17, 18]. One of the benefits of social networking services (SNS) is that they reveal these links which could be useful e.g. for a job search [17]. Richter & Koch [19] define social networking services as web-based services that “offer users functionalities for identity management (…) and enable furthermore to keep in touch with other users.” In this context, a distinction is made between open SNS that are available to use for everyone and closed SNS that are used by a distinct user group, often within a company [19, 20, 21]. This paper focuses on open SNS, as it examines the potential integration of innovators in a process of open collaborative development. This requires a field of collaboration open to all innovators. Popular examples of open social networking services are facebook.com, myspace.com, classmates.com and myyearbook.com. The importance of open SNS to the digital public is mirrored in the monthly visits. To name but two, in January 2009, facebook.com experienced 1.2 billion visits and myspace.com was visited about 800.000 times [22]. This fascination is mostly driven by the desire of users to remain in contact [19, 23]. The basic features of social network services, however, include more functionalities [19, 24]: identity management within a bounded system (i.e., construction of a public or semi-public personal profile); expert search and contact management (i.e., list other users with whom the user shares a connection); (contextual and network) awareness (i.e. information on the status of other users); as well as support of a collaborative exchange (i.e., view and navigation of the list of connections and those made by others within the system) Among the multitude of social networking services, we have identified the emergence of a new type in which the above listed basic functionalities are combined with further functions that potentially support open collaborative development. These social networking services mostly focus on academic users and accordingly, we propose the term social research networks (SRN) for this specific subgroup of social networking services. As the academic audience might well form a nucleus for activities in the field of internet-based collaborative development, we investigated these more in detail.
4 Research Approach We draw on previously acknowledged characteristics from research on social software in our efforts to elaborate a first integrated systematization of characteristics of
Open Collaborative Development: Trends, Tools, and Tactics
877
SNS. We also seek to identify additional characteristics which existing literature has yet not regarded. To accomplish these goals, we conducted a three-step research study. In total, 24 social networking services have been analyzed. Research Sample and Data Collection. To allow for comparison of the different social network services, we have chosen our sample based on the following criteria: web-based service, addressing a professional or an academic audience; representation of target audiences from different disciplinary fields as well as social networking services independent of a particular discipline and openness to a broad public. Due to the large number of social networking services analyzed, the diversity necessary for generalization of insights is ensured (cf. table 1). The 24 social networking services are basically defined by two characteristics: degree of topical focus (20 classified ‘general’, 4 ‘focused’) and degree of openness (19 classified ‘open’, 5 ‘invitation’). Data Analysis. To gain a deeper understanding of the characteristics of social networking services, we applied the following research approach. Using extant characteristics from the publications in the field, we analyzed how the social networking services in our sample represent these characteristics. Analysis of characteristics was done by 52 under-graduate students specializing in innovation management at a large public German University. Evaluation was done by two researchers separately and subsequently, evaluation results were compared. In case of differences, a third researcher was asked to evaluate the social networking services. We focused on identification of relevant characteristics as standalones, i.e. relations and interdependences have not been explored.
5 Findings Despite the continuous growth of importance that social software in general and social networking services in particular have shown throughout the last years, research in the field so far has been limited. In addition, previous publications on social networking services have categorized their functionalities into three main areas – identity and network, interaction and communication, and information and content [19]. We propose that while these categories are sufficient for generic social networking services, they are not adequate to comprehensively describe emerging forms that hold more specific offerings for collaborative work and development. Hence, this study has taken a step forward by examination of 24 social networking services and proposing the enlargement of the systematization by two novel criteria stemming from research on social software (topical focus and degree of openness). Table 1 below shows our findings. In future work, the five criteria will be further refined and expanded. Identity and Network. Managing identity information is one of the core functionalities of social networking services, also present in the investigated social networking services. 20 of 24 networks analyzed offer identity management functionalities. Academia.edu goes one step further and offers a tree-like hierarchy of universities and their departments in which researchers can easily find their place in. Furthermore, a researcher can select from a range of keywords (or create new keywords) denoting his or her research interests. The keywords bear another advantage – you can opt for an
878
K.M. Moeslein, A.C. Bullinger, and J. Soeldner
Network
Criteria
2collab.com academia.edu academici.com biomedexperts.com centraldesktop.com collabrx.com epws.org escidoc.org globaledge.msu.edu labmeeting.com laboratree.org lalisio.com lumifi.com mendeley.com mynetresearch.com network.nature.com pingtsta.com researchgate.net saba.com scholarz.net scilife.net scispace.net ssrn.com thoughtleaders.within3.co
Table 1. Categorization of collaborative social networking services
Identity and Network Personal Profile Directory of Profiles Search for Profiles
x x x x x x x x x x x x x
x x x
x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x
x
x x x x x x x x
x
Interaction and Communication Messages Instant Mess. Service
x x x x
x x x
Information and Content Wiki Group Editor Social Tagging Social Rating Data Upload Paper Upload Commenting
x
x x x
x x
x
x
x
x x
x x x x x x x x x x x
x x x
x x x x x x x x x x x x x x x x
x x x x x x x x x x x x
x x x x x x x x x
Topical Focus Generic Specific
x x x
x x
x x x x x x x x x x x
x x x x x x x
x
Degree of Openness Free Commercial Open for everyone By invitation only
x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
automatic email notification when your profile is found by others in google.com, so you directly see when your profile has been viewed. Academia.edu implements network awareness via the feature of “following” contacts – this provides for automatic updates on any status changes of direct contacts displayed in the user’s news feed. Academia.edu serves as an example of another feature: cross-platform integration of social network services. In this particular example, keywords entered in the profile can automatically be replicated to the user’s facebook.com profile. Interaction and Communication. 16 out of 24 social networking services analyzed in this study offer built-in functions for communication via direct messages between its members. The social networking service academici.com for example, a social networking service targeting knowledge workers and offering features for interaction,
Open Collaborative Development: Trends, Tools, and Tactics
879
collaboration and knowledge transfer allows for direct and indirect communication, however limited to paying users. Interaction and Communication. 16 out of 24 social networking services analyzed in this study offer built-in functions for communication via direct messages between its members. The social networking service academici.com for example, a social networking service targeting knowledge workers and offering features for interaction, collaboration and knowledge transfer allows for direct and indirect communication, however limited to paying users. Information and Content. Management of information and content is the third core functionality of social software, yet traditionally not the domain of social networking services. Managing of information is traditionally provided by wikis or forums which about one fifth of the services analyzed include. Content management, i.e. managing publications, documents and papers is provided by 19 of 24 social networking services networks analyzed. 6 of 24 also feature tagging and rating possibilities for the content. Some social networking services go well beyond these features, lumifi.com for instance features strong functionalities in the area of searching for content with an automatic content analysis engine that finds connections and relationships between words, going beyond traditional indexing or search applications.
6 Conclusion Our findings provide a contribution to the fields of open innovation and social software by establishing a link between two previously distinct strands of research. By our analysis, we suggest that social networking services as an application from the field of social software might be a suitable tool to enable collaborative development over the web. Hence, this research can serve as a first indication for (innovation) managers or other organizers to design social networking services to this end as well as a first foundation for researchers in the field to further explore the details of social networking services in the light of innovative collaboration. The strengths of our study must be tempered with recognition of its limitations that could be addressed in future research. Given the qualitative nature of the study, the systematization of social networking services should be seen as a structured analysis of reality, and not as reality itself. First, we do not claim to have identified a comprehensive set of criteria to categorize social networks enabling collaborative work. We see a need to further and in more detail explore a number of criteria. In particular, there is a need to further explore the reception of users towards the offerings made by the social networking services. Second, whereas our findings show a first set of social networking services as standalones, further research could increase the knowledge by studying the relations and interdependencies between different social networking services. Acknowledgments. This research has been funded by the Donors’ Association for the Promotion of Sciences and Humanities in Germany under HHL’s ‘Open School Initiative’. The authors would like to thank their students for their assistance in the analysis of the social networking services.
880
K.M. Moeslein, A.C. Bullinger, and J. Soeldner
References 1. Drucker, P.F.: The practice of management. Elsevier, Amsterdam (2007) 2. Christensen, C.M.: The innovator’s dilemma. When new technologies cause great firms to fail. Harvard Business School Press, Boston (2008) 3. Day, G.S., Shoemaker, P.J.H.: Leading the vigilant organization. Strategy and Leadership 34(5), 4–10 (2006) 4. Chesbrough, H.W.: Open innovation. The new imperative for creating and profiting from technology. Harvard Business School Press, Boston (2005) 5. Perkmann, M., Walsh, K.: University-industry relationships and open innovation: Towards a research agenda. International Journal of Management Reviews 9(4), 259–280 6. Koch, M., Bullinger, A.C., Moeslein, K.: Social Software als Basiswerkzeug der Open Innovation. In: Zerfass, A., Moeslein, K. (eds.) Kommunikation als Erfolgsfaktor im Innovationsmanagement - Strategien im Zeitalter der Open Innovation, Gabler, Wiesbaden (2009) 7. plasticbag.org: An addendum to a definition of Social Software, http://www.plasticbag.org/archives/2005/01/ an_addendum_to_a_definition_of_social_software/ 8. Koch, M., Richter, A.: Enterprise 2.0. Planung, Einführung und erfolgreicher Einsatz von Social Software in Unternehmen. Oldenbourg, München (2007) 9. Lakhani, K.R., Wolf, R.G.: Why Hackers Do What They Do: Understanding Motivation and Effort in Free/Open Source Software Projects. In: Perspectives on Free and Open Source Software, MIT Press, Cambridge (2005) 10. Tidd, J., Bessant, J.R., Pavitt, K.: Managing innovation. Integrating technological, market and organizational change. Wiley, Chichester (2005) 11. Downes, S.: eLearn: Feature Article, http://www.elearnmag.org/ subpage.cfm?section=articles&article=29-1 12. Chone, J.: Web 0.x to Web 2.0 Simplified, http://www.bitsandbuzz.com/article/ web-0x-to-web-20-simplified/ 13. O’Reilly, T.: What Is Web 2.0, http://www.oreillynet.com/pub/a/oreilly/tim/news/ 2005/09/30/what-is-web-20.html 14. Back, A., Gronau, N., Tochtermann, K.: Web 2.0 in der Unternehmenspraxis. Grundlagen, Fallstudien und Trends zum Einsatz von Social Software. Oldenbourg, München (2008) 15. Buhse, W.: Enterprise 2.0 - die Kunst, loszulassen. Rhombos-Verlag, Berlin (2008) 16. Garton, L., Haythornthwaite, C., Wellman, B.: Studying Online Social Networks. Journal of Computer-Mediated Communication 3(1) (1997) 17. Granovetter, M.S.: The strength of weak ties. American Journal of Sociology 78(6), 1360– 1380 (1973) 18. Stocker, A., Tochtermann, K.: Investigating Weblogs in Small and Medium Enterprises: An Exploratory Case Study. In: Flejter, D., Grzonkowski, S., Kaczmarek, T., Kowalkiewicz, M., Nagle, T., Parkes, J. (eds.) BIS (Workshops): CEUR-WS.org (CEUR Workshop Proceedings), vol. 333, pp. 95–107 (2008) 19. Richter, A., Koch, M.: Functions of Social Networking Services. In: Hassanaly, P., Ramrajsingh, A., Randall, D., Salembier, P., Tixier, M. (eds.) Proc. Intl. Conf. on the Design of Cooperative Systems 2008, Carry-le-Rouet, France, pp. 87–98 (2008) 20. Bughin, J., Manyika, J.: How businesses are using Web 2.0: A McKinsey Global Survey, http://www.mckinseyquarterly.com/article_page.aspx?ar=1913
Open Collaborative Development: Trends, Tools, and Tactics
881
21. Young, G.O.: Global Enterprise Web 2.0 Market Forecast: 2007 To 2013. Forrester Research, http://www.forrester.com/Research/Document/Excerpt/ 0,7211,43850,00.html 22. Kazeniac, A.: Social Networks: Facebook Takes Over Top Spot, Twitter Climbs, http://blog.compete.com/2009/02/09/ facebook-myspace-twitter-social-network/ 23. Enders, A., Hungenberg, H., Denker, H.-P., Mauch, S.: The long tail of social networking: Revenue models of social networking sites. European Management Journal 26(3), 199–211 (2008) 24. Boyd, D.M., Ellison, N.B.: Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13(1) (2007)
Investigating the Run Time Behavior of Distributed Applications by Using Tiny Java Virtual Machines with Wireless Communications Tsuyoshi Miyazaki, Takayuki Suzuki, and Fujio Yamamoto Department of Information and Computer Sciences, Kanagawa Institute of Technology, 1030 Shimo-ogino, Atsugi-shi, Kanagawa, 243-0292 Japan {miyazaki,suzuki,yamamoto}@ic.kanagawa-it.ac.jp
Abstract. From the viewpoint of programming education, distributed application programs carried out in a small JAVA machine group were considered. These computers are equipped with radio communication facility, multi-thread function, LEDs and various sensors. Parallel genetic algorithms and distributed search problems were targeted for the study here. About the latter, a detailed implementation method and the result of the experiment are shown. In such a computing environment, it was understood that the internal behavior and the data communication in the distributed application were easy to be grasped by an effect of visualizing them by the physical interface. Keywords: Physical Computing, Distributed Computing, Software Education.
1 Introduction In the present age when Web-related technical development is remarkable, the importance of the software education at the university rises still more. Above all, it is necessary to teach the basic technology of distributed systems or distributed applications practically. Conventionally, in this field, lectures on a basic concept and the basic technology of the distributed applications were performed in the classroom, whereas the practice that uses computers seems not to be performed very much. The one of the reasons is that there is not the environment where each person accesses a lot of PCs except one's PC. The second reason is that grasping the cooperative activities among PCs is not so easy, and consequently it is difficult to have overall image of the processing performed by that distributed application. Recently small computer SunSPOT [5,6] equipped with JAVA virtual machine attracts attention. SunSPOT possesses an acceleration sensor and an illumination sensor, and can execute general JAVA programs on it. Therefore it is originally thought that it is a device to build a radio sensor network. However, as a notable thing, this device possesses a multi-thread function as well as a wireless communication function. These functions can be used in the JAVA language like the case of the normal PC. In other words, various network applications using multi-thread programming, socket communication and multicast can be made on this machine easily. Communication among J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2009, LNCS 5610, pp. 882–889, 2009. © Springer-Verlag Berlin Heidelberg 2009
Investigating the Run Time Behavior of Distributed Applications
883
SunSPOTs is enabled without being connected to the PC after an application program was developed. The power supply by the built-in battery lasts comparatively for a long time enough to demonstrate the applications anywhere. By using radio communication facilities among SunSPOTs, various distributed applications can be performed. When it is compared with a general PC, the CPU performance and the transmission rate of SunSPOT are considerably low. Fortunately, such characteristics can be used effectively. In other words, an environment that can watch the behavior of the distributed application by slow motion is provided. SunSPOT is equipped with eight color LEDs and also can be attached an LCD display such as one in a mobile telephone. If they are used, the operation of the CPU and the movement of the transmission and reception can be grasped easily. Based upon this, utilization of SunSPOT for education of the distributed application technology is trying. In the following, distributed applications to work on a set of SunSPOTs developed this time are explained, and based on it, the possibility of the use by the future education is considered.
2 Distributed Application Programs for Educational Use Two applications are taken up to teach the technology of the distributed programs in a computing environment mentioned above. The first is the parallel genetic algorithm, and second is the issue of search of the two-dimensional domain by Bentley [2]. About the former, only the aim and method are described, and the implementation will be reported in the near future. On the other hand, about the latter, the details on a method and the implementation are shown below. 2.1 Parallel Genetic Algorithms The genetic algorithm (GA) has been used as convincing technique to solve the issue of complicated optimization for a long time. The potential concurrency of the processing attracted attention early and there are many studies on it [4,8]. By the parallel genetic algorithm, a population is divided into some partial groups, and each is allocated to each processor. Evolution processing is carried out independently in each group. The variety of the individual will be maintained in the whole population. However, in the evolution, it is very likely that each sub-population converges early to each local optimum status. To prevent this problem, migration of individuals is necessary in the parallel genetic algorithm. In other words the information should be exchanged at a certain suitable period between sub-populations. Some individuals are moved to another subpopulation. And let the individuals take an opportunity to do crossing-over with the individuals in the sub-population they arrived at. The individual to be migrated is limited to the elite who showed high fitness in its original sub-population. Because the elite evolved in a different way from the evolution performed in the sub-population it arrived at, a birth of child with new character is expected from the crossing-over with a native individual there. Such situation will develop new searching area for global optimized value.
884
T. Miyazaki, T. Suzuki, and F. Yamamoto
Here, a problem to search for the combination pattern of the color was solved by such a parallel genetic algorithm. It is assumed that eight lamps are put to one line now. Each lamp can display 256 colors by changing each value of R, G and B. Then the combination of the color of eight lamps becomes the enormous number, that is, 256 to the 8th power (about 10 to the 19th power). It is the problem to find out one specific pattern among them. The difference between the answer pattern and the presumed pattern is assumed known as a value of the fitness function. Each subpopulation has 30 individuals. Each individual estimates one color combination pattern. From generation to generation, those individuals do crossing-over and the individuals having high fitness (near the correct color pattern) will survive. In this process, appropriate migration is important as shown above. By this migration, the stoppage to a local optimum as a whole population can be avoided. 2.2 Bentley’s Searching Problems The second application developed this time is a solution for the searching problem that Bentley, J. L. (1985) presented [2]. It is assumed that a two dimensional array of NxN contains the elements of plus and minuses numbers at random position. The problem is to find out the best rectangle among all possible rectangles covering the part of the array. The best rectangle here denotes the one that gives the maximum value resulting from the summation of all the elements contained in that rectangle. That maximum value is the final answer. A naive algorithm. For example, in the case of an array A of 4x4 in Fig. 1, the grand total of the elements in the rectangle of the dot line frame becomes 3. On the other hand, the grand total of the elements in the rectangle of the black bold frame is 8, and this becomes the maximum. The empty rectangle that nothing includes is permitted, and, in that case, the sum of elements is considered to be a zero. Therefore, the maximum does not become the minus number. Generally, in this solution, computational complexity suddenly increases as size N of the array A grows big.
Fig. 1. Bentley’s problem (for 4x4 array)
The most naive algorithm needs computational complexity of the order of N to the 6th. However, by some improvement, it can be reduced to the order of N to the 5th. The main part of this algorithm is shown below.
Investigating the Run Time Behavior of Distributed Applications
885
int maxsofar = 0; int sum; for (int is =0; is