BERKSHIRE PUBLISHING GROUP
Berkshire Encyclopedia of
Human—Computer Interaction When science fiction becomes science fact
William Sims Bainbridge National Science Foundation
Edited by
Berkshire Encyclopedia of
Human-Computer Interaction
Berkshire Encyclopedia of
Human-Computer Interaction VOLUME
1
William Sims Bainbridge Editor
Great Barrington, Massachusetts U.S.A. www.berkshirepublishing.com
Copyright © 2004 by Berkshire Publishing Group LLC All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the publisher. Cover photo: Thad Starner sporting a wearable computer. Photo courtesy of Georgia Institute of Technology. Cover background image: Courtesy of Getty Images. For information: Berkshire Publishing Group LLC 314 Main Street Great Barrington, Massachusetts 01230 www.berkshirepublishing.com Printed in the United States of America Library of Congress Cataloging-in-Publishing Data Berkshire encyclopedia of human-computer interaction / William Sims Bainbridge, editor. p. cm. “A Berkshire reference work.” Includes bibliographical references and index. ISBN 0-9743091-2-5 (hardcover : alk. paper) 1. Human-computer interaction--Encyclopedias. I. Bainbridge, William Sims. II. Title. QA76.9.H85B46 2004 004'.01'9--dc22 2004017920
BERKSHIRE PUBLISHING STAFF Project Director Karen Christensen Project Coordinators Courtney Linehan and George Woodward Associate Editor Marcy Ross Copyeditors Francesca Forrest, Mike Nichols, Carol Parikh, and Daniel Spinella Information Management and Programming Deborah Dillon and Trevor Young Editorial Assistance Emily Colangelo Designer Monica Cleveland Production Coordinator Janet Lowry Composition Artists Steve Tiano, Brad Walrod, and Linda Weidemann Composition Assistance Pam Glaven Proofreaders Mary Bagg, Sheila Bodell, Eileen Clawson, and Cassie Lynch Production Consultant Jeff Potter Indexer Peggy Holloway
CONTENTS
List of Entries, ix Reader’s Guide, xv List of Sidebars, xix Contributors, xxiii Introduction, xxxiii Publisher’s Note, xli About the Editor, xliii
Entries Volume I: A–L 1–440 Vol II: M–W 441–826 Appendix 1: Glossary, 827 Appendix 2: Master Bibliography of Human-Computer Interaction, 831 HCI in Popular Culture, 893 Index, 931 • Index repeated in this volume, I-1 vii
LIST OF ENTRIES
Adaptive Help Systems Peter Brusilovsky Adaptive Interfaces Alfred Kobsa
Animation Abdennour El Rhalibi Yuanyuan Shen Anthropology and HCI Allen W. Batteau
Affective Computing Ira Cohen Thomas S. Huang Lawrence S. Chen
Anthropometry Victor L. Paquet David Feathers
Altair William Sims Bainbridge
Application Use Strategies Suresh K. Bhavnani
Alto William Sims Bainbridge
Arpanet Amy Kruse Dylan Schmorrow Allen J. Sears
Artificial Intelligence Robert A. St. Amant Asian Script Input William Sims Bainbridge Erika Bainbridge Atanasoff-Berry Computer John Gustafson Attentive User Interface Ted Selker Augmented Cognition Amy Kruse Dylan Schmorrow
ix
X ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Augmented Reality Rajeev Sharma Kuntal Sengupta
Compilers Woojin Paik
Digital Divide Linda A. Jackson
Avatars Jeremy Bailenson James J. Blascovich
Computer-Supported Cooperative Work John M. Carroll Mary Beth Rosson
Digital Government Jane E. Fountain Robin A. McKinnon
Beta Testing Gina Neff
Constraint Satisfaction Berthe Y. Choueiry
Braille Oleg Tretiakoff
Converging Technologies William Sims Bainbridge
Brain-Computer Interfaces Melody M. Moore Adriane D. Davis Brendan Z. Allison
Cybercommunities Lori Kendall
Browsers Andy Cockburn Cathode Ray Tubes Gregory P. Crawford CAVE Thomas DeFanti Dan Sandin Chatrooms Amanda B. Lenhart Children and the Web Dania Bilal Classrooms Chris Quintana Client-Server Architecture Mark Laff Cognitive Walkthrough Marilyn Hughes Blackmon Collaboratories Gary M. Olson
Cybersex David L. Delmonico Elizabeth Griffin Cyborgs William Sims Bainbridge Data Mining Mohammad Zaki Data Visualization Kwan-Liu Ma Deep Blue Murray Campbell
Digital Libraries Jose-Marie Griffiths Drawing and Design Mark D. Gross E-business Norhayati Zakaria Education in HCI Jan Stage Electronic Journals Carol Tenopir Electronic Paper Technology Gregory P. Crawford Eliza William H. Sterner E-mail Nathan Bos Embedded Systems Ronald D. Williams
Denial-of-Service Attack Adrian Perrig Abraham Yaar
ENIAC William Sims Bainbridge
Desktop Metaphor Jee-In Kim
Ergonomics Ann M. Bisantz
Dialog Systems Susan W. McRoy
Errors in Interactive Behavior Wayne D. Gray
Digital Cash J. D. Tygar
Ethics Helen Nissenbaum
LIST OF ENTRIES ❚❙❘ XI
Ethnography David Hakken Evolutionary Engineering William Sims Bainbridge Expert Systems Jay E. Aronson
Handwriting Recognition and Retrieval R. Manmatha V. Govindaraju Haptics Ralph L. Hollis
Eye Tracking Andrew T. Duchowski
History of Human-Computer Interaction Jonathan Grudin
Facial Expressions Irfan Essa
Hollerith Card William Sims Bainbridge
Fly-by-Wire C. M. Krishna
Human-Robot Interaction Erika Rogers
Fonts Thomas Detrie Arnold Holland
Hypertext and Hypermedia David K. Farkas
Games Abdennour El Rhalibi
Icons Stephanie Ludi
Information Theory Ronald R. Kline Instruction Manuals David K. Farkas Internet—Worldwide Diffusion Barry Wellman Phuoc Tran Wenhong Chen Internet in Everyday Life Barry Wellman Bernie Hogan Iterative Design Richard Baskerville Jan Stage Keyboard Alan Hedge Language Generation Regina Barzilay
Gender and Computing Linda A. Jackson
Identity Authentication Ashutosh P. Deshpande Parag Sewalkar
Laser Printer Gary Starkweather
Geographic Information Systems Michael F. Goodchild
Impacts Chuck Huff
Law and HCI Sonia E. Miller
Gesture Recognition Francis Quek
Information Filtering Luz M. Quiroga Martha E. Crosby
Law Enforcement Roslin V. Hauck
Graphical User Interface David England Grid Computing Cavinda T. Caldera
Information Organization Dagobert Soergel Information Overload Ruth Guthrie
Groupware Timothy J. Hickey Alexander C. Feinman
Information Retrieval Dagobert Soergel
Hackers Douglas Thomas
Information Spaces Fionn Murtagh
Lexicon Building Charles J. Fillmore Liquid Crystal Displays Gregory P. Crawford Literary Representations William Sims Bainbridge Machine Translation Katrin Kirchhoff
XII ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Markup Languages Hong-Gee Kim Mobile Computing Dharma P. Agrawal Mosaic William Sims Bainbridge
Online Education Robert S. Stephenson Glenn Collyer Online Questionnaires James Witte Roy Pargas
Programming Languages David MacQueen Prototyping Richard Baskerville Jan Stage Psychology and HCI Judith S. Olson
Motion Capture and Recognition Jezekiel Ben-Arie
Online Voting R. Michael Alvarez Thad E. Hall
Mouse Shumin Zhai
Ontology Christopher A. Welty
Recommender and Reputation Systems Cliff Lampe Paul Resnick
Movies William Sims Bainbridge
Open Source Software Gregory R. Madey
Repetitive Strain Injury Jack Tigh Dennerlein
MUDs Richard Allan Bartle
Optical Character Recognition V. Govindaraju Swapnil Khedekar
Scenario-Based Design John M. Carroll
Multiagent systems Gal A. Kaminka Multimodal Interfaces Rajeev Sharma Sanshzar Kettebekov Guoray Cai Multiuser Interfaces Prasun Dewan Musical Interaction Christopher S. Raphael Judy A. Franklin Natural-Language Processing James H. Martin Navigation John J. Rieser
Peer-to-Peer Architecture Julita Vassileva Pen and Stylus Input Alan Hedge Personality Capture William Sims Bainbridge Physiology Jennifer Allanson Planning Sven Koenig Michail G. Lagoudakis Pocket Computer William Sims Bainbridge
N-grams James H. Martin
Political Science and HCI James N. Danziger Michael J. Jensen
Olfactory Interaction Ricardo Gutierrez-Osuna
Privacy Jeffrey M. Stanton
Search and Rescue Howie Choset Search Engines Shannon Bradshaw Security Bhavani Thuraisingham Semantic Web Bhavani Thuraisingham Smart Homes Diane J. Cook Michael Youngblood Sociable Media Judith Donath Social Informatics Howard Rosenbaum Social Proxies Thomas Erickson Wendy A. Kellogg
LIST OF ENTRIES ❚❙❘ XIII
Social Psychology and HCI Susan R. Fussell
Task Analysis Erik Hollnagel
Sociology and HCI William Sims Bainbridge
Telecommuting Ralph David Westfall
Value Sensitive Design Batya Friedman
Socio-Technical System Design Walt Scacchi
Telepresence John V. Draper
Video Immanuel Freedman
Software Cultures Vaclav Rajlich
Text Summarization Judith L. Klavans
Video Summarization A. Murat Tekalp
Software Engineering Richard Kazman
Theory Jon May
Virtual Reality Larry F. Hodges Benjamin C. Lok
Sonification David M. Lane Aniko Sandor S. Camille Peres
Three-Dimensional Graphics Benjamin C. Lok
Spamming J. D. Tygar Speech Recognition Mary P. Harper V. Paul Harper Speech Synthesis Jan P.H. van Santen Speechreading Marcus Hennecke Spell Checker Woojin Paik Sphinx Rita Singh Statistical Analysis Support Robert A. St. Amant Supercomputers Jack Dongarra Tablet Computer William Sims Bainbridge
Three-Dimensional Printing William Sims Bainbridge Touchscreen Andrew L. Sears Rich Goldman Ubiquitous Computing Olufisayo Omojokun Prasun Dewan Unicode Unicode Editorial Committee Universal Access Gregg Vanderheiden
Jenny Preece Diane Maloney-Krichmar
Viruses J. D. Tygar Visual Programming Margaret M. Burnett Joseph R. Ruthruff Wearable Computer Thad Starner Bradley Rhodes Website Design Barbara S. Chaparro Michael L. Bernard Work Christine A. Halverson
Usability Evaluation Jean Scholtz
Workforce Brandon DuPont Joshua L. Rosenbloom
User Modeling Richard C. Simpson
World Wide Web Michael Wilson
User Support Indira R. Guzman
WYSIWYG David M. Lane
User-Centered Design Chadia Abras
READER’S GUIDE
This list is provided to assist readers in locating entries on related topics. It classifies articles into ten general categories: Applications; Approaches; Breakthroughs; Challenges; Components; Disciplines; Historical Development; Interfaces; Methods; and Social Implications. Some entries appear in more than one category. Applications Classrooms Digital Government Digital Libraries E-business Games Geographic Information Systems Grid Computing Law Enforcement Mobile Computing
Navigation Online Education Online Voting Planning Recommender and Reputation Systems Search and Rescue Statistical Analysis Support Supercomputers Telecommuting Ubiquitous Computing Video Approaches Application Use Strategies Beta Testing Cognitive Walkthrough Constraint Satisfaction Ethics xv
XVI ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Ethnography Evolutionary Engineering Information Theory Iterative Design Ontology Open Source Software Prototyping Scenario-Based Design Social Informatics Socio-Technical System Design Task Analysis Theory Universal Access Usability Evaluation User Modeling User-Centered Design Value Sensitive Design Website Design Breakthroughs Altair Alto Arpanet Atanasoff-Berry Computer CAVE Converging Technologies Deep Blue Eliza ENIAC Hollerith Card Mosaic Sphinx Challenges Denial-of-Service Attack Digital Divide Errors in Interactive Behavior Hackers Identity Authentication Information Filtering Information Overload Privacy Repetitive Strain Injury Security Spamming Viruses
Components Adaptive Help Systems Animation Braille Cathode Ray Tubes Client-Server Architecture Desktop Metaphor Electronic Paper Technology Fonts Keyboard Laser Printer Liquid Crystal Displays Mouse N-grams Peer-to-Peer Architecture Social Proxies Spell Checker Touchscreen Unicode WYSIWYG Disciplines Anthropology and HCI Artificial Intelligence Ergonomics Law and HCI Political Science and HCI Psychology and HCI Social Psychology and HCI Sociology and HCI Historical Development Altair Alto ENIAC History of HCI Interfaces Adaptive Interfaces Affective Computing Anthropometry Asian Script Input Attentive User Interface Augmented Cognition Augmented Reality Brain-Computer Interfaces
READER’S GUIDE ❚❙❘ XVII
Compilers Data Visualization Dialog Systems Drawing and Design Eye Tracking Facial Expressions Fly-by-Wire Graphical User Interface Haptics Multimodal Interfaces Multiuser Interfaces Musical Interaction Olfactory Interaction Online Questionnaires Pen and Stylus Input Physiology Pocket Computer Smart Homes Tablet Computer Telepresence Three-Dimensional Graphics Three-Dimensional Printing Virtual Reality Wearable Computer Methods Avatars Browsers Data Mining Digital Cash Embedded Systems Expert Systems Gesture Recognition Handwriting Recognition and Retrieval Hypertext and Hypermedia Icons Information Organization Information Retrieval Information Spaces Instruction Manuals Language Generation Lexicon Building Machine Translation
Markup Languages Motion Capture and Recognition Natural-Language Processing Optical Character Recognition Personality Capture Programming Languages Search Engines Semantic Web Software Engineering Sonification Speech Recognition Speech Synthesis Speechreading Text Summarization User Support Video Summarization Visual Programming World Wide Web Social Implications Chatrooms Children and the Web Collaboratories Computer-Supported Cooperative Work Cybercommunities Cybersex Cyborgs Education in HCI Electronic Journals E-mail Gender and Computing Groupware Human-Robot Interaction Impacts Internet—Worldwide Diffusion Internet in Everyday Life Literary Representations Movies MUDs Multiagent systems Sociable Media Software Cultures Work Workforce
LIST OF SIDEBARS
Adaptive Help Systems Farewell “Clippy”
Chatrooms Life Online
Adaptive Interfaces Keeping Disabled People in the Technology Loop
Classrooms History Comes Alive in Cyberspace Learning through Multimedia
Anthropology and HCI Digital Technology Helps Preserve Tribal Language Anthropology and HCI Eastern vs. Western Cultural Values
Computer-Supported Cooperative Work Internet Singing Lessons Social Context in Computer-Supported Cooperative Work
Augmented Cognition Putting Humans First in Systems Design
Cybercommunities Welcome to LamdaMOO
Braille Enhancing Access to Braille Instructional Materials
Cybersex Cybersex Addiction xix
XX ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Digital Divide HomeNetToo Tries to Bridge Digital Divide Digital Libraries Vannevar Bush on the Memex Education in HCI Bringing HCI Into the Real World Eliza Talking with ELIZA E-mail The Generation Gap Errors in Interactive Behavior To Err Is Technological Fonts Our Most Memorable Nightmare Gender and Computing “Computer Girl” Site Offers Support for Young Women Narrowing the Gap Geographic Information Systems Geographic Information Systems Aid Land Conservation Groupware Away Messages The Wide World of Wikis History of HCI Highlights from My Forty Years of HCI History Human-Robot Interaction Carbo-Powered Robots Hypertext and Hypermedia Ted Nelson on Hypertext and the Web Impacts Therac-25 Safety Is a System Property
Internet in Everyday Life Finding Work Online Information Technology and Competitive Academic Debate Law Enforcement Fighting Computer Crime Literary Representations Excerpt from Isaac Asimov’s I, Robot Excerpt from “The Sand-Man” (1817) by E. T. A. Hoffman Machine Translation Warren Weaver on Machine Translation Movies HAL’s Birthday Celebration MUDs The Wide World of a MUD Online Education An Online Dig for Archeology Students Virtual Classes Help Rural Nurses Political Science and HCI Washington Tales of the Internet Psychology and HCI Human Factors Come into the Forefront Virtual Flight for White-Knuckled Travelers Repetitive Strain Injury The Complexities of Repetitive Strain Scenario-Based Design The Value of a Devil’s Advocate Social Psychology and HCI Love and HCI Sociology and HCI “Who’s on First” for the Twenty-First Century
LIST OF SIDEBARS ❚❙❘ XXI
Spell Checker Check the Spell Checker Task Analysis Excerpt from Cheaper by the Dozen Unicode History and Development of Unicode Relationship of the Unicode Standard to ISO_IEC 10646 Usability Evaluation Global Usability Is Usability Still a Problem?
Work Software Prescribes Break Time for Enhanced Productivity Workforce Cultural Differences Employee Resistance to Technology World Wide Web “Inventing” the World Wide Web Tim Berners-Lee on the Web as Metaphor WYSIWYG The Future of HCI
CONTRIBUTORS
Abras, Chadia Goucher College User-Centered Design
Alvarez, R. Michael Caltech-MIT Voting Technology Project Online Voting
Agrawal, Dharma P. University of Cincinnati Mobile Computing
Aronson, Jay E. University of Georgia Expert Systems
Allanson, Jennifer Lancaster University Physiology
Bailenson, Jeremy Stanford University Avatars
Allison, Brendan Z. Georgia State University Brain-Computer Interfaces
Bainbridge, Erika Harvard University, Center for Hellenic Studies Asian Script Input
xxiii
XXIV ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Bainbridge, William Sims National Science Foundation Altair Alto Asian Script Input Converging Technologies Cyborgs ENIAC Evolutionary Engineering Hollerith Card Literary Representations Mosaic Movies Personality Capture Pocket Computer Sociology and HCI Tablet Computer Three-Dimensional Printing Bartle, Richard Allan Multi-User Entertainment Limited MUDs Barzilay, Regina Massachusetts Institute of Technology Language Generation
Bilal, Dania University of Tennessee Children and the Web Bisantz, Ann M. State University of New York, Buffalo Ergonomics Blackmon, Marilyn Hughes University of Colorado, Boulder Cognitive Walkthrough Blascovich, James J. University of California, Santa Barbara Avatars Bos, Nathan University of Michigan E-mail Bradshaw, Shannon University of Iowa Search Engines Brusilovsky, Peter University of Pittsburgh Adaptive Help Systems
Baskerville, Richard Georgia State University Iterative Design Prototyping
Burnett, Margaret M. Oregon State University Visual Programming
Batteau, Allen W. Wayne State University Anthropology and HCI
Cai, Guoray Pennsylvania State University Multimodal Interfaces
Ben-Arie, Jezekiel University of Illinois, Chicago Motion Capture and Recognition
Caldera, Cavinda T. Syracuse University Grid Computing
Bernard, Michael L. Wichita State University Website Design
Campbell, Murray IBM T.J. Watson Research Center Deep Blue
Bhavnani, Suresh K. University of Michigan Application Use Strategies
CONTRIBUTORS ❚❙❘ XXV
Carroll, John M. Pennsylvania State University Computer-Supported Cooperative Work Scenario-Based Design Chaparro, Barbara S. Wichita State University Website Design Chen, Lawrence Eastman Kodak Research Labs Affective Computing Chen, Wenhong University of Toronto Internet – Worldwide Diffusion Choset, Howie Carnegie Mellon University Search and Rescue Choueiry, Berthe Y. University of Nebraska, Lincoln Constraint Satisfaction Cockburn, Andy University of Canterbury Browsers Cohen, Ira Hewlett-Packard Research Labs, University of Illinois, Urbana-Champaign Affective Computing Collyer, Glenn iDacta, Inc. Online Education Cook, Diane J. University of Texas, Arlington Smart Homes Crawford, Gregory P. Brown University Cathode Ray Tubes Electronic Paper Technology Liquid Crystal Displays
Crosby, Martha E. University of Hawaii Information Filtering Danziger, James N. University of California, Irvine Political Science and HCI Davis, Adriane D. Georgia State University Brain-Computer Interfaces DeFanti, Thomas University of Illinois, Chicago Cave Delmonico, David L. Duquesne University Cybersex Dennerlien, Jack Tigh Harvard School of Public Health Repetitive Strain Injury Deshpande, Ashutosh P. Syracuse University Identity Authentication Detrie, Thomas Arizona State University Fonts Dewan, Prasun Microsoft Corporation Multiuser Interfaces Ubiquitous Computing Donath, Judith Massachusetts Institute of Technology Sociable Media Dongarra, Jack University of Tennessee Supercomputers
XXVI ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Draper, John V. Raven Research Telepresence
Fountain, Jane E. Harvard University Digital Government
Duchowski, Andrew T. Clemson University Eye Tracking
Franklin, Judy A. Smith College Musical Interaction
DuPont, Brandon Policy Research Institute Workforce
Freedman, Immanuel Dr. Immanuel Freedman, Inc. Video
El Rhalibi, Abdennour Liverpool John Moores University Animation Games
Friedman, Batya University of Washington Value Sensitive Design
England, David Liverpool John Moores University Graphical User Interface Erickson, Thomas IBM T. J. Watson Research Center Social Proxies Essa, Irfan Georgia Institute of Technology Facial Expressions Farkas, David K. University of Washington Hypertext and Hypermedia Instruction Manuals Feathers, David State University of New York, Buffalo Anthropometry Feinman, Alexander C. Brandeis University Groupware Fillmore, Charles J. International Computer Science Institute Lexicon Building
Fussell, Susan R. Carnegie Mellon University Social Psychology and HCI Goldman, Rich University of Maryland, Baltimore Touchscreen Goodchild, Michael F. University of California, Santa Barbara Geographic Information Systems Govindaraju, V. University at Buffalo Handwriting Recognition and Retrieval Optical Character Recognition Gray, Wayne D. Rensselaer Polytechnic Institute Errors in Interactive Behavior Griffin, Elizabeth J. Internet Behavior Consulting Cybersex Griffiths, Jose-Marie University of Pittsburgh Digital Libraries
CONTRIBUTORS ❚❙❘ XXVII
Gross, Mark D. University of Washington Drawing and Design
Hauck, Roslin V. Illinois State University Law Enforcement
Grudin, Jonathan Microsoft Research Computer Science History of HCI
Hedge, Alan Cornell University Keyboard Pen and Stylus Input
Gustafson, John Sun Microsystems Atanasoff-Berry Computer
Hennecke, Marcus TEMIC Telefunken Microelectronic GmbH Speechreading
Guthrie, Ruth California Polytechnic University of Pomona Information Overload
Hickey, Timothy J. Brandeis University Groupware
Gutierrez-Osuna, Ricardo Texas A&M University Olfactory Interaction
Hodges, Larry F. University of North Carolina, Charlotte Virtual Reality
Guzman, Indira R. Syracuse University User Support
Hogan, Bernie University of Toronto Internet in Everyday Life
Hakken, David State University of New York Institute of Technology Ethnography
Holland, Arnold California State University, Fullerton Fonts
Hall, Thad E. Century Foundation Online Voting Halverson, Christine IBM T. J. Watson Research Center Work Harper, Mary P. Purdue University Speech Recognition Harper, V. Paul United States Patent and Trademark Office Speech Recognition
Hollis, Ralph L. Carnegie Mellon University Haptics Hollnagel, Erik University of Linköping Task Analysis Huang, Thomas S. University of Illinois, Urbana-Champaign Affective Computing Huff, Chuck Saint Olaf College Impacts
XXVIII ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Jackson, Linda A. Michigan State University Digital Divide Gender and Computing Jensen, Michael J. University of California, Irvine Political Science and HCI Kaminka, Gal Bar Ilan University Multiagent systems Kazman, Richard Carnegie Mellon University Software Engineering Kellogg, Wendy A. IBM T. J. Watson Research Center Social Proxies
Klavans, Judith L. Columbia University Text Summarization Kline, Ronald R. Cornell University Information Theory Kobsa, Alfred University of California, Irvine Adaptive Interfaces Koenig, Sven Georgia Institute of Technology Planning Krishna, C. M. University of Massachusetts, Amherst Fly-by-Wire
Kendall, Lori State University of New York, Purchase College Cybercommunities
Kruse, Amy Strategic Analysis, Inc. Arpanet Augmented Cognition
Kettebekov, Sanshzar Oregon Health and Science University Multimodal Interfaces
Laff, Mark IBM T.J. Watson Research Center Client-Server Architecture
Khedekar, Swapnil University at Buffalo Optical Character Recognition
Lagoudakis, Michail G. Georgia Institute of Technology Planning
Kim, Hong-Gee Dankook University Markup Languages
Lampe, Cliff University of Michigan Recommender and Reputation Systems
Kim, Jee-In Konkuk University Desktop Metaphor
Lane, David M. Rice University Sonification WYSIWYG
Kirchhoff, Katrin University of Washington Machine Translation
Lenhart, Amanda B. Pew Internet & American Life Project Chatrooms
CONTRIBUTORS ❚❙❘ XXIX
Lok, Benjamin C. University of Florida Three-Dimensional Graphics Virtual Reality Ludi, Stephanie Rochester Institute of Technology Icons Ma, Kwan-Liu University of California, Davis Data Visualization MacQueen, David University of Chicago Programming Languages Madey, Gregory R. University of Notre Dame Open Source Software Maloney-Krichmar, Diane Bowie State University User-Centered Design Manmatha, R. University of Massachusetts, Amherst Handwriting Recognition and Retrieval Martin, James H. University of Colorado, Boulder Natural-Language Processing N-grams May, Jon University of Sheffield Theory McKinnon, Robin A. Harvard University Digital Government McRoy, Susan W. University of Wisconsin, Milwaukee Dialog Systems
Miller, Sonia E. S. E. Miller Law Firm Law and HCI Moore, Melody M. Georgia State University Brain-Computer Interfaces Murtagh, Fionn Queen’s University, Belfast Information Spaces Neff, Gina University of California, Los Angeles Beta Testing Nissenbaum, Helen New York University Ethics Olson, Gary M. University of Michigan Collaboratories Olson, Judith S. University of Michigan Psychology and HCI Omojokun, Olufisayo University of North Carolina, Chapel Hill Ubiquitous Computing Paik, Woojin University of Massachusetts, Boston Compilers Spell Checker Paquet, Victor L. State University of New York, Buffalo Anthropometry Pargas, Roy Clemson University Online Questionnaires
XXX ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Peres, S. Camille Rice University Sonification
Rogers, Erika California Polytechnic State University Human-Robot Interaction
Perrig, Adrian Carnegie Mellon University Denial-of-Service Attack
Rosenbaum, Howard Indiana University Social Informatics
Preece, Jenny University of Maryland, Baltimore County User-Centered Design
Rosenbloom, Joshua L. University of Kansas Workforce
Quek, Francis Wright State University Gesture Recognition
Rosson, Mary Beth Pennsylvania State University Computer-Supported Cooperative Work
Quintana, Chris University of Michigan Classrooms
Ruthruff, Joseph R. Oregon State University Visual Programming
Quiroga, Luz M. University of Hawaii Information Filtering
Sandin, Dan University of Illinois, Chicago CAVE
Rajlich, Vaclav Wayne State University Software Cultures
Sandor, Aniko Rice University Sonification
Raphael, Christopher S. University of Massachusetts, Amherst Musical Interaction
Scacchi, Walt University of California, Irvine Socio-Technical System Design
Resnick, Paul University of Michigan Recommender and Reputation Systems
Schmorrow, Dylan Defense Advanced Projects Agency Arpanet Augmented Cognition
Rhodes, Bradley Ricoh Innovations Wearable Computer Rieser, John J. Vanderbilt University Navigation
Scholtz, Jean National Institute of Standards and Technology Usability Evaluation Sears, Andrew L. University of Maryland, Baltimore County Touchscreen
CONTRIBUTORS ❚❙❘ XXXI
Sears, J. Allen Corporation for National Research Initiatives Arpanet Selker, Ted Massachusetts Institute of Technology Attentive User Interface Sewalkar, Parag Syracuse University Identity Authentication Sengupta, Kuntal Advanced Interfaces Augmented Reality Sharma, Rajeev Advanced Interfaces Augmented Reality Multimodal Interfaces Shen, Yuan Yuan Liverpool John Moores University Animation Simpson, Richard C. University of Pittsburgh User Modeling Singh, Rita Carnegie Mellon University Sphinx
Stage, Jan Aalborg University Education in HCI Iterative Design Prototyping Stanton, Jeffrey M. Syracuse University Privacy Starkweather, Gary Microsoft Corporation Laser Printer Starner, Thad Georgia Institute of Technology Wearable Computers Stephenson, Robert S. Wayne State University Online Education Sterner, William H. University of Chicago Eliza Tekalp, A. Murat University of Rochester Video Summarization Tenopir, Carol University of Tennessee Electronic Journals
Soergel, Dagobert University of Maryland Information Organization Information Retrieval
Thomas, Douglas University of Southern California Hackers
St. Amant, Robert A. North Carolina State University Artificial Intelligence Statistical Analysis Support
Thuraisingham, Bhavani National Science Foundation Security Semantic Web Tran, Phuoc University of Toronto Internet — Worldwide Diffusion
XXXII ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Tretiakoff, Oleg C.A. Technology, Inc. Braille
Westfall, Ralph David California State Polytechnic University, Pomona Telecommuting
Tygar, J. D. University of California, Berkeley Digital Cash Spamming Viruses
Williams, Ronald D. University of Virginia Embedded Systems
Unicode Editorial Committee Unicode van Santen, Jan P.H. Oregon Health and Science University Speech Synthesis Vanderheiden, Gregg University of Wisconsin, Madison Universal Access Vassileva, Julita University of Saskatchewan Peer-to-Peer Architecture Wellman, Barry University of Toronto Internet - Worldwide Diffusion Internet in Everyday Life Welty, Christopher A. IBM T.J. Watson Research Center Ontology
Wilson, Michael CCLRC Rutherford Appleton Laboratory World Wide Web Witte, James Clemson University Online Questionnaires Yaar, Abraham Carnegie Mellon University Denial of Service Attack Youngblood, Michael University of Texas, Arlington Smart Homes Zakaria, Norhayati Syracuse University E-business Zaki, Mohammad Rensselaer Polytechnic Institute Data Mining Zhai, Shumin IBM Almaden Research Center Mouse
INTRODUCTION By William Sims Bainbridge
In hardly more than half a century, computers have become integral parts of everyday life, at home, work, and play. Today, computers affect almost every aspect of modern life, in areas as diverse as car design, filmmaking, disability services, and sex education. Human-computer interaction (HCI) is a vital new field that examines the ways in which people communicate with computers, robots, information systems, and the Internet. It draws upon several branches of social, behavioral, and information science, as well as on computer science and electrical engineering. The traditional heart of HCI has been user interface design, but in recent years the field has expanded to include any science and technology related to the ways that humans use or are affected by computing technology. HCI brings to the fore social and ethical issues that
hitherto existed only in the pages of science fiction. For a sense of the wide reach of HCI, consider the following vignettes: ■
Gloria, who owns a small fitness training business, is currently trying out a new system in which she and a client dance on sensor pads on the floor, while the computer plays rhythms and scores how quickly they are placing their feet on the designated squares. ■ Elizabeth has made friends through chatrooms connected to French and British music groups that are not well known in the United States. She occasionally shares music files with these friends before buying CDs from foreign online distributors, and she has helped one of the French bands translate its website into English. xxxiii
XXXIV ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
■
Carl’s work team develops drivers for new color printers far more quickly and effectively than before, because the team comprises expert designers and programmers who live in different time zones around the world, from India to California, collectively working 24 hours a day, 7 days a week, by means of an Internet-based collaboration system. ■ Bella is blind, but her wearable computer uses Internet and the Global Positioning System not only to find her way through the city safely but also to find any product or service she needs at the best price and to be constantly aware of her surroundings. ■ Anderson, whose Internet moniker is Neo, discovers that his entire life is an illusion, maintained by a vast computer plugged directly into his nervous system. The first three stories are real, although the names are pseudonyms, and the scenarios are duplicated millions of times in the modern world of personal computers, office automation, and the World Wide Web. The fourth example could be realized with today’s technology, simply given a sufficient investment in infrastructure. Not only would it revolutionize the lives of blind people like Bella, it would benefit the sighted public too, so we can predict that it will in fact become true over the next decade or two. The story about Mr. Anderson is pure fiction, no doubt recognizable to many as the premise of the 1999 film The Matrix. It is doubtful that HCI ever could (or should) become indistinguishable from real life.
Background on HCI In a brief history of HCI technology published in 1996, the computer scientist Brad Myers noted that most computer interface technology began as government-supported research projects in universities and only years later was developed by corporations and transformed into commercial products. He then listed six up-and-coming research areas: natural language and speech, computer-supported cooperative work, virtual and augmented reality, three-dimensional graphics, multimedia, and com-
puter recognition of pen or stylus movements on tablet or pocket computers. All of these have been very active areas of research or development since he wrote, and several are fundamental to commercial products that have already appeared. For example, many companies now use speech recognition to automate their telephone information services, and hundreds of thousands of people use stylus-controlled pocket computers every day. Many articles in the encyclopedia describe new approaches that may be of tremendous importance in the future. Our entire perspective on HCI has been evolving rapidly in recent years. In 1997, the National Research Council—a private, nonprofit institution that provides science, technology, and health policy advice under a congressional charter—issued a major report, More Than Screen Deep, “to evaluate and suggest fruitful directions for progress in user interfaces to computing and communications systems.” This high-level study, sponsored by the National Science Foundation (NSF), concluded with three recommendations to the federal government and university researchers. 1. Break away from 1960s technologies and paradigms. Major attempts should be made to find new paradigms for human-machine interaction that employ new modes and media for inp u t a n d o u t p u t a n d t h a t i nv o l v e n e w conceptualizations of application interfaces. (192) 2. Invest in the research required to provide the component subsystems needed for every-citizen interfaces. Research is needed that is aimed at both making technological advances and gaining understanding of the human and organizational capabilities these advances would support. (195) 3. Encourage research on systems-level design and development of human-machine interfaces that support multiperson, multimachine groups as well as individuals. (196) In 2002, John M. Carroll looked back on the history of HCI and noted how difficult it was at first to get computer science and engineering to pay attention to issues of hardware and software usability. He
INTRODUCTION ❚❙❘ XXXV
argued that HCI was born as the fusion of four fields (software engineering, software human factors, computer graphics, and cognitive science) and that it continues to be an emerging area in computer science. The field is expanding in both scope and importance. For example, HCI incorporates more and more from the social sciences as computing becomes increasingly deeply rooted in cooperative work and human communication. Many universities now have research groups and training programs in HCI. In addition to the designers and engineers who create computer interfaces and the researchers in industry and academia who are developing the fundamental principles for success in such work, a very large number of workers in many industries contribute indirectly to progress in HCI. The nature of computing is constantly changing. The first digital electronic computers, such as ENIAC (completed in 1946), were built to solve military problems, such as calculating ballistic trajectories. The 1950s and 1960s saw a great expansion in military uses and extensive application of digital computers in commerce and industry. In the late 1970s, personal computers entered the home, and in the 1980s they developed more user-friendly interfaces. The 1990s saw the transformation of Internet into a major medium of communications, culminating in the expansion of the World Wide Web to reach a billion people. In the first decade of the twenty-first century, two trends are rushing rapidly forward. One is the extension of networking to mobile computers and embedded devices literally everywhere. The other is the convergence of all mass media with computing, such that people listen to music, watch movies, take pictures, make videos, carry on telephone conversations, and conduct many kinds of business on computers or on networks of which computers are central components. To people who are uncomfortable with these trends, it may seem that cyberspace is swallowing real life. To enthusiasts of the technology, it seems that human consciousness is expanding to encompass everything. The computer revolution is almost certainly going to continue for decades, and specialists in human-computer interaction will face many new challenges in the years to come. At least one other
technological revolution is likely to give computer technology an additional powerful boost: nanotechnology. The word comes from a unit for measuring tiny distances, the nanometer, which is one billionth of a meter (one millionth of a millimeter, or one millionth the thickness of a U.S. dime). The very largest single atoms are just under a nanometer in size, and much of the action in chemistry (including fundamental biological processes) occurs in the range between 1 nanometer and 100–200 nanometers. The smallest transistors in experimental computer chips are about 50 nanometers across. Experts working at the interface between nanotechnology and computing believe that nanoelectronics can support continued rapid improvements in computer speed, memory, and cost for twenty to thirty years, with the possibility of further progress after then by means of integrated design approaches and investment in information infrastructure. Two decades of improvement in computer chips would mean that a desktop personal computer bought in 2024 might have eight thousand times the power of one bought in 2004 for the same price—or could have the same power but cost only twenty cents and fit inside a shirt button. Already, nanotechnology is being used to create networks of sensors that can detect and identify chemical pollutants or biological agents almost instantly. While this technology will first be applied to military defense, it can be adapted to medical or personal uses in just a few years. The average person’s wristwatch in 2024 could be their mobile computer, telling them everything they might want to know about their environment— where the nearest Thai restaurant can be found, when the next bus will arrive at the corner up the road, whether there is anything in the air the person happens to be allergic to, and, of course, providing any information from the world’s entire database that the person might want to know. If advances in natural-language processing continue at the rate they are progressing today, then the wristwatch could also be a universal translator that allows the person to speak with anyone in any language spoken on the face of the planet. Of course, predictions are always perilous, and it may be that progress will slow down. Progress does not simply happen of its own
XXXVI ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
accord, and the field of human-computer interaction must continue to grow and flourish if computers are to bring the marvelous benefits to human life that they have the potential to bring.
My Own Experience with Computers Computer and information technologies have progressed amazingly over the past fifty years, and they may continue to do so for the next half century. My first computer, if it deserves that word, was a Geniac I received for my sixteenth birthday in 1956. Costing only $20, it consisted of masonite disks, wires, light bulbs and a vast collection of nuts, bolts, and clips. From these parts I could assemble six rotary switches that could be programmed (by hardwiring them) to solve simple logic problems such as playing tick-tack-toe. I developed a great affection for the Geniac, as I did for the foot-long slide rule I lugged to my high school classes, but each was a very far cry from the pocket computer or even the programmable calculator my sixteenyear-old daughter carries in her backpack today. Geniac was not really an electronic computer because it lacked active components—which in 1956 meant relays or vacuum tubes, because transistors were still very new and integrated circuits had not yet been invented. The first real computer I saw, in the early 1960s, was the massive machine used by my father’s company, Equitable Life Insurance, to keep its records. Only decades later did I learn that my uncle, Angus McIntosh, had been part of a team in World War II that seized the German computer that was cracking Soviet codes, and that the secret Colossus computer at Bletchley Park where he worked had been cracking German codes. In the middle of the twentieth century, computers were huge, rare, and isolated from the general public, whereas at the beginning of the twenty-first century they are essential parts of everyday life. My first experience programming computers came in 1974, when I was a graduate student in the sociology department at Harvard University, and I began using the machines for statistical analysis of data. Starting the next year at the University of Washington, where I was a beginning assistant professor, I would sit for hours at a noisy keypunch machine, making the punch cards to enter programs
and data. After a while I realized I was going deaf from the noise and took to wearing earplugs. Later, back at Harvard in a faculty position, I began writing my own statistical analysis programs for my first personal computer, an Apple II. I remember that one kind of analysis would take a 36 hours to run, with the computer humming away in a corner as I went about my daily life. For a decade beginning in 1983, I programmed educational software packages in sociology and psychology, and after a series of computer-related projects found myself running the sociology program at the National Science Foundation and representing the social and behavioral sciences on the major computing initiatives of NSF and the federal government more generally. After eight years of that experience, I moved to the NSF Directorate for Computer and Information Science and Engineering to run the NSF’s programs in human-computer interaction, universal access, and artificial intelligence and cognitive science before becoming deputy director of the Division of Information and Intelligent Systems, which contains these programs. My daughters, aged sixteen and thirteen, have used their considerable computer expertise to create the Center for Glitch Studies, a research project to discover and analyze programming errors in commercial video games. So far they have documented on their website more than 230 programming errors in popular video games. The hundreds of people who visit the website are not a passive audience, but send e-mail messages describing errors they themselves discovered, and they link their own websites into a growing network of knowledge and virtual social relationships.
A Personal Story—NSF’s FastLane Computers have become vastly more important at work over recent decades, and they have come to play increasingly more complex roles. For example, NSF has created an entire online system for reviewing grant proposals, called FastLane, and thousands of scientists and educators have become familiar with it through serving as reviewers or principal investigators.
INTRODUCTION ❚❙❘ XXXVII
A researcher prepares a description of the project he or she hopes to do and assembles ancillary information such as a bibliography and brief biographies of the team members. The researcher submits this material, along with data such as the dollar requests on the different lines of the formal budget. The only software required is a word processor and a web browser. As soon as the head of the institution’s grants office clicks the submit button, the full proposal appears at NSF, with the data already arranged in the appropriate data fields, so nobody has to key it in. Peer review is the heart of the evaluation process. As director of the HCI program, I categorize proposals into review panels, then recruit panelists who were experts in the field with specializations that matched the scope of the proposals. Each panelist reviews certain proposals and submits a written review electronically. Once the individual reviews have been submitted, the panel meets face-to-face to discuss the proposals and recommend funding for the best ones. The panelists all have computers with Electronic Panel System (EPS) groupware that provides easy access to all the proposals and reviews associated with the particular panel. During the discussion of a particular proposal, one panelist acts as “scribe,” keeping a summary of what was said in the EPS. Other panelists can read the summary, send written comments to the scribe, and may be asked to approve the final draft online. Next the NSF program officer combines all the evaluations and writes a recommendation in the electronic system, for approval by the director of the division in which the program is located. More often than not, unfortunately, the decision is to decline to fund the proposal. In that case, the program officer and division director processes the action quickly on their networked computers, and an electronic notification goes immediately to the principal investigator, who can access FastLane to read the reviews and summary of the panel discussion. In those rarer and happier situations when a grant is awarded, the principal investigator and program officer negotiate the last details and craft an abstract, describing the research. The instant the award is made, the money goes electronically to
the institution, and the abstract is posted on the web for anyone to see. Each year, the researcher submits a report, electronically of course, and the full record of the grant accumulates in the NSF computer system until the work has been completed. Electronic systems connect the people— researcher, program director, and reviewers—into a system of information flow that is also a social system in which each person plays a specific role. Because the system was designed over a number of years to do a particular set of jobs, it works quite well, and improvements are constantly being incorporated. This is a prime example of Computer-Supported Cooperative Work, one of the many HCI topics covered in this encyclopedia.
The Role of the Berkshire Encyclopedia of Human-Computer Interaction Because the field of HCI is new, the Berkshire Encyclopedia of Human-Computer Interaction breaks new ground. It offers readers up-to-date information about several key aspects of the technology and its human dimensions, including ■
applications—major tools that serve human needs in particular ways, with distinctive usability issues. ■ approaches—techniques through which scientists and engineers design and evaluate HCI. ■ breakthroughs—particular projects that marked a turning point in the history of HCI. ■ challenges—problems and solutions, both technical and human, especially in controversial areas. ■ components—key parts of a software or hardware system that are central to how people use it. ■ disciplines—the contributions that various sciences and academic fields make to HCI. ■ interfaces—hardware or software systems that mediate between people and machines. ■ methods—general computer and information science solutions to wide classes of technical problems. ■ social implications—technological impacts on society and policy issues, and the potential of multiuser HCI systems to bring about social change.
XXXVIII ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
These categories are not mutually exclusive; many articles fit in two or more of them. For example, the short article on laser printers concerns an output interface and explains how a laser printer puts words and pictures on paper. But this article also concerns a breakthrough, the actual invention of the laser printer, and it was written by the inventor himself, Gary Starkweather. Contributors The 175 contributors to the encyclopedia possess the full range and depth of expertise covered by HCI, and more. They include not only computer scientists and electrical engineers, but also social and behavioral scientists, plus practicing engineers, scientists, scholars, and other experts in a wide range of other fields. The oldest authors were born around the time that the very first experimental digital electronic computer was built, and the entire history of computing has taken place during their lives. Among the influential and widely respected contributors is Jose-Marie Griffiths, who contributed the article on digital libraries. As a member of the U.S. President’s Information Technology Advisory Committee, Griffiths understands the full scope and social value of this new kind of public resource. Contributors Judith S. Olson, Gary M. Olson, and John M. Carroll are among the very few leaders who have been elected to the Academy of the Special Interest Group on Computer-Human Interaction of the Association for Computing Machinery (SIGCHI). In 2003 Carroll received the organization’s Lifetime Achievement Award for his extensive accomplishments, including his contributions to the Blacksburg Electronic Village, the most significant experiment on community participation in computer-mediated communication. Jack Dongarra, who wrote the contribution on supercomputers, developed the LINPACK Benchmark, which is used to test the speed of these upper-end machines and which is the basis of the annual list of the five hundred fastest computers in the world. Building the Encyclopedia: Computer-Supported Cooperative Work The creation of this encyclopedia is an example of computer-supported cooperative work, a main area
of HCI. I have written occasional encyclopedia articles since the early 1990s, when I was one of several subject matter editors of The Encyclopedia of Language and Linguistics. Often, an editor working on a specialized encyclopedia for one publisher or another would send me an e-mail message asking if I would write a particular essay, and I would send it in, also by e-mail. I had a very good experience contributing to the Encyclopedia of Community, edited by Karen Christensen and David Levinson of Berkshire Publishing. I suggested to Karen that Berkshire might want to do an encyclopedia of human-computer interaction and that I could recruit excellent authors for such a project. Berkshire has extensive experience developing high-quality reference works, both in partnership with other publishing houses and on its own. Almost all the communication to create the encyclopedia was carried out online. Although I know many people in the field personally, it was a great help to have access to the public databases placed on the Web by NSF, including abstracts of all grants made in the past fifteen years, and to the online publications of organizations such as the Association for Computing Machinery and to the websites of all of the authors, which often provide copies of their publications. Berkshire created a special passwordprotected website with information for authors and a section where I could review all the essays as they were submitted. For the Reader There are many challenges ahead for HCI, and many are described in this encyclopedia. Difficult problems tend to have both technical and human aspects. For the benefit of the reader, the articles identify standard solutions and their ramifications, both positive and negative, and may also cover social or political controversies surrounding the problem and its possible solutions. Many of the articles describe how a particular scientific discipline or branch of engineering approaches HCI, and what it contributes to the multidisciplinary understanding of and improvement in how computers, robots, and information systems can serve human needs. Other articles focus on a particular interface, modality, or medium in which people receive information and control the
INTRODUCTION ❚❙❘ XXXIX
computer or system of which it is a part. These articles explain the technical features of the hardware or software; they also explain the way humans perceive, learn, and behave in the particular context. Still other articles concern how computer and information science has developed to solve a wide class of problems, using vivid examples to explain the philosophy of the method, paying some attention as well to the human side of the equation. Many articles—sometimes as their central focus and sometimes incidentally—examine the social implications of HCI, such as the impact of a particular kind of technology, the way that the technology fits into societal institutions, or a social issue involving computing. The technology can strengthen either cooperation or conflict between human beings, and the mutual relations between technological change and social change are often quite complex. For information technology workers, this encyclopedia provides insight into specialties other than the one they work in and offers useful perspectives on the broad field. For policy makers, it provides a basis for thinking about the decisions we face in exploiting technological possibilities for maximum human benefit. For students, this encyclopedia lays out how to use the technology to make a better world and offers a glimpse of the rapidly changing computer-assisted human world in which they are living their lives. To illuminate and expand on the articles themselves, the encyclopedia includes the following special features: ■
Approximately eighty sidebars with key primary text, glossary terms, quotes, and personal stories about how HCI has had an impact on the work and lives of professionals in the field.
■
Some seventy-five diverse illustrations, which range from “antique” photos of the ENIAC computer (c. 1940s) to cutting-edge computerized images. ■ A bibliography of HCI books and journal articles. ■ A popular culture appendix that includes more than 300 annotated entries on books, plays, movies, television shows, and songs that have connections to HCI. William Sims Bainbridge The views expressed are those of the author and do not necessarily reflect the position of the National Science Foundation
FURTHER READING Asher, R. E., & Simpson, J. M. Y. (Eds.). (1994). The encyclopedia of language and linguistics. Oxford, UK: Pergamon. Bainbridge, W. S. (1989). Survey research: A computer-assisted introduction. Belmont, CA: Wadsworth. Bainbridge, W. S. (1992). Social research methods and statistics: A computer-assisted introduction. Belmont, CA: Wadsworth. Carroll, J. M. (Ed.). (2002). Human-computer interaction in the new millennium. Boston: Addison-Wesley. Christensen, K., & Levinson, D. (2003). Encyclopedia of community: From the village to the virtual world. Thousand Oaks, CA: Sage. Myers, B. A. (1996). A brief history of human computer interaction technology. ACM Interactions, 5(2), 44–54. National Research Council. (1997). More than screen deep. Washington, DC: National Academy Press. Roco, M. C., & Bainbridge, W. S. (2001). Societal implications of nanoscience and nanotechnology. Dordrecht, Netherlands: Kluwer. Roco, M. C., & Bainbridge, W. S. (2003). Converging technologies for improving human performance. Dordrecht, Netherlands: Kluwer.
PUBLISHER’S NOTE By Karen Christensen
The Berkshire Encyclopedia of Human-Computer Interaction (HCI) is our first independent title. We’ve done many other award-winning encyclopedias but HCI will always have a unique place in our hearts and in our history. Even though most of our work has been in the social sciences, when William Bainbridge at the National Science Foundation wrote to suggest the topic of HCI, I knew instantly that it was the right topic for our “knowledge and technology” company. I grew up with the computer industry. My father, a computer engineer in the Silicon Valley, tried very hard to explain the fundamentals of computing, and even built a machine out of plywood and blinking lights to show my sixth-grade class that information can be captured and communicated with nothing more than a combination of on-off switches. I was a reader, much more interested in human stories and
relationships than in binary code; but it was books— and a career in publishing—that at last brought home to me that computers can support and expand human connections and improve our lives in myriad ways. Berkshire Publishing Group, based in a tiny New England town, depends on human-computer interaction to maintain working relationships, and friendships too, with many thousands of experts around the world. We are convinced, in fact, that this topic is central to our development as a twenty-first century publishing company, The Berkshire Encyclopedia of Human-Computer Interaction takes computing into new realms, introducing us to topics that are intriguing both in their technical complexity and because they present us— human beings—with a set of challenging questions about our relationship with “thinking”machines. There are opportunities and risks in any new technology, and xli
XLII ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
HCI has intrigued writers for many decades because it leads us to a central philosophical, religious, and even historical question: What does it mean to be human? We’ll be exploring this topic and related ones in further works about technology and society. Bill Bainbridge was an exceptional editor: organized, focused, and responsive. Working with him has been deeply rewarding, and it’s no surprise that the hundreds of computer scientists and engineers he helped us recruit to contribute to the encyclopedia were similarly enthusiastic and gracious. All these experts—computer scientists and engineers as well as people working in other aspects of HCI— truly wanted to work with us to ensure that their work would be accessible and understandable. To add even greater interest and richness to the work, we’ve added dozens of photographs, personal stories, glossary terms, and other sidebars. In addition to article bibliographies, there is a master bibliography at the end, containing all 2,590 entries in the entire encyclopedia listed together for easy reference. And we’ve added a characteristic Berkshire touch, an appendix designed to appeal to even the most resolute Luddite: “HCI in Popular Culture,” a database compilation listing with 300 sci-fi novels, nonfiction titles, television programs and films from The Six-Million Dollar Man to The Matrix (perhaps the quintessential HCI story), and even a handful of plays and songs about computers and technology. The encyclopedia has enabled us to develop a network of experts as well as a cutting-edge resource that will help us to meet the needs of students, professionals, and scholars in many disciplines. Many articles will be of considerable interest and value to librarians—Digital Libraries, Information Filtering, Information Retrieval, Lexicon Building, and much more—and even to publishers. For example, we have an article on “Text Summarization” written by Judith Klavans, Director of Research at the Center for Advanced Study of Language, University of Maryland. “Summarization is a technique for identifying the key points of a document or set of related documents, and presenting these selected points as a brief, integrated independent representation” and is essential to electronic publishing, a key aspect of publishing today and in the future.
The Berkshire Encyclopedia of Human-Computer Interaction provides us with an essential grounding in the most relevant and intimate form of technology, making scientific and technological research available to a wide audience. This topic and other aspects of what Bill Bainbridge likes to refer to as “converging technologies” will continue to be a core part of our print and online publishing program. And, as befits a project so closely tied to electronic techn o l o g y, a n o n l i n e ve r s i o n o f t h e B e r k s h i re Encyclopedia of Human-Computer Interaction will be available through xrefplus. For more information, visit www.xreferplus.com. Karen Christensen CEO, Berkshire Publishing Group
[email protected] Editor’s Acknowledgements Karen Christensen, cofounder of the Berkshire Publishing Group, deserves both thanks and praise for recognizing that the time had come when a comprehensive reference work about human relations with computing systems was both possible and sorely needed. Courtney Linehan at Berkshire was both skilled and tireless in working with the authors, editor, and copyeditors to complete a marvelous collection of articles that are technically accurate while communicating clearly to a broad public. At various stages in the process of developing the encyclopedia, Marcy Ross and George Woodward at Berkshire made their own indispensable contributions. Among the authors, Mary Harper, Bhavani Thuraisingham, and Barry Wellman were unstinting in their insightful advice. I would particularly like to thank Michael Lesk who, as director of the Division of Information and Intelligent Systems of the National Science Foundation, gave me the opportunity to gain invaluable experience managing the grant programs in Universal Access and Human-Computer Interaction. William Sims Bainbridge Deputy Director, Division of Information and Intelligent Systems National Science Foundation
ABOUT THE EDITOR
William Sims Bainbridge is deputy director of the Division of Information and Intelligent Systems of the National Science Foundation, after having directed the division’s Human-Computer Interaction, Universal Access, and Knowledge and Cognitive Systems programs. He coedited Converging Technologies to Improve Human Performance, which explores the combination of nanotechnology, biotechnology, information technology, and cognitive science (National Science Foundation, 2002; www.wtec.org/ConvergingTechnologies). He has rep-
resented the social and behavioral sciences on five advanced technology initiatives: High Performance Computing and Communications, Knowledge and Distributed Intelligence, Digital Libraries, Information Technology Research, and Nanotechnology. Bill Bainbr idge is also the author of ten books, four textbook-software packages, and some 150 shorter publications in information science, social science of technology, and the sociology of culture. He earned his doctorate from Harvard University.
xliv
ADAPTIVE HELP SYSTEMS ADAPTIVE INTERFACES AFFECTIVE COMPUTING ALTAIR
A
ALTO ANIMATION ANTHROPOLOGY AND HCI ANTHROPOMETRY APPLICATION USE STRATEGIES ARPANET ARTIFICIAL INTELLIGENCE ASIAN SCRIPT INPUT THE ATANASOFF-BERRY COMPUTER ATTENTIVE USER INTERFACE AUGMENTED COGNITION AUGMENTED REALITY AVATARS
ADAPTIVE HELP SYSTEMS Adaptive help systems (AHSs; also called intelligent help systems) are a specific kind of help system and a recognized area of research in the fields of artificial intelligence and human-computer interaction. The goal of an adaptive help system is to provide personalized help to users working with complex interfaces, from operating systems (such as UNIX) to popular applications (such as Microsoft Excel). Unlike traditional static help systems that serve by request the same information to different users, AHSs attempt to adapt to the knowledge and goals of individual users, offering the most relevant information in the most relevant way.
The first wave of research on adaptive help emerged in early 1980 when the UNIX system—due to its low cost and efficiency—reached many universities whose users lacked the advanced technical training (such as knowledge of complicated commands) needed to operate UNIX. Early work on adaptive and intelligent help systems focused almost exclusively on UNIX and its utilities, such as text editors and e-mail. From 1980 to 1995 this research direction involved more than a hundred researchers working on at least two-dozen projects. The most representative projects of this generation were UNIX Consultant and EUROHELP. The widespread use of graphical user interfaces (GUIs) in early 1990 caused a pause in AHS research, because GUIs resolved a number of the problems that the early generation of AHS sought to address. In just a few years, however, GUIs reached the level of complexity where adaptive help again became important, giving 1
2 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
INTERFACE Interconnections between a device, program, or person that facilitate interaction.
rise to a second wave of research on AHSs. Lumière, the most well known project of this wave, introduced the idea of intelligent help to millions of users of Microsoft applications.
Active and Passive AHSs Adaptive help systems are traditionally divided into two classes: active and passive. In a passive AHS, the user initiates the help session by asking for help. An active help system initiates the help session itself. Both kinds of AHSs have to solve three challenging problems: They must build a model of user goals and knowledge, they must decide what to present in the next help message, and they must decide how to present it. In addition, active AHSs also need to decide when to intervene with adaptive help. User Modeling To be useful, a help message has to present information that is new to the user and relevant to the user‘s current goal. To determine what is new and relevant, AHSs track the user’s goals and the user’s knowledge about the interface and maintain a user model. Two major approaches to user modeling in AHSs are “ask the user” and “observe the user.” Most passive AHSs have exploited the first of these approaches. UNIX Consultant demonstrates that a passive AHS can be fairly advanced: It involves users in a natural-language dialogue to discover their goals and degree of knowledge and then provides the most relevant information. In contrast, active AHSs, introduced by the computer scientist Gerhard Fischer in 1985, strive to deduce a user’s goals by observing the user at work; they then strive to identify the lack of knowledge by detecting errors and suboptimal behavior. EUROHELP provides a good example of an active help system capable of identifying a knowledge gap and filling it provocatively. In practical AHSs the two approaches often coexist: The user model is initiated through a short interview with the user and then kept updated through observation.
Many AHSs use two classic approaches to model the user. First, they track the user’s actions to understand which commands and concepts the user knows and which are not known, and second, they use task models to deduce the user’s current goal and missing knowledge. The first technology is reasonably simple: The system just records all used commands and parameters, assuming that if a command is used, it must be known. The second is based on plan recognition and advanced domain knowledge representation in such forms as a goal-plan-action tree. To identify the current goal and missing pieces of knowledge, the system first infers the user’s goal from an observed se-
Farewell Clippy
M
any PC users through the years quickly learned how to turn off “Clippy,” the Microsoft Office helper who appeared out of nowhere eagerly hoping to offer advice to the baffled. The Microsoft press release below was Clippy’s swan song. REDMOND, Wash., April 11, 2001—Whether you love him or you hate him, say farewell to Clippy automatically popping up on your screen. Clippy is the little paperclip with the soulful eyes and the Groucho eyebrows. The electronic ham who politely offers hints for using Microsoft Office software. But, after four years on-screen, Clippy will lose his starring role when Microsoft Office XP debuts on May 31. Clippy, the Office Assistant introduced in Office 97, has been demoted in Office XP. The wiry little assistant is turned off by default in Office XP, but diehard supporters can turn Clippy back on if they miss him. “Office XP is so easy to use that Clippy is no longer necessary, or useful,” explained Lisa Gurry, a Microsoft product manager. “With new features like smart tags and Task Panes, Office XP enables people to get more out of the product than ever before. These new simplicity and easeof-use improvements really make Clippy obsolete,” she said. “He’s quite down in the dumps,” Gurry joked. “He has even started his own campaign to try to get his old job back, or find a new one.” Source: Microsoft. Retrieved March 10, 2004, from http://www.microsoft.com/presspass/features/2001/apr01/04-11clippy.asp
ADAPTIVE INTERFACES ❚❙❘ 3
quence of commands. It then tries to find a more efficient (or simply correct) sequence of commands to achieve this goal. Next, it identifies the aspects of the interface that the user needs to know to build this sequence. These aspects are suspected to be unknown and become the candidates to be presented in help messages.
intelligence and HCI and has helped to establish research on intelligent interfaces and user modeling. A treasury of knowledge accumulated by various AHS projects over the last thirty years is being used now to develop practical adaptive help and adaptive performance support systems. Peter Brusilovsky
Providing Adaptive Help: Deciding What to Present and How Deciding what should be the focus of the next help message is the most challenging job of an adaptive help system. A number of passive AHSs simply avoid this problem, allowing the users to determine what they need and focusing on adaptive presentation only. Classic AHSs, which use plan recognition, can determine quite precisely what the user needs, but this functionality requires elaborate knowledge representation. To bypass the knowledge representation barrier, modern practical AHSs use a range of alternative (though less precise) technologies that are either statistically or socially based. For example, Lumière used a complex probabilistic network to connect observed user actions with available help interventions, while the system developed by MITRE researchers Linton and Schaefer compared the skills of individual users with a typical set of interface skills assembled by observing multiple users. As soon as the focus of the next help message is determined, the AHS has to decide how to present the target content. While some AHSs ignore this part and focus solely on the selection part, it has been shown that adaptive presentation of help information can increase the user’s comprehension speed and decrease errors. Most often the content presentation is adapted to the user’s knowledge, with, for example, expert users receiving more specific details and novice users receiving more explanations. To present the adaptive content, classic AHSs that operated in a line-based UNIX interface relied mostly on a natural language generation approach. Modern AHSs operating in the context of Graphical User Interfaces exploit adaptive hypermedia techniques to present the content and links to further information that is most suitable for the given user. Research into adaptive help systems has contributed to progress in a number of subfields within artificial
See also Artificial Intelligence; Task Analysis; User Modeling FURTHER READING Brusilovsky, P., Kobsa, A., & Vassileva, J. (Eds.). (1998). Adaptive hypertext and hypermedia. Dordrecht, Netherlands: Kluwer. Encarnação, L. M., & Stoev, S. L. (1999). An application-independent intelligent user support system exploiting action-sequence based user modeling. In J. Kay (Ed.), Proceedings of 7th International Conference on User Modeling, UM99, June 20–24, 1999 (pp. 245–254). Vienna: Springer. Fischer, G. (2001). User modeling in human-computer interaction. User Modeling and User-Adapted Interaction, 11(1–2), 65–86. Goodman, B. A., & Litman, D. J. (1992). On the interaction between plan recognition and intelligent interfaces. User Modeling and UserAdapted Interaction, 2(1), 83–115. Hegner, S. J., Mc Kevitt, P., Norvig, P., & Wilensky, R. L. (Eds.). (2001). Intelligent help systems for UNIX. Dordrecht, Netherlands: Kluwer. Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of Fourteenth Conference on Uncertainty in Artificial Intelligence (pp. 256–265). San Francisco: Morgan Kaufmann. Linton, F., & Schaefer, H.-P. (2000). Recommender systems for learning: Building user and expert models through long-term observation of application use. User Modeling and User-Adapted Interaction, 10(2–3), 181–208. Oppermann, R. (Ed.). (1994). Adaptive user support: Ergonomic design of manually and automatically adaptable software. Hillsdale, NJ: Lawrence Erlbaum Associates. Wilensky, R., Chin, D., Luria, M., Martin, J., Mayfield, J., & Wu, D. (1988). The Berkeley UNIX Consultant project. Computational Linguistics, 14(4), 35–84. Winkels, R. (1992). Explorations in intelligent tutoring systems and help. Amsterdam: IOS Press.
ADAPTIVE INTERFACES Computer interfaces are becoming ever richer in functionality, software systems are becoming more
4 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
complex, and online information spaces are becoming larger in size. On the other hand, the number and diversity of people who use computer systems are increasing as well. The vast majority of new users are thereby not computer experts, but rather laypersons such as professionals in nontechnical areas, elderly people, and children. These users vary with respect not only to their computer skills, but also to their fields of expertise, their tasks and goals, their mood and motivation, and their intellectual and physical capabilities. The traditional strategy for enabling heterogeneous user groups to master the complexity and richness of computers was to render computer interaction as simple as possible and thereby to cater to the lowest common denominator of all users. Increasingly, though, developers are creating computer applications that can be “manually” customized to users’ needs by the users themselves or by an available expert. Other applications go beyond this capability. They are able within certain limits to recognize user needs and to cater to them automatically. Following the terminology of Reinhard Oppermann, we will use the term adaptable for the manual type of application and adaptive for the automatic type.
Adaptable and Adaptive Systems Adaptable systems are abundant. Most commercial software allows users to modify system parameters and to indicate individual preferences. Web portals permit users to specify the information they want to see (such as stock quotes or news types) and the form in which it should be displayed by their web browsers. Web shops can store basic information about their customers, such as payment and shipping data, past purchases, wish lists for future purchases, and birthdates of friends and family to facilitate transactions online. In contrast, adaptive systems are still quite rare. Some shopping websites give purchase recommendations to customers that take into account what these customers bought in the past. Commercial learning software for high school mathematics adapts its teaching strategies to the presumed level of expertise of each student. Advertisements on mobile devices are already being targeted to users in certain geographical locations only or to users who
perform certain indicative actions (such as entering certain keywords in search machines). User adaptability and adaptivity recently gained strong popularity on the World Wide Web under the notion of “personalization.” This popularity is due to the fact that the audiences of websites are often even less homogeneous than the user populations of commercial software. Moreover, personalization has been recognized as an important instrument for online customer relationship management.
Acquiring Information about Users To acquire the information about users that is needed to cater to them, people can use several methods. A simple way is to ask users directly, usually through an initial questionnaire. However, this questionnaire must be kept extremely short (usually to less than five questions) because users are generally reluctant to spend efforts on work that is not directly related to their current tasks, even if this work would save them time in the long run. In certain kinds of systems, specifically tutoring systems, user interviews can be clad in the form of quizzes or games. In the future, basic information about users may be available on smartcards, that is, machine-readable plastic cards that users swipe through a reading device before the beginning of a computer session or that can even be read from a distance as users approach a computer terminal. Various methods draw assumptions about users based on their interaction behavior. These methods include simple rules that predict user characteristics or assign users to predetermined user groups with known characteristics when certain user actions are being observed (the latter method is generally known as the “stereotype approach” to user modeling). Probabilistic reasoning methods take uncertainty and evidences from different sources into account. Plan recognition methods aim at linking individual actions of users to presumable underlying plans and goals. Machine-learning methods try to detect regularities in users’ actions (and to use the learned patterns as a basis for predicting future actions). Clique-based (collaborative) filtering methods determine those users who are closest to the current user in an n-dimensional attribute space and
ADAPTIVE INTERFACES ❚❙❘ 5
Keeping Disabled People in the Technology Loop
AUSTIN, Texas (ANS)—If communications technology is fueling the economy and social culture of the 21st century, why should 18 percent of the population be left behind? Stephen Berger, a specialist in retrofitting the latest computer and phone technology for the disabled, is trying to make sure they’re not. From an office in Austin, Berger works to make sure that those with hearing and vision impairments or other disabilities can benefit from the latest in Internet, cell phone and other technologies. As a project manager at Siemens Information and Communication Mobile, where he’s responsible for standards and regulatory management, Berger works to unravel such problems as why those who use hearing aids couldn’t use many brands of cell phones. “Some new cell phones make a buzz in hearing aids,” Berger explained. “The Federal Communications Commission took note and said it needed to be resolved.” But what was needed was either better technology or protocols that both the hearing impaired and the cell phone companies could agree on. Berger helped determine what types of hearing aids work with certain types of phones. The intelligence was passed around the industry, and the problem is now minimal. Berger is one of the many technology specialists in huge communications companies whose niche has
use them as predictors for unknown attributes of the current user. Clustering methods allow one to generalize groups of users with similar behaviors or characteristics and to generate user stereotypes.
Types of Information about the User Researchers have considered numerous kinds of user-related data for personalization purposes, including the following: ■
Data about the user, such as demographic data, and information or assumptions about the user’s knowledge, skills, capabilities, interests, preferences, goals, and plans
been defined in recent years. While the proliferation of computers, home gadgets and gizmos is on the rise, it’s workers like Berger who make sure the disabled aren’t left out of the loop. Other workers in the field, according to Berger, are coming from educational institutions. For example, Neil Scott and Charlie Robinson, from Stanford University and Louisiana Tech University respectively, are working on the things the Hollywood movies are made of. […] “Guys like this are breaking the barrier between the blind and computers,” he said.“(The blind) will soon have an interface with no visual, just audio computer controls with no touch, just head position and voice controls.” Other devices, like the Home RF systems—that’s home radio frequency—link all the major appliances and electronics of the home together. That means telephone, Dolby sound, Internet, entertainment electronics and other devices are all connected into one wireless network with voice control for those who aren’t mobile. “It’s microphones implanted in wallpaper, security systems by voice, household appliances that work on a vocal command,” Berger said. “It’s what the movies are made of and it’s here today.” Source: Innovations keep disabled in the technology loop. American News Services, October 12, 2000.
Usage data, such as selections (e.g., of webpages or help texts with certain content), temporal viewing behavior (particularly “skipping” of webpages or streaming media), user ratings (e.g., regarding the usefulness of products or the relevance of information), purchases and related actions (e.g., in shopping carts, wish lists), and usage regularities (such as usage frequencies, high correlations between situations and specific actions, and frequently occurring sequences of actions) ■ Environmental data, such as data about the user’s software and hardware environments and information about the user’s current location ■
6 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
(where the granularity ranges from country level to the precise coordinates) and personalizationrelevant data of this location.
Privacy Storing information about users for personalization is highly privacy relevant. Numerous consumer surveys show consistently that users are concerned about their privacy online, which also affects personalized systems on the Web. Some popular personalization methods also seem in conflict with privacy laws that protect the data of identified or identifiable individuals in more than thirty countries. Such laws usually call for parsimony, purpose specificity, and user awareness or even user consent in the collecting and processing of personal data. The privacy laws of many countries also restrict the transborder flow of personal data or even extend their coverage beyond the national boundaries. Such laws then also affect personalized websites abroad that serve users in these regulated countries, even if there is no privacy law in place in the country where the websites are located. Well-designed user interaction will be needed in personalized systems to communicate to users at any point the prospective benefits of personalization and the resulting privacy consequences to enable users to make educated choices. A flexible architecture, moreover, will be needed to allow for optimal personalization within the constraints set by users’ privacy preferences and the legal environment. Alternatively, anonymous yet personalized interaction can be offered.
Empirical Evaluation A number of empirical studies demonstrate in several application areas that well-designed adaptive user interfaces may give users considerable benefits. Boyle and Encarnacion showed that the automatic adjustment of the wording of a hypertext document to users’ presumed familiarity with technical vocabulary improved text comprehension and search times significantly in comparison with static hypertext. Conati and colleagues presented evidence that “adaptive prompts based on the student model effectively elicited self-explanations that improved stu-
dents’ learning” (Conati et al. 2000, 404). Corbett and Trask showed that a certain tutoring strategy (namely subgoal scaffolding based on a continuous knowledge trace of the user) decreases the average number of problems required to reach cognitive mastery of Lisp concepts. In studies reviewed by Specht and Kobsa, students’ learning time and retention of learning material improved significantly if learners with low prior knowledge received “strict” recommendations on what to study next (which amounted to the blocking of all other learning material), while students with high prior knowledge received noncompulsory recommendations only. Strachan and colleagues found significantly higher user ratings for the personalized version of a help system in a commercial tax advisor system than for its nonpersonalized version. Personalization for e-commerce on the Web has also been positively evaluated to some extent, both from a business and a user point of view. Jupiter Communications reports that personalization at twenty-five consumer e-commerce sites boosted the number of new customers by 47 percent and revenues by 52 percent in the first year. Nielsen NetRatings reports that registered visitors to portal sites (who obtain the privilege of adapting the displayed information to their interests) spend more than three times longer at their home portal than other users and view three to four times more pages. Nielsen NetRatings also reports that e-commerce sites offering personalized services convert approximately twice as many visitors into buyers than do e-commerce sites that do not offer personalized services. In design studies on beneficial personalized elements in a Web-based procurement system, participants, however,“expressed their strong desire to have full and explicit control of data and interaction” and “to readily be able to make sense of site behavior, that is, to understand a site’s rationale for displaying particular content” (Alpert et al. 2003, 373). User-adaptable and user-adaptive interfaces have shown their promise in several application areas. The increase in the number and variety of computer users is likely to increase their promise in the future. The observation of Browne still holds true, however: “Worthwhile adaptation is system specific. It is dependent on the users of that system and requirements
AFFECTIVE COMPUTING ❚❙❘ 7
to be met by that system” (Browne 1993, 69). Careful user studies with a focus on expected user benefits through personalization are, therefore, indispensable for all practical deployments. Alfred Kobsa See also Artificial Intelligence and HCI; Privacy; User Modeling
FURTHER READING Alpert, S., Karat, J., Karat, C.-M., Brodie, C., & Vergo, J. G. (2003). User attitudes regarding a user-adaptive e-commerce web site. User Modeling and User-Adapted Interaction, 13(4), 373–396. Boyle, C., & Encarnacion, A. O. (1994). MetaDoc: An adaptive hypertext reading system. User Modeling and User-Adapted Interaction, 4(1), 1–19. Browne, D. (1993). Experiences from the AID Project. In M. SchneiderHufschmidt, T. Kühme, & U. Malinowski (Eds.), Adaptive user interfaces: Principles and practice (pp. 69–78). Amsterdam: Elsevier. Carroll, J., & Rosson, M. B. (1989). The paradox of the active user. In J. Carroll (Ed.), Interfacing thought: Cognitive aspects of humancomputer interaction (pp. 80–111). Cambridge, MA: MIT Press. Conati, C., Gertner, A., & VanLehn, K. (2002). Using Bayesian networks to manage uncertainty in student modeling. User Modeling and User-Adapted Interaction, 12(4), 371–417. Corbett, A. T., & Trask, H. (2000). Instructional interventions in computer-based tutoring: Differential impact on learning time and accuracy. Proceedings of ACM CHI’ 2000 Conference on Human Factors in Computing Systems (pp. 97–104). Hof, R., Green, H., & Himmelstein, L. (1998, October 5). Now it’s YOUR WEB. Business Week (pp. 68–75). ICONOCAST. (1999). More concentrated than the leading brand. Retrieved August 29, 2003, from http://www.iconocast.com/issue/1999102102.html Kobsa, A. (2002). Personalized hypermedia and international privacy. Communications of the ACM, 45(5), 64–67. Retrieved August 29, 2003, from http://www.ics.uci.edu/~kobsa/papers/2002-CACMkobsa.pdf Kobsa, A., Koenemann, J., & Pohl, W. (2001). Personalized hypermedia presentation techniques for improving customer relationships. The Knowledge Engineering Review, 16(2), 111–155. Retrieved August 29, 2003, from http://www.ics.uci.edu/~kobsa/papers/2001-KERkobsa.pdf Kobsa, A., & Schreck, J. (2003). Privacy through pseudonymity in useradaptive systems. ACM Transactions on Internet Technology, 3(2), 149–183. Retrieved August 29, 2003, from http://www.ics.uci.edu/ ~kobsa/papers/2003-TOIT-kobsa.pdf Oppermann, R. (Ed.). (1994). Adaptive user support: Ergonomic design of manually and automatically adaptable software. Hillsdale, NJ: Lawrence Erlbaum. Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3, 329–354. Rich, E. (1983). Users are individuals: Individualizing user models. International Journal of Man-Machine Studies, 18, 199–214.
Specht, M., & Kobsa, A. (1999). Interaction of domain expertise and interface design in adaptive educational hypermedia. Retrieved March 24, 2004, from http://wwwis.win.tue.nl/asum99/specht/specht.html Strachan, L., Anderson, J., Sneesby, M., & Evans, M. (2000). Minimalist user modeling in a complex commercial software system. User Modeling and User-Adapted Interaction, 10(2–3), 109–146. Teltzrow, M., & Kobsa, A. (2004). Impacts of user privacy preferences on personalized systems—A comparative study. In C.-M. Karat, J. Blom, & J. Karat (Eds.), Designing personalized user experiences for e-commerce (pp. 315–332). Dordrecht, Netherlands: Kluwer Academic Publishers.
AFFECTIVE COMPUTING Computations that machines make that relate to human emotions are called affective computations. Such computations include but are not limited to the recognition of human emotion, the expression of emotions by machines, and direct manipulation of the human user’s emotions. The motivation for the development of affective computing is derived from evidence showing that the ability of humans to feel and display emotions is an integral part of human intelligence. Emotions help humans in areas such as decisionmaking and human-to-human communications. Therefore, it is argued that in order to create intelligent machines that can interact effectively with humans, one must give the machines affective capabilities. Although humans interact mainly through speech, we also use body gestures to emphasize certain parts of the speech and as one way to display emotions. Scientific evidence shows that emotional skills are part of what is called intelligence. A simple example is the ability to know when something a person says to another is annoying or pleasing to the other, and be able to adapt accordingly. Emotional skills also help in learning to distinguish between important and unimportant things, an integral part of intelligent decision-making. For computers to be able to interact intelligently with humans, they will need to have such emotional skills as the ability to display emotions (for example, through animated agents) and the ability to recognize the user’s emotions. The ability to recognize emotions would be useful in day-to-day interaction, for example, when the user is Web browsing or
8 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
FUNCTIONALITY The capabilities of a given program or parts of a program.
searching: If the computer can recognize emotions, it will know if the user is bored or dissatisfied with the search results. Affective skills might also be used in education: A computer acting as a virtual tutor would be more effective if it could tell by students’ emotional responses that they were having difficulties or were bored—or pleased. It would not, however, be necessary for computers to recognize emotions in every application. Airplane control and banking systems, for example, do not require any affective skills. However, in applications in which computers take on a social role (as a tutor, assistant, or even companion), it may enhance their functionality if they can recognize users’ emotions. Computer agents could learn users’ preferences through the users’ emotions. Computers with affective capabilities could also help human users monitor their stress levels. In clinical settings, recognizing a person’s inability to interpret certain facial expressions may help diagnose early psychological disorders. In addition to recognizing emotions, the affective computer would also have the ability to display emotions. For example, synthetic speech with emotions in the voice would sound more pleasing than a monotonous voice and would enhance communication between the user and the computer. For computers to be affective, they must recognize emotions, be capable of measuring signals that represent emotions, and be able to synthesize emotions.
General Description of Emotions Human beings possess and express emotions in everyday interactions with others. Emotions are often reflected on the face, in hand and body gestures, and in the voice. The fact that humans understand emotions and know how to react to other people’s expressions greatly enriches human interaction. There is no clear definition of emotions. One way to handle emotions is to give them discrete labels, such as joy, fear, love, surprise, sadness, and so on. One prob-
lem with this approach is that the humans often feel blended emotions. In addition, the choice of words may be too restrictive or culturally dependent. Another way to describe emotions is to have multiple dimensions or scales. Instead of choosing discrete labels, emotions are describe on several continuous scales, for example from pleasant to unpleasant or from simple to complicated. Two common scales are valence and arousal.Valence describes the pleasantness of the stimuli, with positive (or pleasant) on one end and negative (or unpleasant) on the other. For example, happiness has a positive valence, while disgust has a negative valence. The other dimension is arousal, or activation, which describes the degree to which the emotion stimulates the person experiencing it. For example, sadness has low arousal, whereas surprise has high arousal. The different emotional labels could be plotted at various positions on a two-dimensional plane spanned by these two axes to construct a two-dimensional emotion model. In 1954 the psychologist Harold Schlosberg suggested a three-dimensional model in which he added an axis for attention-rejection to the above two. This was reflected by facial expressions as the degree of attention given to a person or object. For example, attention is expressed by wide open eyes and an open mouth. Rejection shows contraction of eyes, lips, and nostrils. Although psychologists and others argue about what exactly emotions are and how to describe them, everyone agrees that a lack of emotions or the presence of emotional disorders can be so disabling that people affected are no longer able to lead normal lives or make rational decisions.
Technology for Recognizing Emotions Technologies for recognizing human emotions began to develop in the early 1990s. Three main modalities have been targeted as being relevant for this task: visual, auditory, and physiological signals. The visual modality includes both static images and videos containing information such as facial expressions and body motion. The audio modality uses primarily human voice signal as input, while the physiological signals measure changes in the human body, such as changes in temperature, blood pressure, heart rate, and skin conductivity.
AFFECTIVE COMPUTING ❚❙❘ 9
Facial Expressions One of the most common ways for humans to display emotions is through facial expressions. The best-known study of facial expressions was done by the psychologist Paul Ekman and his colleagues. Since the 1970s, Ekman has argued that emotions are manifested directly in facial expressions, and that there are six basic universal facial expressions corresponding to happiness, surprise, sadness, fear, anger, and disgust. Ekman and his collaborator, the researcher Wallace Friesen, designed a model linking facial motions to expression of emotions; this model is known as the Facial Action Coding System (FACS). The facial action coding system codes facial expressions as a combination of facial movements known as action units. The action units have some relation to facial muscular motion and were defined based on anatomical knowledge and by studying videotapes of how the face changes its appearance. Ekman’s work inspired many other researchers to analyze facial expressions by means of image and video processing. Although the FACS is designed to be performed by human observers viewing a video frame by frame, there have been attempts to automate it in some fashion, using the notion that a change in facial appearance can be described in terms of a set of facial expressions that are linked to certain emotions. Work on automatic facial-expression recognition started in the early 1990s. In all the research, some method to extract measurements of the facial features from facial images or videos was used and a classifier was constructed to categorize the facial expressions. Comparison of facial expression recognition methods shows that recognition rates can, on limited data sets and applications, be very high. The generality of these results has yet to be determined.
Voice Quantitative studies of emotions expressed in the voice have had a longer history than quantitative studies of facial expressions, starting in the 1930s. Studies of the emotional content of speech have examined the pitch, duration, and intensity of the utterance. Automatic recognition systems of emotions from voice have so far not achieved high accuracy. In addition, there is no agreed-upon theory
of the universality of how emotions are expressed vocally, unlike the case for facial expressions. Research that began in the late 1990s concentrated on combining voice and video to enhance the recognition capabilities of voice-only systems.
Multimodal Input Many researchers believe that combining different modalities enables more accurate recognition of a user’s emotion than relying on any single modality alone. Combining different modalities presents both technological and conceptual challenges, however. On the technological side, the different signals have different sampling rates (that is, it may take longer to register signals in one modality than in another), and the existence of one signal can reduce the reliability of another (for example, when a person is speaking, facial expression recognition is not as reliable). On the conceptual side, emotions are not always aligned in time for different signals. For example, happiness might be evident visually before it became evident physiologically.
Computers Displaying Emotions For affective computing, it is as important that computers display emotions as it is that they recognize them. There are a number of potential ways in which computers could evince an emotion.A computer might depend on facial expressions of animated agents or on synthesized speech, or emotion could be conveyed to a user through wearable devices and text messages. The method would be determined by the application domain and preset goals. For example, interaction in an office environment requires emotion to be expressed differently from the way it would be expressed during pursuit of leisure activities, such as video games; similarly, in computer-assisted tutoring, the computer’s goal is to teach the human user a concept, and the display of emotions facilitates this goal, while in a game of poker, the computer’s goal is to hide its intention and deceive its human adversary. A computer could also synthesize emotions in order to make intelligent decisions with regard to problems whose attributes cannot all be quantified exactly, or when the search space for the best solution is
10 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
large. By assigning valence to different choices based on different emotional criteria, many choices in a large space can be eliminated quickly, resulting in a quick and good decision.
Research Directions Affective computing is still in its infancy. Some computer systems can perform limited recognition of human emotion with limited responses in limited application domains. As the demand for intelligent computing systems increases, however, so does the need for affective computing. Various moral issues have been brought up as relevant in the design of affective computers. Among them are privacy issues: If a computer can recognize human emotions, a user may want assurances that information on his or her emotional state will not be abused. There are also issues related to computers’ manipulation of people’s emotions: Users should have assurance that computers will not physically or emotionally harm them. There are also questions regarding who will have responsibility for computer actions. As affective technology advances, these issues will have increasing relevance. Ira Cohen, Thomas S. Huang, Lawrence S. Chen
FURTHER READING Darwin, C. (1890). The expression of the emotions in man and animals (2nd ed.). London: John Murray. Ekman, P. (Ed.). (1982). Emotion in the human face (2nd ed.). New York: Cambridge University Press. Ekman, P., & Friesen,W.V. (1978). Facial action coding system: Investigator’s guide. Palo Alto, CA: Consulting Psychologists Press. James, W. (1890). The principles of psychology. New York: Henry Holt. Jenkins, J. M., Oatley, K., and Stein, N. L. (Eds.). (1998). Human emotions: A reader. Malden, MA: Blackwell. Lang, P. (1995). The emotion probe: Studies of motivation and attention. American Psychologist, 50(5), 372–385. Pantic, M., & Rothkrantz, L. J. M. (2000). Automatic analysis of facial expressions: The state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1424–1445. Picard, R. W. (1997). Affective computing. Cambridge, MA: MIT Press. Picard, R. W., Vyzas, E., & Healey, J. (2001). Toward machine emotional intelligence: Analysis of affective physiological state. IEEE Transactions Pattern Analysis and Machine Intelligence, 23(10), 1175–1191. Schlosberg, H. (1954). Three dimensions of emotion. Psychological Review, 61(2), 81–88.
ALTAIR People have called the legendary Altair the first true personal computer. However, although it played an important role in the development of personal computers, we would be more correct to say that Altair was the last hobby computer that set many of the ambitions for the social movement that produced real personal computers during the 1970s. Thus, from the standpoint of human-computer interaction, Altair is worth remembering because it marked a crucial transition between two eras of amateur computing: an experimental era lasting from about 1950 until the mid-1970s, when home computers were among the more esoteric projects attempted by electronics hobbyists, and the true personal computer era, beginning with such computers as the Apple II in 1977. Altair was announced to the world late in 1974 in the issue of Popular Electronics magazine dated January 1975. Some controversy exists about how active a role the magazine played in launching Altair, but clearly Altair was actually designed and manufactured by the small company Micro Instrumentation and Telemetry Systems (MITS) in Albuquerque, New Mexico, headed by H. Edward “Ed” Roberts. Altair was a kit, costing $397, that required much skill from the builder and did not include sufficient memory or input-output devices to perform any real tasks. The central processing unit was the new Intel 8080 microprocessor chip. Altair came with only 256 bytes of memory, and a notoriously unreliable 4-kilobyte memory expansion board kit cost an additional $264. After a while MITS offered data input and output by means of an audio cassette recorder, but initially only a dedicated amateur was in a practical position to add a keyboard (perhaps a used teletype machine) or punched paper tape reader. Inputoutput for the original computer was accomplished by switches and lights on the front of the cabinet. Popular Electronics hyped the Altair as if it were a fully developed “minicomputer” and suggested some excessively demanding applications: an autopilot for airplanes or boats, a high-speed input-output device for a mainframe computer, a
ALTAIR ❚❙❘ 11
brain for a robot, an automatic controller for an air-conditioning system, and a text-to-Braille converter to allow blind people to read ordinary printed matter. Altair rescued MITS from the verge of bankruptcy, but the company could never fully deliver on the promise of the computer and was absorbed by another company in 1977. Hundreds of amateurs built Altairs on the way to careers in the future personal computer industry, its subcomponent interface bus became the widely used S100 standard, and the computer contributed greatly to the revolution in human-computer interaction that occurred during its decade. Notably, the mighty Microsoft corporation began life as a tiny partner of MITS, producing a BASIC interpreter for programming the Altair. Altair was far from being the first hobby computer, however. That honor probably belongs to Edmund C. Berkeley’s Simon relaybased computer produced in 1950 and publicized among hobbyists in the pages of Radio-Electronics magazine. The most widely owned hobby digital computer before Altair was probably Berkeley’s GENIAC (Genius Almost-Automatic Computer), which cost less than twenty dollars in 1955. Lacking vacuum tubes, relays, or transistors, this assembly of Masonite board, rotary switches, lights, and wires instructed students in the rudiments of logic programming (programming the steps of logical deductions). Immediately prior to Altair, two less influential hobby computers were also based on Intel chips: the Scelbi 8H and the Titus Mark-8. The difference is that Altair was expandable and intended to evolve into a full-featured personal computer. The 1970s marked a turning point in the history of hobby electronics, and innovative projects such as Altair could be seen as desperate measures in the attempt to keep the field alive. Today some enthusiasts build electronic equipment from kits or from scratch, just as others build their own harpsichords, but they no longer have the same relationship to the electronics industry that they enjoyed during the middle decades of the twentieth century. Prior to the development of integrated circuits, factories constructed radios, televisions, and audio amplifiers largely by hand, laboriously
“Personal computers are notorious for having a half-life of about two years. In scientific terms, this means that two years after you buy the computer, half of your friends will sneer at you for having an outdated machine.” —Peter H. Lewis
soldering each wire, capacitor, and resistor into place manually. Any of these parts might burn out in use, so repair shops flourished, and companies such as Allied Radio and Lafayette Electronics sold individual parts to hobbyists and to anyone else who was willing to buy. For the novice, these distributors sold kits that provided all the parts needed to build a project, and more advanced amateurs followed instructions in a number of magazines to build projects from parts they bought separately from distributors. In purely financial terms building a stereo system from a kit during the 1960s, as tens of thousands of people did, made little sense, but the result was fully as good as the best ready-made system that could be bought in stores, and in some cases the designs were identical. The introduction of integrated circuits gradually reduced the role of repairpersons, and by the dawn of the twenty-first century much electronic equipment really could not be repaired effectively and was simply replaced when it broke down. Already by the late 1970s the electronics hobby was in decline, and the homebuilt computer craze during that decade was practically a last fling. For a decade after the introduction of Altair, a vibrant software hobbyist subculture prevailed as people manually copied programs from a host of amateur computer magazines, and many people “brewed” their own personally designed word processors and fantasy games. This subculture declined after the introduction of complicated graphical user interface operating systems by Apple and Microsoft, but it revived during the mid-1990s as vast numbers of people created their own websites in the initially simple HTML (hypertext markup language). During its heyday this subculture was a great training ground of personnel
12 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
for the electronics and computer industries because amateurs worked with the same technology that professionals worked with. Altair was a watershed personal computer in the sense that amateurs assembled it personally and that it transformed them personally into computer professionals. William Sims Bainbridge See also Alto FURTHER READING Freiberger, P., & Swaine, M. (1999). Fire in the valley: The making of the personal computer (2nd ed.). New York: McGraw-Hill. Roberts, H. E., & Yates, W. (1975). Altair minicomputer. Popular Electronics, 7(1), 33–38. Roberts, H. E., & Yates, W. (1975). Altair minicomputer. Popular Electronics, 7(2), 56–58. Mims, F. M. (1985, January). The tenth anniversary of the Altair 8800. Computers & Electronics, 23(1), 58–60, 81–82.
ALTO The Alto computer, developed at the Xerox Corporation’s Palo Alto Research Center (Xerox PARC) in the 1970s, was the prototype of the late twentieth-century personal computer. Input was by means of both keyboard and mouse; the display screen integrated text and graphics in a system of windows, and each computer could communicate with others over a local area network (LAN). The Alto was significant for human-computer interaction (HCI) in at least three ways. First, it established a new dominant framework for how humans would interact with computers. Second, it underscored the importance of theory and research in HCI. Third, the failure of Xerox to exploit Alto technology by gaining a dominant position in the personal computer industry is a classic case study of the relationship between innovators and the technology they create. During the late 1960s the Xerox Corporation was aware that it might gradually lose its dominant position in the office copier business, so it sought ways of expanding into computers. In 1969 it paid $920 million to buy a computer company named “Scientific
Data Systems” (SDS), and it established Xerox PARC near Stanford University in the area that would soon be nicknamed “Silicon Valley.” Xerox proclaimed the grand goal of developing the general architecture of information rather than merely producing a number of unconnected, small-scale inventions. The Alto was part of a larger system of software and hardware incorporating such innovations as object-oriented programming, which assembles programs from many separately created, reusable “objects,” the ethernet LAN, and laser printers. At the time computers were large and expensive, and a common framework for human-computer interaction was time sharing: Several users would log onto a mainframe or minicomputer simultaneously from dumb terminals, and it would juggle the work from all of the users simultaneously. Time sharing was an innovation because it allowed users to interact with the computer in real time; however, because the computer was handling many users it could not devote resources to the HCI experience of each user. In contrast, Alto emphasized the interface between the user and the machine, giving each user his or her own computer. In April 1973 the first test demonstration of an Alto showed how different using it would be from using the text-only computer terminals that people were used to when it began by painting on its screen a picture of the Cookie Monster from the television program Sesame Street. The Alto’s display employed bitmapping (controlling each pixel on the screen separately) to draw any kind of diagram, picture, or text font, including animation and pulldown menus. This capability was a great leap forward for displaying information to human beings, but it required substantial hardware resources, both in terms of memory size and processing speed, as well as radically new software approaches. During the 1970s the typical computer display consisted of letters, numbers, and common punctuation marks in a single crude font displayed on a black background in one color: white or green or amber. In contrast, the default Alto display was black on white, like printed paper. As originally designed, the screen was 606 pixels wide by 808 pixels high, and each of those 489,648 pixels could be separately controlled. The Xerox PARC researchers developed sys-
ANIMATION ❚❙❘ 13
tems for managing many font sizes and styles simultaneously and for ensuring that the display screen and a paper document printed from it could look the same. All this performance placed a heavy burden on the computer’s electronics, so an Alto often ran painfully slow and, had it been commercialized, would have cost on the order of $15,000 each. People have described the Alto as a “time machine,” a computer that transported the user into the office of the future, but it might have been too costly or too slow to be a viable personal computer for the average office or home user of the period in which it was developed. Human-computer interaction research of the early twenty-first century sometimes studies users who are living in the future. This means going to great effort to create an innovation, such as a computer system or an environment such as a smart home (a computer-controlled living environment) or a multimedia classroom, that would not be practical outside the laboratory. The innovation then becomes a test bed for developing future systems that will be practical, either because the research itself will overcome some of the technical hurdles or because the inexorable progress in microelectronics will bring the costs down substantially in just a few years. Alto was a remarkable case study in HCI with respect to not only its potential users but also its creators. For example, the object-oriented programming pioneered at Xerox PARC on the Alto and other projects changed significantly the work of programmers. Such programming facilitated the separation between two professions: software engineering (which designs the large-scale structure and functioning of software) and programming (which writes the detailed code), and it increased the feasibility of dividing the work of creating complex software among many individuals and teams. People often have presented Alto as a case study of how short sighted management of a major corporation can fail to develop valuable new technology. On the other hand, Alto may have been both too premature and too ambitious. When Xerox finally marketed the Alto-based Star in 1981, it was a system of many small but expensive computers, connected to each other and to shared resources such as laser printers—a model of distributed personal
computing. In contrast, the model that flourished during the 1980s was autonomous personal computing based on stand-alone computers such as the Apple II and original IBM PC, with networking developing fully only later. The slow speed and limited capacity of the Alto-like Lisa and original 128-kilobyte Macintosh computers introduced by Apple in 1983 and 1984 suggest that Alto would really not have been commercially viable until 1985, a dozen years after it was first built. One lesson that we can draw from Alto’s history is that corporate-funded research can play a decisive role in technological progress but that it cannot effectively look very far into the future. That role may better be played by university-based laboratories that get their primary funding from government agencies free from the need to show immediate profits. On the other hand, Xerox PARC was so spectacularly innovative that we can draw the opposite lesson—that revolutions in human-computer interaction can indeed occur inside the research laboratories of huge corporations, given the right personnel and historical circumstances. William Sims Bainbridge See also Altair; Graphical User Interface FURTHER READING Hiltzik, M. (1999). Dealers of lightning: Xerox PARC and the dawn of the computer age. New York: HarperBusiness. Lavendel, G. (1980). A decade of research: Xerox Palo Alto Research Center. New York: Bowker. Smith, D. C., & Alexander, R. C. (1988). Fumbling the future: How Xerox invented, then ignored the first personal computer. New York: William Morrow. Waldrop, M. M. (2001). The dream machine: J. C. R. Licklider and the revolution that made computing personal. New York: Viking.
ANIMATION Animation, the creation of simulated images in motion, is commonly linked with the creation of cartoons, where drawn characters are brought into play
14 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
to entertain. More recently, it has also become a significant addition to the rich multimedia material that is found in modern software applications such as the Web, computer games, and electronic encyclopedias.
Brief History Animations are formed by showing a series of still pictures rapidly (at least twelve images per second) so that the eye is tricked into viewing them as a continuous motion. The sequence of still images is perceived as motion because of two phenomena, one optical (persistence of vision) and one psychological (phi principle). Persistence of vision can be explained as the predisposition of the brain and eye to keep on seeing a picture even after it has moved out of the field of vision. In 1824 British scientist, physician, and lexicographer Peter Mark Roget (1779–1869) explained this phenomenon as the ability of the retina to retain the image of an object for 1/20 to 1/5 second after its removal; it was demonstrated two years later using a thaumatrope, which is a disk with images drawn on both sides that, when twirled rapidly, gives the illusion that the two images are combined together to form one image. The other principle is the phi phenomenon or stroboscopic effect. It was first studied by German psychologist Max Wertheimer (1880–1943) and German-American psycho-physiologist Hugo Munsterberg (1863–1916) during the period from 1912 to 1916. They demonstrated that film or animation watchers form a mental connection that completes the action frame-to-frame, allowing them to perceive a sequence of motionless images as an uninterrupted movement. This mental bridging means that even if there are small discontinuities in the series of frames, the brain is able to interpolate the missing details and thus allow a viewer to see a steady movement. In the nineteenth century, many animation devices, such as the zoetrope invented by William George Horner (1786–1837), the phenakistiscope (1832), the praxinoscope (1877), the flipbook, and the thaumatrope were direct applications of the persistence of vision. For example, the zoetrope is a cylindrical device through which one can see an image in action. The rotating barrel has evenly spaced peepholes on the out-
side and a cycle of still images on the inside that show an image in graduating stages of motion. Whenever the barrel spins rapidly, the dark frames of the still pictures disappear and the picture appears to move. Another, even simpler example is the flipbook, a tablet of paper with a single drawing on each page. When the book is flicked through rapidly, the drawings appear to move. Once the basic principles of animation were discovered, a large number of applications and techniques emerged. The invention of these simple animation devices had a significant influence on the development of films, cartoons, computer-generated motion graphics and pictures, and more recently, of multimedia.
Walt Disney and Traditional Animation Techniques During the early to mid-1930s, animators at Walt Disney Studios created the “twelve animation principles” that became the basics of hand-drawn cartoon character animation. While some of these principles are limited to the hand-drawn cartoon animation genre, many can be adapted for computer animation production techniques. Here are the twelve principles: 1. Squash and stretch—Use shape distortion to emphasize movement. 2. Anticipation—Apply reverse movement to prepare for and bring out a forward movement. 3. Staging—Use the camera viewpoint that best shows an action. 4. Straight-ahead vs. pose-to-pose action—Apply the right procedure. 5. Follow-through and overlapping action—Avoid stopping movement abruptly. 6. Slow-in and slow-out—Allow smooth starts and stops by spacing frames appropriately. 7. Arcs—Allow curved motion in paths of action. 8. Secondary actions—Animate secondary actions to bring out even more life. 9. Timing—Apply time relations within actions to create the illusion of movement. 10. Exaggeration—Apply caricature to actions and timing. 11. Solid drawing—Learn and use good drawing techniques.
ANIMATION ❚❙❘ 15
12. Appeal—Create and animate appealing characters. Traditional animation techniques use cel animation in which images are painted on clear acetate sheets called cels. Animation cels commonly use a layering technique to produce a particular animation frame. The frame background layer is drawn in a separate cel, and there is a cel for each character or object that moves separately over the background. Layering enables the animator to isolate and redraw only the parts of the image that change between consecutive frames. There is usually a chief animator who draws the key-frames, the ultimate moments in the series of images, while in-between frames are drawn by others, the in-betweeners. Many of the processes and lingo of traditional cel-based animation, such as layering, key-frames, and tweening (generating immediate frames between two images to give the appearance that the first image evolves smoothly into the next), have carried over into two-dimensional and three-dimensional computer animation.
Two-Dimensional Computer Animation In recent years, computer programs have been developed to automate the drawing of individual frames, the process of tweening frames between keyframes, and also the animation of a series of frames. Some animation techniques commonly used in twodimensional (2D) computer animation are either frame-based or sprite-based. Frame-based animation is the simplest type of animation. It is based on the same principle as the flipbook, where a collection of graphic files, each containing a single image, is displayed in sequence and performs like a flipbook. Here again, to produce the illusion of motion, graphic images, with each image slightly different from the one before it in the sequence, are displayed at a high frame-rate (the number of frames of an animation displayed every second). Sprite-based animation uses a technique that is similar to the traditional animation technique in which an object is animated on top of a static graphic background. A sprite is any element of an animation that moves independently, such as a bouncing ball
or a running character. In sprite-based animation, a single image or sequence of images can be attached to a sprite. The sprite can animate in one place or move along a path. Many techniques—for example, tiling, scrolling, and parallax— have been developed to process the background layer more efficiently and to animate it as well. Sometimes sprite-based animation is called path-based animation. In pathbased animation, a sprite is affixed to a curve drawn through the positions of the sprite in consecutive frames, called a motion path. The sprite follows this curve during the course of the animation. The sprite can be a single rigid bitmap (an array of pixels, in a data file or structure, which correspond bit for bit with an image) that does not change or a series of bitmaps that form an animation loop. The animation techniques used by computers can be frame-by-frame, where each frame is individually created, or real-time, where the animator produces the key-frames and the computer generates the frames in between when the animation is displayed at run time. Two-dimensional computer animation techniques are widely used in modern software and can be seen in arcade games, on the Web, and even in word processors. The software used to design twodimensional animations are animation studios that allow animators to draw and paint cels, provide key-frames with moving backgrounds, use multiple layers for layering, support linking to fast video disk recorders for storage and playback, and allow scans to be imported directly. Examples of this software include Adobe Photoshop (to create animated GIFs), Macromedia Director (multimedia authoring tool that includes sophisticated functions for animation), Macromedia Flash (vector-based authoring tool to produce real-time animation for the Web). Even some programming languages such as Java are used to produce good quality animation (frame-by-frame and real-time) for the Web.
Three-Dimensional Computer Animation Three-dimensional computer animations are based on a three-dimensional (3D) coordinate system, which is a mathematical system for describing
16 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
three-dimensional space. Space is measured along three coordinates, the X direction, Y direction, and Z direction. These coordinates correspond to the width, length, and depth of objects or space. The X, Y, Z coordinates of points in space are used to define polygons, and collections of polygons make up the definition of three-dimensional objects. The process of 3D animation involves at least the following stages: modeling, rendering, and animation. Modeling is the process of creating 3D objects from simple 2D objects by lofting (the process of transforming a two-dimensional cross section object into a complete three-dimensional object) or from other simple 3D objects called “primitives” (spheres, cubes, cylinders, and so on). Primitives can be combined using a variety of Boolean operations (union, subtraction, intersection, and so on). They can also be distorted in different ways. The resulting model is called a mesh, which is a collection of faces that represent an object. Rendering is used to create an image from data that represents objects as meshes, and to apply colors, shading, textures, and lights to them. In its simplest form, the process of three-dimensional computer animation is very similar to the two-dimensional process of key-frames and tweening. The main differences are that threedimensional animations are always vector-based and real-time. Spline-Based Animation Motion paths are more believable if they are curved, so animation programs enable designers to create spline-based motion paths. (Splines are algebraic representations of a family of curves.) To define spline-based curves, a series of control points is defined and then the spline is passed through the control points. The control points define the beginning and end points of different parts of the curve. Each point has control handles that enable designers to change the shape of the curve between two control points. The curves and the control points are defined in 3D space. Most computer animation systems enable users to change the rate of motion along a path. Some systems also provide very sophisticated control of the velocity of an object along paths.
Skeletal Structure Animation Skeletal structures are bones-based. Widely used to control three-dimensional creatures, they appear in practically all modern three-dimensional modeling software studios. They enable the artist to preset and control the rotation points of a three-dimensional creature, facilitating its animation. The animator can then model a geometric skin (representing how the creature would appear) and link it to the bones structure. Skeletal structures software with graphical and powerful interfaces provide rich environments in which artists can control the complex algorithms involved in creating animated threedimensional creatures (human, animal, or imaginary). The availability of a skeletal animation environment characteristically brings another advantage—the exploitation of inverse kinematics (IK) to bring a character to life. Inverse Kinematics IK is a common technique for positioning multilinked objects, such as virtual creatures. When using an animation system capable of IK, a designer can position a hand in space by grabbing the hand and leading it to a position in that space. The connected joints rotate and remain connected so that, for example, the body parts all stay connected. IK provides a goal-directed method for animating a 3D creature. It allows the animator to control a threedimensional creature’s limbs by treating them as a kinematics chains. The points of control are attached to the ends of these chains and provide a single handle that can be used to control a complete chain. IK enables the animator to design a skeleton system that can also be controlled from data sets generated by a motion capture application. Motion Capture Motion capture is the digital recording of a creature’s movement for immediate or postponed analysis and playback. Motion capture for computer character animation involves the mapping of human action onto the motion of a computer character. The digital data recorded can be as simple as the position and orientation of the body in space, or as intricate as the deformations of the expression of the visage.
ANTHROPOLOGY AND HCI ❚❙❘ 17
Advances in Three-Dimensional Animation With the support of powerful computers, threedimensional animation allows the production and rendering of a photo-realistic animated virtual world. Three-dimensional scenes are complex virtual environments composed of many elements and effects, such as cameras, lights, textures, shading, and environment effects, and all these elements can be animated. Although cel animation is traditionally two-dimensional, advances in three-dimensional rendering techniques and in camera animation have made it possible to apply three-dimensional techniques to make two-dimensional painted images appear visually three-dimensional. The 3D animation techniques described in this section are supported by modern 3D animation studios that are software programs such as Maya (alias|wavefront), Softimage (Softimage), 3D Studio Max (Discreet), or Rhino3D (Robert McNeel & Associates). Examples of environment effects include rain, fire, fog, or dying stars. A technique widely used in real-time applications involving an environmental effect is called a particle system. A particle system is a method of graphically producing the appearance of amorphous substances, such as clouds, smoke, fire, or sparkles. The substance is described as a collection of particles that can be manipulated dynamically for animation effects. Some even more recent techniques include physics-based behavior such as a realistic animation of cloth, hair, or grass affected by the wind.
Endless Possibilities Animation has become an ubiquitous component of human-computer interfaces. It has evolved from prehistoric paintings in Altamira caves to realistic virtual worlds in sophisticated multimedia computers. The technologies supporting animation are still emerging and will soon support even more complex worlds, more realistic character animation, considerably easier 3D animation development, better quality animations on the Web, and better interactions with virtual reality interfaces. The current
work in animation lies in physic-based modeling in which objects or natural phenomena are animated according to their real physical properties, in realtime motion capture, and in goal-orientated animation. Considering the numerous applications of animation, from multimedia to archeology and chemistry, the future possibilities seem endless. Abdennour El Rhalibi and Yuanyuan Shen See also Data Visualization; Graphic Display; Graphical User Interface FURTHER READINGS CoCo, D. (1995). Real-time 3D games take off. Computer Graphics World, 8(12), 22–33. Corra, W. T., Jensen, R. J., Thayer, C. E., & Finkelstein, A. (1998). Texture mapping for cel animation. In Proceedings of SIGGRAPH’ 98, Computer Graphics Proceedings, Annual Conference Series (pp. 435–446). Kerlow, I. V. (2000). The art of 3-D computer animation and imaging (2nd ed.). New York: Wiley. Lassiter, J. (1987). Principles of traditional animation applied to 3D computer animation. SIGGRAPH 87 (pp. 35–44). Maltin, L. (1987). Of mice and magic—A history of American animated cartoons. New York: Penguin Books. O’Rourke, M. (1995). Principles of three-dimensional computer animation. New York: W. W. Norton. Parent, R. (2001). Computer animation: Algorithms and techniques. San Francisco: Morgan-Kaufmann. Potter, C. D. (1995). Anatomy of an animation. Computer Graphics World, 18(3). 36–43. Solomon, C. (1994). The history of animation: Enchanted drawings. New York: Wings Books. Thomas, F., & Johnson, O. (1981). The illusion of life. New York: Abbeville Press. Watt, A., & Policarpo, F. (2001). 3D games—real-time rendering and software technology. New York: Addison-Wesley. Watt, A. H., & Watt, M. (1992). Advanced animation and rendering. New York: Addison-Wesley. Williams, R. (2001). The animator’s survival kit. New York: Faber & Faber.
ANTHROPOLOGY AND HCI As a social science that brings together social anthropology, linguistics, archaeology, and human biology, anthropology clearly has a major contribution to make to the study of human-computer interaction
18 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
(HCI). However, bringing that contribution into focus is at times a challenge, not only because of the extreme interdisciplinarity but also because of collaborations between anthropologists and computer scientists and the sometimes-blurred boundaries between anthropology and related disciplines, including sociology and psychology. Despite these challenges, anthropology has created distinctive methods and a distinctive epistemology, and has offered new insights for understanding human-computer interaction. Anthropology also poses three profound questions.
Methods Anthropology’s development of ethnographic methods is a notable contribution to research in HCI. More than simple naturalistic observation, ethnography is a structured process informed by theoretical models through which researchers attempt to elucidate the coherence of a context. For example, anthropologist Bonnie Nardi, in her study of end-user computing used concepts of formalisms and communication to interpret how users developed their own programs; anthropologist Lucy Suchman used a mechanistic concept of cognition as a foil to understand how users interacted with an expert-system-based help facility embedded in a copying machine. In both these cases researchers combined intensive naturalistic observation with conceptual insights to develop new HCI models. A frequently employed variation on ethnographic methods is called ethnomethodology. As originally developed by sociologist Harold Garfinkel, ethnomethodology stipulates that individuals make sense out of a context in an ad hoc, almost indeterminate manner. In place of social order, the actors in a given context are synthesizing what appears to be order, accepting or rejecting information as it fits with their synthesis. The mutual intelligibility of an interaction is thus an ongoing achievement between the actors, a result rather than a starting point. Thus, two users can construct two quite different meanings out of similar interactions with computers, depending on the experiences they bring to the interaction. This suggests some obvious limitations on the abilities of computers to constrain or reproduce human actions.
A third concept, actually a method, employed by anthropologists in the study of HCI is actor-network theory. This theory views artifacts and social roles as coevolving nodes in a common network. Insofar as each node encodes information about the entire network (for example, in any country, electrical appliances are tailored to the specific power system of the country and the expectations of the users) and is capable of state changes based on network inputs, both artifacts and social roles can be considered to have agency within the network. This concept, originally developed by the sociologist Michel Callon, in his study of the French government’s involvement in technological projects, and elaborated by the sociologist John Law in a study of Portuguese sailing vessels in the sixteenth century, is very pertinent to rapidly changing technologies such as computers. Indeed, observing the shifting topology of the Internet and Internet computing makes it clear that user roles are anticipated and complemented by machine behavior (for instance, collaborative filtering), and machine states enable or constrain users’ agency within the network (for example, the structures of search engines). Although silicon and carbon units are distinct, for now, the image of the cyborg (cybernetic organism), and the emergence of integrated biological/computational systems, suggests other possibilities. This hints at the final, and perhaps most important anthropological contribution to HCI, the evolutionary perspective. All branches of anthropology have been concerned with the evolution of human societies, languages, and even genotypes. Although there is room for debate over the telos or chaos of evolutionary processes, understanding humans and their artifacts as goal-seeking objects who learn is fundamental to any anthropological viewpoint. Using the archaeological record and anthropological knowledge of societies with simpler toolkits, the anthropologist David Hakken has questioned the extent to which the widespread use of computers in society justifies being called a “revolution”; he concludes that due to their failure to transform the character of labor, computers are “just one more technology” in the implementation of an automated, massified Fordist model of production—a model inspired by Henry Ford in which large quantities of products are produced through the repetitive motions of unskilled workers.
ANTHROPOLOGY AND HCI ❚❙❘ 19
Epistemology What distinguishes anthropology from other disciplines such as psychology and sociology that use similar methods is in many ways a matter of epistemology— that is, the stance it takes toward the subject matter. Central to this stance is the orthogonal view, that is, the ability to analyze a situation from a fresh and original yet plausible perspective. It is the orthogonal view that enables anthropologist Constance Perin to see office automation as a panopticon, that suggests to linguistic anthropologist Charlotte Linde that failed communication can improve performance, or that led Edwin Hutchins, a cognitive anthropologist, to understand a cockpit as a cognitive device. Orthogonal viewpoints originate from the experience of fieldwork, or rather, field immersion, preferably in a remote setting, which is the rite
of passage for most anthropologists. When researchers have lived for an extended period of time in an unfamiliar village, cut off from their normal social moorings, when cultural disorientation becomes embedded in their daily routine, they acquire a profound conviction that all social forms are conventional, that otherness is not alien, and that belonging and familiarity are rare and fragile flowers. It is this experience and this conviction more than any methodological or conceptual apparatus that defines anthropology and that enables the orthogonal view. It is this implicitly critical stance that has constrained anthropology’s contribution to the study of automation human factors. “Human factors” is an engineering discipline using engineering methods of analytic decomposition to solve engineering
A Personal Story—Eastern vs. Western Cultural Values My understanding of human communication using mediated technologies is primarily based on cultural assumptions. Cultural values could influence the way a human chooses its medium of communication. On the other hand, with the advancement of computer-mediated communication (CMC) technologies (e.g., e-mail, e-commerce sites, weblogs, bulletin boards, newsgroups) people could also change their communication patterns to suit the different forms of a medium. Whichever way, apparently, people will not adopt CMC unless and until it fits with their cultural values. Based on my interviews with a number of informants from different cultural backgrounds, I have observed some disparate yet interesting views on communication patterns and preferences, i.e., why and when people use CMC. Let me briefly illustrate one case of contrasting communication preferences and patterns. When I asked the informants from Eastern cultures why they would use CMC, one of the key responses was that they can express themselves better over mediated technologies than to voice their opinions in face-to-face. Public self-expression is avoided due to the value of “saving face.” Also, using asynchronous medium such as e-mail, does not require spontaneous response. People could first think, reflect, and then express. On the contrary, the informants from Western cultures felt that using e-mail is best for complex and detailed information, as they require very explicit forms of instructions. Additionally, people send messages via CMC in order to get quick response so that tasks can get completed. Also, based on a written format, the text becomes an evidence or “proof of say” for a job accomplished. Getting a job or assignment done is perceived as a priority and building a relationship is thus secondary. Cultural values could present a new lens to understand why and how certain a communication medium offers different functions or purposes. What is more important is the uniqueness of human beings with a set of cultural assumptions and values, and not the technological features. Anthropologist Edward T. Hall postulates that “communication is culture and culture is communication.” Hence, organizations need to understand fully the myriad cultural preferences before making a substantial investment in CMC technology. Without such understanding, technology will simply be another gadget that gets rusty and dusty! Norhayati Zakaria
20 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Digital Technology Helps Preserve Tribal Language
(ANS)—The American Indian language of Comanche was once taught through conversation—a vocabulary passed on and polished as it moved from one generation to the next. But as fluency among Comanches declines, the tribe has turned to cutting-edge technology to preserve this indigenous language. By next winter, members hope to produce an interactive CD-ROM that will create a digital record of the language and help tribe members learn it. “You can’t say you’re Comanche without knowing your own language. That’s the way I feel,” said Billie Kreger of Cache, Okla., vice president of the Comanche Language and Cultural Preservation Committee. Kreger, 47, didn’t learn much Comanche as a child but has begun studying it in the past few years. Of the 10,000 Comanches that still remain in the United States, roughly 200 are fluent, according to Karen Buller, president and chief executive officer of the Santa Fe, N.M.-based organization that is paying for the CDROM project, the first of its kind in the United States. Tribe members are anxious to record the language while the fluent speakers, who are in their 70s and 80s, are still living, she said. Buller’s group, the National Indian Telecommunications Institute, is paying for the project with $15,000 in grant money from the Fund for the Four Directions. The CD-ROM will teach about 1,500 vocabulary words. Students will see Comanche elders pronouncing the words and hear the words used in conversations. Buller’s group is recording conversations on videotape. Other indigenous language revitalization efforts are under way around the country, too, including language immersion programs in Alaskan and Hawaiian schools. The institute provided teacher training for those projects. “All the tribes are saying, ‘We’ve got to save the language,’” said Leonard Bruguier, who heads the Institute
of American Indian Studies at the University of South Dakota in Vermillion. Students at that university, located in the midst of a large Sioux community, are increasingly interested in learning indigenous languages, he said. Under a federal policy of discouraging use of American Indian languages by allowing only English to be spoken by American Indian children at schools run by the Bureau of Indian Affairs, Comanche began faltering about 50 years ago. Without preservation efforts, researchers predict that 90 percent of the world’s languages, including those of the 554 American Indian tribes, will disappear in the next century, said Peg Thomas, executive director of The Grotto Foundation, a nonprofit organization in St. Paul, Minn., that provides funding to American Indian organizations. Each year about five languages fall into “extinction,” meaning that they have no youthful speakers, she said. According to some estimates, between 300 and 400 American Indian languages have become extinct since European settlers first arrived in North America. The point of preserving the languages is partly to maintain a connection to the past and learn the history of a culture, said Buller. Students of the Comanche language discover, for instance, that the words for food preparation are based on the root word for “meat”—because meat was a key part of the Comanche diet. She and others say that American Indian children who learn indigenous languages in addition to English appear to perform better in school. But language programs are targeting adults, too. Kreger, of the Comanche Language and Cultural Preservation Committee, says she is looking forward to using the CD-ROM for her own language studies. “I can hardly wait,” she said. Nicole Cusano Source: Digital technology helps preserve tribal language. American News Service, June 15, 2000.
ANTHROPOLOGY AND HCI ❚❙❘ 21
problems—in other words, the improved performance of artifacts according to some preestablished set of specifications. Anthropology, by contrast, would begin by questioning the specifications, adopting a holistic point of view toward the entire project. Holism is the intellectual strategy of grasping the entire configuration rather than breaking it down into separate elements. From an anthropological viewpoint, specifications are not a given, but open to interrogation. A holistic viewpoint requires that the researcher adopt multiple disciplinary tools, including (but certainly not limited to) direct observation, interviewing, conversation analysis, engineering description, survey research, documentary study, and focus groups. For many, anthropology is highly interdisciplinary, assembling research tools as the contextualized problem requires. How far the anthropologist is permitted to go with this approach is one of the dilemmas of anthropologists working in software design. The emerging fields of “design ethnography” and “user-centered design” have employed ethnographers to better understand users’ requirements, and to elicit expert knowledge in the construction of expert systems. However, these efforts are at times compromised by a substantial disconnect between the anthropologists’ understanding of requirements and knowledge, and the eng ineers’ understanding of them. Anthropologists see human needs (that is, requirements) as emergent rather than given, and knowledge (even expert knowledge) as embedded in a culturally contingent body of assumptions called “common sense.” Many systems designers, as the late medical anthropologist Diana Forsythe put it, view common sense as unproblematic and universal. This assumption and others will be discussed below.
Insights The most important anthropological insight to HCI is the emphasis on context for understanding human behavior, including human interaction with cybernetic devices. The human organism is unique in its ability to integrate information from a variety of sensory inputs and to formulate an infinite array of potential behavioral responses to these inputs. These arrays of inputs and responses constitute—
that is, construct—the context of information, a structure of irreducible complexity. The context is far more than simply a compilation of information. Computers and other information technologies, by contrast, focus on the processing of information, stripping information of its contextual properties and thus of the attributes that humans use to turn information into (warranted, usable, and meaningful) knowledge. John Seely Brown, the former director of Xerox Palo Alto Research Center, and researcher Paul Duguid, for example, describe the importance of context for using information. “The news,” for instance, is not simply unfiltered information from a distant place; it is information that has been selected, aggregated, evaluated, interpreted, and warranted by human journalists, trained in face-to-face classrooms or mentored by over-the-shoulder coaches. Physicality is an important component of these relationships: Although people can learn technical skills online, they learn integrity and morality only interpersonally. Making a convincing case for the criticality of context for human users, Brown and Duguid describe six of the context-stripping mechanisms that are supposedly inherent in information technologies: demassification, decentralization, denationalization, despacialization, disintermediation, and disaggregation. “These are said to represent forces that, unleashed by information technology, will break society down into its fundamental constituents, primarily individuals and information” (Brown and Duguid 2000, 22). The sum of their argument is that such “6D thinking” is both unrealized and unrealizable. Information technology does not so much eliminate the social context of information, for this is either pointless or impossible, as it displaces and decomposes that context, thus posing new difficulties for users who need to turn information into knowledge. Contexts can be high (rich, detailed, and full of social cues) or low (impoverished and monochromatic), they can be familiar or unfamiliar, and they can include information channels that are broadband (a face-to-face conversation) or narrowband (reading tea leaves, for example). From a human perspective, all computer interaction, even the most multimedia-rich, is narrowband: Sitting where I am,
22 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
my computer screen and keyboard occupy no more than 25 percent of my field of vision, until I turn my head. Looking around, the percentage shrinks to under 5 percent. The other 95 percent is filled with other work and information storage devices (bookshelves and filing cabinets), task aids (charts on the wall), and reminders of relationships: a social context. As a low-context device, the computer must be supplemented by these other more social artifacts if it is to have human usefulness—that is, if it is to be used for knowledge work rather than mere information processing. Applying the concept of context to a specific technological problem, the design of intelligent systems, Suchman developed a concept of situated action as an alternative explanation for the rationality of human action. In place of seeing activity as the execution of a plan (or program), or inversely, seeing a plan as a retrospective rationalization of activity, Suchman’s concept of situated action sees plans as only one of several resources for making sense out of the ongoing flow of activity. Human action, or more accurately interaction (for all action is by definition social, even if only one actor is physically present), is an ongoing flow of message input and output. Traditionally social studies have assumed that actors have a scheme or mental program which they are enacting: a plan. In contrast to this, Suchman demonstrates that the rationality of an action is an ongoing construction among those involved in the action. The default state of this rationality is a transparent spontaneity in which the participants act rather than think. Only when the ongoing flow breaks down does it become necessary to construct a representation (that is, a plan or image) of what is happening. (Breakdowns, while frequent, are usually easily repaired.) Language, due to its ability to classify, is a powerful resource for constructing such representations, although it is only one of several channels that humans use for communication. Using language, the participants in an action understand what they are doing. Rationality (“understanding what they are doing”) is the achievement rather than the configuration state of interaction. The implications of this for constructing intelligent devices (such as expert systems) are profound. In order for an intelligent device to reproduce intelligi-
ble human action, according to Suchman, it must not attempt to anticipate every user state and response (for it cannot). Alternatively, a strategy of “real-time user modeling” that incorporates (a) continually updated models of user behavior, (b) detection (and adaptation to) diagnostic inconsistencies, (c) sensitivity to local conditions, and (d) learning from fault states (such as false alarms and misleading instructions) suggests a better approximation of situated action than preconceived “user models.” Suchman’s findings are based on the concept of “distributed cognition” originally developed by Edwin Hutchins. Instead of understanding cognition as information processing (searching, aggregating, parsing, and so on), Hutchins saw mental activity as contextually emergent, using contextual resources (including language and artifacts) as part of an interactive process. These insights are derived from efforts to use anthropological methods in the development of expert systems and other artificial intelligence devices. Expert systems hold out the hope that in classroom instruction, in routine bureaucratic problem solving, in medical diagnosis, and in other fields, certain low-level mental tasks could be accomplished by computers, in much the same manner as repetitive manual tasks have been automated. Building these systems requires a process of “knowledge acquisition” that is viewed as linear and unproblematic. An alternative view, suggested by anthropologist Jean Lave and computer scientist Etienne Wenger, is that learning is embedded in (and a byproduct of) social relationships and identity formation, and that people learn by becoming a member of a “community of practice.” The concept of “community of practice” is further developed by Wenger to describe how experts acquire, share, and use their expertise. Communities of practice are groups that share relationships, meaning, and identity around the performance of some set of tasks, whether processing insurance claims or delivering emergency medicine. The knowledge that they share is embedded in these relationships and identities, not something that can be abstracted and stored in a database (or “knowledge base”). Anthropologist Marietta Baba has applied these concepts along with the concept of “sociotechnical
ANTHROPOLOGY AND HCI ❚❙❘ 23
systems” developed by the Tavistock Institute to examine the response of work groups to the introduction of office automation and engineering systems. At major corporations she found that efforts to introduce new automated systems frequently failed because they were disruptive of the work processes, social relationships, identities, and values of the work group, considered as a community of practice. Understanding cognitive activity as distributed among multiple agents is closely related to the issue of man/machine boundaries, an issue clearly of interest to anthropologists. “Cyborg anthropology” has been an ongoing professional interest at least since the 1991 publication of anthropologist Donna Haraway’s Simians, Cyborgs, and Women. Although most cyborg anthropology has focused on medical technology (such as imaging systems and artificial organs) rather than on computational technology, the basic concept—of human bodies and lives becoming increasingly embedded within automated information (control) circuits—will have increasing relevance for understanding the adaptation of humans to advanced information technology: As more and more human faculties, such as memory, skilled manipulation, and interpersonal sensitivity, are minimalized, disaggregated, and shifted away from the individual organism to automated devices, the dependence of carbon-based humans on their artifactual prostheses will increase. Communities also form around technologies. Technology writer Howard Rheingold has described participation in a San Francisco-based usenet as a form of community building. Hakken describes the influence of class on the experiences of users with computing in Sheffield, England. Sociolologist Sherry Turkle describes the identity experimentation conducted by users of multiuser domains. Anthropologist Jon Anderson has examined how Middle Eastern countries have used and adapted the Internet with unique methods for unique social goals. These include the maintenance of diaspora relationships with countrymen scattered around the globe. Online communities quickly evolve (actually adapt from surrounding norms) distinctive norms, including styles of communication and categories of identity. Although such collections of norms and values fall short of full-fledged human cultures, they
are indicative of a propensity to create normative closure within any ongoing collectivity. Both these concepts, of work group cultures and online communities, point up the importance of culture for computing. As anthropology’s signature concept, culture has an important (if sometimes unstated) place in anthropological thinking about human-computer interaction.
Culture For anthropologists, culture is more profound than simply the attitudes and values shared by a population. As a system of shared understandings, culture represents the accumulated learning of a people (or a group), rooted in their history, their identity, and their relationship with other groups. Cultures evolve as shared projects with other groups. Although they are invented and imagined, cultures cannot be conjured up at will, as much of the recent management literature on corporate culture seems to suggest. This is significant, because much of computing use is in a corporate or organizational context (even if the organization is virtual). From an anthropological perspective, it is highly important to note that much of human-computer interaction is influenced either directly, by the regimes of instrumental rationality in which it takes place, or indirectly, by the fact that it follows protocols established by influential corporations. Several ethnographies of hightech companies suggest that computerization and the high-tech expectations associated with it are creating new corporate cultures: sociologist Gideon Kunda and anthropologist Kathleen GregoryHuddleston have described the working atmosphere of two high-tech corporations, noting that despite a technological aura and emancipatory rhetoric, their corporate cultures are still mechanisms of control. It should be noted that high tech is less an engineering concept for explaining functionality or performance than it is an aesthetic conceit for creating auras of power and authority. Others have taken note of the fact that computers create new forms of culture and identity and have described numerous microcultures that have sprung up around such systems as textual databanks, engineering design, and online instruction.
24 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
The culture of systems developers, as described by Diana Forsythe is particularly notable. Insofar as developers and users have separate and distinctive cultural outlooks, there will be a mismatch between their tacit understandings of system functionality and system performance. The frequent experience of systems not living up to expectations when deployed in the field is less a consequence of poor engineering than of the fundamental cultural relationships (or disconnects) between developers and users. Finally, anthropology’s original interest in the remote and exotic has often taken its attention away from the laboratories and highly engineered environments in which the most advanced information technologies are found. In 2001 Allen Batteau, an industrial anthropologist, observed that many factories and field installations usually lack the reliable infrastructure of universities or development laboratories. As a consequence, computationally intensive applications that work so well in the laboratory (or in the movies) crash and burn in the field. This lack, however, is not simply a matter of these production environments needing to catch up to the laboratories: Moore’s Law for nearly forty years has accurately predicted a doubling of computational capability every eighteen months, a geometric growth that outstrips the arithmetic pace of technological diffusion. The dark side of Moore’s Law is that the gap between the technological capabilities of the most advanced regions and those of the remote corners of the human community will continue to grow. In 1995 Conrad Kottak, an anthropologist, observed that “High technology has the capacity to tear all of us apart, as it brings some of us closer together” (NSF 1996, 29). Many of these observations grew out of a workshop organized by the American Anthropological Association and the Computing Research Association called “Culture, Society, and Advanced Information Technology.” Held (serendipitously) at the time of the first deployment of graphical Web browsers (an event that as much as any could mark the beginning of the popular information revolution), this workshop identified seven areas of interest for social research in advanced information technology: (1) the nature of privacy, identity, and social roles in the new infor-
mation society; (2) family, work groups, and personal relationships; (3) public institutions and private corporations; (4) communities, both virtual and real; (5) public policy and decision-making; (6) the changing shapes of knowledge and culture; and (7) the globalization of the information infrastructure (NSF 1996). In many ways this workshop both captured and projected forward the anthropological research agenda for understanding the changing social face of advanced information technology.
Questions Anthropology’s orthogonal viewpoint proposes several unique questions. Perhaps the first of these is the question of control versus freedom. On the one hand, cybernetic devices exist to create and integrate hierarchies of control, and the fifty-year histor y of the development of automation has demonstrated the effectiveness of this strategy. On the other hand, this poses the question of the proper role of a unique node in the control loop, the human user: How many degrees of freedom should the user be allowed? The designer’s answer, “No more than necessary,” can be unsatisfying: Systems that constrain the behavior of all their elements limit the users’ learning potential. The related concepts of system learning and evolution raise the second outstanding question, which has to do with the nature of life. Should systems that can evolve, learn from, and reproduce themselves within changing environments be considered “living systems”? Studies of artificial life suggest that they should. The possibility of a self-organizing system that can replicate itself within a changing environment has been demonstrated by anthropologist Chris Langston, enlarging our perspective beyond the carbon-based naïveté that saw only biological organisms as living. The final question that this raises, which is the ultimate anthropological question, is about the nature or meaning of humanity. Etymologically, anthropology is the “science of man,” a collective term that embraces both genders, and possibly more. Anthropologists always anchor their inquiries on the question of “What does it mean to be human?” Otherwise, their endeavors are difficult to distinguish from com-
ANTHROPOLOGY AND HCI ❚❙❘ 25
parative psychology, or comparative linguistics, or comparative sociology. However, the rise of information technology has fundamentally challenged some received answers to the question of what it means to be human. What are the human capabilities that computers will never mimic? As Pulitzer-prize-winning writer Tracy Kidder asked, Do computers have souls? Will there ever be a computer that meets the Turing test—that is, a computer that is indistinguishable from a fully social human individual? More specifically, how many generations are required to evolve a cluster of computers that will (unaided by human tenders) form alliances, reproduce, worship a deity, create great works of art, fall into petty bickering, and threaten to destroy the planet? As the abilities of silicon-based artifacts to think, feel, learn, adapt, and reproduce themselves continue to develop, the question of the meaning of humanity will probably become the most challenging scientific and philosophical question of the information age. Allen W. Batteau See also Ethnography; Sociology and HCI; Social Psychology and HCI
FURTHER READING Anderson, J. (1998). Arabizing the Internet. Emirates Occasional Papers # 30. Abu Dhabi, United Arab Emirates: Emirates Center for Strategic Studies and Research. Anderson, J., & Eikelman, D. (Eds.). (2003). New media in the Muslim world: The emerging public sphere (Indiana Series in Middle East Studies). Bloomington: Indiana University Press. Baba, M. L. (1995). The cultural ecology of the corporation: Explaining diversity in work group responses to organizational transformation. (1995). Journal of Applied Behavioral Science, 31(2), 202–233. Baba, M. L. (1999). Dangerous liaisons: Trust, distrust, and information technology in American work organizations. Human Organization, 58(3), 331–346. Batteau, A. (2000). Negations and ambiguities in the cultures of organization. American Anthropologist, 102(4), 726–740. Batteau, A. (2001). A report from the Internet2 ‘Sociotechnical Summit.’ Social Science Computing Review, 19(1), 100–105. Borofsky, R. (1994). Introduction. In R. Borofsky (Ed.), Assessing cultural anthropology. New York: McGraw-Hill. Brown, J. S., & Duguid, P. (2000). The social life of information. Boston: Harvard Business School Press. Callon, M. (1980). The state and technical innovation: A case study of the electrical vehicle in France. Research Policy, 9, 358–376.
Deal, T., & Kennedy, A. (1999). The new corporate cultures. Reading, MA: Perseus Books. Emery, F., & Trist, E. (1965). The causal texture of organizational environments. Human Relations, 18, 21–31. Forsythe, D. (2001). Studying those who study us: An anthropologist in the world of artificial intelligence. Stanford, CA: Stanford University Press. Garfinkel, H. (1967). Studies in ethnomethodology. Englewood Cliffs, NJ: Prentice-Hall. Gregory-Huddleston, K. (1994). Culture conflict with growth: Cases from Silicon Valley. In T. Hamada & W. Sibley (Eds.), Anthropological Perspectives on Organizational Culture. Washington, DC: University Press of America. Hakken, D. (1999).
[email protected]: An ethnographer looks to the future. New York: Routledge. Haraway, D. (1991). Simians, cyborgs, and women—The reinvention of nature. London: Free Association Books. Hutchins, E. (1994). How a cockpit remembers its speeds. Cognitive Science, 19, 265–288. Hutchins, E. (1995). Cognition in the wild. Cambridge, MA: MIT Press. Kidder, T. (1981). The soul of a new machine. Boston: Little, Brown. Kunda, G. (1992). Engineering culture: Control and commitment in a high-tech corporation. Philadelphia: Temple University Press. Langston, C. G. (Ed.). (1989). Artificial life (Santa Fe Institute Studies in the Sciences of Complexity, Proceedings, Volume 6). Redwood City, CA: Addison-Wesley. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge, UK: Cambridge University Press. Law, J. (1987). Technology and heterogeneous engineering: The case of Portuguese expansion. In W. E. Bijker, T. P. Hughes, & T. Pinch (Eds.), The social construction of technological systems. (pp. 111–134). Cambridge, MA: MIT Press. Linde, C. (1988). The quantitative study of communicative success: Politeness and accidents in aviation discourse. Language in Society, 17, 375–399. Moore, G. (1965, April 19). Cramming more components onto integrated circuits. Electronics. Nardi, B. A. (1993). A small matter of programming: Perspectives on end user computing. Cambridge, MA: MIT Press. National Science Foundation. (1996). Culture, society, and advanced information technology (Report of a workshop held on June 1–2, 1995). Washington, DC: U. S. Government Printing Office. Perin, C. (1991). Electronic social fields in bureaucracies. Communications of the ACM, 34(12), 74–82. Rheingold, H. (1993). The virtual community: Homesteading on the electronic frontier. Reading, MA: Addison-Wesley. Star, S. L. (Ed.). (1995). The cultures of computing. Oxford, UK: Blackwell Publishers. Stone, A. R. (1995). The war of desire and technology at the close of the mechanical age. Cambridge, MA: MIT Press. Suchman, L. (1987). Plans and situated actions: The problem of humanmachine communication. Cambridge, UK: Cambridge University Press. Turkle, S. (1995). Life on the screen: Identity in the age of the Internet. New York: Simon & Schuster. Wenger, E. (1998). Communities of practice: Learning, meaning, and identity. Cambridge, UK: Cambridge University Press.
26 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
ANTHROPOMETRY The word anthropometry, which means “the measurement of the physical characteristics and physical abilities of people,” is derived from the Greek words anthropo meaning “human being” and metry meaning “measure.” Physical characteristics, also called “structural dimensions,” include such aspects as heights, widths, depths, and body segment circumferences. Physical abilities, also called “functional dimensions,” include such aspects as grip, push and pull strength, reaching capabilities, fields of vision, and functional task performance. Anthropologists, clinicians, and engineers use anthropometric information in a variety of ways. For engineers, in particular, anthropometry provides information that can be used for the design of occupational, pubic, and residential environments. The information can also be used for the design of tools, protective head gear, clothing, and workstation equipment. Doorway widths, tool handle lengths and circumferences, ranges of clothing sizes, and the location of displays and controls on workstations are some of the design applications. Anthropometry also provides information about body segment center of mass and joint center of rotation characteristics that is used for biomechanical modeling (the study of joint forces and torques on the body).
A Brief History Although anthropometry was applied when Greek and Egyptian artists created standards (canons) for the human form centuries ago, not until the nineteenth century were thought and dialogue on anthropometry organized. Early work in anthropometry focused on the human anatomy, racial characteristics, skeletal remains, and human growth. Among the noteworthy work documented by physical anthropologist Ales Hrdlicka was that of French anthropologist Paul Pierre Broca and the Belgian scientist Adolphe Quetelet. During the mid-eighteenth century Quetelet used statistics to describe anthropometric information. Shortly after the FrancoPrussian War of 1870, a growing emphasis on individualism was evident in the proposals of the
German scholar Rodolpho von Ihering. These proposals called upon German anatomists and anthropologists to reinvestigate craniometric (relating to measurement of the skull) and anthropometric measurement methods. The German Anthropological Society convened in Munich and Berlin during the 1870s and early 1880s to establish what anthropometrist J. G. Garson and others have called the “Frankfort Agreement” of 1882. This agreement introduced craniometric methods distinct from the predominant French methods and established a new nomenclature and measurement methods. The existence of the French and German schools only further cemented the belief that international consensus on methods, nomenclature, and measurements was needed. During the early twentieth century people attempted to develop an international consensus on the nomenclature of body dimensions and measurement methods. In 1906, at the Thirteenth International Congress of Prehistoric Anthropology and Archaeology in Monaco, an international agreement of anthropometry took form. This congress and the Fourteenth International Congress in Geneva, Switzerland, in 1912 began to formalize the body of anthropometric work. The foundations of a normative framework and a standardization of anthropometric measurement had been laid and translated into French, German, and English by 1912. This framework standardized anthropometric measurements on both skeletal and living human subjects. Since 1912 several works by Hrdlicka, Rudolf Marting, and James Gaven have increased the awareness of anthropometry and its uses and added to its scientific rigor. After the initial congresses, people attempted to establish consensus throughout the twentieth century. Congresses meeting under the name of Hrdlicka convened on the topic of anthropometry and measurement methods. Other congresses aimed to create standards and databases for general use. During the late twentieth century authors such as Bruce Bradtmiller and K. H. E. Kroemer chronicled these congresses and offered unique ways to manage anthropometric data. During recent years the International Standardization Organization (ISO) technical committee on ergonomics published ISO
ANTHROPOMETRY ❚❙❘ 27
7250: Basic Human Body Measurements for Technical Design (1996) to standardize the language and measurement methods used in anthropometry and ISO 15535: General Requirements for Establishing an Anthropometric Database (2003) to standardize the variables and reporting methods of anthropometric studies.
Structural Anthropometric Measurement Methods Structural anthropometric measurement methods require a person to be measured while standing or sitting. Anatomical landmarks—observable body features such as the tip of the finger, the corner of the eye, or the bony protrusion of the shoulder known as the “acromion process”—standardize the locations on the body from which measurements are made. The desire to achieve consistent measurements has led to the use of standardized measurement postures held by people who are being measured. The anthropometric standing posture requires the person to hold the ankles close together, standing erect, arms relaxed and palms facing medially (lying or extending toward the median axis of the body) or anteriorly (situated before or toward the front), the head erect and the corners of the eyes aligned horizontally with the ears. The anthropometric seated posture requires the person to be seated erect on a standard seating surface. The elbows and knees are flexed 90 degrees. The palms face medially with the thumb superior (situated above or anterior or dorsal to another and especially a corresponding part) to the other digits. Structural dimensions include the distances between anatomical landmarks, the vertical distance from a body landmark to the floor, and the circumferences of body segments and are measured with a variety of instruments. Among the most common instruments is the anthropometer, which is a rod and sliding perpendicular arm used to measure heights, widths, and depths. A spreading caliper having two curved arms that are hinged together is sometimes used to measure segment widths and depths defined by the distance between the tips of the arms. Graduated cones are used to measure grip
circumferences, and tape measures are used to measure other circumferences such as the distance around the waist. Scales are used to measure body weight. Photographs and video are used to measure body dimensions in two dimensions. One method uses grids that are attached behind and to the side of the person measured. Photographs are then taken perpendicular to the grids, and the space covered by the person in front of the grids can be used to estimate body segment heights, widths, and depths. A variant of this method uses digital photography for which an anthropometric measurement is obtained by comparing the number of pixels (small discrete elements that together constitute an image, as in a television or computer screen) for a dimension to the number of pixels of a reference object also located in the digital photograph. Attempts to develop three-dimensional computer human models with conventional anthropometric data reveal that limitations exist, such as the uncertainty about three-dimensional definition of key points on the body surface, locations of circumferences, and posture. These limitations have resulted in the development of more sophisticated three-dimensional anthropometric measurement methods. Digital anthropometry is the use of digital and computerized technology in the collection of information about body size and physical ability. In this use, computers are responsible for the actual collection of anthropometric data and are not relegated solely to data analysis or storage. Digital anthropometry varies greatly from conventional anthropometry. This variation has changed the nature of anthropometry itself for both the anthropometrist and the experimental context in which measurements are taken. Human factors engineer Matthew Reed and colleagues have identified some of the potential benefits of digital anthropometry: ■
The capacity to assemble more accurate models of human form, dimensions, and postures ■ The capacity to evaluate multiple body dimensions simultaneously ■ The capacity to measure the human and the environment together
28 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
■
The improved description of joint centers of rotation and movement in three dimensions ■ The capacity to make corrections to dimensions or create new dimensions after measurements have been recorded Laser scanning is often used in digital anthropometry because it allows excellent resolution of the morphological (relating to the form and structure of an organism or any of its parts) features of the human body and can be completed rapidly. Laser scanning produces accurate three-dimensional representations of the complex body surfaces, and most protocols (detailed plans of a scientific or medical experiment, treatment or procedure) require the placement of surface markers on the body to ensure the proper location of bony protrusions that are used as measurement landmarks beneath the surface of the skin. Other protocols using laser scans have morphological extraction algorithms (procedures for solving a mathematical problem in a finite number of steps that frequently involve repetition of an operation) to estimate landmark locations based on morphological features. Potentiometry can also be used to collect digital anthropometric measurements. Electromechanical potentiometric systems allow the measurer to manually digitize points in three-dimensional space. The measurer guides a probe tip manually to render discrete points or body surface contours.
Functional Performance Measurements Conventional functional performance measurements include grip, push, and pull strength, and reaching abilities. For grip strength measurement, an individual is required to squeeze for several seconds at maximum effort a hand dynamometer (a force measurement device) set at one or more grip circumferences. For the measurement of push and pull strength, an individual usually holds a static (unchanging) posture while either pushing on or pulling against a force gauge at a maximum effort over several seconds. An individual’s reaching abilities can be evaluated with a number of methods, including
those that employ photography and potentiometry as described above, or methods that require an individual to mark with a hand-held pen or pencil the maximum or comfortable reach locations on a vertical or horizontal grid. Electromagnetic and video-based motion analysis systems provide new measures of physical abilities related to the way people move (kinematics) and can be used with other types of instrumentation, such as force plates (hardware that measure the force applied to it), to provide biomechanical (the mechanics of biological and especially muscular activity) information or measures of balance. These systems allow positions of body landmarks to be tracked over time during a physical activity. The data can be evaluated statistically or can serve as an example of a human task simulation. Such methods of data collection allow more lifelike dynamic digital human models that can be used to evaluate human performance in virtual environments. However, use of these methods is expensive and time consuming.
Measurement Consistency and Variation Anthropometric measurements are recordings of body dimensions and physical abilities that are subject to variability. No “correct” measurement exists because a measurement is simply an observation or recording of an attribute that is the cumulative contribution of many factors. Anthropometric studies have investigated the topic of measurement consistency in relation to intrinsic qualities of variability within a given measurement. J. A. Gavan (1950) graded anthropometry dimensions in terms of consistencies seen through expert anthropometrists and concluded that “consistency increased as: the number of technicians decreased, the amount of subcutaneous [under the skin] tissue decreased, the experience of the technician increased, and as the landmarks were more clearly defined” (Gavan 1950, 425). Claire C. Gordon and Bruce Bradtmiller (1992), Charles Clauser and associates (1998), Gordon and associates (1989), and others have also studied intra- and interobserver error contributions in anthropometric measurements,
ANTHROPOMETRY ❚❙❘ 29
including the contributions of different measurement instruments and the effects of breathing cyc l e s . O t h e r re s e a rch e r s , s u ch a s Ka t h e r i n e Brooke-Wavell and colleagues (1994), have evaluated the reliability of digital anthropometric measurement systems. These evaluations have brought about an awareness of anthropometric reliability and error as well as acceptable levels of reliability. Anthropometric data typically are collected for large samples of populations to capture distributional characteristics of a dimension so that it is representative of a target population. Many sources of anthropometric variability exist within populations. Men and women differ greatly in terms of structural and functional anthropometric dimensions. Additionally, the anthropometric dimensions of people have changed systematically through time. Today’s people are generally taller and heavier than those of previous generations, perhaps because of improved availability and nutrition of food in developed countries. Of course, a person’s body size also changes through time, even throughout the course of adulthood. As a person ages, for example, his or her height decreases. Other sources of anthropometric variability include ethnicity, geography, and occupational status. The distribution characteristics of an anthropometric dimension are often reported for different categories of age and gender, and sometime for different ethnicities or countries. Because the variability of anthropometric dimensional values within such subgroups often takes the shape of a Gausian (bell-shaped) distribution, the mean deviation and standard deviation of the sample data are often used to describe the distributional characteristics of a dimension. The percentile value—the value of a dimension that is greater than or equal to a certain percentage of a distribution—also provides useful information. For example, the fifth and ninety-fifth percentiles of a dimensional value define the outer boundaries of the 90 percent midrange of a population distribution that might enable a designer to develop an adjustable consumer product or environment feature that can accommodate 90 percent or more of the target population. Multivariate data analysis includes the use of correlation and regression analyses, as well as human
modeling methods. The correlation between two dimensions provides a measure of how strongly two dimensions covary linearly. When two measurements are highly correlated the values of one measurement can be used to predict the values of another in a regression analysis, therefore reducing the total number of measurements needed to construct a comprehensive set of anthropometric tables and human models based on partially extrapolated data. When combinations of anthropometric dimensions are considered simultaneously in the evaluation of a product or environment, mockups and task trialing involving people or simulation approaches using digital human modeling of people are required.
Important Data Sources The most comprehensive anthropometric studies have focused on military personnel, at least in part due to the need for the military to have information to provide well-designed uniforms, equipment, land vehicles, and aircraft. Perhaps one of the most comprehensive studies was the 1988 U.S. Army Anthropometric Survey (ANSUR), which summarized 132 dimensions of approximately nine thousand army personnel. One of the most inclusive sources of civilian anthropometric data is a U.S. National Aeronautics and Space Administration (NASA) technical report produced by the staff of the Anthropology Research Project in 1978. This report contains anthropometric data across a variety of civilian and military populations for a large number of anthropometric variables, including information about the mass distribution of body segments. More recently, the Civilian American and European Surface Anthropometry Resource (CAESAR) project used laser scanning to collect the body surface contours and sizes of approximately twentyfour hundred North American and two thousand European civilians from 1998 to 2000. Measurements were recorded with people in standing, standardized seated, and relaxed seated postures. Thousands of points that define the location of the body’s surface were collected with each scan, providing extremely accurate three-dimensional representations of the body surface contours for individual human
30 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
models that can be used to evaluate the fit of a product or of the person in an environment. Because markers are also placed over key body landmarks, conventional descriptive analysis of dimensions has also been performed. CAESAR is the largest and most valuable anthropometric data source of its kind.
Using Anthropometric Data in Design Conventional use of anthropometric data in design requires determining (1) the population for which a design in intended, known as the “target population,” (2) the critical dimension or dimensions of the design, (3) appropriate anthropometric data source, (4) the percentage of the population to be accommodated by the design, (5) the portion of the distribution that will be excluded, usually the largest and/or smallest values of the distribution, and (6) the appropriate design values through the use of univariate or bivariate statistical methods. Conventional application of anthropometric data, however, is not able to address the design problems that require the evaluation of many design characteristics simultaneously. Multivariate analysis using mockups and task trialing requires recruiting people with the desired range of body size and ability and assessing human performance during the simulation, such as judging whether people can reach a control or easily see a display for a particular design. Static and dynamic digital human modeling approaches require manipulating models of various sizes in virtual environments to assess the persondesign fit. Analysis methods for dynamic digital human modeling approaches are still in their infancy due to the limited amount of studies recording the needed information and the complicated nature of the data. A variety of fields uses anthropometric data, including anthropology, comparative morphology, human factors engineering and ergonomics, medicine, and architectural design. Additionally, digital anthropometry has been used outside of scientific and research endeavors, as seen in the application of a new suit-making technology for Brooks Brothers (known as “digital tailoring). The International Organization for Standardization has published numerous publications
that apply anthropometric data to the development of design guidelines. These publications include ISO 14738 Safety of Machinery—Anthropometric Requirements for the Design of Workstations at Machinery (2002), ISO 15534 Ergonomic Design for the Safety of Machinery (2000), and ISO 9241 Documents on the Ergonomic Requirements for Office Work with Visual Display Terminals (1992–2001). The latter publications were developed to improve the fit between people and their computers at work.
Future Research A major challenge of future research is how to summarize and interpret the information-rich but complex three-dimensional data that accompany the new methods of measurement described here. New methods of three-dimensional measurement of body dimensions such as whole-body scanning provide new opportunities to move conventional univariate anthropometric applications to complete three-dimensional static human models that can be used to evaluate design in new ways. Motion analysis methods in dynamic human modeling also provide a powerful tool to improve our understanding of the functional abilities of people. The reliability, accuracy, and applications of many of these anthropometric measurement methods, however, have yet to be fully explored. Perhaps what is most needed is simply more information about the physical dimensions and abilit i e s i n m o r e d i ve r s e u s e r g r o u p s . L a c k o f anthropometric information severely limits the use of anthropometry in the design of living and working spaces that can be used by diverse populations. U.S. government agencies, particularly the U.S. Architectur al and Tr anspor tation Bar r iers Compliance Board (Access Board) and the U.S. Department of Education’s National Institute on Disability and Rehabilitation Research (NIDRR), recently have started to address the information gap by studying the physical abilities of people with disabilities, such as people who use wheelchairs. However, much work remains to be done. In particular, the need for anthropometric data to inform the design of occupational, public, and residential environments of the elderly is expected
ANTHROPOMETRY ❚❙❘ 31
to increase substantially as the proportion of the elderly in the population continues to increase dramatically during the years to come. Victor Paquet and David Feathers See also Motion Capture
FURTHER READING Annis, J. F. (1989). An automated device used to develop a new 3-D database for head and face anthropometry. In A. Mital (Ed.), Advances in industrial ergonomics and safety (pp. 181–188). London: Taylor & Francis. Annis, J. F., Case, H. W., Clauser, C. E., & Bradtmiller, B. (1991). Anthropometry of an aging work force. Experimental Aging Research, 17, 157–176. Brooke-Wavell, K., Jones, P. R. M., & West, G. M. (1994). Reliability and repeatability of 3-D body scanner (LASS) measurements compared to anthropometry. Annals of Human Biology, 21, 571–577. Clauser, C., Tebbetts, I., Bradtmiller, B., McConville, J., & Gordon, C. (1998). Measurer’s handbook: U.S. Army anthropometric survey (Technical Report No. TR-88/043). Natick, MA: U.S. Army Natick Research, Development and Engineering Center. Damon, A., & Stout, H. (1963). The functional anthropometry of old men. Human Factors, 5, 485–491. Dempster, W. T., Gabel, W. C., & Felts, W. J. L. (1959). The anthropometry of the manual work space for the seated subject. American Journal of Physical Anthropometry, 17, 289–317. Eastman Kodak Company. (2003). Ergonomic design for people at work (2nd ed.). New York: Wiley. Garson, J. (1885). The Frankfort Craniometric Agreement, with critical remarks thereon. Journal of the Anthropological Institute of Great Britain and Ireland, 14, 64–83. Gavan, J. (1950). The consistency of anthropometric measurements. American Journal of Physical Anthropometry, 8, 417–426. Gordon, C., & Bradtmiller, B. (1992). Interobserver error in a large scale anthropometric survey. American Journal of Human Biology, 4, 253–263. Gordon, C., Bradtmiller, B., Clauser, C., Churchill, T., McConville, J., Tebbetts, I., & Walker, R. (1989). 1987–1988 anthropometric survey of U.S. Army personnel: Methods and summary statistics (Technical Report No. TR-89/027). Natick, MA: U.S. Army Natick Research, Development and Engineering Center. Haddon, A. (1934). The history of anthropology. London: Watts & Co. Hobson, D., & Molenbroek, J. (1990). Anthropometry and design for the disabled: Experiences with seating design for cerebral palsy population. Applied Ergonomics, 21(1), 43–54. Hoekstra, P. (1997). On postures, percentiles and 3D surface anthropometry. Contemporary Ergonomics (pp. 130–135). London: Taylor & Francis. Hrdlicka, A. (1918). Physical anthropology; its scope and aims, etc. American Journal of Physical Anthropometry, 1, 3–23. I n t e r n a t i o n a l O r g a n i z a t i o n f o r S t a n d a rd i z a t i o n . ( E d . ) . (1992–2003). Ergonomics requirements for office work with visual
display terminals (VDTs), (ISO Standard 9241). Geneva, Switzerland: International Organization for Standardization. International Organization for Standardization. (Ed.). (1996). Basic human body measurements for technical design (ISO Standard 7250). G e n e v a , Sw i t ze r l a n d : In te r n a t i o n a l O r g a n i z a t i o n f o r Standardization. International Organization for Standardization. (Ed.). (2000). Ergonomic design for the safety of machinery (ISO Standard 15534). G e n e v a , Sw i t ze r l a n d : In te r n a t i o n a l O r g a n i z a t i o n f o r Standardization. International Organization for Standardization. (Ed.). (2002). Safety of machinery—Anthropometric requirements for the design of workstations at machinery (ISO Standard 14738). Geneva, Switzerland: International Organization for Standardization. International Organization for Standardization. (Ed.). (2003). General requirements for establishing an anthropometric database (ISO Standard 15535). Geneva, Switzerland: International Organization for Standardization. Kroemer, K. H. E., Kroemer, H. J., & Kroemer-Elbert, K. E. (1997). Engineering anthropometry. In K. H. E. Kroemer (Ed.), Engineering physiology (pp. 1–60). New York: Van Nostrand Reinhold. Marras, W., & Kim, J. (1993). Anthropometry of industrial populations. Ergonomics, 36(4), 371–377. Molenbroek, J. (1987) Anthropometry of elderly people in the Netherlands: Research and applications. Applied Ergonomics, 18, 187–194. Paquet, V. (Ed.). (2004). Anthropometry and disability [Special issue]. International Journal of Industrial Ergonomics, 33(3). Paquette, S., Case, H., Annis, J., Mayfield, T., Kristensen, S., & Mountjoy, D. N. (1999). The effects of multilayered military clothing ensembles on body size: A pilot study. Natick, MA: U.S. Soldier and Biological Chemical Command Soldier Systems Center. Reed, M., Manary, M., Flannagan, C., & Schneider, L. (2000). Effects of vehicle interior geometry and anthropometric variables on automobile driving posture. Human Factors, 42, 541–552. Reed, M., Manary, M., & Schneider, L. (1999). Methods for measuring and representing automobile occupant posture (SAE Technical Paper No. 1999-01-0959). Warrendale, PA: Society of Automotive Engineers. Robinette, K. (1998). Multivariate methods in engineering anthropometry. In Proceedings of the Human Factors and Ergonomics Society 42nd annual meeting (pp. 719–721). Santa Monica, CA: Human Factors and Ergonomics Society. Robinette, K. (2000). CAESAR measures up. Ergonomics in Design, 8(3), 17–23. Roebuck, J., Kroemer, K. H. E., & Thomson, W. (1975). Engineering anthropometry methods. New York: Wiley. Steenbekkers, L., & Molenbroek, J. (1990). Anthropometric data of children for non-specialized users. Ergonomics, 33(4), 421–429. Ulijaszek, S., & Mascie-Taylor, C. G. N. (Eds.). (1994). Anthropometry: The individual and the population. Cambridge, UK: Cambridge University Press.
32 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
APPLICATION USE STRATEGIES Strategies for using complex computer applications such as word-processing programs and computeraided drafting (CAD) systems are general and goaldirected methods for performing tasks. These strategies are important to identify and learn because they can make users more efficient and effective in completing their tasks, and they are often difficult to acquire just by knowing commands on an interface. To understand strategies for using computer applications, consider the task of drawing three identical arched windows in a CAD system. As shown in Figure 1A, one way to perform the task is to draw all the arcs across the windows, followed by drawing all the vertical lines, followed by drawing all the horizontal lines. Another way to perform the same task (Figure 1B) is to draw all the elements of the first window, group the elements and then make three copies of the grouped elements. The first method is called sequence-by-operation because it organizes the drawing task by performing one set of identical operations (in this case draw arc), followed by performing the next set of similar operations (in this case draw line). The second method is called detail-aggregate-manipulate because it organizes the task by first detailing all the elements of the first object (in this case drawing the parts of the first window), aggregating the elements of the first object (in this case grouping all the parts of the first window), and then manipulating that aggregate (in this case making two copies of the grouped elements of the first window). Both the methods are strategies because they are general and goal-directed. For example, the detail-aggregatemanipulate strategy is general because it can be used to create multiple copies of sets of objects in a wide range of applications. The above example was for a CAD application, but the same strategy could be used to create many identical paragraphs for address labels in a word-processing application, such as Microsoft Word. The detail-aggregate-
manipulate strategy is also goal-directed because it can be used to complete the task of drawing three arched windows. The definition of a strategy given above subsumes more limited strategy definitions used in fields as diverse as business management and cognitive psychology. These definitions may be stated in terms of time (they may define strategy as a long-term plan for achieving a goal), the existence of alternate methods (they may consider a strateg y to be any method that is nonobligatory), or performance outcomes (they may define a strategy as a method that results in a competitive advantage). However, excluding these particulars (time, existence of alternate methods, and performance outcomes) from the definition of strategy enables us to describe strategies in a more encompassing way, irrespective of whether they are short term or long term, unique or one of many, or efficient or inefficient.
The Costs and Benefits of Using Strategies Although the two strategies shown in Figure 1 achieve the same goal, different costs and benefits are associated with each one’s use. By drawing all the arcs before the lines, the sequence-by-operation strategy reduces the cost of switching between the draw arc, and the draw line commands. Furthermore, the strategy uses simple commands that are useful for performing a large set of tasks. Therefore, the short-term learning cost of using this strategy is small. However, because the user is constructing every element in the drawing, the performance cost (measured in terms of time and effort) can become large when drawing repeated elements across many tasks, especially in the long term. In contrast, the detail-aggregatemanipulate strategy requires the user to draw the elements of only one window, and makes the computer construct the rest of the windows using the group, and copy commands. For a novice CAD user, the short-term learning cost for the detail-aggregatemanipulate strategy involves learning the group and copy commands and how to sequence them. However, as is common in the use of any new tool,
APPLICATION USE STRATEGIES ❚❙❘ 33
this short-term learning cost is amortized over the long term because of the efficiency gained over many invocations of the strategy. This amortization therefore lowers the overall performance cost. Research has shown that strategies like detail-aggregate-manipulate can save users between 40 percent and 70 percent of the time to perform typical drawing tasks, in addition to reducing errors. Furthermore, with properly designed strategy-based training, such strategies can be taught to novice computer users in a short amount of time. For users who care about saving time and producing accurate drawings, learning such strategies can therefore make them more efficient (save time) and more effective (reduce errors) with relatively short training.
A. Sequence-by-Operation Strategy
1.Draw arcs.
2. Draw vertical lines.
3. Draw horizontal lines.
B. Detail-Aggregate-Manipulate Strategy
1.Draw arc. 2.Draw lines. 3. Group lines. 4. Copy group. detail
aggregate
manipulate
Source: Bhavnani, S. K., John, B. E. (1996). Exploring the unrealized potential of computer-aided drafting. Proceedings of CHI’96, 337. Copyright 1996 ACM, Inc. Reprinted by permission.
Two strategies to perform the 3-window drawing task. FIGURE 1.
A Framework That Organizes Strategies for Using Complex Computer Applications Given the important role that strategies can play in improving overall productivity, researchers have attempted to identify and organize strategies for computer application use, such as authoring and information retrieval applications. Frameworks to organize strategies have suggested the design of: (1) training that teaches the strategies in a systematic way, (2) new systems that provide effective and efficient strategies to users with little experience, and (3) evaluation methods to ensure that designers consistently offer the commands for using efficient and effective strategies. One useful way to organize strategies is based on the general capabilities of computer applications that the strategies exploit. For example, the detail-aggregate-manipulate strategy described in Figure 1 exploits the iterative power of computers; it makes the computer (instead of the user) perform the repetitious task of copying the elements multiple times. Strategies have also been identified that exploit other powers of computers, such as the powers of propagation, organization, and visualization. Another way to organize strategies is by the scope of their use. For example, some strategies are
broad in scope because the powers they exploit are offered by a large range of computer applications such as authoring and information retrieval applications. Other strategies are narrower in scope and applicable to a smaller range of computer applications such as only to word processors. Large-Scope Strategies Given the ubiquity of graphical user interfaces (GUIs) across computer applications, most useful computer applications require some interaction with a visual interface. Such computer applications offer the power of visualization, that is, the power to selectively view information on the screen. For example, a common word-processing task is to compare information from one part of a document with information in another part of the document. When these two parts of the document cannot fit simultaneously on the screen, the user can perform the comparison task in several ways. One way is to scroll back and forth between the relevant parts of the document. This method is time-consuming and error-prone because it requires the user to remember the information that is not visible. Another way to perform the same comparison task is to first bring together on the computer screen the two relevant parts of the document, before
[
34 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Large Scope
Computer Applications
Visualization Strategies
Medium Scope
Other computer applications (e.g. information retrieval )
Authoring Applications
Iteration Strategies Propagation Strategies
Word Processors
Small Scope
Text Transformation Strategies
Spreadsheets
Small Scope
Formula Decomposition Strategies
Drawing Systems
Small Scope
Graphic Precision Strategies
Other authoring applications (e.g. Web authoring)
F I G U R E 2 . Strategies have been identified to exploit different powers of computers at different scopes levels. Large scope strategies are useful to many classes of computer applications, such as authoring and information retrieval applications. Medium scope strategies apply to a single class of computer applications, such as authoring applications. Small scope strategies apply to a single sub-class of applications, such as only to word processors. The dotted lines represent how future strategies can be included in the framework.
comparing them. The information can be brought together on the screen by different commands, such as by opening two windows of the same document scrolled to the relevant parts of the document, or by using the split window command in Microsoft Word to view two parts of the document simultaneously. In addition to being useful for word-processing tasks, this visualization strategy is also useful when one is drawing a complex building in a CAD system, or when one is comparing information from two different webpages when retrieving information on the Web. Hence strategies that exploit the power of visualization have wide scope, spanning many different classes of computer applications. Medium-Scope Strategies While visualization strategies have the widest use across classes of computer applications, there are three sets of strategies that are limited in scope to only one class of computer applications: First, there are strategies that exploit the iterative power of computers, such as the detail-aggregate-manipulate strategy discussed earlier. These are useful mainly for authoring applications such as drawing systems and word processors.
Second, there are strategies that exploit the power of propagation provided by authoring applications. The power of propagation enables users to set up dependencies between objects, such that modifications automatically ripple through to the dependent objects. For example, often users have to change the font and size of headings in a document to conform to different publication requirements. One way to perform this task is to make the changes manually. This is time-consuming, especially when the document is long, and error-prone, because certain headings may be missed or incorrectly modified. A more efficient and effective method of performing the same task is to first make the headings in a document dependent on a style definition in Microsoft Word. When this style definition is modified, all dependent headings are automatically changed. This strategy is useful across such applications as spreadsheets (where different results can be generated by altering a variable such as an interest rate), and CAD systems (where it can be used to generate variations on a repeated window design in a building façade). Third, there are strategies that exploit the power of organization provided by authoring applications. The power of organization enables users to explic-
APPLICATION USE STRATEGIES ❚❙❘ 35
itly structure information in representations (such as in a table). These explicit representations enable users to make rapid changes to the content without having to manually update the structure of the representation. For example, one way to represent tabular information in a word-processing application is by using tabs between the words or numbers. However, because tabs do not convey to the computer an explicit tabular representation consisting of rows and columns, the tabular structure may not be maintained when changes are made to the content. A more efficient and effective way to perform this task is to first make the table explicit to the computer by using the command insert table, and then to add content to the table. Because the computer has an internal data structure for representing a table, the tabular representation will be maintained during modifications (such as adding more content to a cell in the table). Organization strategies are also useful in other authoring applications. For example, information can be stored using a set-subset representation in a spreadsheet (as when different sheets are used to organize sets of numbers) and in a CAD system (as when different layers are used to organize different types of graphic information). As discussed above, strategies that exploit the powers of iteration, propagation, and organization are useful mainly for authoring applications. However, it is important to note that the powers of iteration, propagation, and organization can also be offered by other classes of computer applications, such as information retrieval applications. For example, many Web browsers offer users ways to organize the addresses of different retrieved webpages. (The organizing features provided by the favorites command in Internet Explorer is one example.) However, while powers provided by authoring applications can be provided in other classes of computer applications, the strategies that they exploit will tend to be the same. Small-Scope Strategies Small-scope strategies exploit powers provided by particular subclasses of applications. For example, the power of graphic precision is offered mainly by drawing systems, such as CAD systems. Strategies that exploit graphic precision enable users to cre-
ate and manipulate precise graphic objects. For example, a common precision drawing task is to create a line that is precisely tangent and touching the end of an arc (as shown in the arched windows in Figure 1). One way to perform this task is to visually locate, and then click the end of the arc when drawing the line. This is error-prone because the user relies on visual feedback to detect the precise location of the end of the arc. Another way is to use the snapto-object command, which enables the user to click a point that is only approximately at the end of the arc. The computer responds by automatically locating the precise end of the arc, and therefore enables the user to draw a line that is precisely tangent to the end of the arc. Similar small-scope strategies have been identified for word-processing applications (such as those that assist in transforming text to generate summaries or translations) and for spreadsheets (such as those that decompose formulas into subformulas to enable quick debugging). Future Extensions of the Strategy Framework The strategy framework described above focuses on authoring applications. However, the framework can also be extended to organize the large number of search strategies that have been identified for use with information retrieval applications such as general-purpose search engines like Google. In contrast to computer powers that are useful in organizing strategies for use with authoring applications, strategies for use with information retrieval systems appear to be driven by attributes of how information sources are structured. For example, a large portion of the Web comprises densely connected webpages referred to as the core of the Web. The densely connected structure of information sources in the core suggests the importance of using a variety of browsing strategies (that rely on using hyperlinks to move from one page to another) to locate relevant sources. There is also a large portion of the Web that consists of new pages that are not linked to many other pages. Strategies to find these pages therefore require the use of different query-based search engines, given that no single search engine indexes all webpages.
36 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
While there has been much research on strategies for finding relevant sources of information, one set of strategies works by selecting and ordering relevant sources of information based on the way information is distributed across sources. For example, health care information is typically scattered across different health care portals. In this situation a useful strategy is to visit specific kinds of portals in a particular order to enable comprehensive accumulation of the relevant information. Such strategies become critical when incomplete information can have dangerous consequences (as is the case with incomplete information on health issues). An important difference between strategies for using authoring applications and strategies for using information retrieval systems is that search strategies are fundamentally heuristic—that is, they are rules of thumb that do not guarantee successful task completion. This is in part because users’ evaluation of what is relevant changes based on what is being learned during the search process.
How the Identification of Strategies Can Improve Human-Computer Interaction The identification and analysis of application use strategies suggests three practical developments: strategy-based instruction, new search systems, and an analysis method to ensure consistency in capabilities across applications. Strategy-Based Instruction Strategies for using authoring applications have led to the design of strategy-based instruction. Strategy-based instruction teaches commands in combination with the authoring strategies that make use of authoring applications’ powers of iteration, propagation, and organization. Research has shown that students who took the strategy-based training acquired more efficient and effective strategies and demonstrated a greater ability to transfer that knowl-
edge across applications than did students who were taught only commands. New Search Systems The identification of search strategies to deal with the scatter of information across the Web has led to the design of a new kind of domain portal called a Strategy Hub. This type of domain portal implements the heuristic search strategy of visiting sources of information in a particular order. Recent studies show that such a system enables users to find more comprehensive information on specific topics when compared to the information retrieved by users of other search systems. An Analysis Method To Ensure Consistency in Capabilities across Applications To enable the widest use of strategies across computer applications, designers must provide a consistent set of commands. Therefore, a method called “designs conducive to the use of efficient strategies” (Design-CUES) has been developed that enables designers to systematically check if their designs provide the commands necessary for users to implement efficient and effective strategies.
Looking Forward Many years of research has shown that merely learning commands does not make for the best use of complex computer applications. The effective and efficient use of computer applications often requires the use of strategies in addition to commands. An important research goal has therefore been to identify strategies for using a wide range of computer applications. The strategies that have been identified to date have benefited users through strategybased instruction, new forms of search systems, and new design methods. As research on strategy identification continues, we can expect more developments along those lines, all with the ultimate goal of making users more effective and efficient in the use of complex computer applications. Suresh K. Bhavnani
ARPANET ❚❙❘ 37
FURTHER READING Bates, M. (1979). Information search tactics. Journal of the American Society for Information Science 30(4), 205–214. Bates, M. J. (1998). Indexing and access for digital libraries and the Internet: Human, database, and domain factors. Journal of the American Society for Information Science, 49(13), 1185–1205. Belkin, N., Cool, C., Stein, A., & Thiel, U. (1995). Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. Expert Systems with Applications, 9(3), 379–395. Bhavnani, S. K. (2002). Domain-specific search strategies for the effective retrieval of healthcare and shopping information. In Proceedings of CHI’02 (pp. 610–611). New York: ACM Press. Bhavnani, S. K. (in press). The distribution of online healthcare information: A case study on melanoma. Proceedings of AMIA ’03. Bhavnani, S. K., Bichakjian, C. K., Johnson, T. M., Little, R. J., Peck, F. A., Schwartz, J. L., et al. (2003). Strategy hubs: Next-generation domain portals with search procedures. In Proceedings of CHI ’03, (pp. 393–400). New York: ACM Press. Bhavnani, S. K., & John, B. E. (2000). The strategic use of complex computer systems. Human-Computer Interaction, 15(2–3), 107–137. Bhavnani, S. K., Reif, F., & John, B. E. (2001). Beyond command knowledge: Identifying and teaching strategic knowledge for using complex computer applications. In Proceedings of CHI ’01 (pp. 229–236). New York: ACM Press. Drabenstott, K. (2000). Web search strategies. In W. J. Wheeler (Ed.), Saving the user’s time through subject access innovation: Papers in honor of Pauline Atherton Cochrane (pp. 114–161). Champaign: University of Illinois Press. Mayer, R. E. (1988). From novice to expert. In M. Helander (Ed.), Handbook of human-computer interaction (pp. 781–796). Amsterdam: Elsevier Science. O’Day, V., & Jeffries, R. (1993). Orienteering in an information landscape: How information seekers get from here to there. In Proceedings of CHI 93 (pp. 438–445). New York: ACM Press. Shute, S., & Smith, P. (1993). Knowledge-based search tactics. Information Processing & Management, 29(1), 29–45. Siegler, R. S., & Jenkins, E. (1989). How children discover new strategies. Hillsdale, NJ: Lawrence Erlbaum Associates. Singley, M., & Anderson, J. (1989). The transfer of cognitive skill. Cambridge, MA: Harvard University Press.
ARPANET The Arpanet, the forerunner of the Internet, was developed by the U.S. Department of Defense’s Advanced Research Projects Agency (ARPA) in the early 1960s. ARPA was created in 1958 by President Dwight D. Eisenhower to serve as a quick-response research and development agency for the Department of Defense, specifically in response to the launch of the Soviet satellite Sputnik. The agency, now the Defense
Advanced Research Projects Agency (DARPA), funded some of the most important research of the twentieth century.
The Arpanet Concept The Arpanet long-distance computer network was a collection of ideas, breakthroughs, and people. The roots of the Arpanet can be traced to one of ARPA’s most famous managers, J. C. R. Licklider. In 1962 Licklider was recruited to work at ARPA, then housed in the Pentagon, to start a behavioral sciences program. Although a psychologist by training, Licklider had a passion for the emergent field of computers and was adamant that the future of computing resided in the interactions between humans and computers. In his seminal work, a paper entitled “Man-Computer Symbiosis” written in 1960, Licklider predicted that computers would not be merely tools for people to use but also extensions of people, forming a symbiotic relationship that would revolutionize the way people interact with the world. Through ARPA Licklider began to interact with the brightest minds in computing—scientists at Stanford, Berkeley, UCLA, MIT, and a handful of companies that made up what Licklider considered to be his “intergalactic computer network.” Of course, this network existed only in theory because people had no way to bring these resources together other than telephone or face-to-face meetings. However, Licklider had the vision of gathering these people and resources, making the intergalactic network a physical network through an integrated network of computers. Although originally brought on board to work on behavioral science issues in command-and-control systems, Licklider was directly responsible for transforming his command-and-control research office into the Information Processing Techniques Office (IPTO), which would be responsible for critical advanced computing achievements for decades to come. Although Licklider left ARPA in 1964, he had a lasting effect on the field of computing and the development of the Arpanet. In 1966 another computer visionary, Bob Taylor, became director of IPTO and immediately began to address the computer networking problem. The
38 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
We were just rank amateurs, and we were expecting that some authority would finally come along and say,“Here’s how we are going to do it.” And nobody ever came along. —Vint Cerf on the design of Arpanet
computing field at that time suffered from duplication of research efforts, no electronic links between computers, little opportunity for advanced graphics development, and a lack of sharing of valuable computing resources. Taylor asked the director of ARPA, Charles Herzfeld, to fund a program to create a test network of computers to solve these problems. Herzfeld granted Taylor’s request, and Taylor’s office received more than one million dollars to address the problems. Thus, the Arpanet project was born. Taylor needed a program manager for the Arpanet project. He recruited Larry Roberts from MIT’s Lincoln Labs. Roberts, twenty-nine years old, arrived at the Pentagon in 1966 and was ready to address head on the problem of communications between computers.
Fundamental Issues in Networking Several fundamental issues existed in the networking of computers. Networking had been conceived of to solve the problem of resource sharing between computers. During the 1960s computers were extremely large, expensive, and time consuming to operate. ARPA had already invested in computing resources at several computing centers across the country, but these centers had no way to communicate among one another or to share resources. At the same time, Cold War concerns were causing U.S. scientists to take a hard look at military communications networks across the country and to evaluate the networks’ survivability in case of a nuclear strike. In the United Kingdom scientists were looking at networks for purely communications use and were evaluating digital communication methods to work around the inefficiency of the analogue telephone system. Both U.S. and United Kingdom scientists were researching distributed networks (digital data communication networks that extend across multi-
ple locations), packet switching, dynamic routing algorithms (computational means of directing data flows), and network survivability/redundancy. Packet switching would be a critical element of network design because it would allow information to be broken down into pieces or “packets” that would be sent over the network and reassembled at their final destination. This design was a much more efficient messaging design, particularly when contrasted to analogue phone lines. Additionally, the distributed network design would be more efficient and robust. Without central nodes (locations that contain all the resources and then distribute them to the rest of the system), the system could survive a loss of one or more nodes and still route data traffic. This design also would allow more efficient data trafficking when coupled with an adaptive networking algorithm capable of determining the most efficient path for any packet to travel. Researchers addressed these issues prior to the Arpanet project. RAND’s Paul Baran recommended a distributed switching network to the U.S. Air Force in 1965 for the communications network of the Strategic Air Command, but the network was not developed. In the United Kingdom Don Davies was working on packet switching and adaptive networking for the Ministry of Defense. The two men independently came up with many of the same answers that would eventually be incorporated into the Arpanet.
The Arpanet Experiment Larry Roberts arrived at ARPA in 1966 with the charge to solve the computer networking problem. At an ARPA investigators meeting in Ann Arbor, Michigan, Roberts proposed a networking experiment that would become the Arpanet. He proposed that all of the ARPA time-sharing computers at various sites across the country be connected over dialup telephone lines. The time-sharing (or host) computers would serve double duty—both as resources and routers. Meeting participants met Roberts’s proposal with a great deal of skepticism. Why would people want to spend valuable computing resources to communicate between computers when people already had all the computing they needed at their site? At the time, sharing between
ARPANET ❚❙❘ 39
computing centers was a goal of ARPA and not necessarily of the scientific community itself. In addition, researchers would be reluctant to give up valuable computing power just so they could “share” with other researchers. However, a researcher at the meeting, Wes Clark, struck upon a solution that would allow the experiment to be carried out. Clark recommended keeping the host computers out of the networking duties. Instead, he suggested using a subnetwork of intermediary computers to handle packet switching and data trafficking. This subnetwork would reduce the computing demand on the host computers, and the use of a subnetwork of specialized computers would provide uniformity and control. This suggestion solved many problems, both technical and administrative, and would allow ARPA to control the subnetwork. The computers used at the subnetwork level were called “interface message processors” (IMPs). In addition to designing IMPs, researchers would have to develop protocols for how the IMPs would communicate with host computers and create the network. ARPA issued a request for proposals (RFP) in 1968, because the specifications for the network had become so detailed. These specifications included: Transfer of digital bits from source to specified location should be reliable. ■ Transit time through the subnetwork should be one-half second or less. ■ The subnetwork had to operate autonomously. ■ The subnetwork had to function even when IMP nodes went down. ■
The ARPA RFP was issued to determine which company could build the Arpanet to these specifications. After much debate, the contract was awarded in 1969 to the Bolt, Baranek, and Newman company (BBN), which had assembled an amazing team of scientists to transform this vision into reality. The choice of BBN was a surprise to many people because BBN was considered to be a consulting firm, not a computing heavy hitter. However, its proposal was so detailed and exacting that it could begin work immediately upon awarding of the contract. BBN had only twelve months to do the work. In their 1996 book, Where Wizards Stay Up Late, Katie Hafner, co-author of Cyberpunk, and Matthew
Lyon, assistant to the president of the University of Texas, unveil the Sputnik-era beginnings of the Internet, the groundbreaking scientific work that created it, and the often eccentric, brilliant scientists and engineers responsible. The team, led by Frank Heart, was dedicated to building the Arpanet on time and to specifications and had only nine months to deliver the first IMP. Despite hardware setbacks, the team delivered the first IMP to UCLA early. UCLA was also the site of the network management center, the “test track” for the Arpanet. The team was charged with testing the network’s limits and exposing bugs, flaws, and oddities. The initial Arpanet experiment consisted of four nodes, with an IMP at UCLA, Stanford Research Institute (SRI), University of Utah, and University of California at Santa Barbara. BBN also was responsible for two critical elements: the IMPs themselves (including IMP-to-IMP communications) and the specifications for the IMP-to-host communications. The specifications for the IMP-to-host communications were drafted by Bob Kahn, who became the intermediary between the Arpanet research community and BBN. Graduate students of the host institutions digested those specifications and developed the code that would serve as the interface between host and IMP. They formed the Network Working Group to hammer out the details of protocols, shared resources, and data transfer. They created file transfer protocols (which layout the rules for how all computers handle the transfer of files) that became the backbone of the Arpanet and made it functional. This experiment was so successful that the Arpanet was expanded to include other research sites across the country until it grew to twenty-nine nodes. In 1972 the Arpanet made its public debut at the International Conference on Computer Communication. It was an unequivocal hit, and the computer networking concept was validated in the public arena.
The Arpanet Evolves As members of a user community, the researchers involved in the Arpanet were always adding, creating, experimenting. The Arpanet became a bargaining tool in the recruiting of computer science faculty and an impromptu communication tool for “network mail” or electronic mail (e-mail). In 1973 an
40 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
ARPA study showed that 75 percent of all traffic on the Arpanet was e-mail. Researchers eventually wrote dedicated software to handle this “side use” of the Arpanet. In 1972 Bob Kahn left BBN and went to work at ARPA with Larry Roberts. Kahn was now in charge of the network that he had helped create. He formed a fruitful collaboration with Vint Cerf of Stanford ( w h o w a s a g r a d u a te s t u d e n t o n t h e U C L A Arpanet project) that led to the next evolution of networking. Together they tackled the problem of packet switching in internetworking, which would eventually become the Internet. In 1975 Vint Cerf went to DARPA to take charge of all of the ARPA Internet programs, and the Arpanet itself was transferred to the Defense Communication Agency, a transfer that upset some people in the non-bureaucratic computing research community. The Internet was created by the merging of the Arpanet, SATNET (Atlantic Packet Satellite Network), and a packet radio network—all based on the transmission-control protocol/Internet protocol (TCP/IP) standard that Cerf and Kahn created—and then more and more networks were created and connected until the Internet was born. The Arpanet eventually burgeoned to 113 nodes before it adopted the new TCP/IP standard and was split into MILNET and Arpanet in 1983. In 1989 the Arpanet was officially “powered down,” and all of the original nodes were transferred to the Internet.
The Internet and Beyond The creation of the Arpanet—and then the Internet— was the work of many researchers. Only with difficulty can we imagine our modern society without the interconnectedness that we now share. The Arpanet was a testament to the ingenuity of the human mind and people’s perhaps evolutionary desire to be connected to one another. The Arpanet not only brought us closer together but also brought us one step closer to J. C. R. Licklider’s vision of humancomputer interaction more than four decades ago. Amy Kruse, Dylan Schmorrow, and J. Allen Sears See also Internet—Worldwide Diffusion
FURTHER READING Adam, J. (1996, November). Geek gods: How cybergeniuses Bob Kahn and Vint Cerf turned a Pentagon project into the Internet and connected the world. Washingtonian Magazine, 66. Baranek, B., & Newman. (1981, April). A history of the ARPANET: The first decade. NTIS No. AD A 115440). Retrieved March 23, 2004, from http://www.ntis.gov Evenson, L. (1997, March 16). Present at the creation of the Internet: Now that we’re all linked up and sitting quietly, Vint Cerf, one of its architects, describes how the Internet came into being. San Francisco Chronicle (p. 3ff). Hafner, K., & Lyon, M. (1996). Where wizards stay up late: The origins of the Internet. New York: Simon & Schuster. Hughes, T. J. (1998). Rescuing Prometheus. New York: Pantheon Books. Norberg, A., & O’Neill, J. (1997). Transforming computer technology. Ann Arbor: Scholarly Publishing Office, University of Michigan Library. Salus, P. (1995). Casting the Net. Reading, MA: Addison-Wesley.
ARTIFICIAL INTELLIGENCE Most research in mainstream artificial intelligence (AI) is directed toward understanding how people (or even animals or societies) can solve problems effectively. These problems are much more general than mathematical or logical puzzles; AI researchers are interested in how artificial systems can perceive and reason about the world, plan and act to meet goals, communicate, learn, and apply knowledge such that they can behave intelligently. In the context of human-computer interaction (HCI), research in AI has focused on three general questions: How can the process of designing and implementing interactive systems be improved? ■ How can an interactive system decide which problems need to be solved and how they should be solved? ■ How can an interactive system communicate most effectively with the user about the problems that need to be solved? ■
The first question deals with the development process in HCI, the others with user interaction,
ARTIFICIAL INTELLIGENCE ❚❙❘ 41
specifically the issues of control and communication. These questions have been a central concern in HCI for the past thirty years and remain critical today. AI has been able to provide useful insights into how these questions can be answered. In sum, what AI brings to HCI development is the possibility of a more systematic exploration and evaluation of interface designs, based on automated reasoning about a given application domain, the characteristics of human problem solving, and general interaction principles. The AI approach can benefit end users because it encourages tailoring the behavior of an interactive system more closely to users’ needs.
The Concept of Search Almost all techniques for problem solving in AI are based on the fundamental concept of search. One way to understand search is by analogy to navigation on the World Wide Web. Imagine that my goal is to reach a specific webpage starting from my homepage, and that I have no access to automated facilities such as search engines. I proceed by clicking on the navigation links on my current page. For each new page that comes up, I decide whether I have reached my goal. If not, then I evaluate the new page, comparing it with the other pages that I have encountered, to see whether I am moving closer to my goal or farther away. Based on my evaluation, I may continue forward or go back to an earlier, more promising point to take a different path. An automated search process works in the same way. Pages correspond to states in a search space, or relevant information about the environment; navigation actions are operators, which transform one state into another; an evaluation function assesses information about the state to guide the selection of operators for further transformations. A large number of AI techniques have been developed to address specific classes of search problems, representing the problems in different ways. For example, planning systems search for sequences of interdependent operators to reach a set of goals; these systems can deal with complex tasks ranging from planning space missions to helping robots navigate over unfamiliar terrain. Expert systems, whether
acting as automated tax advisors, automobile repair advisors, or medical consultants, search opportunistically for combinations of if-then rules that derive plausible conclusions from input data and existing knowledge. Machine learning systems, including neutral networks, incrementally refine an internal representation of their environment, in a search for improved performance on given tasks. Natural language understanding systems search for correct interpretations through a space of ambiguous word meanings, grammatical constructs, and pragmatic goals. These brief descriptions are only approximate, but they help us understand how a system can represent and deal with some of the problems that arise in interacting with users or interface developers in an intelligent way.
ARTIFICIAL INTELLIGENCE (AI) The subfield of computer science that is concerned with symbolic reasoning and problem solving.
AI and the Development of User Interfaces Considerable attention in AI has focused on the process of developing user interfaces. Experienced developers generally have a working knowledge of software engineering practice, interface architectures, graphic design, and related areas, plus information about the purpose for which the interface is to be used. If this knowledge can be captured in computational form, an intelligent development environment can aid developers by testing and validating design specifications, by producing alternative designs for a given specification, by generating potential improvements to a design, and by automating some of the more common implementation tasks. The motivation for a search-based approach can be seen most clearly in the problem of layout design. If an experienced designer were asked to organize ten loosely related items of information (represented in text, pictures, and buttons) on a company’s toplevel webpage, the final product might be the result of comparing several alternatives, perhaps a few
42 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
dozen at most. The number of all possible layouts of ten items, however, runs into the millions and higher; this is much more than a designer can humanly consider. Most of these layouts will be unacceptable (for example, all possible orderings of items diagonally across the page), but there may be many effective designs that are missed simply because the number of possibilities is so enormous. A system that can search through different spatial relationships and evaluate the results, even without perfect accuracy, can give designers a more comprehensive view of the problem and its solutions. Automated layout design is just one aspect of interface design. Research in the general area of modelbased interface design aims to support developers in all stages of the design process. In MOBI-D and Mastermind, which are user interface generation tools, developers build and evaluate abstract models of computer applications (such as word processing applications, spreadsheet applications, or photographic design applications), interaction tasks and actions, presentations, even users and workplaces. The goal is to give developers decision-making tools that allow them to apply their design skills but do not overly restrict their choices. These tools test constraints, evaluate design implications, present suggestions, track changes, and so forth, facilitating the eventual construction of the actual interface. For example, if a developer specifies that the user must enter a number at some point, MOBI-D can present different interface alternatives, such as a slider (the software equivalent of a linear volume control) or a text box that the user can type into directly, for the developer to choose from. In Mastermind, the developer can switch between a number of visual formats, avoiding ones that are cumbersome. Current research in this area is helping to improve webpage design and build interfaces that meet the constraints of the next generation of interactive devices, including cell phones and handheld computers. AI research is also helping software companies with product evaluation. Partially automated testing of noninteractive software is now commonplace, but conventional techniques are not well suited to testing user interfaces. Software companies usually rely on limited user studies in the laboratory, plus a large population of alpha and beta testers (people who test
the product in real-world situations). Thanks to AI research, however, it is becoming possible to build artificial software agents that can stand in for real users. It is common to think about user interaction with the software in problem-solving terms, as goal-oriented behavior. For example, if my goal is to send an e-mail message, I divide this into subgoals: entering the recipient information and subject line information, writing a short paragraph of text, and attaching a picture. My paragraph subgoal breaks down further into writing individual sentences, with the decomposition continuing to the point of mouse movements and key presses. In AI terms, these decompositions can be represented by plans to be constructed and executed automatically. The PATHS system, a system designed to help automate the testing of graphical user interfaces, lets developers specify a beginning state, an end state, and a set of goals to be accomplished using the interface. PATHS then creates a comprehensive set of plans to achieve the goals. For example, given the goal of modifying a document, the planner will generate sequences of actions for opening the document, adding and deleting text, and saving the results, accounting for all the different ways that each action can be carried out. If a given sequence is found not to be supported when it should be, PATHS will record this as an error in the application. Similar work is carried out in the related field of cognitive modeling, which shares many concepts with AI. Cognitive modelers build computational models of human cognitive processing—perception, attention, memory, motor action, and so forth—in order to gain insight into human behavior. To make valid comparisons between a model’s performance and human performance, a common experimental ground is needed. User interfaces provide that common ground. Cognitive models comparable to planning systems have been developed for evaluating user interfaces, and they have the added benefit of giving developers information about the human side of interaction as well as the application side.
Interaction The metaphor of tool use has come to dominate the way we understand human interaction with computers, especially with regard to graphical user interfaces. Just as a carpenter keeps specialized sets of tools
ARTIFICIAL INTELLEGENCE ❚❙❘ 43
A Personal Story—Putting Humans First in Systems Design The field of augmented cognition is pushing the integration of human systems and information technology to the forefront, while also attempting to maximize human potential. My current (and anticipated future) experience with using an ever-increasing number of technologies during my everyday life compels me (propels me!) to help design a new class of systems for the user to interact with. Practitioners of traditional human-systems integration research and design have steadfastly urged that the human must be considered when designing systems for human use. An emerging concept is that not only are human beings the weak link in current human-systems relationships, but also that the number of systems that a single human interacts with is growing so rapidly that the human is no longer capable of using these technologies in truly meaningful ways. This specifically motivates me to develop augmented cognition technologies at the Defense Advanced Research Projects Agency (where I am a program manager). I want to decrease the number of system interfaces that we need to interact with, and increase the number of advanced systems that individuals are capable of using simultaneously. On any given day, I typically wear (carry) five computers: my wristwatch, cell phone, twoway pager with e-mailing capability, a personal digital assistant, and a laptop. I find these systems intrusive and the associated demands on my time to be unacceptable. My home is inundated with appliances that are evolving into computer devices—these systems have advanced “features” that require significant attention in order to use them optimally. Even with the world’s greatest human factors interface, I would never have time to interact with all of these systems that I use on a daily basis. Having said all of this, I need the systems that support me to exhibit some intelligence; I need them to be able to perceive and understand what is going on around and inside of me. I do not have time to overtly direct them. Ideally they will support me by “sensing” my limitations (and my capabilities) and determining how best to communicate with me if absolutely necessary. Augmented cognition technology will imbue into these systems the ability to interact with me. Indeed, augmented cognition is about maximizing human potential. If we humans are the “weak link,” it is because our current advanced computer systems are actually limiting our performance. In the future, we must have transparent technologies addressing our needs, or we will be overwhelmed by meaningless interactions. Dylan Schmorrow
for framing a house or building fine furniture, an experienced computer user has a variety of software tools for word processing, analyzing data with spreadsheets, or creating graphics and illustrations. User interfaces are often thought of as tool-using environments, which has important implications for the involvement of AI in user interaction. Let us extend the carpenter analogy. If I am intent on hammering a nail, I am not constantly reconsidering and recalibrating the relationship between the hammer and my hand, or the head of the hammer and the nail. Instead, after an initial adjustment, the hammer effectively becomes an extension of my arm, so that I can use it without thinking about it. Similarly, for a tool-based software environment, selecting individual tools should be intuitive, and applying a tool should quickly become second nature.
The principles of direct manipulation provide a foundation for tool-based environments. Direct-manipulation interfaces, as defined by Ben Shneiderman, the founding director of the Human-Computer Interaction Laboratory at the University of Maryland, provide a visual representation of objects, allow rapid operations with visible feedback, and rely mainly on physical actions (such as selecting and dragging or pressing buttons) to initiate actions. Modern graphical user interfaces can trace much of their power to direct-manipulation principles. Nevertheless, as powerful as direct-manipulation interfaces can be, they are not appropriate in all situations. For example, sometimes in using a piece of software I know what needs to be done—I can even describe in words what I would like to do—but I do not know exactly how to accomplish my task given the tools at hand.
44 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
These potential limitations, among others, have led AI researchers to consider alternatives to a strict tool-based approach. First, it is possible to build intelligent environments that take a more active role in assisting the user—for example, by automatically adapting their behavior to the user’s goals. Second, intelligent behavior can be encapsulated within a software agent that can take responsibility for different tasks in the environment, reducing the burden on the user. Third, these agents and environments can communicate with the user, rather than passively being acted upon by the user, as tools are. Intelligent Environments Some intelligent environments work by integrating AI search into an otherwise conventional interface. One recently developed technique, human-guided simple search, is intended to solve computationally intensive problems such as the traveling salesman problem. This problem involves a salesman who must visit a number of cities while keeping the distance traveled as small as possible. Finding the optimal route for even a small number of locations is beyond what can be done with pencil and paper; for ten locations there are over three million possible routes. Large problems are challenging even for the most sophisticated computer programs. The user works with the human-guided search (HUGSS) tool kit through a graphical display of routes that the system has found. By pressing a button, the user activates a search process that computes the best route it can find within a fixed period of time. The user examines the solution and modifies it by selecting parts of the route that need further refinement or identifying those parts that already have a reasonable solution. The user brings human perception and reasoning to bear on the problem by constraining the space that the search process considers (for example, by temporarily focusing the search on routes between five specific locations, rather than the entire set). Problem-solving responsibility is explicitly shared between the user and the system, with the amount and timing of the system’s effort always under the user’s control. HUGSS works faster than the best fully automated systems currently in use, and it produces results of equal quality.
Other approaches to building intelligent environments, such as programming by example (PBE), involve more significant changes to user interaction. PBE systems watch the user perform a procedure a number of times and then automatically generalize from these examples to create a fully functional program that can execute the repetitive actions so the user does not have to. The SMARTedit system is an example of a machine-learning approach to PBE, in the context of a text-editing application. Suppose that the user moves the cursor to the beginning of the word apple, erases the lowercase a, and types an uppercase A. There are several ways that those actions could be interpreted. Perhaps, for example, the user wanted to move the cursor forward n characters and replace the arbitrary character at that location with A, or perhaps the user wanted to move to the next occurrence of the letter a and capitalize it, or to correct the capitalization of the first word in a sentence, or some other possibility. Each of these interpretations is a different hypothesis maintained by SMARTedit about the user’s intentions. As the user takes further actions, repeating similar sequences on different text, ambiguity is reduced. Some hypotheses become more plausible while others are pruned away because they predict actions inconsistent with the user’s behavior. At any point, the user can direct SMARTedit to take over the editing process and watch the system apply its most hig hly ranked hy pothesis. If SMARTedit carries out a sequence incorrectly, the user can interrupt and correct the mistake, with the system learning from the feedback. Adaptive user interfaces are another type of intelligent environment. Their development is motivated by the observation that while the ideal software system is tailored to an individual user, for economic reasons a single system must be designed and released to thousands or even millions of users, who differ widely from one another in expertise, interests, needs, and so forth. The solution is a system that can adapt to its users when in use. A simple example is adaptive menus. A system can record how often the user selects different menu options, and modify the menu structure so that more frequently chosen options can be reached more efficiently. This basic idea
ARTIFICIAL INTELLEGENCE ❚❙❘ 45
also works in more sophisticated adaptive systems, many of which compile detailed models of users and their particular tasks and adapt accordingly. Adaptive systems have become especially relevant in efforts to personalize the World Wide Web as well as in research on intelligent tutoring systems and other applications of AI to education.
Intelligent Agents The engineer Michael Huhns and the computer scientist Munindar Singh define intelligent agents as “active, persistent (software) components that perceive, reason, act, and communicate” (Huhns and Singh 1997, 1). For our purposes, the most important characteristic of an agent is its autonomy—its ability to carry out activities without the constant, direct supervision of a human being. Agents in use at present include animated characters or “believable” agents, autonomous agents such as softbots (software agents that perform tasks on the Internet) and physical robots, and mobile agents whose processing is not limited to a single computer platform. Agents are also used in multi-agent systems, which may involve mixed teams of humans and agents. Most relevant to HCI are interface agents, which act as intelligent assistants within a user interface, sometimes carrying out tasks on their own but also able to take instructions and guidance from the user. Letizia is an interface agent that assists users in browsing the World Wide Web. Letizia operates in conjunction with a standard Web browser, maintaining two open windows for its own use. As the user navigates through the Web, Letizia records the information on each page that the user visits and performs an independent search of nearby pages that the user may not have seen. Letizia’s evaluation function compares the information on the pages that it visits with the information that the user has seen up to the current point. In this way Letizia can make suggestions about what the user might be interested in seeing next. As Letizia visits pages, it displays the most promising ones for a short time in one window and the overall winner it has encountered in the other window. The user can watch what Letizia is doing and take control at will.
Information retrieval is just one area in which agents have become popular. Agents have also appeared in help systems, planning and scheduling aids, scripting systems, intelligent tutoring systems, collaborative filtering applications, matchmaking applications, and electronic auctions. Work on agents is one of the fastest-growing areas of AI. An important topic within research on agents is how to make agents interact most effectively with users. Who should take the initiative—the user or the agent? And when? Should one ever interrupt the other? These are questions of mixed-initiative interaction. Some work on these questions is carried out in the area of rational decision making, wherein rationality is interpreted in an economic sense. If an agent has knowledge of the user’s preferences and can reason about the user’s goals, then it can, for example, determine that the value of the information it can contribute at some point will offset the cost of the user having to deal with an interruption. A different direction is taken by projects that are influenced by the ways that people interact with one another, especially in dialogue. TRIPS (The Rochester Interactive Planning System) is a mixed-initiative planning and scheduling assistant that collaborates with a human user to solve problems in crisis situations, such as planning and managing an evacuation. The COLLAGEN (from COLLaborative AGENt) system is a collaboration system that can be incorporated into agents to give them sophisticated collaboration capabilities across a range of application domains. TRIPS and COLLAGEN agents can interact with users via everyday natural language as well as through multimedia presentations, which leads to the topic of communication. Communication Some agents communicate by conventional means in a graphical user interface, for example by raising dialog windows and accepting typed input and button presses for responses. A common and reasonable expectation, however, is that if a system is intelligent, we should be able to talk with it as we would with other people, using natural language. (Natural language refers to the languages that people commonly use, such as English or French, in contrast to programming languages.) Unfortunately,
46 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
even a brief treatment of natural-language understanding and generation, not to mention voice recognition and speech output, is beyond the scope of this article. An example, however, may give some idea of the issues involved. Consider three remarks from the user’s side of a dialogue with a natural-language system (the bracketed text is not spoken by the user): User (1): Show me document.txt. User (2): What’s the last modification date [on the file document.txt]? User (3): Okay, print it [i.e., document.txt]. To respond correctly, the system must be able to reason that modification dates are associated with files and that files rather than dates are usually printed (“it” could grammatically refer to either.) Reading this dialogue, English-speaking humans make these inferences automatically, without effort or even awareness. It is only recently that computer systems have been able to match even a fraction of our abilities. The QuickSet communication system combines natural language and other methods of interaction for use in military scenarios. Shown a map on a tablet PC, the user can say, “Jeep 23, follow this evacuation route,” while drawing a path on the display. The system responds with the requested action. This interaction is striking for its efficiency: the user has two simultaneous modes of input, voice and pen-aided gesture, and the ambiguities in one channel (in this example, the interpretation of the phrase “this route”) are compensated for by information in the other channel (the drawn path). In general, voice and natural language can support a more engaging, natural style of interaction with the interface than approaches that use a single vector of communication. Embodied conversational agents take work in natural language a step further. When people speak with one another, communication is not limited to the words that are spoken. Gestures, expressions, and other factors can modify or even contradict the literal meaning of spoken words. Embodied conversational agents attempt to recognize and produce these broader cues in communication. REA, a simulated real estate agent research prototype developed at the Massachusetts Institute of Technology, is represented by a full body figure on a large-scale display. REA shows users around a house, making appropriate use of eye gaze, body posture, hand gestures, and facial expressions to en-
hance its spoken conversation. Users can communicate via speech or gesture, even by simply looking at particular objects, nonverbal behavior that is sensed by cameras. Systems like REA aim to make the computer side of face-to-face human-computer communication as rich and nuanced as the human side.
Future Directions This article has introduced the reader to AI approaches to HCI rather than give a taxonomy of AI systems; many of the systems touched upon are much broader in scope than can be conveyed through a category assignment and a few sentences. Developments that do not fit neatly within the categories discussed are listed below. Smart Rooms and Intelligent Classrooms Much of what makes a software environment intelligent can be generalized to the physical domain. Smart rooms and intelligent classrooms rely on the same kind of technology as an embodied conversational agent; they register users’ gestures and spoken commands and adjust thermostats, change lighting, run presentations and the like, accordingly. Games and Virtual Environments Intelligent agents have begun to enrich games and virtual environments, acting as teammates or opponents. Extending this line of research, the Mimesis system imposes a nonscripted, dynamic narrative structure onto a virtual gaming environment, so that external goals (for example, education on a historical period) can be met without compromising the user’s direct control over the environment. Human-Robot Interaction Robots are appearing outside the laboratory, in our workplaces and homes. Human-robot interaction examines issues of interaction with physical agents in real-world environments, even in social situations. Robots can be used to explore otherwise inaccessible environments and in search-and-rescue missions. It should be clear from this discussion that the most interesting problems in HCI are no longer found in software technology, at the level of the visible components of the interface. Effective AI
ASIAN SCRIPT INPUT ❚❙❘ 47
approaches to HCI focus on issues at deeper levels, probing the structure of problems that need to be solved, the capabilities and requirements of users, and new ways of integrating human reasoning with automated processing. Robert St. Amant FURTHER READING Anderson, D., Anderson, E., Lesh, N., Marks, J., Mirtich, B., Ratajczak, D., et al. (2000). Human-guided simple search. In Proceedings of the National Conference on Artificial Intelligence (AAAI) (pp. 209–216). Cambridge, MA: MIT Press. Cassell, J. (Ed.). (2000). Embodied conversational agents. Cambridge, MA: MIT Press. Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjálmsson, H., et al. (1999). Embodiment in conversational interfaces: REA. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI) (pp. 520–527). New York: ACM Press. Cypher, A., (Ed.). (1993). Watch what I do: Programming by demonstration. Cambridge, MA: MIT Press. Huhns, M. N., & Singh, M. P. (Eds.). (1997). Readings in agents. San Francisco: Morgan Kaufmann. Kobsa, A. (Ed.). (2001). Ten year anniversary issue. User Modeling and User-Adapted Interaction, 11(1–2). Lester, J. (Ed.). (1999). Special issue on intelligent user interfaces. AI Magazine, 22(4). Lieberman, H. (1995). Letizia: An agent that assists Web browsing. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (pp. 924–929). San Francisco: Morgan Kaufmann. Lieberman, H. (Ed.). (2001). Your wish is my command. San Francisco: Morgan Kaufmann. Lok, S., & Feiner, S. (2001). A survey of automated layout techniques for information presentations. In Proceedings of the First International Symposium on Smart Graphics (pp. 61–68). New York: ACM Press. Maybury, M. T., & Wahlster, W. (Eds.). (1998). Readings in intelligent user interfaces. San Francisco: Morgan Kaufmann. Memon, A. M., Pollack, M. E., Soffa, M. L. (2001). Hierarchical GUI test case generation using automated planning. IEEE Transactions on Software Engineering, 27(2), 144–155. Newell, A., & Simon, H. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Oviatt, S. L., Cohen, P. R., Wu, L., Vergo, J., Duncan, L., Suhm, B., et al. (2002). Designing the user interface for multimodal speech and gesture applications: State-of-the-art systems and research directions. In J. Carroll (Ed.), Human-computer interaction in the new millennium (pp. 419–456). Reading, MA: Addison-Wesley. Puerta, A.R. (1997). A model-based interface development environment. IEEE Software, 14(4), 41–47. Ritter, F. E., & Young, R. M. (Eds.). (2001). Special issue on cognitive modeling for human-computer interaction. International Journal of Human-Computer Studies, 55(1). Russell, S., & Norvig, P. (1995). Artificial intelligence: A modern approach. Englewood Cliffs, NJ: Prentice-Hall.
Shneiderman, B. (1998). Designing the user interface: Strategies for effective human-computer interaction. Boston: Addison-Wesley. Shneiderman, B., & Maes, P. (1997). Debate: Direct manipulation vs. interface agents. Interactions, 4(6), 42–61. Sullivan, J. W., & Tyler, S. W. (Eds.). (1991). Intelligent user interfaces. New York: ACM Press. St. Amant, R., & Healey, C. G. (2001). Usability guidelines for interactive search in direct manipulation systems. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1179–1184). San Francisco: Morgan Kaufman. Szekely, P., Sukaviriya, P., Castells, P., Muthukumarasamy, J., & Salcher, E. (1996). Declarative interface models for user interface construction tools: The Mastermind approach. In L. Bass & C. Unger (Eds.), Engineering for human-computer interaction (pp. 120–150). London and New York: Chapman & Hall. Wolfman, S. A., Lau, T. A., Domingos, P., & Weld, D. S. (2001). Mixed initiative interfaces for learning tasks: SMARTedit talks back. In Proceedings of the International Conference on Intelligent User Interfaces (pp. 67–174). New York: ACM Press.
ASIAN SCRIPT INPUT The Asian languages that employ the Chinese alphabet in their writing systems present difficult challenges for entering text into computers and word processors. Many Asian languages, such as Korean and Thai, have their own alphabets, and the Devanagari alphabet is used to write Sanskrit, Hindi, and some other languages of India. Designing keyboards and fonts for alphabets of languages—such as Hebrew, Greek, Russian, and Arabic—that do not employ the Roman alphabet used by English and other western European languages is relatively simple. The challenge with Chinese, simply put, is that a standard national database contains 6,763 symbols (called “characters” rather than “letters”), and a keyboard with so many keys would be completely unwieldy. As was the case with ancient Egyptian hieroglyphics and Mesopotamian cuneiform, Chinese writing began as pictographs that represented particular things. Evolving through time and modified for graceful drawing with an ink brush, these pictographs became the current system of characters representing concepts and sounds in a complex interplay of functions. A person fully literate in Chinese today uses 3,000 to 4,000 characters; newspapers have 6,000 to 7,000 available, but some dictionaries list as many as 50,000. In 1958 a standardized phonetic system based
48 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
on the Roman alphabet and called “pinyin” was introduced, but it has not replaced the traditional system of writing. Japanese employs two phonetic alphabets called kana, as well as Chinese characters called kanji. In 1983 the Japan Industrial Standard listed 2,963 commonly used characters plus another 3,384 that appear only rarely. Korean also makes some use of Chinese characters, but the chief form of writing is with an alphabet historically based on Chinese but phonetically representing the sounds of spoken Korean. Because Japan has been a leader in developing computer technology for decades, its language is the best example. Around 1915 Japan began experimenting with typewriters, but they were cumbersome and rare. Typewriters could be made simply for the kana, a centuries-old phonetic system for writing Japanese syllables, either in the traditional hiragana form or in the equivalent katakana form used for writing foreign words or telegrams. Occasionally reformers have suggested that Chinese characters should be abandoned in favor of the kana or the Roman alphabet, but this reform has not happened. Thus, newspapers employed vast collections of Chinese type, and careful handwriting was used in business, schools, and forms of printing such as photocopying that could duplicate handwriting. During the 1980s word processors were introduced that were capable of producing the traditional mixture of kanji, hiragana, and katakana, along with occasional words in Roman script and other Western symbols. The Macintosh, which was the first commercially successful computer with bitmapped (relating to a digital image for which an array of binary data specifies the value of each pixel) screen and printing, became popular in Japan because it could handle the language, but all Windows-based computers can now as well, as, of course, can indigenous Japanese word processors. Kana computer keyboards exist in Japan, but the most common input method for Chinese characters in both China and Japan requires the user to enter text into a Western keyboard, romanizing the words. Suppose that someone is using Microsoft Word in Japanese and wants to type the word meaning “comment.” The writer would press the Western keys that phonetically spell the Japanese word kannsou. If the word processor is set to do so, it will automatically dis-
play the equivalent hiragana characters instead of Western letters on the screen.
Many Meanings The writer probably does not want the hiragana but rather the kanji, but many Japanese words can be romanized kannsou. Asian languages have many homonyms (words that sound similar but have different meanings), and Chinese characters must represent the one intended meaning. The standard way in which word processors handle this awkward fact, in Chinese as well as Japanese, is to open a selection window containing the alternatives. For example, let’s say the user typed “kannsou,” then hit the spacebar (which is not otherwise used in ordinary Japanese) to open the selection window with the first choice highlighted. The user can select the second choice, which is the correct Chinese characters for the Japanese word meaning “a comment” (one’s thoughts and impressions about something). If the user wanted kannsou to mean not “comment,” but rather “dry,” he or she would select the third choice. The fourth through ninth choices mean “welcome” and “farewell,”“a musical interlude,” “completion, as of a race,”“meditate,”“hay”(dry grass), and “telling people’s fortunes by examining their faces.” Good Asian-language word processing software presents the choices in descending order of likelihood, and if a person selects a particular choice repeatedly it will appear on the top of the list. The word processor can be set so that the first kanji choice, instead of the hiragana, appears in the text being written. Pressing the spacebar once would transform it to the second choice, and pressing again could select the next choice and open the selection window. The choices may include a katakana choice as well. Many choices exist, and some Chinese word processors often fill the selection window four times over. Thus, research on the frequency of usage of various Chinese words is important in establishing their most efficient ordering in the selection window. Human-computer interaction (HCI) research has explored other ways of making the word selection, including eye tracking to select the alternative that the user’s eyes focus upon. The chief substitutes for keyboard text input are speech recognition and handwriting recognition. Speech recognition systems developed for English are
ASIAN SCRIPT INPUT ❚❙❘ 49
unsuitable for Asian languages. Notably, spoken Chinese is a tonal language in which each syllable has a characteristic pitch pattern, an important feature absent from English. Experts have done a good deal of research on computer recognition of Japanese and Chinese, but speech input introduces errors while requiring the same selection among choices, as does keyboard input. Handwriting recognition avoids the problem of alternative ways of writing homonyms, but despite much research it remains excessively error prone. Three approaches are being tried with Chinese: recognizing (1) the whole word, (2) the individual characters, or (3) parts of characters, called “radicals,” that may appear in many characters. All three approaches have high error rates because many characters are graphically complex, and people vary considerably in how they draw them. Thus, keyboard input remains by far the most popular method.
Modern word processors may change the balance of forces working for or against change in the traditional Asian scripts. They may degrade people’s Chinese character handwriting skills, but they may simultaneously help people employ more obscure characters. In the psychology of memory people have the ability to recognize things they would not have spontaneously produced. Chinese-language and Japanese-language word processors often include character palettes (comparable ranges, qualities, or uses of available elements), allowing users to select even obscure characters with a single click of the mouse, thereby perhaps encouraging them to do so. Computer and information scientists and engineers are rapidly producing search engines and a whole host of other tools that are giving the ancient Asian scripts a new life on the Internet and the World Wide Web. William Sims Bainbridge and Erika Bainbridge
East and West China, Japan, and Korea have from time to time considered abandoning the traditional Chinese characters, with Korea coming the closest to actually doing so. A phonetic writing system is easier to learn, thus giving students more time to study other things. The traditional Chinese system supported an entrenched intellectual elite, who feared that a simple alphabet might democratize writing. On the other hand, one advantage of the traditional system is that a vast region of the world speaking many dialects and languages could be united by a single writing system, and even today a Chinese person can communicate to some extent with a Japanese person—even though neither knows the other’s spoken language—by drawing the characters. Fluent bilingual readers of an Asian language and a Western language sometimes say they can read Chinese characters more quickly because the characters directly represent concepts, whereas Western letters represent sounds and thus only indirectly relate to concepts. Some writers have conjectured that dyslexia should be rare in Chinese, if difficulties in learning to read are an inability to connect letters with sounds. However, dyslexia seems to exist in every language, although its causes and characteristics might be somewhat different in Asian languages than in English.
See also Handwriting Recognition and Retrieval; Keyboard FURTHER READING Apple Computer Company. (1993). Macintosh Japanese input method guide. Cupertino, CA: Apple. Asher, R. E., & Simpson, J. M. Y. (Eds.). (1994). The encyclopedia of language and linguistics. Oxford, UK: Pergamon. Fujii, H., & Croft, W. B. (1993). A comparison of indexing techniques for Japanese text retrieval. In Proceedings of the 16th annual ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 237–246). New York: ACM Press. Ho, F.-C. (2002). An analysis of reading errors in Chinese language. In L. Jeffrey (Comp.), AARE 2002 conference papers (n.p.). Melbourne, Australia: Australian Association for Research in Education. Li,Y., Ding, X., & Tan, C. L. (2002). Combining character-based bigrams with word-based bigrams in contextual postprocessing for Chinese script recognition, ACM Transactions on Asian Language Information Processing, 1(4), 297–309. Shi, D., Damper, R. I., & Gunn, S. R. (2003). Offline handwritten Chinese character recognition by radical decomposition. ACM Transactions on Asian Language Information Processing, 2(1), 27–48. Wang, J. (2003). Human-computer interaction research and practice in China. ACM Interactions, 10(2), 88–96. Wang, J., Zhai, S., & Su, H. (2001). Chinese input with keyboard and eyetracking. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 349–356). New York: ACM Press.
50 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
THE ATANASOFF-BERRY COMPUTER The Atanasoff-Berry Computer (ABC) was the first electronic digital computer and the inspiration for the better-publicized 1946 ENIAC. It was conceived in late 1938, prototyped in 1939 at Iowa State College (now Iowa State University) in Ames, Iowa, and made usable for production computing by 1941. John Atanasoff, a professor of mathematics and physics, collaborated with Clifford Berry, a graduate student, to develop the system.
Physical Description In contrast to the computers that followed in the 1940s, the ABC was compact, movable, and easily operated by a single user. The original system no longer exists except for a logic module and a memory drum, but a functioning replica was constructed in the late 1990s. The Atanasoff-Berry Computer The ABC weighed about 750 pounds. It had the weight and maneuverability of an upright piano and could roll on four heavy-duty casters. The total power it drew was less than a kilowatt, and the heat generated by its vacuum tubes was low enough to dissipate without requiring fan-forced air. The ABC used ordinary 117-volt line power. An electric motor synchronized to standard 60-hertz line voltage served as the system clock. The electromechanical parts of the ABC, like those of a modern computer, were for purposes other than calculation; the computing itself was completely electronic. The arithmetic modules were identical and could easily be interchanged, removed, and repaired.
Intended Applications and Production Use The ABC was intended to solve dense systems of up to thirty simultaneous linear equations with 15-decimal precision. Atanasoff targeted a workload
like that of current scientific computers: curve-fitting, circuit analysis, structural analysis, quantum physics, and problems in mechanics and astronomy. The desktop calculators of the era were not up to the equation-solving task, and Atanasoff identified their limits as a common bottleneck in scientific research. His conception of a high-speed solution made several unprecedented leaps: binary internal arithmetic (with automatic binary-decimal conversion), allelectronic operation using logic gates, dynamicallyrefreshed memory separated from the arithmetic units, parallel operation of up to thirty simultaneous arithmetic units, and a synchronous system clock. The ABC achieved practical success at the curvefitting application. Atanasoff collaborated with a statistician colleague at Iowa State, George Snedecor, who supplied a steady stream of small linear-system problems to the ABC. Snedecor’s secretary was given the task of checking the results by desk calculation, which was simpler than solving the equations and could be performed manually.
Human Interface Compared to modern interfaces, the ABC interface resembled that of an industrial manufacturing machine. The user controlled the system with throw switches and card readers (decimal for input and binary for intermediate results). The user was also responsible for moving a jumper from one pair of contacts to another to indicate a particular variable in the system of equations. The ABC communicated to the user through a few incandescent lamp indicators, an ohmmeter to indicate correct working voltages, a binary punch card output, and a cylindrical readout for decimal numbers that resembled a car odometer. The inventors clearly designed the machine for operation by themselves, not general users. None of the switches or lamps was labeled; it was up to the user to remember what each switch did and what each lamp meant. One switch instructed the ABC to read a base-10 punch card, convert it to binary, and store it in the dynamic memory, for example. Furthermore, the open design of the ABC provided far less protection from electric shock than a modern appliance does. Exposed surfaces only a few
ATTENTIVE USER INTERFACE ❚❙❘ 51
centimeters apart could deliver a 120-volt shock to the unwary. A user entered the coefficients of the equations on standard punch cards, using an IBM card punch. Each coefficient required up to fifteen decimals and a sign, so five numbers fit onto a single eighty-column card. It was in the user’s best interest to scale up the values to use all fifteen decimals, since the arithmetic was fixed-point and accumulated rounding error. Because the ABC could hold only two rows of coefficients in its memory at once, it relied on a mass storage medium to record scratch results for later use. (The solution of two equations in two unknowns did not require scratch memory.) Since magnetic storage was still in its infancy, Atanasoff and Berry developed a method of writing binary numbers using high-voltage arcs through a paper card. The presence of a hole, representing a 1, was then readable with lower voltage electrodes. Both reading and writing took place at 1,500 bits per second, which was a remarkable speed for input/output in 1940. However, the reliability of this system was such that a 1-bit error would occur every 10,000 to 100,000 bits, and this hindered the ability to use the ABC for production computing beyond five equations in five unknowns. To obtain human-readable results, the ABC converted the 50-bit binary values stored in the memory to decimals on the odometer readout. The total process of converting a single 15-decimal number and moving the output dials could take anywhere from 1 second to 150 seconds depending on the value of the number. Atanasoff envisioned automating the manual steps needed for operation, but enhancement of the ABC was interrupted by World War II and never resumed. The ABC was a landmark in human-computer interaction by virtue of being the first electronic computer. Its use of punch cards for the input of high-accuracy decimal data, binary internal representation, operator console, and the management of mass storage and volatile storage were major advancements for the late 1930s when Atanasoff and Berry conceived and developed it. John Gustafson See also ENIAC
FURTHER READING Atanasoff, J. V. (1984). Advent of electronic digital computing. Annals of the History of Computing, 6(3), 229–282. Burks, A. R. (2003). Who invented the computer? The legal battle that changed computing history. Amherst, NY: Prometheus Books. Burks, A. R., & Burks, A. W. (1989). The first electronic computer: The Atanasoff story. Ann Arbor: University of Michigan Press. Gustafson, J. (2000). Reconstruction of the Atanasoff-Berry computer. In R. Rojas & U. Hashagen (Eds.), The first computers: History and architectures (91–106). Cambridge, MA: MIT Press. Mackintosh, A. R. (1988, August). Dr. Atanasoff ’s computer. Scientific American (pp. 72–78). Mollenhoff, C. R. (1988). Atanasoff: Forgotten father of the computer. Ames: Iowa State University Press. Randell, R. (Ed.). (1982). The or ig ins of dig ital computers (pp. 305–325). New York: Springer-Verlag. Reconstruction of the Atanasoff-Berr y Computer. (n.d.). Retrieved on January 27, 2004, from http://www.scl.ameslab .gov/ABC Sendov, B. (2003). John Atanasoff: The electronic Prometheus. Sofia, Bulgaria: St. Kliment Ohridski University Press. Silag, W. (1984). The invention of the electronic digital computer at Iowa State College, 1930–1942. The Palimpsest, 65(5), 150–177.
ATTENTIVE USER INTERFACE An attentive user interface is a context-aware human-computer interface that uses a person’s attention as its primary input to determine and act upon a person’s intent. Although we can read a person’s attention in her every word and action (even the way a person moves a cursor on a computer interface shows what she is attending to), we usually read attention in what and how people look at things. Visual attentive user interfaces concentrate on the autonomic (involuntary) and social responses that eyes communicate and read such eye movements as a lingering stare, a roving gaze, and a nervous blink in a language of ocular attention. Such interfaces also monitor the order in which people visually scan objects.
52 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Eye Tracking Eye tracking is a technique that monitors a person’s eye movements to determine where she is looking. Eye tracking has long held promise as the ultimate human-computer interface, although eye tracking products have not been a commercial success. Original eye tracking approaches used mechanical/optical instruments that tracked mirrored contact lens reflections or even instruments that measured eye muscle tension. Newer approaches illuminate the eye with infrared light and watch reflections with a camera. Researchers can indirectly determine where a person’s eye is focusing by noting that an electroencephalogram (EEG) signal is dominated by an ocular stimulus. Four or five video strobe rates on different parts of a display can be distinguished in an EEG. When a person attends to one of them, his EEG pulses at the video strobe rate. Codings of attention on a screen can be identified with an EEG frequency counter.
Attention Can Be Detected Whereas the advertising and psychology fields have long used eye movement to understand what a person is looking at, the human-computer interface field has struggled to use the eye as a controller. However, the breakthrough in visual attentive user interfaces is in observing what the eye does, not in giving it a tracking task. Interest Tracker is a system that monitors the time that a person spends gazing over a title area instead of the time that the person spends gazing at a specific character to determine selection. For example, the title of an article is presented at the bottom of a computer screen. A user might glance down to read the title; if his glance plays over the title for more than .3 seconds a window opens on the computer screen with the full article. That .3 seconds of dwell time is less than the typical 1 second that is required for a computer user to select something on a screen by using a pointing device. Interest Tracker registers whether a person is paying attention to, for example, news feeds, stock prices, or help information and learns what titles to audition at the bottom of the screen. MAGIC (Manual and Gaze Input Cascaded) pointing is a technique that lets a computer mouse manipulate what a user’s eyes look at on a screen. An
Researcher Mike Li demonstrates the technology used in the Invision eye-tracking experiment. The balls on the screen have names of companies that move around as he looks at them. The object under the screen is the eye tracker. Photo courtesy of Ted Selker.
eye tracker enables the user’s gaze to roughly position the cursor, which the mouse can then manipulate. If the user wants to change the application window he is working with, he stares at the application window that he wants to work in; this stare “warps” the cursor to that application window. MAGIC pointing speeds up context changes on the screen.
The Path of Attention Can Demonstrate Intention During the late 1960s it was shown that the way that a person’s eyes move while scanning a picture describes aspects of what she is thinking. When researchers asked viewers seven questions about a painting entitled The Unexpected Visitor, seven identifiable eye-scan patterns were recognizable. The order in which a person looks at things also is a key to what that person is thinking. Research on an experiment called “Invision” uses this fact in a user interface to prioritize activities. Invision’s grouping of things by the way a person looks improves eye tracking and uses gaze to group things of interest. Knowing that an eye moves between staring fixations can help find those fixations. By analyzing the
ATTENTIVE USER INTERFACE ❚❙❘ 53
eye-travel vectors between fixation vertices, Invision gains a much more accurate idea of what a person is trying to look at than by analyzing that person’s dwell time on a particular item. Attending to the order in which people look at things provides a powerful interface tool. Invision demonstrates that an attentive user interface can be driven from insights about where people look. Scenarios are created in which the attentive pattern of the eye gaze can be “understood” by a computer. By watching the vertices of a person’s eye moving through a visual field of company names, the system notices which ones interest the person. The company names aggregate themselves into clusters on the screen based on the person’s scanning patterns. A similar approach uses an ecological interface that is an image of a kitchen with several problems. On the counter is a dish with some food on it; the oven door is slightly ajar, as are the dishwasher and refrigerator doors. The manner in which a person’s eyes move around the kitchen image allows the interface to understand whether the person is hungry, thinking of taking care of problems, or thinking about something else in the kitchen. The interface uses the order in which the person views things in the image to bring up a menu and so forth. This approach aggregates eye motions into a story of what the person wants to do. The attention model drives the interface. The vertices of change in direction of eye movements easily give focus locations that have eluded most eye tracking research.
Ocular Attention without Eye Tracking EyeaRe is an ocular attention system that is based on the fact that many of the social cues that are made by an eye do not depend on where the eye is looking. In fact, EyeaRe has no eye tracking system. It simply measures reflected infrared (IR) from the sclera (the opaque white outer coat enclosing the eyeball except the part covered by the cornea) and pupil to a photo diode. The system uses this reflected infrared to determine whether the eye is open, closed, blinking, winking, or staring. Without a camera such
a sensor can recognize many aspects of attention. EyeaRe consists of a Microchip PIC microprocessor that records and runs the system, an LED and a photo diode looking at the eye, and another LED/photo diode pair that measures whether it is in front of other EyeaRe devices and communicates information. An IR channel communicates to a video base station or a pair of glasses. If an EyeaRe user is staring, the IR reflection off his eye does not change. Staring at a video base station starts a video; glancing away stops it. The video image can detect whether a user is paying attention to it; if the user doesn’t like it and blinks her eyes in frustration, the system puts up a more pleasing image. When two people stare at each other, EyeaRe uses the IR communication channel to exchange information. When one person stares at another person, the person being stared at receives the contact information of the person who is staring. People tend to move their eyes until they have to look 15 degrees to the side; EyeaRe has an 18degree horizontal field of view. Thus, gaze and blink detection occurs when a person looks at the EyeaRe base station or glasses. EyeaRe demonstrates that a system that doesn’t even track the eye can understand the intentions of attention.
A Simple Attentive Eye-Gesture Language To take eye communication one step further, the Eye Bed interface uses an eye-gesture language to perform tasks that are helpful to a person lying in bed. The Eye Bed demonstrates that computers can be attentive to people’s need to be horizontal eight hours a day. The Eye Bed interface uses eye tracking housed in a converted lamp hanging over the head of the person in bed. This interface easily distinguishes between staring at an object on the ceiling and glancing around indifferently. A language of attentional eye gestures drives the scenario. Glancing around shows lack of attention, whereas staring demonstrates attention. Blinking a long wink-like blink means selection. Blinking rapidly means dislike. Closing the eyes could mean that the user is going to sleep; thus, a sunset and a nighttime
54 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
scenario begin. Opening the eyes makes a morning and wakeup scenario begin. Intelligent systems analyze a person’s reactions to media on music and video jukeboxes. The media offerings are auditioned to detect the attention shown them. Blinking when one doesn’t like the media makes the system know that it should choose other music or video to show the person. Winking or closing the eyes turns off the system. The reading of eye gestures becomes an attentive user interface. Understanding attention requires a model of what eye movement means. Researchers can make a complexity of interfaces from some simple observations of eye behavior. As an output device the eye is a simpler user interface tool than is normally described. The eye can easily be used with a language of closing, opening, blinking, winking, making nervous movements, glancing around, and staring. This language can be sensed with eye-tracking cameras or with a simple reflected LED, as the EyeaRe system demonstrates.
Promises of the Future Attentive user interfaces hold great promise. People are now in a position to implement and extend such interfaces. The hardware to create and test them is easily accessible. With the use of the eye as a secondary indicator of intention, researchers can make robust and computationally simple visual interfaces. Models of human intention and attention are becoming part of all human-computer interfaces. The context of where we are and what we are doing can accomplish more than automatically opening the grocery store door. Many interfaces can be driven completely by noticing a person’s attention. Sensors in a given context can detect many things about human attention. For example, a sensor pad in front of an office door can detect if a person has arrived to visit. Many biometrics (relating to the statistical analysis of biological observations and phenomena) such as EEG changes, sweat responses, and heart rate variability are candidates for attentive user interfaces. People want to focus on what they are doing and on the people they are with. Attentive user interfaces can detect people’s intentions without taking
their attention—even encouraging their ocular focus to be on what they want to do. Attentive user interfaces allow people’s attention to make things happen. Ted Selker See also Eye Tracking
FURTHER READING Bolt, R. A. (1985). Conversing with computers. Technology Review, 88(2), 34–43. Gregory, R. L. (1997). Eye and brain: The psychology of seeing. Oxford, UK: Oxford University Press. Guo, X. (1999). Eye contact—Talking about non-verbal communication: A corpus study. Retrieved April 29, 2004, from http://www.languagemagazine.com/internetedition/ma99/sprpt35.html Maglio, P. P., Barrett, R., Campbell, C. S., & Selker, T. (2000). SUITOR: An attentive information system. New York: ACM Press. Morimoto, D., & Flickner, M. (2000). Pupil detection using multiple light sources. Image and Vision Computing, 18, 331–335. Nervous TV newscasters blink more. (1999). Retrieved April 29, 2004, from http://www.doctorbob.com/news/7_24nervous.html Rice, R., & Love, G. (1987). Electronic emotion: Socioemotional content in a computer-mediated communication. Communication Research, 14(1), 85–108. Russell, S., & Norvig, P. (1995). Artificial intelligence: A modern approach. Upper Saddle River, NJ: Prentice Hall. Selker, T., & Burleson, W. (2000). Context-aware design and interaction in computer systems. IBM Systems Journal, 39(3–4), 880–891. Shepard, R. N. (1967). Recognition memory for words, sentences and pictures. Journal of Verbal Learning and Verbal Behavior, 6, 156–163.
AUGMENTED COGNITION Augmented cognition is a field of research that seeks to extend a computer user’s abilities via technologies that address information-processing bottlenecks inherent in human-computer interaction (HCI). These bottlenecks include limitations in attention, memory, learning, comprehension, visualization abilities, and decision-making. Limitations in human cognition (the act or process of knowing) are due to intrinsic restrictions in the number of mental tasks that a person can execute at one time, and these restrictions may fluctuate from moment to moment
AUGMENTED COGNITION ❚❙❘ 55
depending on a host of factors, including mental fatigue, novelty, boredom, and stress. As computational interfaces have become more prevalent in society and increasingly complex with regard to the volume and type of information presented, researchers have investigated novel ways to detect these bottlenecks and have devised strategies to aid users and improve their performance via technologies that assess users’ cognitive status in real time. A computational interaction monitors the state of a user through behavioral, psychophysiological, and/or neurophysiological data and adapts or augments the computational interface to significantly improve users’ performance on the task at hand.
Emergence of Augmented Cognition The cognitive science and HCI communities have researched augmented cognition for several decades. Scientific papers in this field increased markedly during the late 1990s and addressed efforts to build and use models of attention in information display and notification systems. However, the phrase “augmented cognition” associated with this research did not find widespread use until the year 2000, when a U.S. Defense Department Defense Advanced Research Project Agency (DARPA) Information Science and Technology (ISAT) group study and a workshop on the field at the National Academy of Sciences were held. During the year 2002 the number of papers about augmented cognition increased again. This increase was due, in part, to the start of a DARPA research program in augmented cognition in 2001 with a focus on challenges and opportunities with the real-time monitoring of cognitive states with physiological sensors. This substantial investment in these developing technologies helped bring together a research community and stimulated a set of thematically related projects on addressing cognitive bottlenecks via the monitoring of cognitive states. By 2003 the augmented cognition field extended well beyond the boundaries of those specific Defense Department research projects, but that initial investment provided impetus for the infant field to begin to mature.
Early Investments in Related Work Augmented cognition does not draw from just one scientific field—it draws from fields such as neuroscience, biopsychology, cognitive psychology, human factors, information technology, and computer science. Each of these fields has itself undergone a substantial revolution during the past forty years that has allowed the challenges raised by researchers to begin to be investigated. Although many individual research projects contributed to the general development and direction of augmented cognition, several multimillion-dollar projects helped shape the foundation on which the field is built. Since the invention of the electronic computer, scientists and engineers have speculated about the unique relationship between humans and computers. Unlike mechanized tools, which are primarily devices for extending human force and action, the computer became an entity with which humans forged an interactive relationship, particularly as computers came to permeate everyday life. In 1960 one of the great visionaries of intelligent computing, J. C. R. Licklider, wrote a paper entitled “ManComputer Symbiosis.” Licklider was director of the Information Processing Techniques Office (IPTO) at the Defense Department’s Advanced Research Projects Agency (ARPA) during the 1960s. In his paper he stated, “The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today” (Licklider 1960, 4). Almost prophetic, this description of the symbiotic relationship between humans and computers is one of the first descriptions of what could be considered an augmented cognition computational system. Although research on this topic was not conducted during his tenure at ARPA during the 1960s, Licklider championed the research that developed into the now-burgeoning field of computer science, including creation of the Arpanet computer network (forerunner of the Internet). His research, vision, and direction had a significant impact on both computer science and information technology and set the stage for the field of augmented cognition.
56 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
During the early 1960s researchers speculated that electrical signals emanating from a human brain in the form of electroencephalographic (EEG) recordings could be used as indicators of specific events in human cognitive processing. Several Department of Defense investigations into detecting these signals and other measurements occurred through the biocybernetics and learning strategies programs sponsored by ARPA during the 1970s and 1980s. The earliest program was biocybernetics, which tested the hypothesis that EEG activity might be able to control military devices and serve as indicators of user performance. In this program biocybernetics was defined as a real-time connection between the operator and computational system via physiological signals recorded during specific tasks. Both the biocybernetics and learning strategies programs centered around the creation of closed-loop feedback systems (the relationship between user and computational system, where changes in the computational interface are driven by detected changes in the user’s physiological status, which in turn change as a result of the new format of the interface) between operator and computer for the selection and training of personnel, display/control design, and online monitoring of operator status (although with slightly different military application domains between the two programs). In both programs researchers saw the real-time identification of cognitive events as critical to understanding the best methods for aiding military users in a rapid and contextually appropriate way. However, when this research was begun, both computational systems and neuroscience were in their infancy, and the results of this research were not incorporated into production military systems. Augmented cognition can be viewed as a descendant of these early programs. Another investigation in this field was the Pilot’s Associate (PA) program sponsored by DARPA during the 1980s and early 1990s. Pilot’s Associate was an integrated system of five components that incorporated AI (artificial intelligence) techniques and cognitive modeling to aid pilots in carrying out their missions with increased situational awareness and enhanced decision-making. Unlike biocybernetics, PA utilized cognitive modeling alone and did not in-
corporate any physiological monitoring. Cognitive modeling was the cornerstone of the pilot-vehicle interface (PVI), which had the critical task of managing all pilot interactions with the system by inferring the pilot’s intentions and communicating these intentions to the other components of the PA system. The PVI was also responsible for modeling pilot workload to adapt and configure the information displays in the cockpit, conveying workload information to the other subsystems, and compensating for pilot behavior that might result in an error. An example of this work was a PA program at NASA-Ames Research Center that explored the use of probabilistic models of a pilot’s goals and workload over time, based on multiple inputs and the use of models to control the content and complexity of displays. Such models did not employ physiological measures of a pilot’s cognitive status. Other research occurred in the academic and private sectors, including the attentional user interface (AUI) project at Microsoft Research during the late 1990s, which provided conceptual support to efforts in augmented cognition. Researchers developed methods for building statistical models of attention and workload from data. Researchers built architectures to demonstrate how cognitive models could be integrated with real-time information from multiple sensors (including acoustical sensing, gaze and head tracking, and events representing interaction with computing systems) to control the timing and communication medium of incoming notifications. AUI work that included psychological studies complemented the systems and architectures work.
Foundations of Augmented Cognition In light of these earlier research efforts, the logical question arises: What sets augmented cognition apart from what has already been done? As mentioned, augmented cognition relies on many fields whose maturity is critical for its success. Although programs such as biocybernetics during the 1970s had similar goals, they did not have access to the advanced computational power necessary to process brain signals in real time, nor did researchers know enough about those signals to use them to control displays or machines. Likewise, the Pilot’s Associate program
AUGMENTED COGNITION ❚❙❘ 57
during the 1980s shared many aspirations of today’s augmented cognition, namely to develop adaptive interfaces to reduce pilot workload. However, PA could assess the status of a pilot from inferences and models based only on the pilot’s overt behavior and the status of the aircraft. What distinguishes augmented cognition is its capitalization on advances in two fields: behavioral/neural science and computer science. At the start of the twenty-first century researchers have an unparalleled understanding of human brain functioning. The depth of this understanding is due to the development of neuroscientific techniques funded by the U.S. National Institutes of Health (NIH) and other agencies during the 1990s, a period now referred to as the “Decade of the Brain.” The billion-dollar funding of the fields of neuroscience, cognitive science, and biopsychology resulted in some of the greatest advances in our understanding of the human biological system in the twentieth century. For example, using techniques such as functional magnetic resonance imaging (fMRI), scientists were able to identify discrete three-dimensional regions of the human brain active during specific mental tasks. This identification opened up the field of cognitive psychology substantially (into the new field of cognitive neuroscience) and enabled researchers to test their theories of the human mind and associate previously observed human thoughts and behaviors with neural activity in specific brain regions. Additional investment from the Department of Defense and other agencies during the twenty-first century has allowed researchers to develop even more advanced sensors that will eventually be used in augmented cognition systems. Novel types of neurophysiological signals that are measurable noninvasively include electrical signals—using electroencephalography and event-related potentials (identifiable patterns of activity within the EEG that occur either before specific behaviors are carried out, or after specific stimuli are encountered)— and local cortical changes in blood oxygenation (BOLD), blood volume, and changes in the scattering of light directly due to neuronal firing (using near infrared [NIR] light). Most of these signals, unlike fMRI, can be collected from portable measurement systems in real time, making them potentially available for everyday use. All augmented cognition systems do not necessarily contain advanced neurophysiolog-
INTELLIGENT AGENT Software program that actively locates information for you based on parameters you set. Unlike a search engine or information filter, it actively seeks specific information while you are doing other things.
ical sensors, but the field of augmented cognition is broadened even further by their inclusion. As a result of the “Decade of the Brain,” researchers have an increased knowledge of the cognitive limitations that humans face. The HCI field focuses on the design, implementation, and evaluation of interactive systems in the context of a user’s work. However, researchers in this field can work only with the data and observations easily accessible to them, that is, how people overtly behave while using interfaces. Through efforts in neuroscience, biopsychology, and cognitive neuroscience we can locate and measure activity from the brain regions that are actively involved in day-to-day information-processing tasks. Researchers will have a greater understanding of the cognitive resources that humans possess and how many of these resources are available during a computationally based task, whether or not the computational systems will include advanced sensors. After these cognitive resources are identified and their activity (or load) measured, designers of computational interfaces can begin to account for these limitations (and perhaps adapt to their status) in the design of new HCI systems. Finally, without advances in computer science and engineering, none of the neuroscientific developments listed here would be possible, and the field of augmented cognition would certainly not be feasible. During the past forty years society has experienced leaps in computational prowess and the sophistication of mathematical algorithms. These leaps have been due in part to the miniaturization of transistors and other silicon-based components so that more computational power is available per square inch of hardware. This miniaturization has allowed computers to shrink in size until they have permeated the very fabrics that people wear and even their environments. Computer code itself has become smaller and more flexible, with the emergence of
58 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
agent-based computing (the instantiation of active, persistent software components that perceive, reason, act, and communicate in software code), JAVA, and Internet services. Thus, augmented cognition has benefited from two computing advances—improvements in raw computational resources (CPUs, physical memory) and improvements in the languages and algorithms that make adaptive interfaces possible. Many other fields have benefited from these advances as well and in turn have fed into the augmented cognition community. These fields include user modeling, speech recognition, computer vision, graphical user interfaces, multimodal interfaces, and computer learning/artificial intelligence.
Components of an Augmented Cognition System At the most general level, augmented cognition harnesses computation and knowledge about human limitations to open bottlenecks and address the biases and deficits in human cognition. It seeks to accomplish these goals through continual background sensing, learning, and inferences to understand trends, patterns, and situations relevant to a user’s context and goals. At its most general level, an augmented cognition system should contain at least four components—sensors for determining user state, an inference engine or classifier to evaluate incoming sensor information, an adaptive user interface, and an underlying computational architecture to integrate the other three components. In reality a fully functioning system would have many more components, but these are the most critical. Independently, each of these components is fairly straightforward. Much augmented cognition research focuses on integrating these components to “close the loop” and create computational systems that adapt to their users. Thus, the primary challenge with augmented cognition systems is not the sensors component (although researchers are using increasingly complex sensors). The primary challenge is accurately predicting/assessing, from incoming sensor information, the correct state of the user and having the computer select an appropriate strategy to assist the user at that time. As discussed, humans have limitations in at-
tention, memory, learning, comprehension, sensory processing, visualization abilities, qualitative judgments, serial processing, and decision-making. For an augmented cognition system to be successful it must identify at least one of these bottlenecks in real time and alleviate it through a performance-enhancing mitigation strategy. Such mitigation strategies are conveyed to the user through the adaptive interface and might involve modality switching (between visual, auditory, and haptic [touch]), intelligent interruption, task negotiation and scheduling, and assisted context retrieval via book marking. When a user state is correctly sensed, an appropriate strategy is chosen to alleviate the bottleneck, the interface is adapted to carry out the strategy, and the resulting sensor information indicates that the aiding has worked—only then has a system “closed the loop” and successfully augmented the user’s cognition.
Applications of Augmented Cognition The applications of augmented cognition are numerous, and although initial investments in systems that monitor cognitive state have been sponsored by military and defense agencies, the commercial sector has shown interest in developing augmented cognition systems for nonmilitary applications. As mentioned, closely related work on methods and architectures for detecting and reasoning about a user’s workload (based on such information as activity with computing systems and gaze) have been studied for nonmilitary applications such as commercial notification systems and communication. Agencies such as NASA also have shown interest in the use of methods to limit workload and manage information overload. Hardware and software manufacturers are always eager to include technologies that make their systems easier to use, and augmented cognition systems would likely result in an increase in worker productivity with a savings of both time and money to companies that purchased these systems. In more specific cases, stressful jobs that involve constant information overload from computational sources, such as air traffic control, would also benefit from such technology. Finally, the fields of education and training are the next likely targets for augmented cognition technology after it reaches commercial vi-
AUGMENTED REALITY ❚❙❘ 59
ability. Education and training are moving toward an increasingly computational medium. With distance learning in high demand, educational systems will need to adapt to this new nonhuman teaching interaction while ensuring quality of education. Augmented cognition technologies could be applied to educational settings and guarantee students a teaching strategy that is adapted to their style of learning. This application of augmented cognition could have the biggest impact on society at large. Dylan Schmorrow and Amy Kruse See also Augmented Reality; Brain-Computer Interfaces; Information Overload FURTHER READING Cabeza, R., & Nyberg, L. (2000). Imaging cognition II: An empirical review of 275 PET and fMRI studies. Journal of Cognitive Neuroscience, 12(1), 1–47. Dix, A., Finlay, J., Abowd, G., & Beale, R. (1998). Human computer interaction (2nd ed.). London, New York: Prentice Hall. Donchin, E. (1989). The learning strategies project. Acta Psychologica, 71(1–3), 1–15. Freeman, F. G., Mikulka, P. J., Prinzel, L. J., & Scerbo, M. W. (1999). Evaluation of an adaptive automation system using three EEG indices with a visual tracking task. Biological Psychology, 50(1), 61–76. Gevins, A., Leong, H., Du, R., Smith, M. E., Le, J., DuRousseau, D., Zhang, J., & Libove, J. (1995). Towards measurement of brain function in operational environments. Biological Psychology, 40, 169–186. Gomer, F. (1980). Biocybernetic applications for military systems. Chicago: McDonnell Douglas. Gray, W. D., & Altmann, E. M. (2001). Cognitive modeling and human-computer interaction. In W. Karwowski (Ed.), International encyclopedia of ergonomics and human factors (pp. 387–391). New York: Taylor & Francis. Horvitz, E., Pavel, M., & Schmorrow, D. D. (2001). Foundations of augmented cognition. Washington, DC: National Academy of Sciences. Humphrey, D. G., & Kramer, A. F. (1994). Toward a psychophysiological assessment of dynamic changes in mental workload. Human Factors, 36(1), 3–26. Licklider, J. C. R. (1960). Man-computer symbiosis: IRE transactions on human factors in electronics. HFE-1 (pp. 4–11). Lizza, C., & Banks, S. (1991). Pilot’s Associate: A cooperative, knowledge-based system application. IEEE Intelligent Systems, 6(3), 18–29. Mikulka, P. J., Scerbo, M. W., & Freeman, F. G. (2002). Effects of a biocybernetic system on vigilance performance. Human Factors, 44, 654–664. Prinzel, L. J., Freeman, F. G., Scerbo, M. W., Mikulka, P. J., & Pope, A. T. (2000). A closed-loop system for examining psychophysiological measures for adaptive task allocation. International Journal of Aviation Psychology, 10, 393–410.
Wilson, G. F. (2001). Real-time adaptive aiding using psychophysiological operator state assessment. In D. Harris (Ed.), Engineering psychology and cognitive ergonomics (pp. 175–182). Aldershot, UK: Ashgate. Wilson, R. A., & Keil, F. C. (Eds.). (2001). The MIT encyclopedia of the cognitive sciences (MITECS). Cambridge, MA: MIT Press.
AUGMENTED REALITY Augmented reality is a new field of research that concentrates on integrating virtual objects into the real world. These virtual objects are computer graphics displayed so that they merge with the real world. Although in its infancy, augmented reality holds out the promise of enhancing people’s ability to perform certain tasks. As sensing and computing technologies advance, augmented reality is likely to come to play a significant role in people’s daily lives.
Augmented Reality and Virtual Reality An augmented-reality system merges the real scene viewed by the user with computer-generated virtual objects to generate a composite view for the user. The virtual objects supplement the real scene with additional and useful information. Sounds may be added through the use of special headphones that allow the user to hear both real sounds and synthesized sounds. There are also special gloves that a user can wear that provide tactile sensation such as hardness or smoothness. A user wearing such gloves could “feel” virtual furniture in a real room. In an augmented-reality system, users can walk around a real room, hear the echo of their footsteps, and feel the breeze from an air conditioning unit, while at the same time they can see computer-generated images of furniture or paintings. One of the requirements of an augmented-reality system is that it needs to be interactive in real time. Animation, sound, and textures are added in real time so that what the user sees, hears, and feels reflects the true status of the real world. The most important characteristic of augmented reality is the ability to render objects in three-dimensional space,
60 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
which makes them much more realistic in the eyes of the user. Virtual objects are drawn in relationship to the real objects around them, both in terms of position and size. If a virtual object is situated partially behind a real object (or vice versa) then the user should not see part of the obscured object. Occlusion of objects is the largest contributor to human depth perception. The major difference between augmented reality and virtual reality is that in virtual reality everything that is sensed by the user is computer generated. Therefore the virtual objects must be rendered as photorealistically as possible in order to achieve the feeling of immersion. Augmented reality uses both real and synthetic sights, sounds, and touches to convey the desired scene, so virtual objects do not bear the entire burden of persuading the user that the scene is real, and therefore they do not need to be so photorealistic. Augmented reality lies in the middle of the continuum between absolute reality (in which everything sensed is real) and virtual reality (in which everything that is sensed is created).
Different Types of Displays for Augmented Reality Most people depend on vision as their primary sensory input, so here we will discuss several types of visual displays that can be used with augmented reality, each with its own advantages and disadvantages. Visual displays include head-mounted displays (HMDs), monitor-based displays, projected images, and heads-up displays (HUDs). Head-Mounted Displays HMDs are headsets that a user wears. HMDs can either be see-through or closed view. The see-through HMD works as its name implies: The user looks through lenses to see the real world, but the lenses are actually display screens that can have graphics projected onto them. The biggest advantage of the see-through HMD mechanism is that it is simple to implement because the real world does not have to be processed and manipulated; the mechanism’s only task is to integrate the visual augmentations.
This reduces the safety risk, since the user can see the real world in real time. If there is a power failure, the user will still be able to see as well as he or she would when wearing dark sunglasses. If there is some kind of hazard moving through the area—a forklift, for example—the wearer does not have to wait for the system to process the image of the forklift and display it; the wearer simply sees the forklift as he or she would when not wearing the HMD. One disadvantage is that the virtual objects may appear to lag behind the real objects; this happens because the virtual objects must be processed, whereas real objects do not need to be. In addition, some users are reluctant to wear the equipment for fear of harming their vision, although there is no actual risk, and other users dislike the equipment’s cumbersome nature. A new version of the see-through HMD is being developed to resemble a pair of eyeglasses, which would make it less cumbersome. Closed-view HMDs cannot be seen through. They typically comprise an opaque screen in front of the wearer’s eyes that totally blocks all sight of the real world. This mechanism is also used for traditional virtual reality. A camera takes an image of the real world, merges it with virtual objects, and presents a composite image to the user. The advantage the closed-view has over the see-through version of the HMD is that there is no lag time for the virtual objects; they are merged with the real scene before being presented to the user. The disadvantage is that there is a lag in the view of the real world because the composite image must be processed before being displayed. There are two safety hazards associated with closed-view HMD. First, if the power supply is interrupted, the user is essentially blind to the world around him. Second, the user does not have a current view of the real world. Users have the same concerns and inhibitions regarding closed-view HMD as they do regarding see-through HMD. Monitor-Based Displays Monitor-based displays present information to the user for configuring an augmented-reality system this way. First, because a monitor is a separate display device, more information can be presented to
AUGMENTED REALITY ❚❙❘ 61
the user. Second, the user does not have to wear (or carry around) heavy equipment. Third, graphical lag time can be eliminated because the real world and virtual objects are merged in the same way they are for closed-view HMDs. The safety risk is avoided because the user can see the real world in true real time. There are also some drawbacks to using monitor-based displays instead of HMDs. First, the user must frequently look away from the workspace in order to look at the display. This can cause a slowdown in productivity. Another problem is that the user can see both the real world and—on the monitor—the lagging images of the real world. In a worse case situation in which things in the scene are moving rapidly, the user could potentially see a virtual object attached to a real object that is no longer in the scene. Projected-Image Displays Projected-image displays project the graphics and annotations of the augmented-reality system onto the workspace. This method eliminates the need for extra equipment and also prevents the user from having to look away from the work area to check the monitor-based display. The biggest disadvantage is that the user can easily occlude the graphics and annotations by moving between the projector and the workspace. Users also can put their hands and arms through the projected display, reducing their sense of the reality of the display. Heads-Up Displays Heads-up displays are very similar to see-through HMDs. They do not require the user to wear special headgear, but instead display the data on a see-through screen in front of the user. As in seethrough HMDs, these systems are easy to implement, however, there may be a lag time in rendering the virtual object.
Challenges in Augmented Reality A majority of the challenges facing augmented reality concern the virtual objects that are added to the real world. These challenges can be divided into two areas: registration and appearance. Registration
involves placing the virtual objects in the proper locations in the real world. This is an important element of augmented reality and includes sensing, calibration, and tracking. Appearance concerns what the virtual objects look like. In order to achieve seamless merging of real and virtual objects, the virtual objects must be created with realistic color and texture. In virtual-reality systems, tracking the relative position and motion of the user is an important research topic. Active sensors are widely used to track position and orientation of points in space. The tracking information thus obtained is fed into the computer graphics system for appropriate rendering. In virtual reality, small errors in tracking can be tolerated, as the user can easily overlook those errors in the entirely computer-generated scene. In augmented-reality systems, by contrast, the registration is performed in the visual field of the user. The type of display used in the system usually determines the accuracy needed for registration. One popular registration technique is visionbased tracking. Many times, there are fiducials (reference marks) marked out in the scene in which the virtual objects need to be placed. The system recognizes these fiducials automatically and determines the pose of the virtual object with respect to the scene before it is merged. There are also techniques that use more sophisticated vision algorithms to determine the pose without the use of fiducials. The motion of the user and the structure of the scene are computed using projective-geometry formulation. (Projective geometry is the branch of geometry that deals with projecting a geometric figure from one plane onto another plane; the ability to project points from one plane to another is essentially what is needed to track motion through space.) For a seamless augmented-reality system, it is important to determine the geometry of the virtual object with respect to the real scene, so that occlusion can be rendered appropriately. Stereo-based depth estimation and the z-buffer algorithm (an algorithm that makes possible the representation of objects that occlude each other) can be used for blending real and virtual objects. Also, using research results in radiosity (a technique for realistically
62 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
simulating how light reflects off objects), it is possible to “shade” the virtual object appropriately so that it blends properly with the background scene.
Applications Augmented reality has applications in many fields. In medicine, augmented reality is being researched as a tool that can project the output of magnetic resonance imaging (MRI), computed tomography (CT) scans, and ultrasound imaging onto a patient to aid in diagnosis and planning of surgical operations. Augmented reality can be used to predict more accurately where to perform a biopsy for a tiny tumor: All the information gathered from traditional methods such as MRIs can be projected onto the patient to reveal the exact location of the tumor. This enables a surgeon to make precise incisions, reducing the stress of the surgery and decreasing the trauma to the patient. In architecture and urban planning, annotation and visualization techniques can be used to show how the addition of a building will affect the surrounding landscape. Actually seeing the future building life sized, in the location it will occupy, gives a more accurate sense of the project than can be conveyed from a model. Augmented-reality simulations also make it easier to recognize potential problems, such as insufficient natural lighting for a building. Augmented reality also has the potential to let developers, utility companies, and home owners “see” where water pipes, gas lines, and electrical wires are run through walls, which is an aid when it comes to maintenance or construction work. In order for this technique to be implemented, the data must be stored in a format the augmented-reality system can use. Simply having a system that can project the images of electrical wiring on a wall would not be sufficient; the system first must know where all the wires are located. Augmented reality has the potential to make a big impact on the entertainment industry. A simple example is the glowing puck that is now used in many televised hockey games. In this application, the hockey puck is tracked and a brightly colored dot
is placed on top of it on the television video to make it easier for those watching the game on television to follow the rapid motion of the puck. Augmented reality could also make possible a type of virtual set, very similar to the blue-screen sets that are used today to film special effects. Augmented-reality sets would be interactive, would take up less space, and would potentially be simpler to build than traditional sets. This would decrease the overall cost of production. Another example, already developed is the game AR2 Hockey, in which the paddles and field (a table, as in air hockey) are real but the puck is virtual. The computer provides visual tracking of the virtual puck and generates appropriate sound effects when the paddles connect with the puck or when the puck hits the table bumpers. One military application is to use the technology to aim weapons based on the movement of the pilot’s head. Graphics of targets can be superimposed on a heads-up display to improve weapons’ accuracy by rendering a clearer picture of the target, which will be hard to miss. Many examples of assembly augmented-reality systems have been developed since the 1990s. One of the best known is the Boeing wire-bundling project, which was started in 1990. Although well known, this project has not yet been implemented in a factory as part of everyday use. The goal is relatively straightforward: Use augmented reality to aid in the assembly of wire bundles used in Boeing’s 747 aircraft. For this project, the designers decided to use a see-through HMD with a wearable PC to allow workers the freedom of movement needed to assemble the bundles, which were up to 19 meters long. The subjects in the pilot study were both computer science graduate students who volunteered and Boeing employees who were asked to participate. The developers ran into both permanent and temporary problems. One temporary problem, for example, was that the workers who participated in the pilot program were typically tired because the factory was running the pilot study at one of the busier times in its production cycle. Workers first completed their normal shift before working on the pilot project. Another temporary problem
AUGMENTED REALITY ❚❙❘ 63
was the curiosity factor: Employees who were not involved with the project often came over to chat and check out what was going on and how the equipment worked. More permanent problems were the employees’ difficulties in tracing the wires across complex subassemblies and their hesitance to wear the headsets because of fear of the lasers located close to their eyes and dislike of the “helmet head” effect that came from wearing the equipment. One of the strongest success points for this pilot study was that the bundles created using the augmented-reality system met Boeing’s quality assurance standards. Another good thing was that the general background noise level of the factory did not interfere with the acoustic tracker. In the pilot study, augmented reality offered no improvement in productivity and the only cost savings came from no longer needing to store the various assembly boards. (This can be, however, a significant savings.) The developers concluded that the reason there was no significant improvement in assembly time was because they still had some difficulty using the system’s interface to find specific wires. The developers are working on a new interface that should help to solve this problem. Augmented reality has also been used in BMW automobile manufacture. The application was designed to demonstrate the assembly of a door lock for a car door, and the system was used as a feasibility study. The annotations and graphics were taken from a CAD (computer-aided design) system that was used to construct the actual physical parts for the lock and the door. In this case, the augmentedreality system uses a see-through HMD and a voiceactivated computer—in part because the assembly process requires that the user have both hands free for the assembly process. Because this augmentedreality system mimicked an existing virtual-reality version of assembly planning for the door lock assembly, much of the required data was already available in an easily retrievable format, which simplified the development of the augmented-reality system. The developers had to overcome certain problems with the system in order to make the pilot work. The first was the issue of calibration. There is an ini-
tial calibration that must be performed as part of the start-up process. The calibration is then performed periodically when the system becomes confused or the error rate increases past a certain threshold. Users seemed to have difficulty keeping their heads still enough for the sensitive calibration process, so a headrest had to be built. Another problem was that the magnetic tracking devices did not work well because there were so many metal parts in the assembly. In addition, the speech recognition part of the system turned out to be too sensitive to background noise, so it was turned off. The pilot study for this project was used as a demonstration at a trade show in Germany in 1998. The program ran for one week without difficulty. Due to time considerations, the system was not calibrated for each user, so some people were not as impressed as the developers had hoped. Also, even with the headrest, some users never stayed still long enough for a proper calibration to be performed. Their reactions showed researchers that average users require some degree of training if they are to use this sort of equipment successfully. Despite setbacks, the developers considered the pilot a success because it brought the technology to a new group of potential users and it generated several possible follow-up ideas relating to the door lock assembly.
The Future Augmented reality promises to help humans in many of their tasks by displaying the right information at the right time and place. There are many technical challenges to be overcome before such interfaces are widely deployed, but driven by compelling potential applications in surgery, the military, manufacturing, and entertainment, progress continues to be made in this promising form of human-computer interaction. Rajeev Sharma and Kuntal Sengupta See also Augmented Cognition; Virtual Reality
64 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
AVATARS
FURTHER READING Aliaga, D. G. (1997). Virtual objects in the real world. Communications of the ACM, 40(3), 49–54. Azuma, R. (1997). A survey of augmented reality. Presence: Teleoperators and Virtual Environments, 6(3), 355–385. Bajura, M., Fuchs, H., & Ohbuchi, R. (1992). Merging virtual objects with the real world: Seeing Ultrasound imagery within the patient. Computer Graphics (Proceedings of SIGGRAPH’92), 26(2), 203–210. Das, H. (Ed.). (1994). Proceedings of the SPIE—The International Society for Optical Engineering. Bellingham, WA: International Society for Optical Engineering. Elvins, T. T. (1998, February). Augmented reality: “The future’s so bright I gotta wear (see-through) shades.” Computer Graphics, 32(1), 11–13. Ikeuchi, K., Sato, T., Nishino, K., & Sato, I. (1999). Appearance modeling for mixed reality: photometric aspects. In Proceedings of the 1999 IEEE International Conference on Systems, Man, and Cybernetics (SMC’99) (pp. 36–41). Piscataway, NJ: IEEE. Milgram, P., & Kishino, F. (1994, December). A taxonomy of mixed reality visual displays. IEICE Transactions on Information Systems, E77-D(12), 1321–1329. Neumann, U., & Majoros, A. (1998). Cognitive, performance, and systems issues for augmented reality applications in manufacturing and maintenance. In Proceedings of the IEEE 1998 Virtual Reality Annual International Symposium (pp. 4–11). Los Alamitos, CA: IEEE. Ohshima, T., Sato, K., Yamamoto, H., & Tamura, H. (1998). AR2 hockey: A case study of collaborative augmented reality. In Proceedings of the IEEE 1998 Virtual Reality Annual International Symposium (pp. 268–275) Los Alamitos, CA: IEEE. Ong, K. C., Teh, H. C., & Tan, T. S. (1998). Resolving occlusion in image sequence made easy. Visual Computer, 14(4), 153–165. Raghavan, V., Molineros, J., & Sharma, R. (1999). Interactive evaluation of assembly sequences using augmented reality. IEEE Transactions on Robotics and Automation, 15(3), 435–449. Rosenblum, L. (2000, January–February). Virtual and augmented reality 2020. IEEE Computer Graphics and Applications, 20(1), 38–39. State, A., Chen, D. T., Tector, C., Brandt, A., Chen, H., Ohbuchi, R., et al. (1994). Observing a volume rendered fetus within a pregnant patient. In Proceedings of IEEE Visualization 94 (pp. 364–368). Los Alamitos, CA: IEEE. Stauder, J. (1999, June). Augmented reality with automatic illumination control incorporating ellipsoidal models. IEEE Transactions on Multimedia, 1(2), 136–143. Tatham, E. W. (1999). Getting the best of both real and virtual worlds. Communications of the ACM, 42(9), 96–98. Tatham, E. W., Banissi, E., Khosrowshahi, F., Sarfraz, M., Tatham, E., & Ursyn, A. (1999). Optical occlusion and shadows in a “seethrough” augmented reality display. In Proceedings of the 1999 IEEE International Conference on Information Visualization (pp. 128–131). Los Alamitos, CA: IEEE. Yamamoto, H. (1999). Case studies of producing mixed reality worlds. In Proceedings of the 1999 IEEE International Conference on Systems, Man, and Cybernetics (SMC’99) (pp. 42–47). Piscataway, NJ: IEEE.
Avatar derives from the Sanskrit word avatarah, meaning “descent” and refers to the incarnation— the descent into this world—of a Hindu god. A Hindu deity embodied its spiritual being when interacting with humans by appearing in either human or animal form. In the late twentieth century, the term avatar was adopted as a label for digital representations of humans in online or virtual environments. Although many credit Neal Stephenson with being the first to use avatar in this new sense in his seminal science fiction novel Snow Crash (1992), the term and concept actually appeared as early as 1984 in online multiuser dungeons, or MUDs (role-playing environments), and the concept, though not the term, appeared in works of fiction dating back to the mid1970s. This entry explores concepts, research, and
Embodied Agent
Avatar
Digital Representation
Agent
Live Human Being
F I G U R E 1 . A representational schematic of avatars and embodied agents. When a given digital representation is controlled by a human, it is an avatar, and when it is controlled by a computational algorithm it is an embodied agent. Central to the current definition is the ability for real-time behavior, in that the digital representation exhibits behaviors by the agent or human as they are performed.
AVATAARS ❚❙❘ 65
ethical issues related to avatars as digital human representations. (We restrict our discussion to digital avatars, excluding physical avatars such as puppets and robots. Currently, the majority of digital avatars are visual or auditory information though there is no reason to restrict the definition as such.)
Agents and Avatars Within the context of human-computer interaction, an avatar is a perceptible digital representation whose behaviors reflect those executed, typically in real time, by a specific human being. An embodied agent, by contrast, is a perceptible digital representation whose behaviors reflect a computational algorithm designed to accomplish a specific goal or set of goals. Hence, humans control avatar behavior, while algorithms control embodied agent behavior. Both agents and avatars exhibit behavior in real time in accordance with the controlling algorithm or human actions. Figure 1 illustrates the fact that the actual digital form the digital representation takes has no bearing on whether it is classified as an agent or avatar: An algorithm or person can drive the same representation. Hence, an avatar can look nonhuman despite being controlled by a human, and an agent can look human despite being controlled by an algorithm. Not surprisingly, the fuzzy distinction between agents and avatars blurs for various reasons. Complete rendering of all aspects of a human’s actions (down to every muscle movement, sound, and scent) is currently technologically unrealistic. Only actions that can be tracked practically can be rendered analogously via an avatar; the remainder are rendered algorithmically (for example, bleeding) or not at all (minute facial expressions, for instance). In some cases avatar behaviors are under nonanalog human control; for example, pressing a button and not the act of smiling may be the way one produces an avatar smile. In such a case, the behaviors are at least slightly nonanalogous; the smile rendered by the button-triggered computer algorithm may be noticeably different from the actual human’s smile. Technically, then, a human representation can be and often is a hybrid of an avatar and an embodied agent, wherein the human controls the consciously generated verbal and nonverbal
gestures and an agent controls more mundane automatic behaviors. One should also distinguish avatars from online identities. Online identities are the distributed digital representations of a person. Humans are known to each other via e-mail, chat rooms, homepages, and other information on the World Wide Web. Consequently, many people have an online identity, constituted by the distributed representation of all relevant information, though they may not have an avatar.
Realism Avatars can resemble their human counterparts along a number of dimensions, but the two that have received the most attention in the literature are behavioral realism (reflected in the number of a given human’s behaviors the avatar exhibits) and photographic realism (reflected in how many of a given human’s static visual features the avatar possesses). Behavioral realism is governed by the capability of the implementation system to track and render behavior in real time. Currently, real-time behavioral tracking technology, while improving steadily, does not meet expectations driven by popular culture; for example, online representations of the character Neo in The Matrix (1999), Hiro from Snow Crash (1992), or Case from Neuromancer (1984). In those fictional accounts, the movements and gestures of avatars and the represented humans are generally perceptually indistinguishable. However, in actual practice, complete real-time behavior tracking is extremely difficult. Although gesture tracking through various mechanical, optical, and other devices has improved, the gap between actual movements and avatar movements remains large, reducing behavioral realism at least in situations requiring real-time tracking and rendering, such as online social interaction (for example, collaborative virtual work groups). Fewer barriers exist for photographic realism. Three-dimensional scanners and photogrammetric software allow for the photographically realistic recreation of static, digital human heads and faces that cannot be easily distinguished from photographs and videos of the underlying faces. Nonetheless, the key challenge to avatar designers is creating faces and
66 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Three views of a digital avatar modeled after a human head and face. This avatar is built by creating a threedimensional mesh and wrapping a photographic texture around it. Photo courtesy of James J. Blascovich. bodies in sufficient detail to allow for the realistic rendering of behavior, which brings us back to behavioral realism. In summary, static avatars currently can look quite a bit like their human controllers but can only perform a small subset of a dynamic human’s actions in real time.
Current Use of Avatars Depending on how loosely one defines digital representation, the argument can be made that avatars are quite pervasive in society. For example, sound is transformed into digital information as it travels over fiber-optic cables and cellular networks. Consequently, the audio representation we perceive over phone lines is actually an avatar of the speaker. This example may seem trivial at first, but becomes less trivial when preset algorithms are applied to the audio stream to cause subtle changes in the avatar, for example, to clean and amplify the signal. This can only be done effectively because the voice is translated into digital information. More often, however, when people refer to avatars, they are referring to visual representations. Currently, millions of people employ avatars in online role-playing games as well as in chat rooms used for virtual conferencing. In these environments,
users interact with one another using either a keyboard or a joystick, typing messages back and forth and viewing one another’s avatars as they move around the digital world. Typically, these are avatars in the minimal sense of the word; behavioral and photographic realism is usually quite low. In the case of online role-playing games, users typically navigate the online world using “stock” avatars with limited behavioral capabilities.
Avatar Research Computer scientists and others have directed much effort towards developing systems capable of producing functional and effective avatars. They have striven to develop graphics, logic, and the tracking capabilities to render actual movements by humans on digital avatars with accuracy, and to augment those movements by employing control algorithms that supply missing tracking data or information about static visual features. Furthermore, behavioral scientists are examining how humans interact with one another via avatars. These researchers strive to understand social presence, or copresence, a term referring to the degree to which individuals respond socially towards others during interaction among their avatars, com-
AVATARS ❚❙❘ 67
pared with the degree to which they respond to actual humans. The behavioral scientist Jim Blascovich and his colleagues have created a theoretical model for social influence within immersive virtual environments that provides specific predictions for how the interplay of avatars’ photographic and behavioral realism will affect people’s sense of the relevance of the avatarmediated encounter. They suggest that the inclusion of certain visual features is necessary if the avatar is to perform important, socially relevant behavioral actions. For example, an avatar needs to have recognizable eyebrows in order to lower them in a frown. Other data emphasize the importance of behavioral realism. In 2001 Jeremy Bailenson and his colleagues demonstrated that making a digital representation more photographically realistic does not increase its social presence in comparison with an agent that is more cartoon-like as long as both types of agents demonstrate realistic gaze behaviors. In findings presented in 2003, Maia Garau and her colleagues failed to demonstrate an overall advantage for more photographically realistic avatars; moreover, these researchers demonstrated that increasing the photographic realism of an avatar can actually cause a decrease in social presence if behavioral realism is not also increased. In sum, though research on avatars currently is largely in its infancy, investigators are furthering our understanding of computer-mediated human interaction. As avatars become more commonplace, research geared towards understanding these applications should increase.
Ethical Issues Interacting via avatars allows for deceptive interactions. In 2003 Bailenson and his colleagues introduced the notion of transformed social interactions (TSIs). Using an avatar to interact with another person is qualitatively different from other forms of communication, including face-to-face interaction, standard telephone conversations, and videoconferencing. An avatar that is constantly rerendered in real time makes it possible for interactants to systematically filter their appearance and behaviors (or
to have systems operators do this for them) within virtual environments by amplifying or suppressing communication signals. TSI algorithms can impact interactants’ abilities to influence interaction partners. For example, system operators can tailor the nonverbal behaviors of online teachers lecturing to more than one student simultaneously within an immersive virtual classroom in ways specific to each student independently and simultaneously. Student A might respond well to a teacher who smiles, and Student B might respond well to a teacher with a neutral expression. Via an avatar that is rendered separately for each student, the teacher can be represented simultaneously by different avatars to different students, thereby communicating with each student in the way that is optimal for that student. The psychologist Andrew Beall and his colleagues have used avatars to employ such a strategy using eye contact; they demonstrated that students paid greater attention to the teacher using TSI. However, there are ethical problems associated with TSIs. One can imagine a dismal picture of the future of online interaction, one in which nobody is who they seem to be and avatars are distorted so much from the humans they represent that the basis for judging the honesty of the communication underlying social interactions is lost. Early research has demonstrated that TSIs involving avatars are often difficult to detect. It is the challenge to researchers to determine the best way to manage this issue as the use of avatars becomes more prevalent.
State of the Art Currently, there are many examples of humans interacting with one another via avatars. For the most part, these avatars are simplistic and behaviorally and photographically unrealistic. The exception occurs in research laboratories, in which scientists are beginning to develop and test avatars that are similar in appearance and behavior to their human counterpart. As avatars become more ubiquitous, it is possible that we may see qualitative changes in social interaction due to the decoupling and transformation of behavior from human to avatar. While
68 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
there are ethical dangers in transforming behaviors as they pass from physical actions to digital representations, there are also positive opportunities both for users of online systems and for researchers in human-computer interaction. Jeremy N. Bailenson and James J. Blascovich See also Animation; Telepresence; Virtual Reality
FURTHER READING Badler, N., Phillips, C., & Webber, B. (1993). Simulating humans: Computer graphics, animation, and control. Oxford, UK: Oxford University Press. Bailenson, J. N., Beall, A. C., Blascovich, J., & Rex, C. (in press). Examining virtual busts: Are photogrammetrically generated head models effective for person identification? PRESENCE: Teleoperators and Virtual Environments. Bailenson, J. N., Beall, A. C., Loomis, J., Blascovich, J., & Turk, M. (in press). Transformed social interaction: Decoupling representation from behavior and form in collaborative virtual environments. PRESENCE: Teleoperators and Virtual Environments. Bailenson, J. N., Blascovich, J., Beall, A. C., & Loomis, J. M. (2001). Equilibrium revisited: Mutual gaze and personal space in virtual environments. PRESENCE: Teleoperators and Virtual Environments, 10, 583–598. Beall, A. C., Bailenson, J. N., Loomis, J., Blascovich, J., & Rex, C. (2003). Non-zero-sum mutual gaze in immersive virtual environments. In Proceedings of HCI International 2003 (pp. 1108–1112). New York: ACM Press.
Blascovich, J. (2001). Social influences within immersive virtual environments. In R. Schroeder (Ed.), The social life of avatars. Berlin, Germany: Springer-Verlag. Blascovich, J., Loomis, J., Beall, A. C., Swinth, K. R., Hoyt, C. L., & Bailenson, J. N. (2001). Immersive virtual environment technology as a methodological tool for social psychology. Psychological Inquiry, 13, 146–149. Brunner, J. (1975). Shockwaver rider. New York: Ballantine Books. Cassell, J., & Vilhjálmsson, H. (1999). Fully embodied conversational avatars: Making communicative behaviors autonomous. Autonomous Agents and Multi-Agent Systems, 2(1), 45–64. Garau, M., Slater, M.,Vinayagamoorhty,V., Brogni, A., Steed, A., & Sasse, M. A. (2003). The impact of avatar realism and eye gaze control on perceived quality of communication in a shared immersive virtual environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 529–536). New York: ACM Press. Gibson, W. (1984). Neuromancer. New York: Ace Books. Morningstar, C., & Farmer, F.R. (1991). The lessons of Lucasfilm’s habitat. In M. Benedikt (Ed.), Cyberspace: First steps. Cambridge, MA: MIT Press. Slater, M., Howell, J., Steed, A., Pertaub, D., Garau, M., & Springel, S. (2000). Acting in virtual reality. ACM Collaborative Virtual Environments, CVE’2000, 103–110. Slater, M., Sadagic, A., Usoh, M., & Schroeder, R. (2000). Small group behaviour in a virtual and real environment: A comparative study. PRESENCE: Teleoperators and Virtual Environments, 9, 37–51. Stephenson, N. (1993). Snow crash. New York: Bantam Books. Thalmann, M. N, & Thalmann D. (Eds). (1999). Computer Animation and Simulation 99. Vienna, Austria: Springer-Verlag. Turk, M., & Kolsch, M. (in press). Perceptual Interfaces. In G. Medioni & S. B. Kang (Eds.), Emerging topics in computer vision. Upper Saddle River, NJ: Prentice-Hall. Turkle, S. (1995). Life on the screen: Identity in the age of the Internet. New York: Simon & Schuster. Yee, N. (2002). Befriending ogres and wood elves: Understanding relationship formation in MMORPGs. Retrieved January 16, 2004, from http://www.nickyee.com/hub/relationships/home.html
BETA TESTING BRAILLE BRAIN-COMPUTER INTERFACES BROWSERS
B BETA TESTING Beta testing, a stage in the design and development process of computer software and hardware, uses people outside a company, called “beta testers,” to be sure that products function properly for typical endusers outside the firm. Does a piece of software work under normal operating conditions? Can users navigate important features? Are there any critical programming flaws? These are the questions beta tests answer. The widespread use of beta tests warrants the examination of the process. Because the trade literature in computer programming focuses on the mechanics of designing, conducting, and interpreting beta tests, less has been written on the social implications of the growing use of beta testing. For example, as
will be discussed below, beta tests make it possible for endusers to contribute to the design and development of a product and may represent a shift in the organization of the production process.
Definitions of Beta Testing A beta test is an early (preshipping or prelaunch), unofficial release of hardware or software that has already been tested within the company for major flaws. In theory, beta versions are very close to the final product, but in practice beta testing is often simply one way for a firm to get users to try new software under real conditions. Beta tests expose software and hardware to real-world configurations of computing platforms, operating systems, hardware, and users. For example, a beta test of a website is “the time period just before a site’s official launch when a fully operational 69
70 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
product is used under normal operating conditions to identify any programming bugs or interface issues” (Grossnickle and Raskin 2001, 351). David Hilbert describes beta testing as a popular technique for evaluating the fit between application design and use. The term beta testing emerged from the practice of testing the unit, module, or components of a system first. This test was called alpha, whereas beta referred to the initial test of the complete system. Alpha and beta, derived from earlier nomenclature of hardware testing, were reportedly first used in the 1960s at IBM. Now alpha typically refers to tests conducted within the firm and beta refers to tests conducted externally. There is ample evidence that beta testing has increased in various forms over the last decade. James Daly, a technology business reporter and founder of the magazine Business 2.0, reports that by 1994, 50 percent of Fortune 1000 companies in the United States had participated in beta testing and 20 percent of those companies had used beta testing widely. However, the implementation—and the purposes—of beta testing vary by company. An online market-research handbook suggests that “for most ventures, standard beta-testing technique involves e-mailing friends, family, and colleagues with the URL of a new site” (Grossnickle and Raskin 2001, 351), which clearly would not produce a statistically representative sample of end users. A meta study of beta-test evaluations done more than a decade ago found that most beta testing was actually “driven by convenience or tradition rather than recognition of the costs and benefits involved” (Dolan and Matthews 1993, 318). In addition to determining whether or not a product works, a beta test can be used to increase a firm’s knowledge about the user base for its products, to support its marketing and sales goals, and to improve product support. More importantly, beta testers’ suggestions may be incorporated into the design of the product or used to develop subsequent generations of the product.
User Participation in Product Development Beta testing allows users to become involved in the product-development process. According to sociol-
ogists Gina Neff and David Stark, establishing a cycle of testing, feedback, and innovation that facilitates negotiations about what is made can make it possible to incorporate broader participation into the design of products and organizations. However, in practice, beta tests may be poorly designed to incorporate user feedback. Advice in the trade literature suggests that beta tests may not be constructed to provide more than “bug squashing and usability testing” (Grossnickle and Raskin n.d., 1). Beta tests also present firms with a chance to conduct research on their users and on how their products are used. Ideally, beta testers are statistically representative of typical product users. However, empirical research suggests that beta testers may not accurately reflect end-users because testers tend to have more technical training and hold more technical jobs than typical office workers.
Critical Views of Beta Testing The shift from total quality management to a testing-driven model of development means that “the generation and detection of error plays a renewed and desired role” in the production cycle (Cole 2002, 1052). With the rise of the acceptance of beta versions, companies and users alike may be more willing to tolerate flaws in widely circulated products, and end-users (including beta testers) may bear an increased burden for the number of errors that companies allow in these products. Some criticism has emerged that companies are “releasing products for beta testing that are clearly not ready for the market” and are exploiting free labor by “using beta testers as unpaid consultants to find the bugs in their products” (Garman 1996, 6). Users may also be frustrated by the continually updated products that beta testing can enable. The distribution of software in non-shrink-wrapped versions means that products are not clean end-versions b u t d e s t a b i l i ze d a n d con s t a n t l y ch a n g i n g . Technological advances in distribution, such as online distribution of software products, “makes it possible to distribute products that are continually updateable and almost infinitely customizable— products that, in effect, never leave a type of beta phase” (Neff and Stark 2003, 177).
BETA TESTING ❚❙❘ 71
Benefits to Beta Testers Because they are willing to risk bugs that could potentially crash their computers, beta testers accrue benefits such as getting a chance to look at new features and products before other users and contributing to a product by detecting software bugs or minor flaws in programming. More than 2 million people volunteered to be one of the twenty thousand beta testers for a new version of Napster. There is also an increase of beta retail products—early and often cheaper versions of software that are more advanced than a traditional beta version but not yet a fully viable commercial release. Although Apple’s public beta release of OS X, its first completely new operating system since 1984, cost $29.95, thousands downloaded it despite reports that it still had many bugs and little compatible software was available. These beta users saw the long-awaited new operating system six months before its first commercial release, and Apple fans and the press provided invaluable buzz about OS X as they tested it. Many scholars suggest that the Internet has compressed the product-development cycles, especially in software, often to the extent that one generation of product software is hard to distinguish from the next. Netscape, for example, released thirty-nine distinct versions between the beta stage of Navigator 1.0 and the release of Communicator 4.0.
Future Developments Production is an “increasingly dense and differentiated layering of people, activities and things, each operating within a limited sphere of knowing and acting that includes variously crude or sophisticated conceptualizations of the other” (Suchman 2003, 62). Given this complexity, beta testing has been welcomed as a way in which people who create products can inter act w ith those who use them. Internet communication facilitates this communication, making the distribution of products in earlier stages of the product cycle both easier and cheaper; it also facilitates the incorporation of user feedback into the design process. While it is true that “most design-change ideas surfaced by a beta test are passed onto product development for incorporation into the next genera-
tion of the product” (Dolan and Matthews 1993, 20), beta tests present crucial opportunities to incorporate user suggestions into the design of a product. Gina Neff See also Prototyping; User-Centered Design FURTHER READING Cole, R. E. (2002). From continuous improvement to continuous innovation. Total Quality Management, 13(8), 1051–1056. Daly, J. (1994, December). For beta or worse. Forbes ASAP, 36–40. Dolan, R. J., & Matthews, J. M. (1993). Maximizing the utility of consumer product testing: Beta test design and management. Journal of Product Innovation Management, 10, 318–330. Hove, D. (Ed.). The Free online dictionary of computing. Retrieved March 10, 2004, from http://www.foldoc.org Garman, N. (1996). Caught in the middle: Online professionals and beta testing. Online, 20(1), 6. Garud, R., Sanjay, J., & Phelps, C. (n.d.). Unpacking Internet time innovation. Unpublished manuscript, New York University, New York. Grossnickle, J., & Raskin, O. (2001). Handbook of online marketing research. New York: McGraw Hill. Grossnickle, J., & Raskin, O. (n.d.). Supercharged beta test. Webmonkey: Design. Retrieved January 8, 2004, from http://hotwired.lycos.com/ webmonkey Hilbert, D. M. (1999). Large-scale collection of application usage data and user feedback to inform interactive software development. Unpublished doctoral dissertation, University of California, Irvine. Kogut, B., & Metiu, A. (2001). Open source software development and distributed innovation. Oxford Review of Economic Policy, 17(2), 248–264. Krull, R. (2000). Is more beta better? Proceedings of the IEEE Professional Communication Society, 301–308. Metiu, A., & Kogut, B. (2001). Distributed knowledege and the global organization of software development. Unpublished manuscript, Wharton School of Business, University of Pennsylvania, Philadelphia. Neff, G., & Stark, D. (2003). Permanently beta: Responsive organization in the Internet era. In P. Howard and S. Jones (Eds.), Society Online. Thousand Oaks, CA: Sage. O'Mahony, S. (2002). The Emergence of a new commercial actor: Community managed software projects. Unpublished doctoral dissertation, Stanford University, Palo Alto, CA. Retrieved on January 8, 2004, from http://opensource.mit.edu/ Raymond, E. (1999). The Cathedral and the bazaar: Musings on Linux and open source from an accidental revolutionary. Sebastapol, CA: O'Reilly and Associates. Ross, R. (2002). Born-again Napster takes baby steps. Toronto Star, E04. Suchman, L. (2002). Located accountabilities in technology production. Retrieved on January 8, 2004, from http://www.comp.lancs.ac. uk/sociology/soc039ls.html. Centre for Science Studies, Lancaster University.
72 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Techweb (n.d.). Beta testing. Retrieved on January 8, 2004, from http://www.techweb.com/encyclopedia Terranova, T. (2000). Free labor: Producing culture for the digital economy. Social Text 18(2), 33–58.
BRAILLE Access to printed information was denied to blind people until the late 1700s, when Valentin Haüy, having funded an institution for blind children in Paris, embossed letters in relief on paper so that his pupils could read them. Thus, two hundred and fifty years after the invention of the printing press by the German inventor Johannes Gutenberg, blind people were able to read but not to write.
Historical Background In 1819 a French army officer, Charles Barbier, invented a tactile reading system, using twelve-dot codes embossed on paper, intended for nighttime military communications. Louis Braille, who had just entered the school for the blind in Paris, learned of the invention and five years later, at age fifteen, developed a much easier-to-read six-dot code, providing sixty-three dot patterns. Thanks to his invention, blind people could not only read much faster, but also write by using the slate, a simple hand tool made of two metal plates hinged together between which a sheet of paper could be inserted and embossed through cell-size windows cut in the front plate. Six pits were cut in the bottom plate to guide a hand-held embossing stylus inside each window. In spite of its immediate acceptance by his fellow students, Braille’s idea was officially accepted only thirty years later, two years after his death in 1852. Eighty more years passed before English-speaking countries adapted the Braille system in 1932, and more than thirty years passed before development of the Nemeth code, a Braille system of scientific notation, in 1965. Braille notation was also adopted by an increasing number of countries. In spite of its immense benefits for blind people, the Braille system embossed on paper was too bulky
and too expensive to give its users unlimited and quick access to an increasing amount of printed material: books, newspapers, leaflets, and so forth. The invention of the transistor in 1947 by three U.S. physicists and of integrated circuits in the late 1960s provided the solution: electromechanical tactile displays. After many attempts, documented by numerous patents, electronic Braille was developed simultaneously during the early 1970s by Klaus-Peter Schönherr in Germany and Oleg Tretiakoff in France.
First Electronic Braille Devices In electronic Braille, Braille codes—and therefore Braille books—are stored in numerical binary format on standard mass storage media: magnetic tapes, magnetic disks, and so forth. In this format the bulk and cost of Braille books are reduced by several orders of magnitude. To be accessible to blind users, electronically stored Braille codes must be converted into raised-dot patterns by a device called an “electromechanical Braille display.” An electromechanical Braille display is a flat reading surface that has holes arranged in a Braille cell pattern. The hemispherical tip of a cylindrical pin can either be raised above the reading surface to show a Braille dot or lowered under the reading surface to hide the corresponding Braille dot. The Braille dot vertical motion must be controlled by some kind of electromechanical actuator. Two such displays were almost simultaneously put onto the market during the mid-1970s. The Schönherr Braille calculator had eight Braille cells of six dots each, driven by electromagnetic actuators and a typical calculator keyboard. The dot spacing had to be increased to about 3 millimeters instead of the standard 2.5 millimeters to provide enough space for the actuators. The Tretiakoff Braille notebook carried twelve Braille standard cells of six dots each, driven by piezoelectric (relating to electricity or electric polarity due to pressure, especially in a crystalline substance) reeds, a keyboard especially designed for blind users, a cassette tape digital recorder for Braille codes storage, and a communication port to transfer data between the Braille notebook and other electronic devices. Both devices were portable and operated on replaceable or
BRAILLE ❚❙❘ 73
Enhancing Access to Braille Instructional Materials (ANS)—Most blind and visually impaired children attend regular school classes these days, but they are often left waiting for Braille and large-print versions of class texts to arrive while the other children already have the books. There are 93,000 students in kindergarten through 12th grade who are blind or have limited vision. Because this group represents a small minority of all schoolchildren, little attention has been paid to updating the cumbersome process of translating books into Braille, advocates said. Traditionally, publishers have given electronic copies of their books to transcribers, who often need to completely reformat them for Braille. Lack of a single technological standard and little communication between publishing houses and transcribers led to delays in blind students receiving instructional materials, experts said. The solution, said Mary Ann Siller, a national program associate for the American Foundation for the Blind who heads its Textbook and Instructional Materials Solutions Forum, is to create a single electronic file format and a national repository for textbooks that would simplify and shorten the production process. And that's exactly what is happening. In October, the American Printing House for the Blind in Louisville, Ky., took the first step in creating a repository by listing 140,000 of its own titles on the Internet. The group is now working to get publishers to deposit their text files, which transcribers could readily access. “Everyone is excited about it,” said Christine Anderson, director of resource services for the Kentucky organization. By having a central database with information about the files for all books available in Braille, large print, sound recording or computer files, costly duplications can be eliminated, she said. Pearce McNulty, director of publishing technology at Houghton Mifflin Co. in Boston, which is a partner in the campaign, said he is hopeful the repository will help solve the problem. Publishers and Braille producers historically
have misunderstood each other's business, he said, which led to frustration on both sides. Most blind children are mainstreamed into public school classrooms and receive additional help from a cadre of special teachers of the blind. Technology is also giving blind students more options. Scanning devices now download texts into Braille and read text aloud. Closed circuit television systems can enlarge materials for lowvision students. “These kids have very individual problems,” noted Kris Kiley, the mother of a 15-year-old who has limited vision.“It's not one size fits all. But if you don't teach them to read you've lost part of their potential.” New tools also bring with them new problems. For example, the new multimedia texts, which are available to students on CD-ROM, are completely inaccessible to blind students. And because graphics now dominate many books, lots of information, especially in math, does not reach those with limited vision. Simply recognizing the challenges faced by the blind would go a long way toward solving the problem, said Cara Yates. Yates, who recently graduated from law school, lost her sight at age 5 to eye cancer. She recalls one of her college professors who organized a series of tutors to help her “see” star charts when she took astrophysics. “A lot of it isn't that hard,” she said.“It just takes some thought and prior planning. The biggest problem for the blind is they can't get enough information. There's no excuse for it. It's all available.” Siller said the foundation also hoped to raise awareness about educational assessment; the importance of parental participation; better preparation for teachers; a core curriculum for blind students in addition to the sighted curriculum; and better Braille skills and a reduced caseload for teachers who often travel long distances to assist their students. Mieke H. Bomann Source: Campaign seeks to end blind students' wait for Braille textbooks. American News Service, December 16, 1999.
74 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
rechargeable batteries. The Tretiakoff Braille notebook, called “Digicassette,” measured about 20 by 25 by 5 centimeters. A read-only version of the Digicassette was manufactured for the U.S. National Library Services for the Blind of the Library of Congress.
Personal Braille Printers Braille books consist of strong paper pages embossed with a Braille dot pattern by high-speed machines and then bound together much like ordinary books. A typical Braille page can carry up to twenty-five lines of forty Braille characters each and can be explored rapidly from left to right and from top to bottom by a blind reader. Electronic Braille displays consist generally of a single line comprising usually from eighteen to forty Braille characters to keep the displays portable and affordable for individual users. The shift from a full page to a single line delayed the acceptance of Braille displays in spite of their ability to provide easy and high-speed access to electronically stored information. Personal Braille printers, also made possible by the development of integrated circuits, appeared soon after the first personal computers to fill the gap between industrially produced Braille books and single-line Braille displays. Similar in concept to dotmatrix ink printers, personal Braille printers allowed a blind user to emboss on a sheet of strong paper a few lines of Braille characters per minute from Braille codes received from an external source.
Tactile Graphics Although the first personal Braille printers were designed to print only a regularly spaced Braille pattern—at .6 centimeter spacing between characters —some were outfitted with print heads capable of printing regularly spaced dots, in both the horizontal and the vertical directions, allowing the production of embossed tactile graphics. Although the first electronic Braille displays were built with horizontally stacked piezoelectric reeds, whose length—about 5 centimeters—prevented the juxtaposition of more than two Braille lines, the mid1980s brought the first “vertical” piezoelectric Braille
cells used in Tretiakoff 's extremely portable Braille notebook, the P-Touch. In these “vertical” cells each piezoelectric actuator was located underneath the corresponding tactile dot, allowing tactile dots to be arranged in arrays of regularly spaced rows and columns for the electronic display of graphics. These vertical cells were about twice as high as conventional “horizontal” cells and no less expensive. Multiline or graphic displays were thus made technically feasible but remained practically unaffordable at about $12 per dot for the end user as early as 1985.
Active versus Passive Reading Since Louis Braille, blind people have performed tactile reading by moving the tip of one to three fingers across a Braille page or along a Braille line while applying a small vertical pressure on the dot pattern in a direction and at a speed fully controlled by the reader, hence the name “active reading.” Louis Braille used his judgment to choose tactile dot height and spacing; research performed during the last thirty years has shown that his choices were right on the mark. Objective experiments, in which the electrical response of finger mechanoreceptors (neural end organs that respond to a mechanical stimulus, such as a change in pressure) is measured from an afferent (conveying impulses toward the central nervous system) nerve fiber, have shown that “stroking”—the horizontal motion of the finger—plays an essential role in touch resolution, the ability to recognize closely spaced dots. Conversely, if a blind reader keeps the tip of one or more fingers still on an array of tactile dots that is moved in various patterns up or down under the fingertips, this is called “passive reading.” Passive reading has been suggested as a way to reduce the number of dots, and therefore the cost of tactile displays, by simulating the motion of a finger across a wide array of dots by proper control of vertical dot motion under a still finger. The best-known example of this approach is the Optacon (Optical to Tactile Converter), invented during the mid-1970s by John Linvill to give blind people immediate and direct access to printed material. The Optacon generated a vibrating tactile image of a small area of an object viewed by its camera placed and moved against its surface.
BRAIN-COMPUTER INTERFACES ❚❙❘ 75
Research has shown that touch resolution and reading speed are significantly impaired by passive reading, both for raised ordinary character shapes and for raised-dot patterns.
Current and Future Electronic Tactile Displays At the beginning of the twenty-first century, several companies make electromechanical tactile cells, which convert electrical energy into mechanical energy and vice versa, but the dominant actuator technology is still the piezoelectric (relating to electricity or electric polarity due to pressure, especially in a crystalline substance) bimorph reed, which keeps the price per tactile dot high and the displays bulky and heavy. The majority of electronic tactile displays are single-line, stand-alone displays carrying up to eighty characters or Braille computers carrying from eighteen to forty characters on a single line. Their costs range from $3,000 to more than $10,000. A small number of graphic tactile modules carrying up to sixteen by sixteen tactile dots are also available from manufacturers such as KGS in Japan. Several research-and-development projects, using new actuator technologies and designs, are under way to develop low-cost g raphic tactile displays that could replace or complement visual displays in highly portable electronic communication devices and computers. Oleg Tretiakoff See also Sonification; Universal Access FURTHER READING American Council of the Blind. (2001). Braille: History and Use of Braille. Retrieved May 10, 2004, from http://www.acb.org/resources/braille.html Blindness Resource Center. (2002). Braille on the Internet. Retrieved May 10, 2004, from http://www.nyise.org/braille.html
BRAIN-COMPUTER INTERFACES A brain-computer interface (BCI), also known as a direct brain interface (DBI) or a brain-machine interface (BMI), is a system that provides a means for people to control computers and other devices directly with brain signals. BCIs fall into the category of biometric devices, which are devices that detect and measure biological properties as their basis of operation. Research on brain-computer interfaces spans many disciplines, including computer science, neuroscience, psychology, and engineering. BCIs were originally conceived in the 1960s, and since the late 1970s have been studied as a means of providing a communication channel for people with very severe physical disabilities. While assistive technology is still the major impetus for BCI research, there is considerable interest in mainstream applications as well, to provide a hands-free control channel that does not rely on muscle movement. Despite characterizations in popular fiction, BCI systems are not able to directly interpret thoughts or perform mind reading. Instead, BCI systems monitor and measure specific aspects of a user’s brain signals, looking for small but detectable differences that signal the intent of the user. Most existing BCI systems depend on a person learning to control an aspect of brain signals that can be detected and measured. Other BCI systems perform control tasks, such as selecting letters from an alphabet, by detecting brain-signal reactions to external stimuli. Although BCIs can provide a communications channel, the information transmission rate is low compared with other methods of control, such as keyboard or mouse. The best reported user performance with current BCI systems is an information transfer rate of sixty-eight bits per minute, which roughly translates to selecting eight characters per minute from an alphabet. BCI studies to date have been conducted largely in controlled laboratory settings, although the field is beginning to target realworld environments for BCI use. A driving motivation behind BCI research has been the desire to help people with severe physical disabil-
76 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
ities such as locked-in syndrome, a condition caused by disease, stroke, or injury in which a person remains cognitively intact but is completely paralyzed and unable to speak. Traditional assistive technologies for computer access depend on small muscle movements, typically using the limbs, eyes, mouth, or tongue to activate switches. People with locked-in syndrome have such severely limited mobility that system input through physical movement is infeasible or unreliable. A BCI system detects tiny electrophysiological changes in brain signals to produce control instructions for a computer, thereby making it unnecessary for a user to have reliable muscle movement. Researchers have created applications for nondisabled users as well, including gaming systems and systems that allow hands-free, heads-up control of devices, including landing an aircraft. Brain signal interfaces have been used in psychotherapy to monitor relaxation responses and to teach meditation, although these are biofeedback rather than control interfaces.
Brain Signal Characteristics Brain signals are recorded using two general approaches. The most ubiquitous approach is the electroencephalogram (EEG), a recording of signals representing activity over the entire surface of the brain or a large region of the brain, often incorporating the activity of millions of neurons. An EEG can be recorded noninvasively (without surgery) from electrodes placed on the scalp, or invasively (requiring surgery) from electrodes implanted inside the skull or on the surface of the brain. Brain signals can also be recorded from tiny electrodes placed directly inside the brain cortex, allowing researchers to obtain signals from individual neurons or small numbers of colocated neurons. Several categories of brain signals have been explored for BCIs, including rhythms from the sensorimotor cortex, slow cortical potentials, evoked potentials, and action potentials of single neurons. A BCI system achieves control by detecting changes in the voltage of a brain signal, the frequency of a signal, and responses to stimuli. The type of brain signal processed has implications for the nature of the user’s interaction with the system.
Sensorimotor Cortex Rhythms Cortical rhythms represent the synchronized activity of large numbers of brain cells in the cortex that create waves of electrical activity over the brain. These rhythms are characterized by their frequency of occurrence; for example, a rhythm occurring between eight and twelve times a second is denoted as mu, and one occurring between eighteen and twentysix times a second is referred to as beta. When recorded over the motor cortex, these rhythms are affected by movement or intent to move. Studies have shown that people can learn via operant-conditioning methods to increase and decrease the voltage of these cortical rhythms (in tens of microvolts) to control a computer or other device. BCIs based on processing sensorimotor rhythms have been used to operate a binary spelling program and two-dimensional cursor movement.
Slow Cortical Potentials Slow cortical potentials (SCPs) are low-frequency shifts of cortical voltage that people can learn to control with practice. SCP shifts can occur in durations of a hundred milliseconds up to several seconds. SCP signals are based over the frontal and central cortex area, and are typically influenced by emotional or mental imagery, as well as imagined movement. SCPs are recorded both from electrodes on the scalp using an operant conditioning approach and from positive reinforcement to train users to alter their SCPs. Both nondisabled and locked-in subjects have been able to learn to affect their SCP amplitude, shifting it in either an electrically positive or negative direction. Locked-in subjects have used SCPs to communicate, operating a spelling program to write letters and even surfing with a simple web browser.
Evoked Potentials The brain's responses to stimuli can also be detected and used for BCI control. The P300 response occurs when a subject is presented with something familiar, such as a photo of a loved one, or of interest, such as a letter selected from an alphabet. The P300 response can be evoked by almost any stimulus, but most BCI systems employ either visual or auditory
BRAIN-COMPUTER INTERFACES ❚❙❘ 77
stimuli. Screening for the P300 is accomplished through an “oddball paradigm,” where the subject views a series of images or hears a series of tones, attending to the one that is different from the rest. If there is a spike in the signal power over the parietal region of the brain approximately 300 milliseconds after the “oddball” or different stimulus, then the subject has a good P300 response. One practical application that has been demonstrated with P300 control is a spelling device. The device works by flashing rows and columns of an alphabet grid and averaging the P300 responses to determine which letter the subject is focusing on. P300 responses have also been used to enable a subject to interact with a virtual world by concentrating on flashing virtual objects until the desired one is activated.
Action Potentials of Single Neurons Another approach to BCI control is to record from individual neural cells via an implanted electrode. In one study, a tiny hollow glass electrode was implanted in the motor cortices of three locked-in subjects, enabling neural firings to be captured and recorded. Subjects attempted to control this form of BCI by increasing or decreasing the frequency of neural firings, typically by imagining motions of paralyzed limbs. This BCI was tested for controlling two-dimensional cursor movement in communications programs such as virtual keyboards. Other approaches utilizing electrode arrays or bundles of microwires are being researched in animal studies.
Interaction Styles With BCIs How best to map signals from the brain to the control systems of devices is a relatively new area of study. A BCI transducer is a system component that takes a brain signal as input and outputs a control signal. BCI transducers fall into three general categories: continuous, discrete, and direct spatial positioning. Continuous transducers produce a stream of values within a specified range. These values can be mapped to cursor position on a screen, or they can directly change the size or shape of an object (such as a progress bar). A user activates a continuous transducer
by learning to raise or lower some aspect of his or her brain signals, usually amplitude or frequency. Continuous transducers have enabled users to perform selections by raising or lowering a cursor to hit a target on a screen. A continuous transducer is analogous to a continuous device, such as a mouse or joystick, that always reports its current position. A discrete transducer is analogous to a switch device that sends a signal when activated. Discrete transducers produce a single value upon activation. A user typically activates a discrete transducer by learning to cause an event in the brain that can be detected by a BCI system. Discrete transducers have been used to make decisions, such as whether to turn in navigating a maze. Continuous transducers can emulate discrete transducers by introducing a threshold that the user must cross to “activate” the switch. Direct-spatial-positioning transducers produce a direct selection out of a range of selection choices. These transducers are typically associated with evoked responses, such as P300, that occur naturally and do not have to be learned. Direct transducers have been used to implement spelling, by flashing letters arranged in a grid repeatedly and averaging the brain signal response in order to determine which letter the user was focusing on. A direct spatial positioning transducer is analogous to a touch screen. BCI system architectures have many common functional aspects. Figure 1 shows a simplified model of a general BCI system design as described by Mason and Birch (2003). Brain signals are captured from the user by an acquisition method, such as scalp electrodes or implanted electrodes. The signals are then processed by an acquisition component called a feature extractor that identifies signal changes that could signify intent. A signal translator then maps the extracted signals to device controls, which in turn send signals to a control interface for a device, such as a cursor, a television, or a wheelchair. A display may return feedback information to the user. Feedback is traditionally provided to BCI users through both auditory and visual cues, but some testing methods allow for haptic (touch) feedback and electrical stimulation. Which feedback mechanisms are most effective usually depends on the abilities and disabilities of the user; many severely disabled
78 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
FIGURE 1.
users have problems with vision that can be compensated for by adding auditory cues to BCI tasks. Some research teams have embraced usability testing to determine what forms of feedback are most effective; this research is under way.
Applications for BCIs As the BCI field matures, considerable interest has arisen in applying BCI techniques to real-world problems. The principal goal has been to provide a communication channel for people with severe motor disabilities, but other applications may also be possible. Researchers are focusing on applications for BCI technologies in several critical areas: Communication Making communication possible for a locked-in person is a critical and very difficult task. Much of the work in BCI technology centers around communication, generally in the form of virtual keyboards or iconic selection systems. Environmental Control The ability to control the physical environment is also an important quality-of-life issue. Devices that permit environmental control make it possible for locked-in people to turn a TV to a desired channel and to turn lights on and off, as well as controlling other physical objects in their world. Internet Access The Internet has the potential to enhance the lives of locked-in people significantly. Access to the Internet can provide shopping, entertainment, education, and sometimes even employment opportunities to people with severe disabilities. Efforts are under way to develop paradigms for BCI interaction with Web browsers.
BCI system architecture
Neural Prosthetics A BCI application with significant implications is neural prostheses, which are orthoses or muscle stimulators controlled by brain signals. In effect, a neural prosthesis could reconnect the brain to paralyzed limbs, essentially creating an artificial nervous system. BCI controls could be used to stimulate muscles in paralyzed arms and legs to enable a subject to learn to move them again. Preliminary work on a neurally controlled virtual hand was reported in 2000 with implanted electrodes: a noninvasive BCU has been demonstrated to control a hand-grasp orthosis for a person whose hand was paralyzed. An SSVEP-based BCI has also been used to control a functional electrical stimulator to activate paralyzed muscles for knee extension. Mobility Restoring mobility to people with severe disabilities is another area of research. A neurally controlled wheelchair could provide a degree of freedom and greatly improve the quality of life for locked-in people. Researchers are exploring virtual navigation tasks, such as virtual driving and a virtual apartment, as well as maze navigation. A noninvasive BCI was used to direct a remote-control vehicle, with the aim of eventually transferring driving skills to a power wheelchair.
Issues and Challenges for BCI There are many obstacles to overcome before BCIs can be used in real-world scenarios. The minute electrophysiological changes that characterize BCI controls are subject to interference from both electrical and cognitive sources. Brain-signal complexity and variability make detecting and interpreting changes very difficult except under controlled circumstances. Especially with severely disabled users, the effects of
BRAIN-COMPUTER INTERFACES ❚❙❘ 79
medications, blood sugar levels, and stimulants such as caffeine can all be significant. Cognitive distractions such as ambient environmental noise can affect a person’s ability to control a BCI in addition to increasing the cognitive load the person bears. Artifacts such as eye blinks or other muscle movements can mask control signals. BCIs and other biometric devices are also plagued by what is termed the Midas touch problem: How does the user signal intent to control when the brain is active constantly? Hybrid discrete/continuous transducers may be the answer to this problem, but it is still a major issue for BCIs in the real world. Another important issue currently is that BCI systems require expert assistance to operate. As BCI systems mature, the expectation is that more of the precise tuning and calibration of these systems may be performed automatically. Although BCIs have been studied since the mid1980s, researchers are just beginning to explore their enormous potential. Understanding brain signals and patterns is a difficult task, but only through such an understanding will BCIs become feasible. Currently there is a lively debate on the best approach to acquiring brain signals. Invasive techniques, such as implanted electrodes, could provide better control through clearer, more distinct signal acquisition. Noninvasive techniques, such as scalp electrodes, could be improved by reducing noise and incorporating sophisticated filters. Although research to date has focused mainly on controlling output from the brain, recent efforts are also focusing on input channels. Much work also remains to be done on appropriate mappings to control signals. As work in the field continues, mainstream applications for BCIs may emerge, perhaps for people in situations of imposed disability, such as jet pilots experiencing high G-forces during maneuvers, or for people in situations that require hands-free, heads-up interfaces. Researchers in the BCI field are just beginning to explore the possibilities of realworld applications for brain signal control. Melody M. Moore, Adriane B. Davis, and Brendan Allison See also Physiology
FURTHER READING Bayliss, J. D., & Ballard, D. H. (2000). Recognizing evoked potentials in a virtual environment. Advances in Neural Information Processing Systems, 12, 3–9. Birbaumer, N., Kubler, A., Ghanayim, N., Hinterberger, T., Perelmouter, J. Kaiser, J., et al. (2000). The thought translation device (TTD) for completely paralyzed patients. IEEE Transactions on Rehabilitation Engineering, 8(2), 190–193. Birch, G. E., & Mason, S. G. (2000). Brain-computer interface research at the Neil Squire Foundation. IEEE Transactions on Rehabilitation Engineering, 8(2), 193–195. Chapin, J., & Nicolelis, M. (2002). Closed-loop brain-machine interfaces. In J. R. Wolpaw & T. Vaughan (Eds.), Proceedings of BrainComputer Interfaces for Communication and Control: Vol. 2. Moving Beyond Demonstration, Program and Abstracts (p. 38). Rensselaerville, NY. Donchin, E., Spencer, K., & Wijesinghe, R. (2000). The mental prosthesis: Assessing the speed of a P300-based brain-computer interface. IEEE Transactions on Rehabilitation Engineering, 8(2), 174–179. Kandel, E., Schwartz, J., & Jessell, T. (2000). Principles of neural science (4th ed.). New York: McGraw-Hill Health Professions Division. Kennedy, P. R., Bakay, R. A. E., Moore, M. M., Adams, K., & Goldwaithe, J. (2000). Direct control of a computer from the human central nervous system. IEEE Transactions on Rehabilitation Engineering, 8(2), 198–202. Lauer, R. T., Peckham, P. H., Kilgore, K. L., & Heetderks, W. J. (2000). Applications of cortical signals to neuroprosthetic control: A critical review. IEEE Transactions on Rehabilitation Engineering, 8(2), 205–207. Levine, S. P., Huggins, J. E., BeMent, S. L., Kushwaha, R. K., Schuh, L. A., Rohde, M. M., et al. (2000). A direct-brain interface based on event-related potentials. IEEE Transactions on Rehabilitation Engineering, 8(2), 180–185. Mankoff, J., Dey, A., Moore, M., & Batra, U. (2002). Web accessibility for low bandwidth input. In Proceedings of ASSETS 2002 (pp. 89–96). Edinburgh, UK: ACM Press. Mason, S. G., & Birch, G. E. (In press). A general framework for braincomputer interface design. IEEE Transactions on Neural Systems and Rehabilitation Technology. Moore, M., Mankoff, J., Mynatt, E., & Kennedy, P. (2002). Nudge and shove: Frequency thresholding for navigation in direct braincomputer interfaces. In Proceedings of SIGCHI 2001Conference on Human Factors in Computing Systems (pp. 361–362). New York: ACM Press. Perelmouter, J., & Birbaumer, N. (2000). A binary spelling interface with random errors. IEEE Transactions on Rehabilitation Engineering, 8(2), 227–232. Pfurtscheller, G., Neuper, C., Guger, C., Harkam, W., Ramoser, H., Schlögl, A., et al. (2000). Current trends in Graz brain-computer interface (BCI) research. IEEE Transactions on Rehabilitation Engineering, 8(2), 216–218. Tomori, O., & Moore, M. (2003). The neurally controllable Internet browser. In Proceedings of SIGCHI 03 (pp. 796–798). Wolpaw, J. R., Birbaumer, N., McFarland, D., Pfurtscheller, G., & Vaughan, T. (2002). Brain-computer interfaces for communication and control. Clinical Neurophysiology, 113, 767–791.
80 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Wolpaw, J. R., McFarland, D. J., & Vaughan, T. M. (2000). Brain-computer interface research at the Wadsworth Center. IEEE Transactions on Rehabilitation Engineering, 8(2), 222–226.
BROWSERS For millions of computer users worldwide, a browser is the main interface with the World Wide Web, the world’s foremost Internet information exchange service. Banking, shopping, keeping in contact with friends and family through e-mail, accessing news, looking words up in the dictionary, finding facts and solving puzzles—all of these activities and many more can be carried out on the Web. After the 1993 release of the first graphical user interface Web browser (NCSA Mosaic), the Web rapidly evolved from a small user base of scientists accessing a small set of interlinked text documents to approximately 600 million users accessing billions of webpages that make use of many different media, including text, graphics, video, audio, and animation. Economies of scale clearly apply to the effectiveness of Web browsers. Although there has been substantial work on the “webification” of sources of information (for example, educational course materials), there has been surprisingly little research into understanding and characterizing Web user’s tasks, developing better browsers to support those tasks, and evaluating the browsers’ success. But ethnographic and field studies can give us a contextual understanding of Web use, and longitudinal records of users’ actions make possible long-term quantitative analyses, which in turn are leading to low-level work on evaluating and improving browsers.
What Do Web Users Do? The best way to understand fully what users do with their browsers, why they do it, and the problems they encounter is to observe and question users directly as they go about their everyday work. Unfortunately this approach puts inordinate demands on researchers’ time, so it is normally used only with small sets of participants. The study that best demonstrates
this ethnographic style of contextually immersed investigation is that of Michael Byrne and his colleagues (1999), who used their observations to create a taxonomy of Web-browsing tasks. Their method involved videotaping eight people whenever they used a browser in their work. The participants were encouraged to continually articulate their objectives and tasks, essentially thinking aloud. A total of five hours of Web use was captured on video and transcribed, and a six-part taxonomy of stereotypical tasks emerged: 1. Use information: activities relating to the use of information gathered on the Web; 2. Locate on page: searching for particular information on a page; 3. Go to: the act of trying to get the browser to display a particular URL (Web address); 4. Provide information: sending information to a website through the browser (for example, providing a billing address or supplying search terms to a search engine); 5. Configure browser: changing the configuration of the browser itself; and 6. React to environment: supplying information required for the browser to continue its operation (for example, responding to a dialog box that asks where a downloaded file should be saved). Although these results were derived from only a few hours of Web use by a few people, they provide initial insights into the tasks and actions accomplished using a browser. Another approach to studying how people use the Web is to automatically collect logs of users’ actions. The logs can then be analyzed to provide a wide variety of quantitative characterizations of Web use. Although this approach cannot provide insights into the context of the users’ actions, it has the advantage of being implementable on a large scale. Months or years of logged data from dozens of users can be included in an analysis. Two approaches have been used to log Web-use data. Server-side logs collect data showing which pages were served to which IP address, allowing Web designers to see, for instance, which parts of their sites are particularly popular or unpopular. Unfortunately,
BROWSERS ❚❙❘ 81
server-side logs only poorly characterize Web usability issues. The second approach uses client-side logs, which are established by equipping the Web browser (or a client-side browser proxy) so that it records the exact history of the user’s actions with the browser. The first two client-side log analyses of Web use were both conducted in 1995 using the then-popular XMosaic browser. The participants in both studies were primarily staff, faculty, and students in university computing departments. Lara Catledge and James Pitkow logged 3 weeks of use by 107 users in 1995, while Linda Tauscher and Saul Greenberg analyzed 5 to 6 weeks of use by 23 users in 1995. The studies made several important contributions to our understanding of what users do with the Web. In particular, they revealed that link selection (clicking on links in the Web browser) accounts for approximately 52 percent of all webpage displays, that webpage revisitation (returning to previously visited webpages) is a dominant navigation behavior, that the Back button is very heavily used, and that other navigation actions, such as typing URLs, clicking on the Forward button, or selecting bookmarked pages, were only lightly used. Tauscher and Greenberg also analyzed the recurrence rate of page visits—“the probability that any URL visited is a repeat of a previous visit, expressed as a percentage” (Tauscher and Greenberg 1997, 112). They found a recurrence rate of approximately 60 percent, meaning that on average users had previously seen approximately three out of five pages visited. In a 2001 study, Andy Cockburn and Bruce McKenzie showed that the average recurrence rate had increased to approximately 80 percent—four out of five pages a user sees are ones he or she has seen previously. Given these high recurrence rates, it is clearly important for browsers to provide effective tools for revisitation. The 1995 log analyses suggested that people rarely used bookmarks, with less than 2 percent of user actions involving bookmark use. However, a survey conducted the following year (Pitkow, 1996) indicates that users at least had the intention of using bookmarks, with 84 percent of respondents having more than eleven bookmarks. Pitkow reported from a survey of 6,619 users that organizing retrieved information is one of the top three problems people report relating to using the Web (reported by 34 percent
of participants). Cockburn and McKenzie’s log analysis suggested that bookmark use had evolved, with users either maintaining large bookmark collections or almost none: The total number of bookmarks in participants’ collections ranged from 0 to 587, with a mean of 184 and a high standard deviation of 166. A final empirical characterization of Web use from Cockburn and McKenzie’s log analysis is that Web browsing is surprisingly rapid, with many or most webpages being visited for only a very brief period (less than a couple of seconds). There are two main types of browsing behavior that can explain the very short page visits. First, many webpages are simply used as routes to other pages, with users following known trails through the series of links that are displayed at known locations on the pages. Second, users can almost simultaneously display a series of candidate “interesting” pages in independent top-level windows by shift-clicking on the link or by using the link’s context menu. For example, the user may rapidly pop up several new windows for each of the top result links shown as a result of a Google search.
Improving the Web Browser User Interface The studies reported above inform designers about what users do with the current versions of their browsers. Naturally, there is a chicken-and-egg problem in that stereotypical browser use is strongly affected by the support provided by browsers. Browser interfaces can be improved both by designing to better support the stereotypes and by innovative design that enables previously difficult or impossible tasks. The problems of hypertext navigation were well known long before the Web. As users navigate through the richly interconnected information nodes of the Web (or any hypertextual information space) their short-term memory becomes overloaded with the branches made, and they become “lost in hyperspace.” In the late 1980s many researchers were experimenting with graphical depictions of hypertext spaces in order to help users orient themselves: For example, the popular Apple language Hypercard provided a thumbnail graphical representation of the
82 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
recent cards displayed, and gIBIS provided a network diagram of design argumentation. Soon after the Web emerged in 1991, similar graphical techniques were being constructed to aid Web navigation. Example systems included MosaicG, which provided thumbnail images of the visited pages arranged in a tree hierarchy, WebNet, which drew a hub-and-spoke representation of the pages users visited and the links available from them, and the Navigational View Builder, which could generate a wide variety of two-dimensional and three-dimensional representations of the Web. Despite the abundance of tools that provide graphical representations of the user’s history, none have been widely adopted. Similarly, log analyses of Web use show that users seldom use the history tools provided by all of the main Web browsers. Given that Web revisitation is such a common activity, why are these history tools so lightly used? The best explanation seems to be that these tools are not needed most of the time, so they are unlikely to be on permanent display, where they would compete with other applications for screen real estate. Once iconified, the tools are not ready to hand, and it is overhead for users to think of using them, take the actions to display them, orient themselves within the information they display, and make appropriate selections. While the projects above focus on extending browser functionality, several other research projects have investigated rationalizing and improving browsers’ current capabilities. The interface mechanisms for returning to previously visited pages have been a particular focus. Current browsers support a wide range of disparate facilities for revisitation, including the Back and Forward buttons and menus, menus that allow users to type or paste the URLs of websites the user wants to visit, the history list, bookmarks or lists of favorites, and the links toolbar. Of these utilities, log analyses suggest that only the Back button is heavily used. The WebView system and Glabster both demonstrate how history facilities and bookmarks can be enhanced and integrated within the Back menu, providing a powerful and unified interface for all revisitation tasks. Both WebView and Glabster automatically capture thumbnail images of webpages, making it easier for the user to identify previously visited pages from the set displayed within the back menu.
Another problem users have with current Web browsers is that they misunderstand the behavior of the Back button. An experiment showed that eight of eleven computer scientists incorrectly predicted the behavior of Back in simple Web navigation tasks. The problem stems from users believing that Back provides access to a complete history of previously visited pages, rather than the stack-based subset that can actually be accessed. Cockburn and his colleagues describe the behavior and make an evaluation of a true history-based Back system, but results indicate that the pros and cons of the technique are closely balanced, such that the advantages do not outweigh the difficulties inherent in making a switch from current behavior. The World Wide Web revolution has been a great success in bringing computer technology to the masses. The widespread adoption and deployment of the Web and the browsers used to access it happened largely without input from researchers in human-computer interaction. Those researchers are now improving their understanding of the usability issues associated with Web browsers and browsing. As the technology and understanding matures, we can expect browser interfaces to improve, enhancing the efficiency of Web navigation and reducing the sensation of becoming lost in the Web. Andy Cockburn See also Mosaic; Website Design FURTHER READING Abrams, D., Baecker R., & Chignell, M. (1998). Information archiving with bookmarks: Personal Web space construction and organization. In Proceedings of CHI'98 Conference on Human Factors in Computing Systems (pp. 41–48). New York: ACM Press. Ayers, E., & Stasko, J. (1995). Using graphic history in browsing the World Wide Web. In Proceedings of the Fourth International World Wide Web Conference (pp. 451–459). Retrieved January 19, 2004, from http://www.w3j.com/1/ayers.270/paper/270.html Bainbridge, L. (1991). Verbal protocol analysis. In J. Wilson & E. Corlett (Eds.), Evaluation of human work: A practical ergonomics methodology (pp. 161–179). London: Taylor and Francis. Byrne, M., John, B., Wehrle, N., & Crow, D. (1999). The tangled Web we wove: A taskonomy of WWW Use. In Proceedings of CHI'99 Conference on Human Factors in Computing Systems (pp. 544–551). New York: ACM Press.
BROWSERS ❚❙❘ 83
Catledge, L., & Pitkow, J. (1995). Characterizing browsing strategies in the World Wide Web. In Computer systems and ISDN systems: Proceedings of the Third International World Wide Web Conference, 27, 1065–1073). Chi, E., Pirolli, P., & Pitkow, J. (2000). The scent of a site: A system for analyzing and predicting information scent, usage, and usability of a Web site. In Proceedings of CHI'2000 Conference on Human Factors in Computing Systems (pp.161–168). New York: ACM Press. Cockburn, A., Greenberg, S., McKenzie, B., Jason Smith, M., & Kaasten, S. (1999). WebView: A graphical aid for revisiting Web pages. In Proceedings of the 1999 Computer Human Interaction Specialist Interest Group of the Ergonomics Society of Australia (OzCHI'91) (pp. 15–22). Retrieved January 19, 2004, from http://www.cpsc.ucalgary.ca/Research/grouplab/papers/1999/99-WebView.Ozchi/ Html/webview.html Cockburn, A., & Jones, S. (1996). Which way now? Analysing and easing inadequacies in WWW navigation. International Journal of Human-Computer Studies, 45(1), 105–129. Cockburn, A., & McKenzie, B. (2001). What do Web users do? An empirical analysis of Web use. International Journal of HumanComputer Studies, 54(6), 903–922. Cockburn, A., McKenzie, B., & Jason Smith, M. (2002). Pushing Back: Evaluating a new behaviour for the Back and Forward buttons in Web browsers. International Journal of Human-Computer Studies, 57(5), 397–414. Conklin, J. (1988). Hypertext: An introduction and survey. In I. Greif (Ed.), Computer supported cooperative work: A book of readings (pp. 423–475). San Mateo, CA: Morgan-Kauffman.
Conklin, J., & Begeman, M. (1988). gIBIS: A hypertext tool for exploratory discussion. ACM Transactions on Office Information Systems, 6(4), 303–313. Coulouris, G., & Thimbleby, H. (1992). HyperProgramming. Wokingham, UK: Addison-Wesley Longman. Fischer, G. (1998). Making learning a part of life: Beyond the 'giftwrapping' approach of technology. In P. Alheit & E. Kammler (Eds.), Lifelong learning and its impact on social and regional development (pp. 435–462). Bremen, Germany: Donat Verlag. Kaasten, S., & Greenberg, S. (2001). Integrating Back, History and bookmarks in Web browsers. In Proceedings of CHI'01 (pp. 379–380). New York: ACM Press. Mukherjea, S., & Foley, J. (1995). Visualizing the World Wide Web with the navigational view builder. Computer Systems and ISDN Systems, 27(6), 1075–1087. Nielsen, J. (1990). The art of navigating through HyperText: Lost in hyperspace. Communications of the ACM, 33(3), 296–310. Pirolli, P., Pitkow, J., & Rao, R. (1996). Silk from a sow's ear: Extracting usable structures from the Web. In R. Bilger, S. Guest, & M. J. Tauber (Eds.), Proceedings of CHI'96 Conference on Human Factors in Computing Systems (pp. 118–125). New York: ACM Press. Pitkow, J. (n.d.). GVU's WWW User Surveys. Retrieved January 19, 2004, from http://www.gvu.gatech.edu/user_surveys/ Tauscher, L., & Greenberg, S. (1997). How people revisit Web pages: Empirical findings and implications for the design of history systems. International Journal of Human Computer Studies, 47(1), 97–138.
CATHODE RAY TUBES CAVE CHATROOMS CHILDREN AND THE WEB
C
CLASSROOMS CLIENT-SERVER ARCHITECTURE COGNITIVE WALKTHROUGH COLLABORATORIES COMPILERS COMPUTER-SUPPORTED COOPERATIVE WORK CONSTRAINT SATISFACTION CONVERGING TECHNOLOGIES CYBERCOMMUNITIES CYBERSEX CYBORGS
CATHODE RAY TUBES The cathode ray tube (CRT) has been the dominant display technology for decades. Products that utilize CRTs include television and computer screens in the consumer and entertainment market, and electronic displays for medical and military applications. CRTs are of considerable antiquity, originating in the late nineteenth century when William Crookes (1832–1919) studied the effects of generating an electrical discharge in tubes filled with various gases. (The tubes were known as discharge tubes.) It was over thirty years later in 1929 that the CRT was utilized to construct actual imagery for television applications by Vladimir Zworykin (1889–1982) of Westinghouse Electric Corporation. The further development and optimization of the CRT for televi-
sion and radar over the next fifty years provided the impetus for continual improvements. With the emergence of desktop computing in the 1980s, the CRT market expanded, and its performance continued to evolve. As portability has come to be more and more important in the consumer electronics industry, the CRT has been losing ground. The development of flat panel technologies such as liquid crystal displays and plasma displays for portable products, computer screens, and television makes the CRT very vulnerable. Because of the CRT’s maturity and comparatively low cost, however, its application will be assured for many years to come.
How Cathode Ray Tubes Work A CRT produces images when an electron beam is scanned over a display screen in a pattern that is 85
86 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
determined by a deflection mechanism. The display screen is coated with a thin layer of phosphor that luminesces under the bombardment of electrons. By this means the display screen provides a twodimensional visual display, corresponding to information contained in the electron beam. There are four major components of a CRT display: the vacuum tube, the electron source (known as the electron “gun”), the deflection mechanism, and the phosphor screen. The tube (sometimes referred to as a bulb) is maintained at a very high vacuum level to facilitate the flow of electrons in the electron beam. The front surface of the tube defines the visual area of the display, and it is this front surface that is covered with phosphor, which is in turn covered by the anode (the electron-collecting electrode). The tube has three main sections: the front surface, the funnel, and the neck. The entire tube is typically made of glass so that very high vacuums can be sustained, but in some cases the funnel and neck can be fabricated from metal or ceramic. For demanding applications that require additional robustness, an implosion-proof faceplate may be secured to the front tube surface for durability. This typically comes at the expense of optical throughput, but antireflection coatings are often used to improve contrast and to compensate for the transmission losses. The electron source, a hot cathode at the far end from the front surface, generates a high-density electron beam whose current can be modulated. The electron beam can be focused or reflected— deflected—by electrostatic or magnetic methods, and this deflection steers the electron beam to designated positions of the front surface to create visual imagery. The phosphor screen on the inside front surface of the tube converts the electron beam into visible light output. On top of the phosphor particles is the thin layer of conducting material (usually aluminum) that serves as the anode, drawing the electrons toward the screen. The directions on how to manipulate the electron stream are contained in an electronic signal called a composite video signal. This signal contains information on how intense the electron beam
must be and on when the beam should be moved across different portions of the screen.
Displaying Color One of the most important tasks of the modern display is rendering full-color images. Shadow-masking configurations are by far the most successful way to create full color images in CRT displays. The shadow mask CRT typically uses three electron beams deflected by one coil (the simplest configuration). The electron beams traverse a perforated metal mask (shadow mask) before impinging on selected phosphor materials (there are three sorts of phosphor that can emit red, green, and blue light). The shadow mask apertures are typically configured as stripes, circles, or slots. The arrangement of the electron optics and the deflection system is such that three electron beams converge onto the screen after passing through the shadow mask, each beam impinging on one phosphor, which, when bombarded with electrons, emits red, green, or blue visible light. The red, green, and blue phosphors are spatially arranged on the viewing screen. The Tr init ron desig n, invented by Sony Corporation, uses vertical stripe arrays rather than circular or slotted apertures. These arrays alternate red, green, and blue when viewed from the faceplate side of the tube. There is a single electron source, rather than three, which eliminates the problem of beam convergence. The Trinitron also has superior resolution in the vertical direction since its apertures are not limited in that direction. The only negative attribute of the Trinitron is that the mask is not selfsupporting, which ultimately limits the size of the vacuum tube. The advantages of CRT displays include their maturity, their well-understood manufacturing process, their ability to provide full-color and high-resolution imaging, and the comparatively low cost for high information content. CRTs are vulnerable to competition from liquid crystal displays and plasma displays (both of which make possible flat-panel displays), however, because CRTs are bulky, heavy, and big power consumers. In addition to the utility of flat-panel display for portable applications for which CRTs could never be considered, flat-
CAVE ❚❙❘ 87
panel displays have made significant inroads into desktop monitors and large-area televisions. As the price of flat-panel displays continues to plummet, they are certain to capture even more of the CRT market in the future. Gregory Philip Crawford See also Liquid Crystal Displays FURTHER READING Castelliano, J. (1992). Handbook of display technology. San Diego, CA: Academic Press. Keller, P. A. (1997). Electronic display measurement. New York: Wiley SID. MacDonald, L. W., & Lowe, A. C. (1997). Display systems: Design and applications. New York: Wiley SID.
CAVE The CAVE is a virtual reality (VR) room, typically 3 by 3 by 3 meters in size, whose walls, floor, and sometimes ceiling are made entirely of computer-projected screens. Viewers wear a six-degree-of-freedom location sensor called a tracker so that when they move within the CAVE, correct viewer-centered perspectives and surround-stereo projections are produced fast enough to give a strong sense of 3D visual immersion. Viewers can examine details of a complex 3D object simply by walking up to and into it. The CAVE was invented in 1991 for two reasons: to help scientists and engineers achieve scientific insight without compromising the color and distortionfree resolution available then on workstations and to create a medium worthy of use by fine artists. CAVE viewers see not only projected computergenerated stereo scenes but also their own arms and bodies, and they can interact easily with other people. The CAVE uses active stereo, which produces different perspective views for the left and right eyes of the viewer in synchrony with special electronic shutter glasses that go clear in front of the left eye when the
left eye image should be seen by the left eye and are opaque otherwise. Similarly, the right eye gets the right image. Images need to be generated at 100 to 120 hertz so each eye can get a flicker-free 50- to 60-hertz display. All screens need to be synchronized so that each eye sees the same phase stereo image on every screen, a requirement that until 2003 meant that only the most expensive SGI (Silicon Graphics, Inc.) computer graphics systems could be used. Synchronizing PC graphics cards now reduce the cost of CAVE computing and image generation by 90 percent. The CAVE’s projection onto the screens does not need to keep up with the viewer’s head motion nearly as much as is required in a head-mounted VR display (HMD), which needs to have small screens attached in front of the eyes. Of course, any movement of the viewer’s body within the space requires updating the scene perspective, but in normal investigative use, the CAVE needs to keep up only with body motion, not head rotation; the important result is that the delay of trackers is dramatically less of a problem with CAVEs than with HMDs. In addition, although only one viewer is tracked, other people can share the CAVE visuals at the same time; their view is also in stereo and does not swing with the tracked user’s head rotation, although their perspective is still somewhat skewed. Often the person
The CAVE is a multi-person, room-sized, highresolution, 3D video and audio environment. Photo courtesy of National Center for Supercomputing Applications.
88 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
CAVE Variants
The Personal Augmented Reality Immersive System (PARIS) has a half-silvered mirror at an angle in front of the user. The screen, above the desk facing down, superimposes a stereo image on the user’s hands working beyond the mirror. Photo courtesy of the Electronic Visualization Laboratory.
in the role of guide or instructor handles the controls (a 3D mouse called “Wanda”) and the student wears the tracker to get the best view, a mode of usage that is quite functional for both learning and demonstrations. The CAVE uses a rear-screen projection for the walls so the viewer does not block the light and cast shadows. The floor is typically projected down from the top, which creates a small shadow around the viewer’s feet. A CAVE with three walls and a floor minimally requires a 13- by 10-meter space with a ceiling 4.5 meters high. Six-sided CAVEs have rear projections from every direction, which require much higher ceilings, more elaborate support structures, and floor screens that can withstand the weight of several people. Someday, 3-square-meter flat-panel displays suspended as a ceiling, positioned vertically as walls, and tough enough to walk on would allow CAVEs in normal rooms. However, current technology paneldisplays refresh too slowly to use shutter glasses, so they must be otherwise modified for stereo display. The Varrier method involves placing a barrier screen so that the computed views to each eye are seen through perfectly placed thin black bars, that is, the correctly segmented image is placed in dynamic perspective behind the barrier in real time. Varrier viewers wear no special glasses since the image separation is performed spatially by the barrier screen.
Variants of the CAVE include the ImmersaDesk, a drafting-table-size rear-projected display with a screen set at an angle so that the viewer can look down as well as forward into the screen; looking down gives a strong sense of being in the scene. PARIS uses a similarly angled half-silvered screen that is projected from the top; the viewer’s hands work under the screen and are superimposed on the 3D graphics (rather than blocking them, as with normal projections). The CAVE originally used three-tube stereo projectors with special phosphors to allow a 100- to 120hertz display without ghosting from slow green phosphor decay. Tube projectors are now rather dim by modern standards, so the CAVE was rebuilt to use bright digital mirror-based projectors, like those used in digital cinema theaters. Projectors require significant alignment and maintenance; wall-sized flatpanel screens will be welcomed since they need no alignment and have low maintenance and no projection distance. The GeoWall, a passive stereo device, works differently, polarizing the output of two projectors onto a single screen. Viewers wear the throw-away polarized glasses used in 3D movies to see stereo. In addition to visual immersion, the CAVE has synchronized synthetic and sampled surround sound. The PARIS system features a PHANTOM tactile device, which is excellent for manipulating objects the size of a bread box or smaller.
CAVEs for Tele-Immersion The CAVE was originally envisioned as a teleimmersive device to enable distance collaboration between viewers immersed in their computergenerated scenes, a kind of 3D phone booth. Much work has gone into building and optimizing ultrahigh-speed computer networks suitable for sharing gigabits of information across a city, region, nation, or indeed, the world. In fact, scientists, engineers, and artists in universities, museums, and commercial manufacturing routinely use CAVEs and variants in this manner. Tom DeFanti and Dan Sandin
CHATROOMS ❚❙❘ 89
See also Virtual Reality; Telepresence; ThreeDimensional Graphics FURTHER READING Cruz-Neira, C., Sandin, D., & DeFanti, T. A. (1993). Virtual reality: The design and implementation of the CAVE. Proceedings of the SIGGRAPH 93 Computer Graphics Conference, USA, 135–142. Czernuszenko, M., Pape, D., Sandin, D., DeFanti, T., Dawe, G. L., & Brown, M. D. (1997). The ImmersaDesk and Infinity Wall projection-based virtual reality displays [Electronic version]. Computer Graphics, 31(2), 46–49. DeFanti, T. A., Brown M. D., & Stevens, R. (Eds.). (1996). Virtual reality over high-speed networks. IEEE Computer Graphics & Applications, 16(4), 14–17, 42–84. DeFanti, T., Sandin, D., Brown, M., Pape, D., Anstey, J., Bogucki, M., et al. (2001). Technologies for virtual reality/tele-immersion applications: Issues of research in image display and global networking. In R. Earnshaw, et al. (Eds.), Frontiers of Human-Centered Computing, Online Communities and Virtual Environments (pp. 137–159). London: Springer-Verlag. Johnson, A., Leigh, J., & Costigan, J. (1998). Multiway tele-immersion at Supercomputing ’97. IEEE Computer Graphics and Applications, 18(4), 6–9. Johnson, A., Sandin, D., Dawe, G., Qiu, Z., Thongrong, S., & Plepys, D. (2000). Developing the PARIS: Using the CAVE to prototype a new VR display [Electronic version]. Proceedings of IPT 2000, CD-ROM. Korab H., & Brown, M. D. (Eds.). (1995). Virtual Environments and Distributed Computing at SC’95: GII Testbed and HPC Challenge Applications on the I-WAY. Retrieved November 5, 2003, from http://www.ncsa.uiuc.edu/General/Training/SC95/GII .HPCC.html Lehner, V. D., & DeFanti, T. A. (1997). Distributed virtual reality: Supporting remote collaboration in vehicle design. IEEE Computer Graphics & Applications (pp. 13–17). Leigh, J., DeFanti, T. A., Johnson, A. E., Brown, M. D., & Sandin, D. J. (1997). Global tele-immersion: Better than being there. ICAT ’97, 7th Annual International Conference on Artificial Reality and Tele-Existence, pp. 10–17. University of Tokyo, Virtual Reality Society of Japan. Leigh, J., Johnson, A., Brown, M., Sandin, D., & DeFanti, T. (1999). Tele-immersion: Collaborative visualization in immersive environments. IEEE Computer Graphics & Applications (pp. 66–73). Sandin, D. J., Margolis, T., Dawe, G., Leigh, J., and DeFanti, T. A. (2001). The Varrier™ auto-stereographic display. Proceedings of Photonics West 2001: Electronics Imaging, SPIE. Retrieved on November 5, 2003, from http://spie.org/web/meetings/programs/pw01/ home.html Stevens, R., & DeFanti, T. A. (1999). Tele-immersion and collaborative virtual environments. In I. Foster & C. Kesselman (Eds.), The grid: Blueprint for a new computing infrastructure (pp. 131–158). San Francisco: Morgan Kaufmann.
CHATROOMS Defined most broadly, chatrooms are virtual spaces where conversations occur between two or more users in a synchronous or nearly synchronous fashion. Many different types of chat spaces exist on the Internet. One type is Internet Relay Chat (IRC), a multiuser synchronous chat line often described as the citizens band radio of the Internet. Another type of virtual space where computer-mediated communication (CMC) takes place is Multi-User Domains (MUDs, sometimes called “Multi-User Dungeons,” because of their origin as virtual locations for a Dungeons and Dragons role-playing type of networked gaming). MUDs were initially distinguished from IRC by their persistence, or continued existence over time, and their malleability, where users may take part in the building of a community or even a virtual world, depending on the tools and constraints built into the architecture of their particular MUD. Web-based chatrooms are a third type of chat space where users may converse synchronously in a persistent location hosted by Internet Service Providers (ISPs) or websites, which may be either large Web portals like Yahoo.com or small individual sites.
UNIX was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. —Doug Gwyn
Another type of chat function on the Internet is instant messaging (IM), which allows users to chat with individuals (or invited groups) in “real time,” provided that they know a person’s user name. Instant messaging is distinguished from other chat functions in that it is often used to hold multiple, simultaneous, private one-on-one chats with others. IM is also unusual in that the user can also monitor a list of online friends to see when they are logged in to the instant messaging service. IM chats also differ from other types of chatrooms in that they are not persistent—that is, a user cannot log in to the same chat after the last chatterer has logged off. Instant
90 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
message chats are more likely to occur among a group of people with some personal or professional connection than among a group of strangers with only one shared interest who happen to be in the same virtual space at the same time.
History of Internet Chat The first function that allowed for synchronous or nearly synchronous communication over a network was Talk, available on UNIX machines and the networks that connected them. Developed in the early 1980s, Talk allowed for the nearly synchronous exchange of text between two parties; however, unlike its descendents, it displayed text messages as they were written, character by character, rather than as completed messages posted to the discussion all at once. Talk and its sister program Phone fell into disuse after the introduction of the World Wide Web in 1991 and the introduction of graphical and multiuser interfaces.
Home computers are being called upon to perform many new functions, including the consumption of homework formerly eaten by the dog. —Doug Larson
Internet Relay Chat Jarkko Oikarinen, a Finnish researcher, developed IRC in 1988 based on the older Bulletin Board System (BBS). BBSs were central locations where users could dial in to a central server using a modem and leave messages and hold discussions on this central server, usually dedicated to a certain topic or interest group. Oikarinen wrote the IRC program to allow users to have “real-time” discussions not available on the BBS. First implemented on a server at the University of Oulu where Oikarinen worked, IRC quickly spread to other Finnish universities, and then to universities and ISPs throughout Scandinavia and then the World. Each “channel” on IRC (the name was taken from the Citizen’s Band radio community) represents a specific topic. Initially each channel was desig-
nated by a hatch mark (#) and a number. Because that proved difficult to use as IRC expanded, each channel was also given a text label, like #hottub or #gayboston. IRC channels were originally not persistent—anyone could create a channel on any conceivable topic, and when the last person logged out of that channel it ceased to exist. Only with the introduction in 1996 of Undernet and later DalNet did it become possible to create persistent channels. IRC runs through client software—the client software is what allows the user to see the text in the chat channel that they’re using and to see who else is currently in that channel. The most popular client is mIRC, a windows-compatible client; others include Xircon and Pirch. IRC does not have a central organizing system; organizations like universities and research groups simply run the software on their servers and make it available to their users. In the late 1990s, IRC decentralized architecture contributed to a system breakdown. In mid-1996, when one IRC server operator, based in North America, started abusing the IRC system, other North American IRC server operators expelled the abuser; however, when he disconnected his server they discovered that he was also the main link between North American and European IRC networks. After weeks of negotiations between North American and European IRC server operators, who disagreed over the handling of the expulsion, the impasse was not resolved. While interconnectivity between continents has been restored, the two IRC networks remain separate (IRC net and Efnet [Eris Free Net]); they have their own separate channels and have developed separately. Other networks, including DALnet and Undernet, have developed since the separation. MUDs Pavel Curtis, a researcher at Xerox PARC who specializes in virtual worlds, gives this definition of a Multi-User Domain: “A MUD is a software program that accepts ‘connections’ from multiple users across some kind of network and provides to each user access to a shared database of ‘rooms’, ‘exits’ and other objects. Each user browses and manipulates this database from inside one of the rooms. A MUD is a kind
CHATROOMS ❚❙❘ 91
A Personal Story—Life Online In the mid 1990s, I went to visit my first online chatroom as part of a larger project on computer-mediated communication. I had no idea what to expect—whether the people would be who they said they were, whether I’d have anything in common with other visitors, or what it would be like to interact in a text-based medium. I found myself enjoying the experience of talking to people from all over the world and came to spend much time in this virtual community. I soon learned that the community was much larger than the chatroom I had visited, connected by telephone, e-mail, letters, and occasional face-to-face visits. Over the past five years, I’ve spoken or emailed with many new acquaintances, and have had the pleasure of meeting my online friends in person when my travels take me to their part of the country. Participation in a virtual community has provided me opportunities to talk in depth with people from around the world, including Australia, New Zealand, South America, Mexico, Europe, and even Thailand. The virtual community also brings together people from a wide range of socioeconomic backgrounds that might ordinarily never have mixed. It’s been fascinating to get to know such a diverse group of individuals. My personal experiences in an online community have helped shape my research into the societal dimensions of computing and computer-mediated communication. One of my current projects investigates the effects of participation in online support communities on people’s mental and physical well-being. In addition, the success with which I’ve been able to meet and become acquainted with others using a text-only medium has had a strong impact on my theories about how technologies can successfully support remote communication and collaboration. Susan R. Fussell
of virtual reality, an electronically represented ‘place’ that users can visit” (Warschauer 1998, 212). MUDs provide more virtual places to visit, hang out, socialize, play games, teach, and learn than IRC or Web-based chatrooms do. Some MUDs have been used to hold meetings or conferences because they allow participants to convene without travel hassles— virtual conferences may have different rooms for different topics and a schedule of events similar to that of a geographically located conference. Two British graduate students, Richard Bartle and Roy Trubshaw, developed the first MUD in 1979, as a multiuser text-based networked computer game. Other MUDs followed, and new subtypes grew, including MOOs (Multiuser domains Object Oriented), used primarily for educational purposes, and MUSHs (Multi-user Shared Hallucinations). MOOs allow for greater control because the users of the virtual space can build objects and spaces as well as contribute text. Because MUDs are complex virtual environments that users visit to master commands and understand protocols, rules, and mores, their use and appeal has been limited to a tech-savvy group of users.
Web-Based Chat Web-based chatting became possible after the World Wide Web was first released onto the Internet in December 1990, but it didn’t became popular until after the release of the Java programming language a few years later. Java allowed developers to create user-friendly graphical interfaces to chat spaces on websites or ISP portals that could function across different computing and Internet browsing platforms. Web-based chatting, like IRC, tends to be based around common themes, issues, or specific discussion topics—it has given rise to rooms like
[email protected] or sports- or hobby-themed rooms like The Runners Room on Yahoo Chats. Other chat spaces may be on an individual website devoted to a common theme (like the chat on the Atlantic County Rowing Association site, hosted by ezteams).
Chatroom Users All the different iterations of chatrooms discussed here have some common elements to help users navigate and quickly understand how to use the software.
92 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
After entering a chatroom, channel, or domain, a user is confronted with a screen that is split into two or more parts: One side, usually the left, shows the discussion in progress. In another box on the screen is a list of who is logged in to the room. Generally below these is the box where the user enters text or commands to begin the conversation or move about the space (in the case of MUDs). In some chat spaces, users can create their own private chat with a single individual from the chatroom. In some Webbased tools, the chatroom is designed to use an instant messaging program to conduct one-onone chats. In others the private chat tool is built-in— in MUDs, a user uses the “whisper” command to direct a comment or conversation to a particular individual, and in some Web-based chats a private chat may be opened in another smaller window in the same chatting interface. In a survey in the summer of 2002, the Pew Internet & American Life Project found that only onequarter of Internet users had ever visited a chatroom or participated in an online discussion, and only 4 percent had visited a chatroom on a typical day. Men are more likely to use chatrooms than women, as are those who are less well off; those earning less than $50,000 a year are much more likely to chat than those earning more. Younger people are also more likely to chat, particularly those between eighteen and twentynine, although among teens, particularly adolescent girls, chatting is frequently perceived as unsafe. Nevertheless, in spite of (or because of) chat’s reputation, 55 percent of young people between twelve and seventeen have visited a chatroom. Chatrooms have become the favorite playgrounds of many Internet users because they enable them to assume a character or a role different from the one they play in their offline life. As social psychologist Erving Goffman noted in his 1959 book Presentation of Self in Everyday Life, we present different images of ourselves to different people, and some theorists have described chatrooms as spaces of performance where an identity is “performed” for the audience of other chatters. In certain chatrooms, like MUDs, where gaming or role-playing is often the reason users go there, it is expected that visitors do not bear any resemblance to their selves at the keyboard. In IRC and
Web-based chat, however, there is the expectation that users are presenting themselves honestly. Nevertheless, all chat spaces give users the opportunity to explore portions of their identity, whether it is by choosing to have the opposite gender, a different race, or a different set of personal experiences, or in the case of some games, by exploring what it is like to be something other than human. Anonymity or pseudonymity on line gives many users a feeling of freedom and safety that allows them to explore identities that they dare not assume in the offline world. Users are separated by geographic distances so it is unlikely that actions taken or phrases uttered will come back to haunt them later. And finally, in chat environments without audio or video, communication is mediated by the technology so there are none of the cues that can make a conversation emotional. All of this leads to lower levels of inhibitions, which can either create greater feelings of friendship and intimacy among chat participants or lead to a greater feeling of tension and lend an argumentative, even combative quality to a chat space.
The Future of Chat In 1991 researchers at Cornell University created CUSeeMe, the first video chat program to be distributed freely online. Video and audio chat did not truly enter mainstream use until the late 1990s, and with the advent of Apple’s iChat and Microsoft’s improved chatting programs and web cams, video chat utilizing speakers and web cams looks to be the future direction of chatting. Today Yahoo.com and other portal-provided Web-based chatrooms allow audio and video chat in their rooms, though the number of users taking advantage of the technology is still relatively small. A user’s bandwidth and hardware capabilities are still limiting factors in the use of the bandwidth-intensive video chat, but as broadband Internet connectivity percolates through the population, the quality of video Web-based chatting available to most users will improve, and its adoption will undoubtedly become more widespread. MUDs and MOOs are also moving into HTMLbased environments, which will make it much easier for the average Internet user to adopt them,
CHILDREN AND THE WEB ❚❙❘ 93
and will perhaps move Multi-User Domains from the subculture of academics and devotees into everyday use. Amanda Lenhart See also E-mail, MUDs
Taylor, T.L. (1999). Life in virtual worlds: Plural existence, multimodalities and other online research challenges. American Behavioral Scientist, 4(3). Turkle, S. (1995). Life on the screen: Identity in the age of the Internet. New York: Simon & Schuster. Warshauer, S. C. (1998). Multi-user environment studies: Defining a field of study and four approaches to the design of multi-user environments. Literary and Linguistic Computing, 13(4). Young, J. R. (1994). Textuality and cyberspace: MUD’s and written experience. Retrieved July 31, 2003, from http://ftp.game.org/pub/ mud/text/research/textuality.txt
FURTHER READING Bartle, R. (1990). Early MUD History. Retrieved July 31, 2003, from http://www.ludd.luth.se/mud/aber/mud-history.html Bevan, P. (2002). The circadian geography of chat. Paper presented at the conference of the Association of Internet Researchers, Maastricht, Netherlands. Campbell, J. E. (2004). Getting it on online: Cyberspace, gay male sexuality and embodied identity. Binghamton, NY: The Haworth Press. Dibbell, J. (1998). A rape in cyberspace. In My tiny life: Crime and passion in a virtual world. Owl Books, chapter 1. Retrieved July 31, 2003, from http://www.juliandibbell.com/texts/bungle.html IRC.Net. IRC net: Our history. Retrieved July 30, 2003, from http:// www.irc.net/ Hudson, J. M., & Bruckman, A. S. (2002). IRC Francais: The creation of an Internet-based SLA community. Computer Assisted Language Learning, 1(2), 109–134. Kendall, L. (2002). Hanging out in the virtual pub: Masculinities and relationships online. Berkeley: University of California Press. Lenhart, A., et al. (2001). Teenage life online: The rise of the instant message generation and the Internet’s impact on friendships and family relationships. Pew Internet & American Life Project, retrieved August 21, 2003, from http://www.pewinternet.org/ Murphy, K. L., & Collins, M. P. (1997). Communication conventions in instructional electronic chats. First Monday, 11(2). Pew Internet & American Life Project. (2003). Internet activities (Chart). Retrieved July 31, 2003, from http://www.pewinternet.org/reports/ index.asp Pew Internet & American Life Project. (2003). Unpublished data from June-July 2002 on chatrooms. Author. Reid, E. M. (1994). Cultural formation in text-based virtual realities. Unpublished doctoral dissertation, University of Melbourne, Australia. Retrieved July 31, 2003, from http://www.aluluei.com/ cult-form.htm Rheingold, H. (1993). The virtual community: Homesteading on the electronic frontier. Cambridge, MA: MIT Press. Rheingold, H. (1998). Building fun online learning communities. Retrieved July 30, 2003, from http://www.rheingold.com/texts/ education/moose.html Schaap, F. (n.d.). Cyberculture, identity and gender resources (online hyperlinked bibliography). Retrieved July 31, 2003, from http:// fragment.nl/resources/ Surkan, K. (n.d.). The new technology of electronic text: Hypertext and CMC in virtual environments. Retrieved July 31, 2003, from http:// english.cla.umn.edu/GraduateProfiles/Ksurkan/etext/etable.html Talk mode (n.d.). The jargon file. Retrieved November 1, 2002, from http://www.tuxedo.org/~esr/jargon/html/entry/talk-mode.html
CHILDREN AND THE WEB Children are among the millions of people who have been introduced to new ways of accessing information on the World Wide Web, which was launched in 1991 and began to become popular with the adoption of a graphical user interface in 1993. The fact that the Web utilizes hypertext (content with active links to other content) and a graphical user interface have made it more congenial and much easier to use than earlier menu-driven, text-based interfaces (i.e., Gopher, Jughead, Veronica) with the Internet.
Children’s Web Use Children use the Web inside and outside the classroom, and they navigate it to find information for both simple and complex projects. They recognize the Web as a rich source of up-to-date information, hard-to-find information, and compelling images. Research by Dania Bilal (2000) and Jinx Watson (1998) has revealed that children who use the Web have a sense of independence, authority, and control. They are motivated, challenged, and selfconfident. They prefer the Web to print sources due to the vast amount of information available and their ability to search by keyword and browse subject hierarchies quickly. Research conducted for the Pew Internet & American Life Project revealed that both parents and children believe that the Internet helps with learning. While these positive perceptions of the Internet are encouraging, children’s success in finding information on the Web is questioned. Given
94 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Two of the many books available that educate children on the perils and promise of the Web. Safety on the Internet is geared to ages 6–9, while Cyber Space is meant for ages 9–12. the Web’s increasing complexity and the abundance of information available there, it is worth asking how well children handle the challenges of using the Web. Researchers from library and information science, educational psychology, sociology, cognitive science, and human-computer interaction have studied children’s interaction with the Web. In the field of information science, researchers have investigated children’s search strategies, their relative preferences for browsing and searching, their successes and failures, the nature of tasks and success, Web design, and children’s navigational skills, relevance judgment, and affective states (feelings, perception, motivation). Findings and conclusions from these studies have begun to provide a rich framework for improving system design and developing more effective Web training programs. The first study in library and information science appeared in 1997 when Jasmine Kafai and Marcia Bates examined elementary schoolchildren’s Web literacy skills. They found that children were enthusiastic about using the Web and were able to scroll webpages and use hyperlinks effectively. However, the researchers perceived that many websites had too much text to read and too much diffi-
cult vocabulary for elementary schoolchildren to understand. Children in that age range preferred sites with high visual content, animation, and short, simple textual content. In 1998 the researchers John Schacter, Gregory Chung, and Aimee Dorr studied the effect of types of tasks on the success of fifth and sixth graders in finding information. They found that children browsed more than they searched by keyword and performed better on open-ended (complex) than factual (simple) tasks. By contrast, in 2000 Terry Sullivan and colleagues found that middle and high school students were more successful on simple tasks than complex ones. Results from Dania Bilal’s research in 2000–2002 echoed Sullivan’s results and revealed that middle school students were most successful on tasks that they chose themselves than they were on tasks that were assigned. In 1999 Andrew Large, Jamshid Beheshti, and Haidar Moukdad examined the Web activities of Canadian sixth graders. These researchers found that children browsed more than they searched by keyword, had difficulty finding relevant information, and, although they had been given basic Web training, lacked adequate navigational skills. The children’s use of the Netscape “Back” command to
CHILDREN AND THE WEB ❚❙❘ 95
return to the previous page, for example, accounted for 90 percent of their total Web moves; they activated online search help only once. In fact, frequent use of the “Back” command is common among children and young adults. Various studies in the late 1990s and early 2000s found similar results. In a follow-up to a 1999 study, Andrew Large and Jamshid Beheshti (2000) concluded that children valued the Web for finding information on hard topics, speed of access, and the availability of color images, but perceived it as more difficult to use than print sources. Children expressed frustration with information overload and with judging relevance of the retrieved results. Information overload and problems determining relevance seem to be widespread among children and young adults using the Web; a study of elementary, middle, and high school students in England corroborated Large and Beheshti’s finding. Most children assume that the Web is an efficient and effective source for all types of information. Consequently, they rarely question the accuracy and authority of what they find. If they retrieve results that are close enough to the topic, they may cease to pursue their initial inquiry and take what they get at face value. Most studies focused on using the Web as a whole and on search engines that are developed for adult users rather than children. Bilal has investigated the information-seeking behavior of children who used Yahooligans!, a search engine and directory specifically designed for children aged seven through twelve. She found that 50 percent of the middle school children were successful on an assigned, factbased task, 69 percent were partially successful on an assigned, research-based task, and 73 percent were successful on tasks they selected themselves. The flexibility children had in topic selection and modification combined with their satisfaction with the results may have influenced their success rate on the selfselected task. Children were more motivated, stimulated, and engaged in completing their tasks when they selected topics of personal interest. The children used concrete concepts (selected from the search questions) in their searches and, when these concepts failed to generate relevant information, they utilized abstract ones (synonyms or related terms). The children had trouble finding informa-
tion, despite the fact that most of the concepts they employed were appropriate. The failure to find results can be attributed largely to the poor indexing of the Yahooligans! database. Overall, the children took initiative and attempted to counteract their information retrieval problems by browsing subject categories. Indeed, they were more successful when they browsed than when they searched by keyword. Children’s low success rates on the assigned tasks were attributed to their lack of awareness of the difference between simple and complex tasks, especially in regard to the approach to take to fulfill the assignment’s requirements. On the complex assigned task, for example, children tended to seek specific answers rather than to develop an understanding of the information found. On the positive side, children were motivated and persistent in using the Web. When asked about reasons for their motivation and persistence, children cited convenience, challenge, fun, and ease of use. Ease of use was described as the ability to search by keyword. On the negative side, children expressed frustration at both information overload and the zero retrieval that resulted from keyword searching. Indeed, this feature was central to most of the search breakdowns children experienced. Although Yahooligans! is designed for children aged seven through twelve, neither its interface design nor its indexing optimized children’s experience. Children’s inadequate knowledge of how to use Yahooligans! and their insufficient knowledge of the research process hindered their success in finding information.
Optimizing the Web for Children Children’s experiences with the Web can be greatly improved by designing Web interfaces that build on their cognitive developmental level, informationseeking behaviors, and information needs. Since 2002, Bilal (working in the United States) and Large, Beheshti, and Tarjin Rahman (working together in Canada), have begun projects that involve children in the design of such interfaces. Both groups have concluded that children are articulate about their information needs and can be effective design partners. Based on the ten interfaces that children designed for search engines, Bilal was able to identify the types
96 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
of information architecture and visual design children needed and the information architecture, functionality, and visual design they sought. In sum, both Bilal and the Canadian-based team concluded that children are creative searchers who are more successful when they browse than when they search by keyword. Typically, children prefer keyword searching but resort to browsing when they experience continued information-retrieval problems. Children do not take advantage of the search features provided in search engines and rarely activate the help file for guidance. The research also revealed that children have both positive and negative feelings when it comes to the Web. They associate the Web with motivation, challenge, convenience, fun, authority, independence, and self-control, but also with frustration, dissatisfaction, and disappointment caused by information overload, lack of success in searches, and inability to make decisions about document relevance. As to information literacy, children possess inadequate information-seeking skills, naïve Web navigational skills, and an insufficient conceptual understanding of the research process. These problems cry out to teachers and information specialists to provide more effective Web training and to design instructional strategies that successfully integrate the Web into effective learning. With regard to system design, it appears that websites, Web directories, and search engines are not easy for children to use. Too much text, difficult vocabulary, long screen display, deep subject hierarchies, ineffective help files, poor indexing, and misleading hyperlink titles all hinder children’s successful use.
Education, Design, and Future Research Use of the Web in school and its increased use at home does not ensure that children possess effective skills in using it. Information professionals (e.g., school and public librarians) who serve children need to collaborate with teachers to identify how the Web can effectively support meaningful learning. Teachers cannot make the Web an effective learning and research tool unless they first receive effective, structured training
in its use. Children, too, should be taught how to use the Web effectively and efficiently.With critical-thinking skills and an understanding of how to manipulate the Web, children can move from being active explorers of the Web to becoming discerning masters of it. In discussing how usable Web interfaces are for children, Jacob Neilsen notes that “existing Web [interfaces] are based at best by insights gleaned from when designers observe their own children, who are hardly representative of average kids, typical Internet skills, or common knowledge about the Web” (Neilsen 2002, 1). Thus, it is not surprising to find that children experience difficulty in using the Web. System developers need to design interfaces that address children’s cognitive developmental level, information needs, and information-seeking behaviors. Developing effective Web interfaces for children requires a team effort involving information scientists, software engineers, graphic designers, and educational psychologists, as well as the active participation of representative children. We have a growing understanding of the strengths and weaknesses of the Web as a tool for teaching and learning. We also know much about children’s perceptions of and experiences with the Web, as well as their information-seeking behavior on the Web. The rapid progress made in these areas of study is commended. However, research gaps remain to be filled. We do not have sufficient research on working with children as partners in designing Web interfaces. We have investigated children’s information-seeking behavior in formal settings, such as schools, to meet instructional needs, but with the exception of Debra J. Slone’s 2002 study, we have little information on children’s Web behavior in informal situations, when they are using it to meet social or entertainment needs. We also lack a sophisticated model that typifies children’s informationseeking behavior.We need to develop a model that more fully represents this behavior so that we can predict successful and unsuccessful outcomes, diagnose problems, and develop more effective solutions. Dania Bilal See also Classrooms; Graphical User Interface; Search Engines
CLASSROOMS ❚❙❘ 97
FURTHER READING Bilal, D. (1998). Children’s search processes in using World Wide Web search engines: An exploratory study. Proceedings of the 61st ASIS Annual Meeting, 35, 45–53. Bilal, D. (1999). Web search engines for children: A comparative study and performance evaluation of Yahooligans!, Ask Jeeves for Kids, and Super Snooper. Proceedings of the 62nd ASIS Annual Meeting, 36, 70–82. Bilal, D. (2000). Children’s use of the Yahooligans! Web search engine, I: Cognitive, physical, and affective behaviors on fact-based tasks. Journal of the American Society for Information Science, 51(7), 646–665. Bilal, D. (2001). Children’s use of the Yahooligans! Web search engine, II: Cognitive and physical behaviors on research tasks. Journal of the American Society for Information Science and Technology, 52(2), 118–137. Bilal, D. (2002). Children’s use of the Yahooligans! Web search engine, III: Cognitive and physical behaviors on fully self-generated tasks. Journal of the American Society for Information Science and Technology, 53(13), 1170–1183. Bilal, D. (2003). Draw and tell: Children as designers of Web interfaces. Proceedings of the 66th ASIST Annual Meeting, 40, 135–141. Bilal, D. (In press). Research on children’s use of the Web. In C. Cole & M. Chelton (Eds.), Youth Information Seeking: Theories, Models, and Approaches. Lanham, MD: Scarecrow Press. Druin, A., Bederson, B., Hourcade, J. P., Sherman, L., Revelle, G., Platner, M., et al. (2001). Designing a digital library for young children. In Proceedings of the first ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 398–405). New York: ACM Press. Fidel, R., Davies, R. K., Douglass, M. H., Holder, J. K., Hopkins, C. J., Kushner, E. J., et al. (1999). A visit to the information mall: Web searching behavior of high school students. Journal of the American Society for Information Science, 50(1), 24–37. Hirsh, S. G. (1999). Children’s relevance criteria and information seeking on electronic resources. Journal of the American Society for Information Science, 50(14), 1265–1283. Kafai, Y. B., & Bates, M. J. (1997). Internet Web-searching instruction in the elementary classroom: Building a foundation for information literacy. School Library Media Quarterly, 25(2), 103–111. Large, A., & Beheshti, J. (2000). The Web as a classroom resource: Reactions from the users. Journal of the American Society for Information Science And Technology, 51(12), 1069–1080. Large, A., Beheshti J., & Moukdad, H. (1999). Information seeking on the Web: Navigational skills of grade-six primary school students. Proceedings of the 62nd ASIS Annual Meeting, 36, 84–97. Large, A., Beheshti, J., & Rahman, R. (2002). Design criteria for children’s Web portals: The users speak out. Journal of the American Society for Information Science and Technology, 53(2), 79–94. Lenhart, A., Rainie, L., Oliver, L. (2003). Teenage life online: The rise of the instant-message generation and the Internet’s impact on friendships and family relationships. Washington, DC: Pew Internet and American Life Project. Retrieved January 4, 2004, from http:// www.pewinternet.org/reports/pdfs/PIP_Teens_Report.pdf Lenhart, A., Simon, M., & Graziano, M. (2001). The Internet and education: Findings of the Pew Internet & American Life Project.
Washington, DC: Pew Internet and American Life Project. Retrieved January 4, 2004, from http://www.pewinternet.org/reports/ pdfs/PIP_Schools_Report.pdf Neilsen, J. (2002). Kids’ corner: Website usability for children. Retrieved Januar y 4, 2004, from http://www.useit.com/aler tbox/ 20020414.html Schacter, J., Chung, G. K. W. K., & Dorr, A. (1998). Children’s Internet searching on complex problems: Performance and process analyses. Journal of the American Society for Information Science, 49(9), 840–849. Shenton, A. K., & Dixon, P. (2003). A comparison of youngsters’ use of CD-ROM and the Internet as information resources. Journal of the American Society for Information Science and Technology, 54(11), 1049–2003. Slone, D. J. (2002). The influence of mental models and goals on search patterns during Web interaction. Journal of the American Society for Information Science and Technology, 53(13), 1152–1169. Wallace, R. M., Kupperman, J., and Krajcik, J. (2002). Science on the Web: Students on-line in a sixth-grade classroom. The Journal of the Learning Sciences, 9(1), 75–104. Watson, J. S. (1998). If you don’t have it, you can’t find it: A close look at students’ perceptions of using technology. Journal of the American Society for Information Science, 49(11), 1024–1036.
CLASSROOMS People have regarded electronic technology throughout its evolution as an instrument for improving learning in classrooms. Television and video were examples of early electronic technology used in classrooms, and now personal computers have shown how electronic technology can enhance teaching and learning. Some people have lauded the new kinds of learning activities afforded by electronic technology; but other people maintain that such technology can be detrimental in classrooms. Despite such criticisms, researchers in different fields— education, computer science, human-computer interaction—continue to explore how such technology, paired with innovative curricula and teacher training, can improve classrooms.
Early Visions of Learning Technologies Early visions of how technology could be applied to learning included so-called behaviorist teaching machines inspired by the U.S. psychologist B. F. Skinner in the 1960s. Skinner believed that classrooms
98 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
History Comes Alive in Cyberspace OLD DEERFIELD, Mass. (ANS)—On a blustery spring morning, 18 students from Frontier Regional High School made their way down Main Street here in this colonial village, jotting down notes on the Federal and Italianate architecture and even getting a look at an early 18th-century kitchen. But this was no ordinary field trip. The students were gathering information for an Internet-based project that is integrating state-of-the-art computer technology with the social studies curriculum throughout this rural western Massachusetts school district. The project, titled Turns of the Centuries, focuses on life at the turns of the last three centuries, beginning in 1700 and continuing through 1900. It’s an unusual partnership between three distinct entities—a secondary school, a university and a museum. In the project, the primary sources of the Pocumtuck Valley Memorial Association, a nationally recognized museum of frontier life in this region, will be available to students through a web site that teachers, students and researchers are putting together. Central to the project are the over 30,000 museum artifacts—diaries, letters and other ‘primary sources’—made available to students through the developing web site. The marriage of technology with the museum archives has made possible new opportunities for “inquiry-based” education, which focuses on developing the student as active learner. In essence, the educational project here is a cyberspace version of the museum, enabling students to access archives through the Internet, either from their homes or through
suffered from a lack of individual attention and that individualized instruction would improve learning. The idea was that individual students could use a computer that would teach and test them on different topics. Students would receive positive reinforcement from the computer through some reward mechanism (e.g., praise and advancement to the next level of instruction) if they gave correct responses.
computer labs that are being established throughout district schools. But as the trip to Old Deerfield demonstrated, students will also add to the pool of knowledge and contribute original data to the web site as well. “This is not just an electronic test book,” said Tim Neumann, executive director of Pocumtuck Valley Memorial Association and one of the project’s designers. “Students are not just surfing the web but actively engaging with the text and images on the screen.” Students also address questions posed by teachers and then conduct research via the Internet, as well as other field studies, he said. Building the web sites, from teachers’ notes and classroom lesson plans, are students and technicians at the University of Massachusetts Center for Computer-Based Instructional Technology. The students in Friday morning’s expedition were responding to an assignment to choose a colonial family and track them over time, using the resources at the museum. Those results will eventually be incorporated into the Turns of the Centuries web site, where other students throughout the K-12 district will be able to access them. In addition to helping acquaint students with emerging technologies, the Turns of the Centuries project instructs teachers how to teach social studies with a web-based curriculum, and how to access these resources in their classrooms, as well as exploring the potential partnerships among school and communities linked by the information highway. Robin Antepara Source: Students learn about history with classroom computers of tomorrow. American News Service, June 17, 1999.
Incorrect responses would prevent advancement to the next level of questions, giving students the opportunity to consider how they could correct their responses. Software adopting this approach is frequently called “drill and practice” software, but few examples of such software exist outside of educational games and other kinds of “flash card” programs that teach skills such as spelling and arithmetic.
CLASSROOMS ❚❙❘ 99
A different vision is found in the work of Seymour Papert, an MIT professor who has explored technology in education since the 1960s, advocating learning theories proposed by the Swiss psychologist Jean Piaget. Papert’s vision uses computers as tools that children use for exploratory and constructive activities. Through these activities children create and shape their own understanding of concepts. Papert incorporated these ideas in the Logo programming language, which was intended to let children write programs to create computer graphics by exploring deeper concepts, such as the mathematical concepts needed to draw geographical concepts. A related vision came from Alan Kay, a renowned computer science researcher, who proposed the Dynabook concept during the early 1970s. The Dynabook was envisioned as a device similar to today’s laptop computer that children could use in information-rich and constructive activities. The Dynabook would have basic software core functionality (using the Smalltalk computer language). However, children could extend their Dynabook’s functionality through Smalltalk programming. This would allow children to create new tools for creative expression, information gathering, simulation, and so forth by learning not only programming but also the fundamentals of the underlying content (e.g., to create a music tool, students would need to learn musical concepts).
Initial Attempts at Technology-Enhanced Classrooms Each technological vision has brought promises of how technology can improve classrooms and learning. With the advent of personal computers, educators rushed to place computers in classrooms with the hope of implementing different visions of learning technologies. However, many initial attempts of technology-enhanced classrooms fell short of their promise because of technological and contextual issues in classrooms. One issue was that although people had some ideas about what kinds of learning activities and goals computers might support, people had little
concrete design information to guide software developers in developing and assessing effective software for learning. Many software projects had little real grounding in learning theories and the nature of children. Thus, for every successful software project, many others had educational content that was lacking and classroom use that was less than successful. For instance, many software projects involved the development of educational games (sometimes called “edutainment” software) whose educational content was dubious and whose initial appeal to children soon wore off. Other approaches involved tools such as HyperCard, which allowed laypeople to create software with the hope that educators could create software for their students. However, although the idea of teachercreated software was well intentioned and although teachers have educational knowledge, they lack software design knowledge, again resulting in few major successes. Other issues were contextual. Many early attempts at educational computing were technocentric and lacked a full understanding of the support needed in classrooms. Inadequate training for busy teachers to use electronic technology can become a large enough burden that teachers simply bypass it. Furthermore, technology has been introduced into the classroom without a good understanding by teachers (and sometimes by researchers developing the technology) of how the technology interacts with the classroom curriculum and learning goals. Again, technology has little impact if it is not a good fit with the activities that teachers desire. Finally, schools have lacked adequate technological resources to make full use of technology, so disparities in the number of computers in classrooms and in network connectivity have hindered effective use of technology.
Designing Learner-Centered Technology Simply developing stand-alone technology for classrooms is not enough. If effective technologyenhanced classrooms are to become a reality, then designers must design an overall learning system
100 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
that integrates three factors: technology, curriculum, and teacher support and development. During the last ten to fifteen years designers have developed many learning systems in the classroom by considering these three factors. Research in many content areas, such as science education, is shedding light on effective technology-enhanced classrooms. In such educational approaches technology acts as a cognitive tool to support learners as they engage in curricular activities. For example, many educational approaches in science education use an inquiry-based technique in which students engage in the same kinds of scientific activity—finding scientific articles, gathering and visualizing scientific data, building scientific models, and so forth—in which experts engage. For example, learners can use software to search digital libraries for information, use handheld computers with probes to gather data in different locations, use software to build graphs and scientific models, and so forth. Such technology should be designed to support learners in mindfully engaging in curricular activities so that learners can meet the learning goals that their teachers have outlined. Given this motivation, the approach for designing learne-rcentered technologies shifts from simply designing technologies whose hallmark is “ease of use” to designing technologies that learners can use in new, information-rich activities. Developing learner-centered technologies requires designers to understand the kinds of work that learners should engage in (i.e., curricular activities) and the learning goals of those learners. Then designers need to understand the areas where learners may face difficulties in performing those kinds of work (e.g., learners may not know what kinds of activities comprise a science investigation or how to do those activities) so that designers can create support features that address those difficulties. Furthermore, such support features differ from usability-oriented traditional software design. Although ease of use is still important, learner-centered technologies should not necessarily make tasks as easy as possible. Rather, just as a good teacher guides students toward an answer without giving the answer outright, learnercentered technologies must provide enough support to make tasks accessible to novice learners but leave
enough challenge that learners still work in the mindful manner needed for real learning to occur. Teacher support and development are also key for technology-enhanced classrooms. Teacher schedules and the classroom environment are busy, and introducing technology into classrooms can make matters more complex for teachers. Teachers need support and development to show them how technology works, how they can integrate technology into their classroom activities, and how they can use technology effectively in the classroom.
New Visions of Technology-Enhanced Classrooms Current classroom technology includes primarily desktop-based software. Some software implements “scaffolding” features that support learners by addressing the difficulties they encounter in their learning activities. For example, one particular software feature implementing a scaffolding approach would be a visual process map that displays the space of activities that learners should perform (e.g., the activities in a science investigation) in a way that helps them understand the structure of the activities. Other classroom software includes intelligent tutoring systems that oversee students as they engage in new activity. Intelligent tutoring systems can sense when students have encountered problems or are working incorrectly and can provide “just-in-time” advice to help them see their errors and understand their tasks. Aside from traditional desktop-based software, researchers are exploring new technology. For example, handheld computers (e.g., Palm or PocketPC computers) are becoming more pervasive among students. The mobility of handheld computers lets students take them to any learning context, not just the classroom. Thus, researchers are exploring how to develop learning tools for handheld computers. An example of such tools is probes that can be attached to handhelds for scientific data gathering (e.g., probes to measure oxygen levels in a stream). Handhelds with wireless networking capability can be used to gather information (e.g., access digital libraries) from a range of locations outside of a classroom. Additionally, handhelds can be part of
CLASSROOMS ❚❙❘ 101
A Personal Story—Learning through Multimedia When I talk about why I became interested in exploring computers in education, I like to tell a story from my early graduate school days in the late 1990s. My research group was working in a local Michigan high school using the MediaText software they had developed earlier. MediaText was a simple text editor that made it possible to incorporate different media objects, such as images, sounds, or video, into the text one was writing. In one class, students had been given an assignment to explain a series of physics terms. One particular student sometimes had difficulty writing, but with MediaText, she could use other media types for her explanations. For example, using video clips from the movie Who Framed Roger Rabbit? she explained potential energy with a clip of a cartoon baby sitting on top of a stack of cups and saucers, swaying precariously without falling over. Then she explained kinetic energy with a clip of the same baby sliding across the floor of a room. What struck me was that it was clear from her choice of video clips that she understood those physics concepts. If she had been confined to textual explanations, she might not have been able to convey as much understanding. But because she had a choice of media types, she was able to successfully convey that she knew those concepts. This episode helped me realize how computers could impact learners by giving them a range of different media types for self-expression. Now sometimes this story gets me in trouble with people who say that if you give students all these alternatives to writing, they’ll never learn to write correctly. I’ll buy that…to a certain extent. But people are diverse— they learn differently and they express themselves differently. My response to the naysayers is that effectively incorporating different media in software tools isn’t for flash, but to give people different “languages” to learn from and use. By offering these alternatives, we open new educational doors, especially for today’s diverse, tech-savvy kids. After all, if one student can explain physics terms using a movie about a cartoon rabbit, then multimedia in the classroom is working. Chris Quintana
new kinds of learning activities called “participatory simulations” in which groups of students can use the “beaming” feature of wireless handhelds to be part of a simulation in which they exchange information. For example, students can explore epidemiological crises in a simulation in which they “meet” other people by exchanging information with their handhelds. During the simulation a student’s handheld might announce that it is “sick,” at which point students would engage in a discussion to understand how disease might spread through a community. Researchers are also exploring the other end of the spectrum, looking at how large displays and virtual reality can be used as learning tools. Such tools can help students explore virtual worlds and engage in activities such as “virtual expeditions.” Students can explore environments that might be difficult or impossible to explore in person (e.g., different ecosystems), thus allowing them to engage in inquiry-based activities throughout a range of locations and gather otherwise inaccessible information.
Meeting the Challenge Technology-enhanced classrooms have had failures as researchers have struggled to understand not only the kinds of effective learning technologies, but also the role of technology in classrooms and the support needed for effective technology-enhanced classrooms. Critics of technology in classrooms still exist. Education professor Larry Cuban has written extensively on the problems and failures of technology in classrooms. Scientist and author Clifford Stoll has also written about the possible adverse effects of technology and the caution that must be taken for children. However, successes and new visions of how technology-enhanced classrooms can support learners also exist. Designers of learning technologies need to understand that designing software for ease of use is not enough. Designers must understand learning theories, the nature of learners, and the classroom context to design cognitive learning technologies that students use to mindfully engage in substantive
102 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
learning activities. People implementing technologyenhanced classrooms need to consider other issues, such as classroom curriculum augmenting technology and teachers having the support and development that they need in order to understand and make full use of technology. As new technology arises, people will always attempt to see how that technology can be used to enhance learning. By understanding the classroom context and the local design issues involved in developing learner-centered technology, the humancomputer interaction community can make significant contributions to realizing the promise of technology-enhanced classrooms. Chris Quintana See also Children and the Internet; Psychology and HCI FURTHER READING Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds.). (2000). How people learn: Brain, mind, experience, and school (Exp. ed.). Washington, DC: National Academy Press. Cuban, L. (1986). Teachers and machines: The classroom use of technology since 1920. New York: Teachers College Press. Kay, A., & Goldberg, A. (1977). Personal dynamic media. IEEE Computer, 10(3), 31–41. Papert, S. (1980). Mindstorms. New York: Basic Books. Quintana, C., Soloway, E., & Krajcik, J. (2003). Issues and approaches for developing learner-centered technology. In M. Zelkowitz (Ed.), Advances in computers: Volume 57. Information Repositories (pp. 272–323). New York: Academic Press. Reiser, B. J. (2002). Why scaffolding should sometimes make tasks more difficult for learners. Proceedings of CSCL 2002, 255–264. Soloway, E., Guzdial, M., & Hay, K. E. (1994). Learner-centered design: The challenge for HCI in the 21st century. Interactions, 1(2), 36–48.
CLIENT-SERVER ARCHITECTURE Client-server architecture is one of the many ways to structure networked computer software. Developed during the 1980s out of the personal com-
puter (PC) explosion, client-server architecture provides a distributed synthesis of the highly interactive personal computer (the client) with a remotely located computer providing data storage and computation (the server). The goal of client-server architecture is to create structures and communication protocols between the client computer and the server computer in order to optimize the access to a set of computational resources.
Motivating Example To understand client-server architecture, one can consider a manufacturing company using computer technology to support day-to-day business operations and long-range strategic planning. Product orders come from the sales department, inventory is maintained by the manufacturing department, and the raw materials orders are generated by the planning department. Furthermore, the accounting department tracks the money, and the chief executive officer (CEO) wants a perspective on all aspects of the company. To be judged successful, the software solution implemented should provide data storage and update capability for all aspects of the company operation. Further, the appropriate database segments should be accessible by all of the employees based on their particular job responsibility, regardless of where they are physically located. Finally, the application views of the database should be highly usable, interactive, and easy to build and update to reflect ongoing business growth and development.
Conflicting Goals One key feature of any software application is the database, the dynamic state of the application. For example, the status of inventory and orders for a factory would be maintained in a database management system (DBMS). Modern database management technology is quite well developed, supporting database lookup and update in a secure, high performance fashion. DBMS computers, therefore, are typically high-performance, focused on the task, and have large permanent storage capacity (disk) and large working memory. The cost of this hardware, the crit-
CLIENT-SERVER ARCHITECTURE ❚❙❘ 103
ical need for consistency, and the complexity of system management dictate that the DBMS be centrally located and administered. This goal was realized in the mainframe architecture of the 1960s and the time-sharing architecture of the 1970s. On the other hand, personal computer applications such as the spreadsheet program VisiCalc, introduced in 1979, demonstrate the power of highly interactive human-computer interfaces. Responding instantly to a user’s every keystroke and displaying results using graphics as well as text, the PC has widened the scope and number of users whose productivity would be enhanced by access to computing. These inexpensive computers bring processing directly to the users but do not provide the same scalable, high-performance data-storage capability of the DBMS. Furthermore, the goal of information management and security is counter to the personal computer architecture, in which each user operates on a local copy of the database. The network—the tie that binds together the DBMS and the human-computer interface—has evolved from proprietary system networks, such as IBM System Network Architecture (SNA), introduced in 1974, to local area networks, such as Ethernet, developed at Xerox’s Palo Alto Research Center (PARC) and introduced in 1976, to the Internet, which began as the U.S. Department of Defense’s Advanced Research Projects Agency network (Arpanet) in 1972 and continues to evolve. A networking infrastructure allows client software, operating on a PC, to make requests of the server for operations on the user’s behalf. In other words, the network provides for the best of both worlds: high-performance, high-reliability components providing centralized data computation and user interface components located on the personal computer providing high interactivity and thereby enhanced usability. Furthermore, by defining standardized messagepassing protocols for expressing the requests from client to server, a level of interoperability is achieved. Clients and servers coming from different vendors or implementing different applications may communicate effectively using protocols such as Remote Procedure Call (RPC) or Standard Query Language (SQL), together with binding services such as the
Common Object Request Broker Architecture (CORBA) or the Component Object Model (COM). Returning to our motivating example, the software solution would include a separate interactive PC application designed for each business function: sales manufacturing, accounting, planning, and the CEO. Each of these individual PC applications would use an RPC call for each query or update operation to the company database server. This partitioning of function is effective both in terms of hardware cost performance (relatively inexpensive client computers for each user versus a relatively expensive database server computer shared between all users) and end-user application design. As the number of simultaneous users grows, the portion of a server’s computation time spent managing client-server sessions grows as well. To mitigate this processing overhead, it is useful to introduce an intermediary server to help handle the client-server requests. Called a “message queuing server,” this software system accepts operations to be performed on the database and manages the request queues asynchronously. Priority information allows intelligent management and scheduling of the operations. Result queues, returning answers back to the requesting client, provide for asynchronous delivery in the other direction as well. Through a message server the queuing operations are offloaded from the database server, providing enhanced throughput (output). The message server also leads to increased flexibility because the message queuing provides a layer of translation and independence between the client software and the DBMS server.
Business Processes Although PC-client access to a server-based DBMS was an early client-server scenario and continues to be important, other client-server architectures include other types of network services. For example, an application server hosts computation rather than data storage, as with a DBMS server. The business processes for an enterprise may be implemented using an application server. Like the message queuing server, the application server sits between the client software and the DBMS, encapsulating functions that may be common across many clients, such as policies and procedures.
104 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
The Future of Client-Server Computing Client-server computing will continue to be important long into the future. PCs continue to drop in price, and new networked devices such as personal data assistants (PDAs) and the World Wide Web are driving network accessibility to a broader audience. The clientserver architecture, which lives on the network through standardized messaging protocols, will continue to have wide applicability, especially in business. Mark R. Laff See also Peer-to-Peer Architecture FURTHER READING Berson, A. (1992). Client/server architecture. New York: McGraw-Hill. Berson, A. (1995). Sybase and client/server computing. New York: McGraw-Hill. Comer, D. (1994). Internetworking with TCP/IP: Vol. 3. Client-server programming and applications. Englewood Cliffs, NJ: Prentice Hall. C o r b i n , J. R . ( 1 9 9 1 ) . T h e a r t o f d i s t r i b u t e d a p p l i c a t i o n s : Programming techniques for remote procedure calls. New York: Springer-Verlag. Edelstein, H. (1994). Unraveling client/server architecture. Redwood City, CA: M & T Publishing. Hall, C. (1994). Technical foundations of client/server systems. New York: Wiley. IBM Corporation. (2002). Websphere MQ application message interface. (SC34-6065-00). Armonk, NY: International Business Machines Corporation. Krantz, S. R. (1995). Real world client server: Learn how to successfully migrate to client/server computing from someone who’s actually done it. Gulf Breeze, FL: Maximum Press. Metcalfe, R. M., & Boggs, D. R. (1976). Ethernet: Distributed packet switching for local computer networks. Communications of the ACM, 19(5), 395–404. Sims, O. (1994). Business objects: Delivering cooperative objects for clientserver. New York: McGraw-Hill.
COGNITIVE WALKTHROUGH The cognitive walkthrough (CW) is a usability evaluation approach that predicts how easy it will be for people to learn to do particular tasks on a computer-
based system. It is crucial to design systems for ease of learning, because people generally learn to use new computer-based systems by exploration. People resort to reading manuals, using help systems, or taking formal training only when they have been unsuccessful in learning to do their tasks by exploration. CW has been applied to a wide variety of systems, including automatic teller machines (ATMs), telephone message and call forwarding systems, websites, computerized patient-record systems for physicians, programming languages, multimedia authoring tools, and computer-supported cooperative work systems. HCI researcher Andrew J. Ko and his associates innovatively applied CW (in lieu of pilot experiments) to predict problems that experimental participants might have with the instructions, procedures, materials, and interfaces used in experiments for testing the usability of a system (the system was a visual programming language).
Cognitive Walkthrough Methodology The CW approach was invented in 1990 and has evolved into a cluster of similar methods with the following four defining features: 1. The evaluation centers on particular users and their key tasks. Evaluators start a CW by carefully analyzing the distinctive characteristics of a particular user group, especially the relevant kinds of background knowledge these users can call upon when learning to perform tasks on the system. Next, CW evaluators select a set of key tasks that members of the user group will do on the system. Key tasks are tasks users do frequently, tasks that are critical even if done infrequently, and tasks that exhibit the core capabilities of the system. 2. The steps designers prescribe for doing tasks are evaluated. For each key task, CW evaluators record the full sequence of actions necessary to do the task on the current version of the system. Then CW evaluators walk through the steps, simulating users’ action selections and mental processes while doing the task. The simplest CW version asks two questions at each
COGNITIVE WALKTHROUGH ❚❙❘ 105
step: (1) Is it likely that these particular users will take the “right action”—meaning the action designers expect them to take—at this step? and (2) If these particular users do the “right action” and get the feedback the system provides (if any), will they know they made a good choice and realize that their action brought them closer to accomplishing their goal? To answer each question evaluators tell a believable success story or failure story. They record failure stories and have the option of adding suggestions for how to repair the problems and turn failures into successes. Anchoring the evaluation to the steps specified by designers communicates feedback to designers in their own terms, facilitating design modifications that repair the usability problems. 3. Evaluators use theory-based, empirically verified predictions. The foundation for CW is a theory of learning by exploration that is supported by extensive research done from the 1960s to the 1980s on how people attempt to solve novel problems when they lack expert knowledge or specific training. According to this theory, learning to do tasks on a computer-based system requires people to solve novel problems by using general problem-solving methods, general reading knowledge, and accumulated experience with computers. “The key idea is that correct actions are chosen based on their perceived similarity to the user’s current goal” (Wharton et al. 1994, 126). For software applications, the theory predicts that a user scans available menu item labels on the computer screen and picks the menu item label that is most similar in meaning to the user’s current goal. CW evaluators answer the first question with a success story if the “right action” designated by the designer is highly similar in meaning to the user’s goal and if the menu item labels on the screen use words familiar to the user. 4. Software engineers can easily learn how to make CW evaluations. It is crucial to involve software engineers and designers in CW, because they are the individuals responsible for revising the design to repair the problems. There is strong evidence that software engineers and
designers can readily learn CW, but they have a shallower grasp of the underlying theory than usability experts trained in cognitive psychology and consequently find less than half as many usability problems. A group CW, including at least one usability expert trained in cognitive psychology, can find a higher percentage of usability problems than an individual evaluator— up to 50 percent of the problems that appear in usability tests of the system. CW was one of the several evaluation methods pioneered in the early 1990s to meet a practical need, the need to identify and repair usability problems early and repeatedly during the product development cycle. The cost of repairing usability problems rises steeply as software engineers invest more time in building the actual system, so it is important to catch and fix problems as early as possible. For a product nearing completion the best evaluation method is usability testing with end users (the people who will actually use the system), but CW is appropriate whenever it is not possible to do usability testing. Early versions of CW were tedious to perform, but the 1992 cognitive jogthrough and streamlined CW of 2000, which still preserve all the essential CW features, are much quicker to perform.
Transforming CW to Faster and More Accurately Predict User Actions The cognitive walkthrough for the Web (CWW) has transformed the CW approach by relying on Latent Semantic Analysis (LSA)—instead of on the subjective judgments of usability experts and software engineers—to predict whether users are likely to select the “right action.” LSA is a computer software system that objectively measures semantic similarity— similarity in meaning—between any two passages of text. LSA also assesses how familiar words and phrases are for particular user groups. While analyzing the distinctive characteristics of the particular user group, CWW evaluators choose the LSA semantic space that best represents the background knowledge of the particular user group— the space built from documents that these users
106 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
are likely to have read. For example, CWW currently offers a college-level space for French and five spaces that accurately represent general reading knowledge for English at college level and at third-, sixth-, ninth-, and twelfth-grade levels. CWW uses LSA to measure the semantic similarity between a user’s information search goal (described in 100 to 200 words) and the text labels for each and every subregion of the web page and for each and every link appearing on a web page. CWW then ranks all the subregions and link labels in order of decreasing similarity to the user’s goal. CWW predicts success if the “right action” is the highest-ranking link, if that link is nested within the highest-ranking subregion, and if the “right action” link label and subregion avoid using words that are liable to be unfamiliar to members of the user group. Relying on LSA produces the same objective answer every time, and laboratory experiments confirm that actual users almost always encounter serious problems whenever CWW predicts that users will have problems doing a particular task. Furthermore, using CWW to repair the problems produces twoto-one gains in user performance. So far, CWW researchers have tested predictions and repairs only for users with college-level reading knowledge of English, but they expect to prove that CWW gives comparably accurate predictions for other user groups and semantic spaces.
APPLICATION A software program that performs a major computing function (such as word processing or Web browsing).
Research by cognitive psychologist Rodolfo Soto suggests that CW evaluations of software applications would be improved by relying on LSA, but to date CW has consistently relied on subjective judgments of human evaluators. Consequently the agreement between any two CW evaluators is typically low, raising concerns about the accuracy of CW predictions. Many studies have tried to assess the accuracy and cost-effectiveness of CW compared to usability testing and other evaluation methods. The results are inconclusive, because there is controversy
about the experimental design and statistics of these studies. Relying on LSA opens the door to fully automating CWW and increasing its cost-effectiveness. If other CW methods start to rely on LSA they, too, could be automated. The streamlined CW is more efficient than earlier CW methods, but it still consumes the time of multiple analysts and relies on subjective judgments of uncertain accuracy.
Objectively Predicting Actions for Diverse Users Relying on LSA makes it possible for CWW to do something that even usability experts trained in cognitive psychology can almost never do: objectively predict action selections for user groups whose background knowledge is very different from the background knowledge of the human evaluators. For example, selecting the sixth-grade semantic space enables LSA to “think” like a sixth grader, because the sixth-grade LSA semantic space contains only documents likely to have been read by people who have a sixth-grade education. In contrast, a collegeeducated analyst cannot forget the words, skills, and technical terms learned since sixth grade and cannot, therefore, think like a sixth grader. Since CW was invented in 1990, the number and diversity of people using computers and the Internet have multiplied rapidly. Relying on LSA will enable the CW approach to keep pace with these changes. In cases where none of the existing LSA semantic spaces offers a close match with the background knowledge of the target user group, new semantic spaces can be constructed for CWW (and potentially for CW) analyses—in any language at any level of ability in that language. Specialized semantic spaces can also be created for bilingual and ethnic minority user groups and user groups with advanced background knowledge in a specific domain, such as the domain of medicine for evaluating systems used by health professionals. Marilyn Hughes Blackmon See also Errors in Interactive Behavior; User Modeling
COLLABORATORIES ❚❙❘ 107
FURTHER READING Blackmon, M. H., Kitajima, M., & Polson, P. G. (2003). Repairing usability problems identified by the cognitive walkthrough for the web. In CHI 2003: Proceedings of the Conference on Human Factors in Computing Systems, 497–504. Blackmon, M. H., Polson, P. G., Kitajima, M., & Lewis, C. (2002). Cognitive walkthrough for the Web. In CHI 2002: Proceedings of the Conference on Human Factors in Computing Systems, 463–470. Desurvire, H. W. (1994). Faster, cheaper!! Are usability inspection methods as effective as empirical testing? In J. Nielsen & R. L. Mack (Eds.), Usability inspection methods (pp. 173–202). New York: Wiley. Gray, W. D., & Salzman, M. D. (1998). Damaged merchandise? A review of experiments that compare usability evaluation methods. Human-Computer Interaction, 13(3), 203–261. Hertzum, M., & Jacobsen, N. E. (2003). The evaluator effect: A chilling fact about usability evaluation methods. International Journal of Human Computer Interaction, 15(1), 183–204. John B. E., & Marks, S. J. (1997). Tracking the effectiveness of usability evaluation methods. Behaviour & Information Technology, 16(4/5), 188–202. John, B. E., & Mashyna, M. M. (1997). Evaluating a multimedia authoring tool. Journal of the American Society for Information Science, 48(11), 1004–1022. Ko, A. J., Burnett, M. M., Green, T. R. G., Rothermel, K. J., & Cook, C. R. (2002). Improving the design of visual programming language experiments using cognitive walkthroughs. Journal of Visual Languages and Computing, 13, 517–544. Kushniruk, A. W., Kaufman, D. R., Patel, V. L., Lévesque, Y., & Lottin, P. (1996). Assessment of a computerized patient record system: A cognitive approach to evaluating medical technology. M D Computing, 13(5), 406–415. Lewis, C., Polson, P., Wharton, C., & Rieman, J. (1990). Testing a walkthrough methodology for theory-based design of walk-up-anduse interfaces. In CHI ‘90: Proceedings of the Conference on Human Factors in Computing Systems, 235–242. Lewis, C., & Wharton, C. (1997). Cognitive walkthroughs. In M. Helander, T. K. Landauer, & P. Prabhu (Eds.), Handbook of human-computer interaction (2nd ed., revised, pp. 717–732). Amsterdam: Elsevier. Pinelle, D., & Gutwin, C. (2002). Groupware walkthrough: Adding context to groupware usability evaluation. In CHI 2002: Proceedings of the Conference on Human Factors in Computing Systems, 455–462. Polson, P., Lewis, C., Rieman, J., & Wharton, C. (1992). Cognitive walkthroughs: A method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies, 36, 741–773. Rowley, D. E., & Rhoades, D. G. (1992). The cognitive jogthrough: A fast-paced user interface evaluation procedure. In CHI ’92: Proceedings of the Conference on Human Factors in Computing Systems, 389–395. Sears, A., & Hess, D. J. (1999). Cognitive walkthroughs: Understanding the effect of task description detail on evaluator performance. International Journal of Human-Computer Interaction, 11(3), 185–200. Soto, R. (1999). Learning and performing by exploration: Label quality measured by Latent Semantic Analysis. In CHI ’99: Proceedings of the Conference on Human Factors and Computing Systems, 418–425.
Spencer, R. (2000). The streamlined cognitive walkthrough method, working around social constraints encountered in a software development company. In CHI 2000: Proceedings of the Conference on Human Factors in Computing Systems, 353–359. Wharton, C., Rieman, J., Lewis, C., & Polson, P. (1994). The cognitive walkthrough method: A practitioner’s guide. In J. Nielsen & R. L. Mack (Eds.), Usability inspection methods (pp. 105–140). New York: Wiley.
COLLABORATIVE INTERFACE See Multiuser Interfaces
COLLABORATORIES A collaboratory is a geographically dispersed organization that brings together scientists, instrumentation, and data to facilitate scientific research. In particular, it supports rich and recurring human interaction oriented to a common research area and provides access to the data sources, artifacts, and tools required to accomplish research tasks. Collaboratories have been made possible by new communication and computational tools that enable more flexible and ambitious collaborations. Such collaborations are increasingly necessary. As science progresses, the unsolved problems become more complex, the need for expensive instrumentation increases, larger data sets are required, and a wider range of expertise is needed. For instance, in highenergy physics, the next generation of accelerators will require vast international collaborations and will have a collaboratory model for remote access. At least 150 collaboratories representing almost all areas of science have appeared since the mid-1980s. Collaboratories offer their participants a number of different capabilities that fall into five broad categories: communication (including tools such as audio or video conferencing, chat, or instant messaging), coordination (including tools relating to access rights, group calendaring, and project management), information access (including tools for
108 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
accessing online databases, digital libraries, and document repositories), computational access (including access to supercomputers), and facility access (including tools for remotely accessing specialized facilities or instruments, such as a particle accelerator or a high-powered microscope). Research on collaboratories has focused mostly on solving technical problems. However, substantial gains in the practice of science are likely to be the combined effect of social and technical transformations. The gap between the raw performance capability of collaboratory tools (based on bandwidth, storage capacity, processor speed, and so forth) and the realized performance (usage for scientific purposes, which is limited by factors such as usability and fit to the work and culture) can limit the potential of collaboratories. This point will be discussed in greater detail later.
Types of Collaboratories: Research-Focused Collaboratories There are a number of different kinds of collaboratories. A collaboratory that satisfies all elements of the definition given above is a prototypical collaboratory— a distributed research center. Other kinds of collaboratories are missing one or more of the elements of that definition. The following four types of collaboratories focus on enabling geographically distributed research. Distributed Research Center This type of collaboratory functions like a full-fledged research center or laboratory, but its users are geographically dispersed—that is, they are not located at the research center. It has a specific area of interest and a general mission, with a number of specific projects. A good example of a distributed research center is the Alliance for Cellular Signaling, a large, complex distributed organization of universities whose goal is to understand how cells communicate with one another to make an organism work. Shared Instrument A shared-instrument collaboratory provides access to specialized or geographically remote facilities.
As the frontiers of science are pushed back, the instrumentation required for advances becomes more and more esoteric, and therefore usually more and more expensive. Alternatively, certain scientific investigations require instrumentation in specific geographic settings, such as an isolated or inhospitable area. A typical example is the Keck Observatory, which provides access to an astronomical observatory on the summit of Mauna Kea in Hawaii to a consortium of California universities. Community Data System An especially common collaboratory type is one in which a geographically dispersed community agrees to share their data through a federated or centralized repository. The goal is to create a more powerful data set on which more sophisticated or powerful analyses can be done than would be possible if the parts of the data set were kept separately. A typical example of a community data system is the Zebrafish Information Network (ZFIN), an online aggregation of genetic, anatomical, and methodological information for zebra fish researchers. Open-Community Contribution System Open-community contribution systems are an emerging organizational type known as a voluntary association. Interested members of a community (usually defined quite broadly) are able to make small contributions (the business scholar Lee Sproull calls them microcontributions) to some larger enterprise. These contributions are judged by a central approval organization and placed into a growing repository. The classic example is open-source software development, which involves hundreds or even thousands of contributors offering bug fixes or feature extensions to a software system. In science, such schemes are used to gather data from a large number of contributors. Two examples will help illustrate this. The NASA Ames Clickworkers project invited members of the public to help with the identification of craters on images from a Viking mission to Mars. They received 1.9 million crater markings from over 85,000 contributors, and the averaged results of these community contributions were equivalent in quality to those of expert geologists. A second example is MIT’s Open Mind Common Sense Initiative, which is collecting
COLLABORATORIES ❚❙❘ 109
examples of commonsense knowledge from members of the public “to help make computers smarter” (Singh n.d.).
Types of Collaboratories: Practice-Focused Collaboratories The next two collaboratory types support the professional practice of science more broadly, as opposed to supporting the conduct of research itself. Virtual Community of Practice This is a network of individuals who share a research area of interest and seek to share news of professional interest, advice, job opportunities, practical tips on methods, and the like. A good example of this kind of collaboratory is Ocean US, which supports a broad community of researchers interested in ocean observations. A listserv is another mechanism that is used to support a virtual community of practice, but much more common these days are websites and wikis. Virtual Learning Community This type of collaboratory focuses on learning that is relevant to research, but not research itself. A good example is the Ecological Circuitry Collaboratory, whose goal is to train doctoral students in ecology in quantitative-modeling methods.
Evolution and Success of Collaboratories Collaboratories that last more than a year or two tend to evolve. For example, a collaboratory may start as a shared-instrument collaboratory. Those who share the instrument may add a shared database component to it, moving the collaboratory toward a community data system. Then users may add communication and collaboration tools so they can plan experiments or data analyses, making the collaboratory more like a distributed research center. Some collaboratories are quite successful, while others do not seem to work very well. There are a number of factors that influence whether or not a
collaboratory is successful. What follow are some of the most important factors. Readiness for Collaboration Participants must be ready and willing to collaborate. Science is by its very nature a delicate balance of cooperation and competition. Successful collaborations require cooperation, but collaboration is very difficult and requires extra effort and motivation. Technologies that support collaboration will not be used if the participants are not ready or willing to collaborate. Various fields or user communities have quite different traditions of sharing. For instance, upper-atmospheric physicists have had a long tradition of collaboration; the Upper Atmospheric Research Collaboratory (UARC) began with a collaborative set of users. On the other hand, several efforts to build collaboratories for biomedical research communities (for instance, for researchers studying HIV/AIDS or depression) have had difficulty in part because of the competitive atmosphere. Readiness for collaboration can be an especially important factor when the collaboratory initiative comes from an external source, such as a funding agency. Technical Readiness The participants, the supporting infrastructure, and the design of the tools must be at a threshold technical level. Some communities are sufficiently collaborative to be good candidates for a successful collaboratory, but their experience with collaborative technologies or the supporting infrastructure is not sufficient. Technical readiness can be of three kinds. People in various organizations or fields have different levels of experience with collaboration tools. A specific new technology such as application sharing may be a leap for some and an easy step for others. It is important to take account of users’ specific experience when introducing new tools.
INDIVIDUAL TECHNICAL READINESS
Collaborative technologies require good infrastructure, both technical and social. Poor networks, incompatible workstations, or a lack of control over different versions of software can cause major problems. It is also very important
INFRASTRUCTURE READINESS
110 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
to have good technical support personnel, especially in the early phases of a collaboratory. The Worm Community System (WCS) was a very early collaboratory project, intended to support a community of researchers who studied the organism c. elegans (a type of nematode). Sophisticated software was developed for the WCS on a UNIX platform that was not commonly used in the laboratories of the scientists. Since the tools were thus not integrated with everyday practice, they were seldom used. Furthermore, the necessary technical support was not generally present in the lab, so when there were problems, they were showstoppers. The social interactions that take place in teams are affected both by the characteristics of team members and by the tools that are used. The study of the impact of technology characteristics on this process may be called social ergonomics (ergonomics is the application of knowledge about humans to the design of things). For example, video conferencing systems often ignore such details as screen size, display arrangement in relation to participants, camera angle, and sound volume. But it turns out that these details can have social effects. For example, a study conducted by the researchers Wei Huang, Judith Olson, and Gary Olson found that the apparent height of videoconference participants, as conveyed via camera angle, influenced a negotiation task. The apparently taller person was more influential in shaping the final outcome than the apparently shorter person.
SOCIAL ERGONOMICS OF TOOLS
Aligned Incentives Aligning individual and organizational incentives is an important element of successful collaborations. Consider the incentives to participate in a community data system: What motivates a researcher to contribute data to a shared database? By contributing, the researcher gives up exclusive access to the data he or she has collected. There are a variety of incentive schemes for encouraging researchers to collaborate. ZFIN has relied on the goodwill of its members. Most of the members of this community had a connection to one specific senior researcher who both pioneered the use of zebra fish as a model organism
GOODWILL
and also created for the community a spirit of generosity and collaboration. Although goodwill among the community of researchers has been a sufficient incentive for participation, ZFIN is now expanding its participation beyond its founders, and it will be interesting to see how successful the goodwill incentive is in the context of the expanded community. Slashdot is a very large and active community of open-source software developers who share and discuss news. Slashdot rewards those who make the most informative contributions by bringing them more into the center of attention and allocating them karma points. Karma points are allocated in accordance with how highly a contributor’s postings are rated by others. These karma points give contributors some additional privileges on the site, but their main value is as a tangible measure of community participation and status. Karma points are a formalization of goodwill, valuable primarily because the members of the community value them as an indicator of the quality of the sharing done by specific individuals. GOODWILL PLUS KARMA POINTS
REQUIRING CONTRIBUTION AS A PREREQUISITE FOR OTHER
In order to get the details of gene sequences out of published articles in journals, a consortium of high-prestige journals in biology requires that those who submit articles to the consortium’s journals have a GenBank accession number indicating that they have stored their gene sequences in the shared database.
ACTIVITY
The Alliance for Cellular Signaling has taken a novel approach to providing researchers with an incentive to contribute molecule pages to the Alliance’s database. Because the molecule pages represent a lot of work, the Alliance has worked out an agreement with Nature, one of the high-prestige journals in the field, to count a molecule page as a publication in Nature. Nature coordinates the peer reviews, and although molecule-page reviews do not appear in print, the molecule pages are published online and carry the prestige of the Nature Publishing Group. The Alliance’s editorial director has written letters in support of promotion and tenure cases indicating that
NEW FORMS OF PUBLICATION
COLLABORATORIES ❚❙❘ 111
molecule page contributions are of journalpublication quality. This agreement is a creative attempt to ensure that quality contributions will be made to the database; it also represents an interesting evolution of the scholarly journal to include new forms of scholarly publication. Data Issues Data are a central component of all collaborations. There are numerous issues concerning how data are represented and managed; how these issues are resolved affects collaboratory success. For example, good metadata—data about data—are critical as databases increase in size and complexity. Library catalogs and indexes to file systems are examples of metadata. Metadata are key to navigation and search through databases. Information about the provenance or origins of the data is also important. Data have often been highly processed, and researchers will want to know what was done to the original raw data to arrive at the processed data currently in the database. Two related collaboratories in high-energy physics, GriPhyN and iVDGL, are developing schemes for showing investigators the paths of the transformations that led to the data in the database. This will help researchers understand the data and will also help in identifying and correcting any errors in the transformations. For some kinds of collaboratories, the complex jurisdictional issues that arise when data are combined into a large database pose an interesting new issue. The BIRN project is facing just such an issue as it works to build up a database of brain images. The original brain images were collected at different universities or hospitals under different institutional review boards, entities that must approve any human data collection and preservation, and so the stipulations under which the original images were collected may not be the same in every case.
Other Issues Many collaboratory projects involve cooperation between domain scientists, who are the users of the collaboratory, and computer scientists, who are responsible for the development of the tools. In many
projects there are tensions between users, who want reliable tools that do what they need done, and computer scientists, who are interested in technical innovations and creative software ideas. There is little incentive for the computer scientists to go beyond the initial demonstration versions of tools to the reliable and supported long-term operational infrastructure desired by the users. In some fields, such as high-energy physics, this tension has been at least partially resolved. The field has used advanced software for so long that it is understood that the extra costs associated with having production versions of tools must be included in a project. Other fields are only just discovering this. The organization of the George E. Brown, Jr., Network for Earthquake Engineering Simulation (NEES) project represents an innovation in this regard. The National Science Foundation, which funds the project, established it in two phases, an initial four-year system-integration phase in which the tools are developed and tested, and a ten-year operational phase overseen by a NEES consortium of user organizations. Any large organization faces difficult management issues, and practicing scientists may not always have the time or the skills to properly manage a complex enterprise. Management issues get even more complicated when the organization is geographically distributed. Many large collaboratories have faced difficult management issues. For instance, the two physics collaboratories mentioned earlier, GriPhyN and iVDGL, found that it was necessary to hire a fulltime project manager for each collaboratory in order to help the science project directors manage the day-by-day activities of the projects. The Alliance for Cellular Signaling has benefited from a charismatic leader with excellent management skills who has set up a rich management structure to oversee the project. The BIRN collaboratory has an explicit governance manual that contains guidelines for a host of tricky management issues; it also has a steering committee that is responsible for implementing these management guidelines.
Collaboratories in the Future Geographically distributed research projects are becoming commonplace in all the sciences. This
112 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
proliferation is largely driven by what is required to work at the frontiers of science. In the future, widely shared knowledge about how to put together successful collaboratories will be essential. Of course, scientists are not alone in attempting geographically distributed collaborations. Similar issues are faced in industry, education, government, and the nonprofit sector. Good tools for collaboration and the social and organizational knowledge to make effective use of them will be critical in all domains. Gary M. Olson
Sproull, L., Conley, C., & Moon, J. Y. (in press). Pro-social behavior on the net. In Y. Amichai-Hamburger (Ed.), The social net: The social psychology of the Internet. New York: Oxford University Press. Sproull, L. & Kiesler, S. (in press). Public volunteer work on the Internet. In B. Kahin & W. Dutton (Eds.), Transforming enterprise. Cambridge, MA: MIT Press. Star, S. L., & Ruhleder, K. (1994). Steps towards an ecology of infrastructure: Complex problems in design and access for large-scale collaborative systems. In Proceedings of CSCW 94 (pp. 253–264). New York: ACM Press. Teasley, S., & Wolinsky, S. (2001). Scientific collaborations at a distance. Science, 292, 2254–2255. Torvalds, L., & Diamond, D. (2001). Just for fun: The story of an accidental revolutionary. New York: Harper Business. Wulf, W.A. (1993). The collaboratory opportunity. Science, 261, 854–855.
See also Computer-Supported Cooperative Work; Groupware
COMPILERS FURTHER READING Aldhous, P. (1993). Managing the genome data deluge. Science, 262, 502–3. Birnholtz, J., & Bietz, M. (2003). Data at work: Supporting sharing in science and engineering. In Proceedings of Group 2003. New York: ACM Press. Cinkosky, M. J., Fickett, J. W., Gilna, P., & Burks, C. (1991). Electronic data publishing and GenBank. Science, 252, 1273–1277. Finholt, T. A. (2002). Collaboratories. In B. Cronin (Ed.), Annual Review of Information Science and Technology, 36, 74–107. Washington, DC: American Society for Information Science and Technology. Finholt, T. A., & Olson, G. M. (1997). From laboratories to collaboratories: A new organizational form for scientific collaboration. Psychological Science, 8(1), 28–36. Huang, W., Olson, J. S., & Olson, G. M. (2002). Camera angle affects dominance in video-mediated communication. In Proceedings of CHI 2002, short papers (pp. 716–717). New York: ACM Press. National Science Foundation. (2003) Revolutionizing science and engineering through cyberinfrastructure: Report of the National Science Foundation blue-ribbon panel on cyberinfrastructure. Retrieved December 24, 2003, from http://www.communitytechnology.org/ nsf_ci_report/ Olson, G. M., Finholt, T. A., & Teasley, S. D. (2000). Behavioral aspects of collaboratories. In S. H. Koslow & M. F. Huerta (Eds.), Electronic collaboration in science (pp. 1–14). Mahwah, NJ: Lawrence Erlbaum Associates. Olson, G. M., & Olson, J. S. (2000). Distance matters. Human-Computer Interaction, 15(2–3), 139–179. Raymond, E. S. (1999). The cathedral and the bazaar: Musing on Linux and open source by an accidental revolutionary. Sebastopol, CA: O’Reilly. Schatz, B. (1991). Building an electronic community system. Journal of Management Information Systems, 8(3), 87–107. Singh, Push (n.d.). Open mind common sense. Retrieved December 22, 2003, from http://commonsense.media.mit.edu/cgi-bin/ search.cgi
Compilers are computer programs that translate one programming language into another. The original program is usually written in a high-level language by a programmer and then translated into a machine language by a compiler. Compilers help programmers develop user-friendly systems by allowing them to program in high-level languages, which are more similar to human language than machine languages are.
Background Of course, the first compilers had to be written in machine languages because the compilers needed to operate the computers to enable the translation process. However, most compilers for new computers are now developed in high-level languages, which are written to conform to highly constrained syntax to ensure that there is no ambiguity. Compilers are responsible for many aspects of information system performance, especially for the run-time performance. They are responsible for making it possible for programmers to use the full power of programming language. Although compilers hide the complexity of the hardware from ordinary programmers, compiler development requires programmers to solve many practical algorithmic and engineering problems. Computer hardware architects constantly create new challenges
COMPILERS ❚❙❘ 113
for compiler developers by building more complex machines. Compilers translate programming languages and the following are the tasks performed by each specific compiler type: ■
Assemblers translate low-level language instructions into machine code and map low-level language statements to one or more machinelevel instructions. ■ Compilers translate high-level language instructions into machine code. High-level language statements are translated into more than one machine-level instruction. ■ Preprocessors usually perform text substitutions before the actual translation occurs. ■ High-level translators convert programs written in one high-level language into another highlevel language. The purpose of this translation is to avoid having to develop machine-languagebased compilers for every high-level language. ■ Decompilers and disassembers translate the object code in a low-level language into the source code in a high-level language. The goal of this translation is to regenerate the source code. In the 1950s compilers were often synonymous with assemblers, which translated low-level language instructions into directly executable machine code. The evolution from an assembly language to a highlevel language was a gradual one, and the FORTRAN compiler developers who produced the first successful high-level language did not invent the notion of programming in a high-level language and then compiling the source code to the object code. The first FORTRAN compiler was designed and written between 1954 and 1957 by an IBM team led by John W. Backus, but it had taken about eighteen personyears of effort to develop. The main goal of the team led by Backus was to produce object code that could execute as efficiently as human machine coders could.
Translation Steps Programming language translators, including compilers, go through several steps to accomplish their task, and use two major processes—an analytic process and a synthetic process. The analytic process
takes the source code as input and then examines the source program to check its conformity to the syntactic and semantic constraints of the language in which the program was written. During the synthetic process, the object code in the target language is generated. Each major process is further divided. The analytic process, for example, consists of a character handler, a lexical analyzer, a syntax analyzer, and a constraint analyzer. The character handler identifies characters in the source text, and the lexical analyzer groups the recognized characters into tokens such as operators, keywords, strings, and numeric constants. The syntax analyzer combines the tokens into syntactic structures, and the constraint analyzer checks to be sure that the identified syntactic structures meet scope and type rules. The synthetic process consists of an intermediate code generator, a code optimizer, and a code generator. An intermediate code generator produces code that is less specific than the machine code, which will be further processed by another language translator. A code optimizer improves the intermediate code with respect to the speed of execution and the computer memory requirement. A code generator takes the output from the code optimizer and then generates the machine code that will actually be executed on the target computer hardware.
Interpreters and Interpretive Compilers In general, compilers produce the executable object code at the full speed, and compilers are usually designed to compile the entire source code before executing the resulting object code. However, it is common for programmers to expect to execute one or more parts of a program before completing the program. In addition, many programmers want to write programs using a trial-and-error or what-if strategy. These cases call for the use of an interpreter in lieu of a traditional compiler because an interpreter, which executes one instruction at a time, can take the source program as input and then execute the instructions without generating any object code. Interpretive compilers generate simple intermediate code, which satisfies the constraints of the
114 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
practical interpreters. The intermediate code is then sent as input to an interpreter, which executes the algorithm embedded in the source code by utilizing a virtual machine. Within the virtual machine setting, the intermediate code plays the role of executable machine code.
Famous Compiler: GNU Compiler Collection (GCC) Many high-level language compilers have been implemented using the C programming language and generating C code as output. Because almost all computers come with a C compiler, source code written in C is very close to being truly hardwareindependent and portable. The GNU Compiler Collection (GCC) provides code generation for many programming languages such as C, C++, and Java, and supports more than two hundred different software and hardware platforms. The source code of GCC is free and open, based on GNU General Public License, which allows people to distribute the compiler’s source code as long as the original copyright is not violated and the changes are published under the same license. This license enables users to port GCC to their platform of choice. Presently almost all operating systems for personal computers are supported by GCC and ship the compiler as an integrated part of the platform. For example, Apple’s Mac OS X is compiled using GCC 3.1. Other companies such as Sun and The Santa Cruz Operation also offer GCC as their standard system compiler. These examples show the flexibility and portability of GCC.
Compiler Constructor: Lex and Yacc Roughly speaking, compilers work in two stages. The first stage is reading the source code to discover its structure. The second stage is generating the executable object code based on the identified structure. Lex, a lexical-analyzer generator, and Yacc, a compiler-compiler, are programs used to discover the structure of the source code. Lex splits the source code into tokens and then writes a program whose
control flow is manipulated by instances of regular expressions in the input stream. Regular expressions consist of normal characters, which include upperand lower-case letters and digits, and metacharacters, which have special meanings. For example, a dot is a metacharacter, which matches any one character other than the new-line character. There is also a table of regular expressions and their associated program pieces, called Lex source, and the resulting program is a translation of the table. The program reads the input stream and generates the output stream by partitioning the input into strings that match the given regular expression. Yacc is a general tool for describing the source code to a program. After the Yacc user specifies the structures to be recognized and the corresponding codes to be invoked, Yacc finds the hierarchical structures and transforms their specifications into subroutines that process the input.
The Future of Compilers Proebstring’s Law states that “compiler advances double computing power every 18 years” (Proebsting, n.d., 1). This implies that compiler-optimization work makes a very minor contribution because it means that while the processing power of computer hardware increases by about 60 percent per year, the compiler optimization increases by only 4 percent. Furthermore, some people claim that compilers will become obsolete with the increased use of scripting languages, which rely on interpreters or interpretive compilers. Scripting languages, such as Python, are popular among new programmers and people who do not care about minute efficiency differences. However, there are arguments for the continued existence of compilers. One of the arguments is that there has to be a machine code on which the interpreters rely in order for a programmer’s intended algorithm to be executed. In addition, there will always be new and better hardware, which will then rely on new compilers. It will also be impossible to extinguish the continuing desire to achieve even minute performance improvements and compiletime error-detection capability. One of the proposed future directions for compilers is to aid in increas-
COMPUTER-SUPPORTED COOPERATIVE WORK ❚❙❘ 115
ing the productivity of programmers by optimizing the high-level code. Another possible direction is to make compilers smarter by making them selfsteering and self-tuning, which would allow them to adapt to input by incorporating artificial-intelligence techniques. Woojin Paik See also Programming Languages
Pizka, M. (1997). Design and implementation of the GNU INSEL Compiler gic. Technical Report TUM–I 9713. Munich, Germany: Munich University of Technology. Proebstring, T. (n.d.). Todd Proebsting’s home page. Retrieved January 20, 2004, from http://research.microsoft.com/~toddpro/ Rice compiler group. (n.d.). Retrieved January 20, 2004, from http:// www.cs.rice.edu/CS/compilers/index.html Terry, P. D. (1997). Compilers and compiler generators— an introduction with C++. London: International Thomson Computer Press. The comp.compilers newsgroup. (2002). Retrieved January 20, 2004, from http://compilers.iecc.com/index.html The Lex and Yacc page. (n.d.). Retrieved January 20, 2004, from http://dinosaur.compilertools.net/ Why compilers are doomed. (April 14, 2002). Retrieved January 20, 2004, from http://www.equi4.com/jcw/wiki.cgi/56.html
FURTHER READING Aho, A. V., Sethi, R., & Ulman, J. D. (1986). Compilers: principles, techniques and tools. Reading, MA: Addison-Wesley. Aho, A. V., & Ulman, J. D. (1977). Principles of compiler design. Reading, MA: Addison-Wesley. Bauer, A. (2003). Compilation of functional programming languages using GCC—Tail Calls. Retrieved January 20, 2004, from http://home.in.tum.de/~baueran/thesis/baueran_thesis.pdf A Brief History of FORTRAN /fortran. (1998). Retrieved January 20, 2004, from http://www.ibiblio.org/pub/languages/FORTRAN/ ch1-1.html Catalog of free compilers and interpreters. (1998). Retrieved January 20, 2004, from http://www.idiom.com/free-compilers/ Clodius W. (1997). Re: History and evolution of compilers. Retrieved January 20, 2004, from http://compilers.iecc.com/comparch/article/ 97-10-008 Compiler Connection. (2003). Retrieved January 20, 2004, from http://www.compilerconnection.com/index.html Compiler Internet Resource List. (n.d.). Retrieved January 20, 2004, from http://www.eg3.com/softd/compiler.htm Cooper, K., & Torczon, L. (2003). Engineering a Compiler. Burlington, MA: Morgan Kaufmann. Cooper, K., Kennedy, K., and Torczon, L. (2003). COMP 412 Overview of the course. Retrieved January 20, 2004, from http://www.owlnet.rice.edu/~comp412/Lectures/L01Intro.pdf Cranshaw, J. (1997). Let’s build a compiler. Retrieved January 20, 2004, from http://compilers.iecc.com/crenshaw/ GCC Homepage. (January 26, 2004). Retrieved January 26, 2004, from http://gcc.gnu.org/ Free Software Foundation. (1991). GNU General Public License. Retrieved January 20, 2004, from http://www.fsf.org/licenses/ gpl.html Joch, A. (January 22, 2001). Compilers, interpreters and bytecode. Retrieved January 20, 2004, from http://www.computerworld.com/ softwaretopics/software/story/0,10801,56615,00.html Lamm, E. (December 8, 2001). Lambda the Great. Retrieved January 20, 2004, from http://lambda.weblogs.com/2001/12/08 Mansour, S. (June 5, 1999). A Tao of Regular Expressions. Retrieved January 20, 2004, from http://sitescooper.org/tao_regexps.html Manzoor, K. (2001). Compilers, interpreters and virtual machines. Retrieved January 20, 2004, from http://homepages.com.pk/ kashman/jvm.htm
COMPUTER-SUPPORTED COOPERATIVE WORK Computer-supported cooperative work (CSCW) is the subarea of human-computer interaction concerned with the communication, collaboration, and work practices of groups, organizations, and communities, and with information technology for groups, organizations, and communities. As the Internet and associated networked computing activities have become pervasive, research in CSCW has expanded rapidly, and its central concepts and vocabulary are still evolving. For the purposes of this discussion, we understand cooperative work as any activity that includes or is intended to include the coordinated participation of at least two individuals; we take computer support of such work to be any information technology used to coordinate or carry out the shared activity (including archiving of the records of an activity to allow subsequent reuse by another). Several themes dominate research and practice in CSCW: studies of work, in which activities and especially tool usage patterns are observed, analyzed, and interpreted through rich qualitative descriptions; design and use of computer-mediated communication (CMC) systems and of groupware, designed to aid with collaborative planning, acting, and sense making; and analyses of the adoption and adaptation of CSCW systems.
116 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
A Personal Story—Social Context in Computer-Supported Cooperative Work (CSCW) In the early 1980s, our research group at the IBM Watson Research Center focused on the early stages of learning word processing systems, like the IBM Displaywriter. We carried out an extensive set of studies over several years. In these investigations, we noticed that people tried to minimize the amount of rote learning they engaged in, preferring to adopt action-oriented approaches in their own learning. Eventually, we developed a description of the early stages of learning to use computer applications that helped to define new design approaches and learning support. But this work also made us wonder what more advanced learning might be like. To investigate this, my colleague John Gould and I visited an IBM customer site, to observe experienced users of Displaywriters as they worked in their everyday environments. These individuals were competent and confident in their use of the software. However we observed a pattern of distributed expertise: Each member of the staff had mastered one advanced function. Whenever someone needed to use an advanced function, she contacted the corresponding expert for personal, one-on-one coaching. This was a win-win situation: the requestors received customized help, and the specialized experts earned an increase in status. These field observations taught us the importance of people’s social context in the use and evaluation of information technology, something we now take for granted in CSCW. Mary Beth Rosson
Studies of Work A fundamental objective of CSCW is to understand how computers can be used to support everyday work practices. Early research in the 1980s focused on workflow systems. This approach codifies existing business procedures (for example, relating to the hiring of a new employee) in a computer model and embeds the model in a tracking system that monitors execution of the procedures, providing reminders, coordination across participants, and assurance that appropriate steps are followed. Computerized workflow systems are highly rational technological tools whose goal is to support the effective execution of normative procedures. Ironically, a major lesson that emerged from building and studying the use of these systems is that exceptions to normative business procedures are pervasive in real activity, and that handling such exceptions characteristically involves social interactions that need to be fluid and nuanced in order to succeed. Indeed, the failure of direct and rational workflow support was to a considerable extent the starting point for modern CSCW, which now emphasizes balance between structured performance
and learning support and flexibility in the roles and responsibilities available to human workers. Studies of work often employ ethnographic methods adapted from anthropology. In ethnographic research, the activities of a group are observed over an extended period of time. This allows collaborative activity to be seen in context. Thus, tasks are not characterized merely in terms of the steps comprising procedures, but also in terms of who interacts with whom to carry out and improvise procedures, what tools and other artifacts are used, what information is exchanged and created, and the longer-term collateral outcomes of activity, such as personal and collective learning and the development of group norms and mutual trust. This work has demonstrated how, for example, the minute interdependencies and personal histories of doctors, nurses, patients, administrators, and other caregivers in the functioning of a hospital must be analyzed to properly understand actions as seemingly simple as a doctor conveying a treatment protocol to a nurse on the next shift.
COMPUTER-SUPPORTED COOPERATIVE WORK ❚❙❘ 117
Sometimes the observer tries to be invisible in ethnographic research, but sometimes the investigator joins the group as a participant-observer. Typically video recordings of work activities are made, and various artifacts produced in the course of the work are copied or preserved to enable later analysis and interpretation. Ethnographic methods produce elaborate and often voluminous qualitative descriptions of complex work settings. These descriptions have become central to CSCW research and have greatly broadened the notion of context with respect to understanding human activity. Theoretical frameworks such as activity theory, distributed cognition, and situated action, which articulate the context of activity, have become the major paradigms for science and theory in CSCW. Much of what people do in their work is guided by tacit knowledge. A team of engineers may not realize how much they know about one another’s unique experience, skills, and aptitudes, or how well they recruit this knowledge in deciding who to call when problems arise or how to phrase a question or comment for best effect. But if an analyst observes them at work, queries them for their rationale during problem-solving efforts, and asks for reflections on why things happen, the tacit knowledge that is uncovered may point to important trade-offs in building computerized support for their work processes. For instance, directing a question to an expert colleague provides access to the right information at the right time, but also establishes and reinforces a social network. Replacing this social behavior with an automated expert database may answer the query more efficiently, but may cause employees to feel more disconnected from their organization. A persistent tension in CSCW studies of work springs from the scoping of activities to be supported. Many studies have shown how informal communication—dropping by a coworker’s office, encountering someone in the hall, sharing a coffee— can give rise to new insights and ideas and is essential in creating group cohesion and collegiality, social capital to help the organization face future challenges. But communication is also time consuming and often
ambiguous, entailing clarifications and confirmations. And of course informal interactions are also often unproductive. Balancing direct support for work activities with broader support for building and maintaining social networks is the current state of the classic workflow systems challenge.
Computer-Mediated Communication The central role of communication in the behavior of groups has led to intense interest in how technology can be used to enable or even enhance communication among individuals and groups. Much attention has been directed at communication among group members who are not colocated, but even for people who share an office, CMC channels such as e-mail and text chat have become pervasive. Indeed e-mail is often characterized as the single most successful CSCW application, because it has been integrated so pervasively into everyday work activities. The medium used for CMC has significant consequences for the communicators. Media richness theory suggests that media supporting video or voice are most appropriate for tasks that have a subjective or evaluative component because the nonverbal cues provided by a communicator’s visual appearance or voice tone provide information that helps participants better understand and evaluate the full impact of one another’s messages. In contrast, text-based media like e-mail or chat are better for gathering and sharing objective information. Of course, even textbased channels can be used to express emotional content or subjective reactions to some extent; a large and growing vocabulary of character-based icons and acronyms are used to convey sadness, happiness, surprise, and so on. Use of CMC has also been analyzed from the perspective of the psychologist Herbert Clark’s theory of common ground in language—the notion that language production, interpretation, and feedback relies extensively on communicators’ prior knowledge about one another, the natural language they are using, the setting they are in, and their group and cultural affiliations. In CMC settings some of this information may be missing. Furthermore, many of the acknowledgement and feedback mechanisms that
118 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
humans take for granted in face-to-face conversation (for example, head nods and interjected uhhuhs and so on) become awkward or impossible to give and receive in CMC. The theory of common ground argues that these simple acknowledgement mechanisms are crucial for fluid conversation because they allow conversation partners to monitor and track successful communication: A head nod or an uh-huh tells the speaker that the listener understands what the speaker meant, is acknowledging that understanding, and is encouraging the speaker to continue. Despite the general acknowledgement that textbased CMC media such as e-mail and chat are relatively poor at conveying emotion and subjective content, these channels have advantages that make them excellent choices for some tasks. E-mail, for example, is usually composed and edited in advance of sending the message; it can be read and reviewed multiple times; and it is very easily distributed to large groups. E-mail is also easy to archive, and its text content can be processed in a variety of ways to create reusable information resources. Because e-mail is relatively emotion-free, it may be appropriate for delicate or uncomfortable communication tasks. With so many CMC options, people are now able to make deliberate (or tacit) choices among CMC channels, using a relatively informal and unobtrusive medium like text chat for low-cost interaction, more formally composed e-mail for business memos, and video or audio conferencing for important decision-making tasks. The relative anonymity of CMC (particularly with text-based channels) has provoked considerable research into the pros and cons of anonymous communication. Communicators may use their real names or screen names that only loosely convey their identity; in some situations virtual identities may be adopted explicitly to convey certain aspects of an invented personality or online persona. Anonymity makes it easier to express sensitive ideas and so can be very effective when brainstorming or discussion is called for but social structures would otherwise inhibit a high degree of sharing. However the same factors that make anonymity an aid to brainstorming also can lead to rude or inappropriate exchanges and may make it difficult to establish common ground
and to build trusting relationships. Indeed, there have been a number of well-publicized episodes of cruel behavior in CMC environments such as chatrooms and MUDs (multiuser dungeons or domains). During the 1990s, cell phones, pagers, personal digital assistants, and other mobile devices rendered people and their work activities more mobile. As a consequence, the context of CMC became quite varied and unpredictable. A research area that has developed in response to users’ changing environments is context-aware computing, wherein the technology is used not only to support work activities, but also to gather information about the users’ situation. For example, it is relatively straightforward to set up distinct settings for how a cell phone will operate (e.g., ring tone or volume) at work, home, outdoors, and so on, but it takes time and attention to remember to activate and deactivate them as needed. Thus the goal is to build devices able to detect changes in people’s environment and to activate the appropriate communication options or tasks. Whether such mode changes take place automatically or are managed by the individual, the resulting context information can be important to collaborators, signaling if and when they can initiate or return to a shared activity.
Groupware CSCW software is often categorized by the timing of the collaboration it supports: Synchronous groupware supports interaction at the same point in time, while asynchronous groupware supports collaboration across time. Another distinction is the collaborators’ relative location, with some groupware designed for colocated interaction and some for distributed activities. For example, group decision support systems are typically used for synchronous and colocated interaction: As part of a face-to-face meeting, group members might use a shared online environment to propose, organize, and prioritize ideas. In contrast, an online forum might be used for asynchronous discussions among distributed group members. A longstanding goal for many groupware developers has been building support for virtual meetings— synchronous group interactions that take place
COMPUTER-SUPPORTED COOPERATIVE WORK ❚❙❘ 119
A Personal Story—Internet Singing Lessons Having immigrated to the United States from India at an early age, I have always had a problem mastering the fine melodic nuances required to sing traditional Hindi songs. This problem has limited my singing repertoire. Last winter, during a religious meeting of Indian immigrants living in the Ann Arbor area, I was struck with how well a young Indian man sang a haunting Hindu chant. Later that evening I asked him to help me improve how I sang Hindu chants, which he did willingly. However, he soon informed me that he was returning to India the following week as he was in the U.S. on a temporary work visa. Because I was disappointed in losing such a willing teacher, my friend suggested a technological solution. He suggested that I set up an account with Yahoo! Messenger, and to buy a stereo headset through which we could continue our music interaction. Yahoo! Messenger is an instant messaging system that enables logged-in users to exchange text messages, and to talk free of charge on the Internet. When my friend returned to India, we had to deal with two problems. First, we had to deal with the time-difference. India is 10½ hours ahead of the U. S. Second, we had to deal with the problem that my friend only had access to an Internet connection at the office where he worked. This is because computers and Internet connections are still quite expensive for the average Indian. We therefore decided that the best time for undisturbed instant voice messaging would be at 7:30 a.m. Indian Standard Time when other employees had not yet arrived in my friend’s office. This time also work out well for me because it would be 9:00 p.m. (EST), the time when I liked to pluck on my guitar and sing. The above plan worked well—on February 8th, 2004, I had my first “transcontinental singing lesson.” Despite a slight delay in sound transmission due to the Internet bandwidth problem, my friend was able to correct the fine melodic nuances that I missed when I sang my favorite Hindu chant. I can now sing a Hindu chant with nuances approved by a singing teacher sitting in front of a computer many oceans away. Suresh Bhavnani
entirely online as a substitute for traditional face-toface meetings. As businesses have become increasingly international and distributed, support for virtual meetings has become more important. A virtual meeting may use technology as simple as a telephone conference call or as complex as a collaborative virtual environment (CVE) that embodies attendees and their work resources as interactive objects in a three-dimensional virtual world. Because virtual meetings must rely on CMC, attendees have fewer communication cues and become less effective at turn taking, negotiation, and other socially rich interaction. It is also often difficult to access and interact with meeting documents in a CVE, particularly when the meeting agenda is open and information needs to evolve during the meeting. Some researchers have argued that online meetings will never equal face-to-face interaction, and that researchers should focus instead on the special qualities offered by a virtual medium—for example, the archiving, re-
viewing, and revising of content that is a natural consequence of working together online. When collaborators meet online, participant authentication is an important issue. Many work situations have policies and procedures that must be respected; for example, meetings may have a specified attendee list or restricted documents, or decisions may require the approval of a manager. Enforcing such restrictions creates work for both the organizer of the activity (who must activate the appropriate controls) and the participants (who must identify themselves if and when required). Depending on a group’s culture and setting, the meeting organizers may choose to make no restrictions at all (for example, they may meet in an online chatroom and rely on group members to self-enforce relevant policies and group behavior), or they may rely on a set of roles (such as leader, attendee, or scribe) built into the groupware system to manage information access and interaction.
120 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
A significant technical challenge for synchronous groupware is ensuring data consistency. When collaborators are able to communicate or edit shared data in parallel, there is the possibility that simultaneous requests will conflict: One participant might correct the spelling of a word at the same time that another member deletes a phrase containing the word, for example. The simplest technique for avoiding consistency problems is to implement a floor control mechanism that permits only one participant at a time to have the virtual pen, with others waiting until it is passed to them. Because such mechanisms can be awkward and slow, many groupware systems have explored alternatives, including implicit locking of paragraphs or individual words, and fully optimistic serialization, which processes all input in the order in which it is received, with the assumption that well-learned social protocols of turn taking and coordination will reduce conflict and ensure smooth operation. Many other technical challenges plague the smooth operation of groupware. For instance, it is quite common for collaborators to be interacting with rather different hardware and software platforms. Although work groups may settle on a standard set of software, not all group members may follow all aspects of the standard, and beyond the work group, there may be no standards. Thus interoperability of data formats, search tools, editing or viewing software, and analysis tools is a constant concern. As work settings have become more mobile and dynamic, the variety of technical challenges has increased: Some members at a virtual meeting may join by cell phone, while others may use a dedicated broadband network connection. It is increasingly common for groupware systems to at least provide an indicator of such variation, so that collaborators can compensate as necessary (for example, by recognizing that a cell phone participant may not be able to see the slides presented at a meeting). The general goal of promoting awareness during CSCW interactions has many facets. During synchronous work, groupware often provides some form of workspace awareness, with telepointers or miniaturized overviews showing what objects are selected or in view by collaborators. In more extended collaborations, partners depend on social awareness to
know which group members are around, available for interaction, and so on. Social awareness can be provided through mechanisms such as buddy lists, avatars (online representations of group members), or even regularly updated snapshots of a person in their work setting. For a shared project that takes place over weeks or months, collaborators need activity awareness: They must be aware of what project features have changed, who has done what, what goals or plans are currently active, and how to contribute. However, promoting activity awareness remains an open research topic; considerable work is needed to determine how best to integrate across synchronous and asynchronous interactions, what information is useful in conveying status and progress, and how this information can be gathered and represented in a manner that supports rather than interrupts collaborative activities.
Adoption and Adaptation of CSCW Systems Even when great care is taken in the design and implementation of a CSCW system, there is no guarantee that it will be successfully adopted and integrated into work practices—or that when it is adopted it will work as originally intended. Many case studies point to a sociotechnical evolution cycle: Initially, delivered CSCW systems do not fit onto existing social and organizational structures and processes. During a process of assimilation and accommodation, the organization changes (for example, a new role may be defined for setting up and facilitating virtual meetings) in concert with the technology (for example, a set of organization-specific templates may be defined to simplify agenda setting and meeting management). Several implications follow from this view of CSCW adoption. One is that participatory design of the software is essential—without the knowledge of praxis provided by the intended users, the software will not be able to evolve to meet their specific needs; furthermore if users are included in the design process, introduction of the CSCW system into the workplace will already have begun by the time the system is deployed. Another implication is that
COMPUTER-SUPPORTED COOPERATIVE WORK ❚❙❘ 121
CSCW software should have as open an architecture as possible, so that when the inevitable need for changes is recognized months or years after deployment, it will be possible to add, delete, or otherwise refine existing services. A third implication is that organizations seeking CSCW solutions should be ready to change their business structures and processes—and in fact should undertake business process reengineering as they contribute to the design of a CSCW system. A frequent contributing factor in groupware failure is uneven distribution of costs and benefits across organizational roles and responsibilities. There are genuine costs to collaboration: When an individual carries out a task, its subtasks may be accomplished in an informal and ad hoc fashion, but distributing the same task among individuals in a group is likely to require more advance planning and negotiation, recordkeeping, and explicit tracking of milestones and partial results. Collaboration implies coordination. Of course the benefits are genuine as well: One can assign tasks to the most qualified personnel, one gains multiple perspectives on difficult problems, and social recognition and rewards accrue when individuals combine efforts to reach a common goal. Unfortunately, the costs of collaboration are often borne by workers, who have new requirements for online planning and reporting, while its benefits are enjoyed by managers, who are able to deliver on-time results of higher quality. Therefore, when designing for sociotechnical evolution, it is important to analyze the expected costs and benefits and their distribution within the organization. Equally important are mechanisms for building social capital and trust, such that individuals are willing to contribute to the common good, trusting that others in the group will reward or care for them when the time comes. Critical mass is another determinant of successful adoption—the greater the proportion of individuals within an organization who use a technology, the more sense it makes to begin using it oneself. A staged adoption process is often effective, with a high-profile individual becoming an early user and advocate who introduces the system to his or her group. This group chronicles its adoption experience and passes the technology on to other groups, and so on. By the time the late adopters begin to use
the new technology, much of the sociotechnical evolution has taken place, context-specific procedures have been developed and refined in situ, and there are local experts to assist new users. As more and more of an organization’s activities take place online—whether through e-mail or videoconferencing or shared file systems—via CSCW technology, the amount of online information about the organization and its goals increases exponentially. The increased presence of organizational information online has generated great interest in the prospects for organizational memory or knowledge management. The hope is that one side effect of carrying out activities online will be a variety of records about how and why tasks are decomposed and accomplished, and that these records can provide guidance to other groups pursuing similar goals. Of course once again, there are important cost-benefit issues to consider: Recording enough information to be helpful to future groups takes time, especially if it is to be stored in any useful fashion, and the benefit in most cases will be enjoyed by other people. One solution is to give computers the job of recording, organizing, and retrieving. For example, even a coarse-grained identification of speakers making comments in a meeting can simplify subsequent browsing of the meeting audiotape.
Research Directions Much of the active research in CSCW is oriented toward new technologies that will enhance awareness, integrate multiple devices, populations, and activities, and make it possible to visualize and share rich data sets and multimedia documents. The need to interconnect people who are using diverse devices in diverse settings entails many research challenges, some related to the general issues of multiplatform computing and others tied to understanding and planning for the social and motivational differences associated with varied work settings. The rapidly expanding archives in organizations offer many research opportunities related to data processing and analysis as well as information visualization and retrieval. At the same time, these digital storehouses raise important questions about individual privacy and identity—the more information an organization
122 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
collects about an individual, the more opportunity there is for inappropriate access to and use of this information. A methodological challenge for CSCW is the development of effective evaluation methods. Field studies and ethnographic analyses yield very rich data that can be useful in understanding system requirements and organizational dynamics. But analyzing such detailed records to answer precise questions is time consuming and sometimes impossible due to the complexity of real-world settings. Unfortunately, the methods developed for studying individual computer use do not scale well to the evaluation of multiple users in different locations. Because social and organizational context are a key component of CSCW activities, it is difficult to simulate shared activities in a controlled lab setting. Groupware has been evolving at a rapid rate, so there are few if any benchmark tasks or results to use for comparison studies. One promising research direction involves fieldwork that identifies interesting collaboration scenarios; these are then scripted and simulated in a laboratory setting for more systematic analysis. In the 1980s human-computer interaction focused on solitary users finding and creating information using a personal computer. Today, the focus is on several to many people working together at a variety of times and in disparate places, relying heavily on the Internet, and communicating and collaborating more or less continually. This is far more than a transformation of human-computer interaction; it is a transformation of human work and activity. It is still under way, and CSCW will continue to play a large role. Mary Beth Rosson and John M. Carroll See also Collaboratories; Ethnography; MUDs; Social Psychology and HCI FURTHER READING Ackerman, M. S. (2002). The intellectual challenge of CSCW: The gap between social requirements and technical feasibility. In J. M. Carroll (Ed.), Human-computer interaction in the new millennium (pp. 303–324). New York: ACM Press.
Baecker, R. M. (1993). Readings in groupware and computer-supported cooperative work: Assisting human-human collaboration. San Francisco: Morgan-Kaufmann. Beaudouin-Lafon, M. (Ed). (1999). Computer supported co-operative work. Chichester, UK: John Wiley & Sons. Bikson, T. K., & Eveland, J. D. (1996). Groupware implementation: Reinvention in the sociotechnical frame. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’96 (pp. 428–437). New York: ACM Press. Carroll, J. M., Chin, G., Rosson, M .B., & Neale, D. C. (2000). The development of cooperation: Five years of participatory design in the virtual school. In Designing interactive systems: DIS 2000 (pp. 239–251). New York: ACM Press. Carroll, J. M., & Rosson, M.B. (2001). Better home shopping or new democracy? Evaluating community network outcomes. In Proceedings of Human Factors in Computing Systems: CHI 2001 (pp. 372–379). New York: ACM Press. Dourish, P., & Bellotti, V. (1992). Awareness and coordination in shared workspaces. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’92 (pp. 107–114). New York: ACM Press. Grudin, J. (1994). Groupware and social dynamics: Eight challenges for developers. Communications of the ACM, 37(1), 92–105. Gutwin, C., & Greenberg, S. (1999). The effects of workspace awareness support on the usability of real-time distributed groupware. ACM Transactions on Computer-Human Interaction, 6(3), 243–281. Harrison, S., & Dourish, P. (1996). Re-placing space: The roles of place and space in collaborative systems. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’96 (pp. 67–76). New York: ACM Press. Hughes, J., King, V., Rodden, T., & Andersen, H. (1994). Moving out from the control room: Ethnography in system design. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’94 (pp. 429–439). New York: ACM Press. Hutchins, E. (1995). Distributed cognition. Cambridge, MA: MIT Press. Malone, T. W., & Crowston, K. (1994). The interdisciplinary study of coordination. ACM Computing Surveys, 26(1), 87–119. Markus, M. L. (1994). Finding a happy medium: Explaining the negative effects of electronic communication on social life at work. ACM Transactions on Information Systems, 12(2), 119–149. Nardi, B. A. (1993). A small matter of programming. Cambridge, MA: MIT Press. Nardi, B. A. (Ed). (1996). Context and consciousness: Activity theory and human-computer interaction. Cambridge, MA: MIT Press. Olson, G. M., & Olson, J. S. (2000). Distance matters. Human Computer Interaction, 15(2–3), 139–179. Orlikowski, W. J. (1992). Learning from notes: Organizational issues in groupware implementation. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’92 (pp. 362–369). New York: ACM Press. Roseman, M., & Greenberg, S. (1996). Building real time groupware with Groupkit, a groupware toolkit. ACM Transactions on Computer Human Interaction, 3(1), 66–106. Sproull, L., & Kiesler, S. (1991). Connections: New ways of working in the networked organization. Cambridge, MA: MIT Press. Streitz, N. A., Geißler, J., Haake, J., & Hol, J. (1994). DOLPHIN: Integrated meeting support across local and remote desktop environments and liveboards. In Proceedings of the Conference on Computer Supported Cooperative Work: CSCW ’94 (pp. 345–358). New York: ACM Press.
CONSTRAINT SATISFACTION ❚❙❘ 123
Suchman, L. (1987). Plans and situated actions: The problem of humanmachine communication. Cambridge, UK: Cambridge University Press. Sun, C., & Chen, D. (2002). Consistency maintenance in real-time collaborative graphics editing systems. ACM Transactions on Computer Human Interaction, 9(1), 1–41. Tang, J., Yankelovich, N., Begole, J., Van Kleek, M., Li, F., & Bhalodia, J. (2001). Connexus to awarenex: Extending awareness to mobile users. In Proceedings of Human Factors in Computing Systems: CHI 2001 (pp. 221–228). New York: ACM Press. Winograd, T. (1987/1988). A language/action perspective on the design of cooperative work. Human-Computer Interaction, 3(1), 3–30.
CONSTRAINT SATISFACTION Constraint satisfaction refers to a set of representation and processing techniques useful for modeling and solving combinatorial decision problems; this paradigm emerged from the artificial intelligence community in the early 1970s. A constraint satisfaction problem (CSP) is defined by three elements: (1) a set of decisions to be made, (2) a set of choices or alternatives for each decision, and (3) a set of constraints that restrict the acceptable combinations of choices for any two or more decisions. In general, the task of a CSP is to find a consistent solution— that is, a choice for every decision such that all the constraints are satisfied. More formally, each decision is called a variable, the set of alternative choices for a given variable is the set of values or domain of the variable, and the constraints are defined as the set of allowable combinations of assignments of values to variables. These combinations can be given in extension as the list of consistent tuples, or defined in intention as a predicate over the variables.
The 4-Queen Problem A familiar example of a CSP is the 4-queen problem. In this problem, the task is to place four queens on a 4×4 chessboard in such a way that no two queens attack each other. One way to model the 4-queen problem as a CSP is to define a decision variable for each square on the board. The square can be either empty (value 0) or have a queen (value 1). The con-
straints specify that exactly four of the decision variables have value 1 (“queen in this square”) and that there cannot be two queens in the same row, column, or diagonal. Because there are sixteen variables (one for each square) and each can take on two possible values, there are a total of 2¹⁶ (65,536) possible assignments of values to the decision variables. There are other ways of modeling the 4-queen problem within the CSP framework. One alternative is to treat each row on the board as a decision variable. The values that can be taken by each variable are the four column positions in the row. This formulation yields 4⁴ (256) possibilities. This example illustrates how the initial formulation or model affects the number of possibilities to be examined, and ultimately the performance of problem solving.
CSP Representations A CSP is often represented as an undirected graph (or network), which is a set of nodes connected by a set of edges. This representation opens up the opportunity to exploit the properties and algorithms developed in graph theory for processing and solving CSPs. In a constraint graph, the nodes represent the variables and are labeled with the domains of the variables. The edges represent the constraints and link the nodes corresponding to the variables to which the constraints apply. The arity of a constraint designates the number of variables to which the constraint applies, and the set of these variables constitutes the scope of the constraint. Constraints that apply to two variables are called binary constraints and are represented as edges in the graph. Constraints that apply to more than two variables are called nonbinary constraints. While, early on, most research has focused on solving binary CSPs, techniques for solving nonbinary CSPs are now being investigated.
The Role of CSPs in Science Beyond puzzles, CSPs have been used to model and solve many tasks (for example, temporal reasoning, graphical user interfaces, and diagnosis) and have been applied in many real-world settings (for example, scheduling, resource allocation, and product configuration and design). They have been used
124 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
in various areas of engineering, computer science, and management to handle decision problems. A natural extension of the CSP is the constrained optimization problem (COP), where the task is to find an optimal solution to the problem given a set of preferences and optimization criteria. The problems and issues studied in the constraint processing (CP) community most obviously overlap with those investigated in operations research, satisfiability and theoretical computer science, databases, and programming languages. The 1990s have witnessed a sharp increase in the interactions and cross-fertilization among these areas. A special emphasis is made in CP to maintain the expressiveness of the representation. Ideally, a human user should be able to naturally express the various relations governing the interactions among the entities of a given problem without having to recast them in terms of complex mathematical models and tools, as would be necessary in mathematical programming. The area of constraint reformulation is concerned with the task of transforming the problem representation in order to improve the performance of problem solving or allow the use of available solution techniques. Sometimes such transformations are truthful (that is, they preserve the essence of the problem), but often they introduce some sufficient or necessary approximations, which may or may not be acceptable in a particular context.
Solution Methods The techniques used to solve a CSP can be divided into two categories: constraint propagation (or inference) and search. Further, search can be carried out as a systematic, constructive process (which is exhaustive) or as an iterative repair process (which often has a stochastic component). Constraint Propagation Constraint propagation consists in eliminating, from the CSP, combinations of values for variables that cannot appear in any solution to the CSP. Consider for example two CSP variables A and B representing two events. Assume that A occurred between 8 a.m. and 12 p.m. (the domain of A is the interval [8, 12]), B occurred between 7 a.m. and 11 a.m. (the
domain of B is the interval [7, 11]), and B occurred one hour after A (B-A ≥ 1). It is easy to infer that the domains of A and B must be restricted to [8, 10] and [9, 11] respectively, because B cannot possibly occur before 9, or A after 10, without violating the constraint between A and B. This filtering operation considers every combination of two variables in a binary CSP. It is called 2-consistency. A number of formal properties have been proposed to characterize the extent to which the alternative combinations embedded in a problem description are likely to yield consistent solutions, as a measure of how “close is the problem to being solved.” These properties characterize the level of consistency of the problem (for example, k-consistency, minimality, and decomposability). Algorithms for achieving these properties, also known as constraint propagation algorithms, remain the subject of intensive research. Although the cost of commonly used constraint propagation algorithms is a polynomial function of the number of variables of the CSP and the size of their domains, solving the CSP remains, in general, an exponential-cost process. An important research effort in CP is devoted to finding formal relations between the level of consistency in a problem and the cost of the search process used for solving it. These relations often exploit the topology of the constraint graph or the semantic properties of the constraint. For example, a tree-structured constraint graph can be solved backtrack-free after ensuring 2-consistency, and a network of constraints of bounded differences (typically used in temporal reasoning) is solved by ensuring 3-consistency. Systematic Search In systematic search, the set of consistent combinations is explored in a tree-like structure starting from a root node, where no variable has a value, and considering the variables of the CSP in sequence. The tree is typically traversed in a depth-first manner. At a given depth of the tree, the variable under consideration (current variable) is assigned a value from its domain. This operation is called variable instantiation. It is important that the value chosen for the current variable be consistent with the instantiations of the past variables. The process of checking the consistency of a value for the current variable
CONSTRAINT SATISFACTION ❚❙❘ 125
with the assignments of past variables is called backchecking. It ensures that only instantiations that are consistent (partial solutions) are explored. If a consistent value is found for the current variable, then this variable is added to the list of past variables and a new current variable is chosen from among the un-instantiated variables (future variables). Otherwise (that is, no consistent value exists in the domain of the current variable), backtracking is applied. Backtracking undoes the assignment of the previously instantiated variable, which becomes the current variable, and the search process attempts to find another value in the domain of this variable. The process is repeated until all variables have been instantiated (thus yielding a solution) or backtrack has reached the root of the tree (thus proving that the problem is not solvable). Various techniques for improving the search process itself have been proposed. For systematic search, these techniques include intelligent backtracking mechanisms such as backjumping and conflict-directed backjumping. These mechanisms attempt to remember the reasons for failure and exploit them during search in order to avoid exploring barren portions of the search space, commonly called thrashing. The choices of the variable to be instantiated during search and that of the value assigned to the variable are handled, respectively, by variable and value ordering heuristics, which attempt to reduce the search effort. Such heuristics can be applied statically (that is, before the search starts) or dynamically (that is, during the search process). The general principles that guide these selections are “the most constrained variable first” and “the most promising value first.” Examples of the former include the least domain heuristic (where the variable with the smallest domain is chosen for instantiation) and the minimal-width heuristic (where the variables are considered in the ordering of minimal width of the constraint graph).
Research Directions
Iterative-Repair Search In iterative repair (or iterative improvement) search, all the variables are instantiated (usually randomly) regardless of whether or not the constraints are satisfied. This set of complete instantiations, which is not necessarily a solution, constitutes a state. Iterative-repair search operates by moving from one
The use of constraint processing techniques is widespread due to the success of the constraint programming paradigm and the increase of commercial tools and industrial achievements. While research on the above topics remains active, investigations are also invested in the following directions: user interaction; discovery and exploitation of symmetry relations; propagation algorithms for high-arity constraints
state to another and attempting to find a state where all constraints are satisfied. This move operator and the state evaluation function are two important components of an iterative-repair search. The move is usually accomplished by changing the value of one variable (thus the name local search). However, a technique operating as a multiagent search allows any number of variables to change their values. The evaluation function measures the cost or quality of a given state, usually in terms of the number of broken constraints. Heuristics, such as the min-conflict heuristic, are used to choose among the states reachable from the current state (neighboring states). The performance of iterative-repair techniques depends heavily on their ability to explore the solution space. The performance is undermined by the existence in this space of local optima, plateaux, and other singularities caused by the nonconvexity of the constraints. Heuristics are used to avoid falling into these traps or to recover from them. One heuristic, a breakout strategy, consists of increasing the weight of the broken constraints until a state is reached that satisfies these constraints. Tabu search maintains a list of states to which search cannot move back. Other heuristics use stochastic noise such as random walk and simulated annealing. Blending Solution Techniques Constraint propagation has been successfully combined with backtrack search to yield effective lookahead strategies such as forward checking. Combining constraint propagation with iterative-repair strategies is less common. On the other hand, randomization, which has been for a long time utilized in local search, is now being successfully applied in backtrack search.
126 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
and for continuous domains; preference modeling and processing; distributed search techniques; empirical assessment of problem difficulty; and statistical evaluation and comparison of algorithms. Berthe Y. Choueiry See also Artificial Intelligence; N-grams
FURTHER READING Bistarelli, S., Montanari, U., & Rossi, F. (1997). Semiring-based constraint satisfaction and optimization. Journal of the ACM, 44(2), 201–236. Borning, A., & Duisberg, R. (1986). Constraint-based tools for building user interfaces. ACM Transactions on Graphics, 5(4), 345–374. Cohen, P. R. (1995). Empirical methods for artificial intelligence. Cambridge, MA: MIT Press. Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann. Ellman, T. (1993). Abstraction via approximate symmetry. In Proceedings of the 13th IJCAI (pp. 916–921). Chambéry, France. Freuder, E. C. (1982). A sufficient condition for backtrack-free search. JACM, 29(1), 24–32. Freuder, E. C. (1985). A sufficient condition for backtrack-bounded search. JACM, 32(4), 755–761. Freuder, E. C. (1991). Eliminating interchangeable values in constraint satisfaction problems. In Proceedings of AAAI-91 (pp. 227–233). Anaheim, CA. Gashnig, J. (1979). Performance measurement and analysis of certain search algorithms. Pittsburgh, PA: Carnegie-Mellon University. Glaisher, J. W. L. (1874). On the problem of the eight queens. Philosophical Magazine, 4(48), 457–467. Glover, F. (1989). Tabu Search—Part I. ORSA Journal on Computing, 1(3), 190–206. Gomes, C. P. (2004). Randomized backtrack search. In M. Milano (Ed.), Constraint and Integer Programming: Toward a Unified Methodology (pp. 233–291). Kluwer Academic Publishers. Haralick, R. M., & Elliott, G. L. (1980). Increasing tree search efficiency for constraint satisfaction problems. Artificial Intelligence, 14, 263–313. Hogg, T., Huberman, B. A., & Williams, C. P. (Eds.). (1996). Special volume on frontiers in problem solving: Phase transitions and complexity. Artificial Intelligence, 81(1–2). Burlington, MA: Elsevier Science. Hooker, J. (2000). Logic-based methods for optimization: Combining optimization and constraint satisfaction. New York: Wiley. Hoos, H. H., & Stützle, T. (2004). Stochastic local search. San Francisco: Morgan Kaufmann. Kirkpatrick, S., Gelatt, J. C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680. Liu, J., Jing, H., & Tang, Y. Y. (2002). Multi-agent oriented constraint satisfaction. Artificial Intelligence, 136(1), 101–144. Minton, S., et al. (1992). Minimizing conflicts: A heuristic repair method for constraint satisfaction and scheduling problems. Artificial Intelligence, 58, 161–205.
Montanari, U. (1974). Networks of constraints: Fundamental properties and application to picture processing. Information Sciences, 7, 95–132. Prosser, P. (1993). Hybrid algorithms for the constraint satisfaction problem. Computational Intelligence, 9(3), 268–299. Régin, J.-C. (1994). A filtering algorithm for constraints of difference in constraint satisfaction problems. In Proceedings from the National Conference on Artificial Intelligence (AAAI 1994) (pp. 362–437). Seattle, WA. Revesz, P. (2002). Introduction to constraint databases. New York: Springer. Stuckey, K. M. (1998). Programming with constraints: An introduction. Cambridge, MA: MIT Press. Tsang, E. (1993). Foundations of constraint satisfaction. London, UK: Academic Press. Yokoo, M. (1998). Distributed constraint satisfaction. New York: Springer.
CONVERGING TECHNOLOGIES Human-computer interaction (HCI) is a multidisciplinary field arising chiefly in the convergence of computer science, electrical engineering, information technology, and cognitive science or psychology. In the future it is likely to be influenced by broader convergences currently in progress, reaching out as far as biotechnology and nanotechnology. Together, these combined fields can take HCI to new levels where it will unobtrusively but profoundly enhance human capabilities to perceive, to think, and to act with maximum effectiveness.
The Basis for Convergence During the twentieth century a number of interdisciplinary fields emerged, bridging the gaps between separate traditionally defined sciences. Notable examples are astrophysics (astronomy plus physics), biochemistry (biology plus chemistry), and cognitive science (psychology plus neurology plus computer science). Many scientists and engineers believe that the twenty-first century will be marked by a broader unification of all of the sciences, permitting a vast array of practical breakthroughs—notably in the convergence of nanotechnology, biotechnology, information technology, and cognitive technology—
CONVERGING TECHNOLOGIES ❚❙❘ 127
based on the unification of nanoscience, biology, information science, and cognitive science. HCI itself stands at the junction between the last two of these four, and it has the potential to play a major role in the emergence of converging technologies. A number of scientific workshops and conferences, organized by scientists and engineers associated with the U.S. National Science Foundation and building upon the United States National Nanotechnology Initiative, have concluded that nanoscience and nanotechnology will be especially important in convergence. Nanoscience and nanotechnology concern scientific research and engineering (respectively) at the nanoscale, the size range of physical structures between about 1 nanometer and 100 nanometers in shortest dimension. A nanometer is 1 billionth of a meter, or 1 millionth of a millimeter, and a millimeter is about the thickness of a dime (the thinnest U.S. coin). Superficially, nanoscience and nanotechnology
seem remote from HCI because the human senses operate at a much larger scale. However, we can already identify a number of both direct and indirect connections, and as work at the nanoscale promotes convergence between other fields it will create new opportunities and challenges for HCI. The largest single atoms, such as those of uranium, are just smaller than 1 nanometer. The structures of complex matter that are fundamental to all sciences originate at the nanoscale. That is the scale at which complex inorganic materials take on the characteristic mechanical, electrical, and chemical properties they exhibit at larger scales. The nanoscale is where the fundamental structures of life arise inside biological cells, including the human DNA (deoxyribonucleic acid) molecule itself. The double helix of DNA has the proportions of a twisted piece of string, about 2.5 nanometers thick but as much as 4 centimeters (40 million nanometers) long
The BRLESC-II, a solid-state digital computer introduced in 1967. It was designed to be 200 times faster than the ORDVAC computer it replaced. Photo courtesy of the U.S. Army.
128 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
if uncoiled. The synaptic gaps between neurons in the human brain, and the structures that contain the neurotransmitter chemicals essential to their functioning, are on the order of 20 to 50 nanometers. Nanotechnology and nanoscience are chiefly a partnership of physics, chemistry, and materials science (an interdisciplinary field at the intersection of physics, chemistry, and engineering that deals with the properties of materials, including composite materials with complex structures). In the near term nanotechnology offers engineering a host of new materials, including powders with nanoscale granules, thin coatings that transform the properties of surfaces, and composite materials having nanoscale structure that gives them greater strength, durability, and other characteristics that can be precisely designed for many specific uses. In the midterm to long term, nanotechnology is expected also to achieve practical accomplishments with complex nanostructures, including new kinds of electronic components and nanoscale machines. Biotechnology applies discoveries in biology to the invention and production of products that are valuable for human health, nutrition, and economic well-being. The traditional application areas for biotechnology are medicine and agriculture, including the production of chemicals and construction materials having organic origins. Biotechnology has a long history, extending back thousands of years to ancient industries such as fermentation of alcohol, tanning of hides, dyeing of clothing, and baking of bread. The pace of innovation accelerated throughout the nineteenth and twentieth centuries, leading to the latest developments in genomics (a branch of biotechnology concerned with applying the techniques of genetics and molecular biology to the genetic mapping and DNA sequencing of sets of genes or the complete genomes of selected organisms) and a growing understanding of the structures and processes inside the living cell. Information technology is a creation of the second half of the twentieth century, revolutionizing traditional communication technologies through the introduction of electronic computation. It comprises computers, information systems, and communication networks such as Internet and the World
Wide Web, both hardware and software. Many of the early applications have been new ways of accomplishing old tasks, for example, word processors, digital music and television, and more recently digital libraries. The integration of mobile computing with the Internet is expected to unleash a wave of radically different innovations, many of which cannot even be imagined today, connected to ubiquitous availability of information and of knowledge tools. Cognitive science is the study of intelligence, whether human, nonhuman animal, or machine, including perception, memory, decision, and understanding. It is itself a convergence of fields, drawing upon psychology, social psychology, cultural anthropology, linguistics, economics, sociology, neuroscience, artificial intelligence, and machine learning. The fundamental aim is a profound understanding of the nature of the human mind. By the beginning of the twenty-first century a new universe of cognitive technologies clearly was opening up, especially in partnerships between humans and computers. The result could be technologies that overcome breakdowns in human awareness, analysis, planning, decision making, and communication. Each of these four fields is a fertile field of scientific research and technological development, but in combination they can achieve progress much more rapidly and broadly than they can alone. Following are examples of the science and engineering opportunities in each of the six possible pairs. Nanotechnology–Biotechnology Research at the nanoscale can reveal the detailed, dynamic geometry of the tiny structures that carry out metabolism, movement, and reproduction inside the living cell, thereby greatly expanding biological science. Biology provides conceptual models and practical tools for building inorganic nanotechnology structures and machines of much greater complexity than currently possible. Nanotechnology–Information Technology Nanoelectronic integrated circuits will provide the fast, efficient, highly capable hardware to support new systems for collecting, managing, and distributing information wherever and whenever it is
CONVERGING TECHNOLOGIES ❚❙❘ 129
needed. Advances in information technology will be essential for the scientific analysis of nanoscale structures and processes and for the design and manufacture of nanotechnology products. Nanotechnology–Cognitive Technology New research methods based on nanoscale sensor arrays will enable neuroscientists to study the fine details of neural networks in the brain, including the dynamic patterns of interaction that are the basis of human thought. Cognitive science will help nanoscientists and educators develop the most readily intelligible models of nanoscale structures and the innovative curriculum needed for students to understand the world as a complex hierarchy of systems built up from the nanoscale. Biotechnology–Information Technology Principles from evolutionary biology can be applied to the study of human culture, and biologically inspired computational methods such as genetic algorithms (procedures for solving a mathematical problem in a finite number of steps that frequently involve repetition of an operation) can find meaningful patterns in vast collections of information. Bioinformatics, which consists of biologically oriented databases with lexicons for translating from one to another, is essential for managing the huge trove of data from genome (the genetic material of an organism) sequencing, ecological surveys, large-scale medical and agricultural experiments, and systematic comparisons of evolutionary connections among thousands of species. Biotechnology–Cognitive Technology Research techniques and instruments developed in biotechnology are indispensable tools for research on the nature and dynamics of the nervous system, in both humans and nonhuman animals, understood as the products of millions of years of biological evolution. Human beings seem to have great difficulty thinking of themselves as parts of complex ecological systems and as the products of evolution by natural selection from random evolution, so advances will be needed to design fresh approaches to scientific education and new visual-
ization tools to help people understand biology and biotechnology correctly. Information Technology–Cognitive Technology Experiments on human and nonhuman animal behavior depend upon computerized devices for data collection and on information systems for data analysis, and progress can be accelerated by sharing information widely among scientists. Discoveries by cognitive scientists about the ways the human mind carries out a variety of judgments provide models for how machines could do the same work, for example, to sift needed information from a vast assembly of undigested data.
HCI Contributions to Convergence Attempting to combine two scientific disciplines would be futile unless they have actually moved into adjacent intellectual territories and proper means can be developed to bridge between them. Disciplines typically develop their own distinctive assumptions, terminologies, and methodologies. Even under the most favorable conditions, transforming tools are needed, such as new concepts that can connect the disparate assumptions of different disciplines, ontologies—category schemes and lexicons of concepts in a particular domain—that translate language across the cultural barriers between disciplines, and research instrumentation or mathematical analysis techniques that can be applied equally well in either discipline. Because many of these transforming tools are likely to be computerized, humancomputer interaction research will be essential for scientific and technological convergence. One of the key ways of developing fresh scientific conceptualizations, including models and metaphors that communicate successfully across disciplinary barriers, is computer visualizations. For example, three-dimensional graphic simulations can help students and researchers alike understand the structures of complex molecules at the nanoscale, thus bridging between nanoscience and molecular biology, including genomics and the study of the structures inside the living cell. In trying to understand the behavior of protein molecules, virtual reality (VR)
130 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
may incorporate sonification (the use of sounds to represent data and information) in which a buzzing sound represents ionization (the dissociation of electrons from atoms and molecules, thus giving them an electric charge), and haptics (relating to the sense of touch) may be used to represent the attraction between atoms by providing a counteracting force when a VR user tries to pull them apart. For data that do not have a natural sensory representation, a combination of psychology and user-centered design, focusing on the needs and habitual thought patterns of scientists, will identify the most successful forms of data visualization, such as information spaces that map across the conceptual territories of adjacent sciences. HCI is relevant not only for analyzing statistical or other data that have already been collected and computerized, but also for operating scientific instruments in real time. Practically every kind of scientific research uses computerized instruments today. Even amateur astronomical telescopes costing under $500 have guidance computers built into them. In the future expensive computerized instruments used in nanoscience, such as atomic force microscopes (tools for imaging individual atoms on a surface, allowing one to see the actual atoms), may provide haptic feedback and three-dimensional graphics to let a user virtually feel and see individual atoms when manipulating them, as if they have been magnified 10 million times. In any branch of science and engineering, HCIoptimized augmented cognition and augmented reality may play a useful role, and after scientists and engineers in different fields become accustomed to the same computer methods for enhancing their abilities, they may find it easier to communicate and thus collaborate with each other. For example, primate cognitive scientists, studying the behavior of baboons, may collaborate with artificial-intelligence researchers, and both can employ augmented reality to compare the behavior of a troop of real animals with a multiagent system designed to simulate them. Internet-based scientific collaboratories can not only provide a research team at one location with a variety of transforming tools, but also let researchers from all around the world become members of the team through telepresence.
Researchers in many diverse sciences already have established a shared data infrastructure, such as international protein structure and genomics databases and the online archives that store thousands of social and behavioral science questionnaire datasets. The development of digital libraries has expanded the range of media and the kinds of content that can be provided to scholars, scientists, and engineers over the Internet. Grid computing, which initially served the supercomputing community by connecting geographically distributed “heavy iron” machines, is maturing into a vast, interconnected environment of shared scientific resources, including data collection instrumentation, information storage facilities, and major storehouses of analytic tools. As more and more research traditions join the grid world, they will come to understand each other better and find progressively more areas of mutual interest. This convergence will be greatly facilitated by advances in human-computer interaction research.
Implications for Computing Because HCI already involves unification of information and cognitive technologies, distinctive effects of convergence will primarily occur in unification with the two other realms: nanotechnology and biotechnology. Nanotechnology is likely to be especially crucial because it offers the promise of continued improvement in the performance of computer components. Already a nanoscale phenomenon called the “giant magnetoresistance” (GMR) effect has been used to increase the data density on mass production computer hard disks, giving them much greater capacity at only slight cost. The two key components of a computer hard disk are a rotatable magnetic disk and a read-and-write head that can move along the radius of the disk to sense the weak magnetism of specific tiny areas on the disk, each of which represents one bit (a unit of computer information equivalent to the result of a choice between two alternatives) of data. Making the active tip of the readand-write head of precisely engineered materials constructed in thin (nanoscale) layers significantly increases its sensitivity. This sensitivity, in turn, allows the disk to be formatted into a larger number of smaller areas, thereby increasing its capacity.
CONVERGING TECHNOLOGIES ❚❙❘ 131
Since the beginning of the human-computer interaction field, progress in HCI has depended not only on the achievements of its researchers, but also on the general progress in computer hardware. For example, early in the 1970s the Altair computer pioneered the kind of graphic user interface employed by essentially all personal computers at the end of the twentieth century, but its memory chips were too expensive, and its central processing unit was too slow. A decade later the chips had evolved to the point where Apple could just barely market the Macintosh, the first commercially successful computer using such an interface. Today many areas of HCI are only marginally successful, and along with HCI research and development, increased power and speed of computers are essential to perfect such approaches as virtual reality, real-time speech recognition, augmented cognition, and mobile computing. Since the mid-1960s the density of transistors on computer chips has been doubling roughly every eighteen months, and the cost of a transistor has been dropping by half. So long as this trend continues, HCI can count on increasingly capable hardware. At some point, possibly before 2010, manufacturers will no longer be able to achieve progress by cramming more and more components onto a chip of the traditional kind. HCI progress will not stop the next day, of course, because a relatively long pipeline of research and development exists and cannot be fully exploited before several more years pass. Progress in other areas, such as parallel processing and wireless networking, will still be possible. However, HCI would benefit greatly if electronic components continued to become smaller and smaller because this miniaturization means they will continue to get faster, use progressively less power, and possibly also be cheaper. Here is where nanotechnology comes in. Actually, the transistors on computer chips have already shrunk into the nanoscale, and some of them are less than 50 nanometers across. However, small size is only one of the important benefits of nanotechnology. Equally important are the entirely new phenomena, such as GMR, that do not even exist at larger scales. Nanotechnologists have begun exploring alternatives to the conventional microelectronics that we have been using for decades, notably molecular logic gates (components made of individual molecules
that perform logical operations) and carbon nanotube transistors (transistors made of nanoscale tubes composed of carbon). If successful, these radically new approaches require development of an entire complex of fresh technologies and supporting industries; thus, the cost of shifting over to them may be huge. Only a host of new applications could justify the massive investments, by both government and industry, that will be required. Already people in the computer industry talk of “performance overhang,” the possibility that technical capabilities have already outstripped the needs of desirable applications. Thus, a potential great benefit for HCI becomes also a great challenge. If HCI workers can demonstrate that a range of valuable applications is just beyond the reach of the best computers that the old technology can produce, then perhaps people will have sufficient motivation to build the entire new industries that will be required. Otherwise, all of computer science and engineering may stall. During the twentieth century several major technologies essentially reached maturity or ran into social, political, or economic barriers to progress. Aircraft and automobiles have changed little in recent years, and they were certainly no faster in 2000 than in 1960. The introduction of high-definition television has been painfully slow, and applications of haptics and multimodal augmented reality outside the laboratory move at a snail’s pace. Space flight technology has apparently stalled at about the technical level of the 1970s. Nuclear technology has either been halted by technical barriers or blocked by political opposition, depending on how one prefers to analyze the situation. In medicine the rate of introduction of new drugs has slowed, and the great potential of genetic engineering is threatened by increasing popular hostility. In short, technological civilization faces the danger of stasis or decline unless something can rejuvenate progress. Technological convergence, coupled with aggressive research at the intersections of technical fields, may be the answer. Because HCI is a convergent field itself and because it can both benefit from and promote convergence, HCI can play a central role. In addition to sustaining progress as traditionally defined, convergence enables entirely new
132 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
applications. For example, nanotechnology provides the prospect of developing sensors that can instantly identify a range of chemicals or microorganisms in the environment, and nano-enabled microscale sensor nets can be spread across the human body, a rain forest, and the wing of an experimental aircraft to monitor their complex systems of behavior.
Paradigm Transformation Convergence is not just a matter of hiring a multidisciplinary team of scientists and engineers and telling them to work together. To do so they need effective tools, including intellectual tools such as comprehensive theories, mathematical techniques for analyzing dynamic systems, methods for visualizing complex phenomena, and well-defined technical words with which to talk about them. Decades ago historian Thomas Kuhn described the history of science as a battle between old ways of thought and new paradigms (frameworks) that may be objectively better but inevitably undergo opposition from the old-guard defenders of the prevailing paradigm. His chief example was the so-called Copernican Revolution in astronomy, when the notion that the Earth is the center of the universe was displaced by a new notion that the sun is the center of the solar system and of a vast, centerless universe far beyond. The problem today is that many paradigms exist across all branches of science and engineering. Some may be equivalent to each other, after their terms are properly translated. Others may be parts of a larger intellectual system that needs to be assembled from them. However, in many areas inferior paradigms that dominate a particular discipline will need to be abandoned in favor of one that originated in another discipline, and this process is likely to be a hard-fought and painful one taking many years. The human intellectual adventure extends back tens of thousands of years. In their research and theoretical work on the origins of religion, Rodney Stark and William Sims Bainbridge observed that human beings seek rewards and try to avoid costs— a commonplace assumption in economics and other branches of social science. To solve the problems they faced every day, ancient humans sought
explanations—statements about how and why rewards may be obtained and costs are incurred. In the language of computer science, such explanations are algorithms. Some algorithms are very specific and apply only under certain narrowly defined circumstances. If one wants meat, one takes a big stick from the forest, goes into the meadow, and clobbers one of the sheep grazing there. If one wants water, one goes to the brook at the bottom of the valley. These are rather specific explanations, assuming that only one meadow, one kind of animal, one brook, and one valley exist. As the human mind evolved, it became capable of working out much more general algorithms that applied to a range of situations. If one wants meat, one takes a club, goes to any meadow, and sees what one can clobber there. If one wants water, the bottoms of deep valleys are good places to look. In the terms of artificial intelligence, the challenge for human intelligence was how to generalize, from a vast complexity of experience, by reasoning from particular cases to develop rules for solving particular broad kinds of problems. Stark and Bainbridge noted how difficult it is for human beings to invent, test, and perfect very general explanations about the nature of the universe and thereby to find empirically good algorithms for solving the problems faced by our species. In other words, science and technology are difficult enterprises that could emerge only after ten thousand years of civilization and that cannot be completed for many decades to come. In the absence of a desired reward, people often will accept algorithms that posit attainment of the reward in the distant future or in some other non-verifiable context. Thus, first simple magic and then complex religious doctrines emerged early in human history, long before humans had accurate explanations for disease and other disasters, let alone effective ways of dealing with them. If the full convergence of all the sciences and technologies actually occurs, as it may during the twenty-first century, one can wonder what will become not only of religion but of all other forms of unscientific human creativity, what are generally called the “humanities.” The U.S. entomologist and sociobiologist Edward O. Wilson has written about the convergence that
CYBERCOMMUNITIES ❚❙❘ 133
is far advanced among the natural sciences, calling it “consilience,” and has wondered whether the humanities and religion will eventually join in to become part of a unified global culture. Here again human-computer interaction may have a crucial role to play because HCI thrives exactly at the boundary between humans and technology. During the first sixty years of their existence, computers evolved from a handful of massive machines devoted to quantitative problems of engineering and a few physical sciences to hundreds of millions of personal tools, found in every school or library, most prosperous people’s homes, and many people’s pockets. Many people listen to music or watch movies on their computers, and thousands of works of literature are available over the Internet. A remarkable number of digital libraries are devoted to the humanities, and the U.S. National Endowment for the Humanities was one of the partner agencies in the Digital Library Initiative led by the U.S. National Science Foundation. The same HCI methods that are used to help scientists visualize complex patterns in nature can become new ways of comprehending schools of art, tools for finding a desired scholarly reference, or even new ways of creating the twenty-second-century equivalents of paintings, sculptures, or symphonies. The same virtual reality systems that will help scientists collaborate across great distances can become a new electronic medium, replacing television, in which participants act out roles in a drama while simultaneously experiencing it as theater. Cyberinfrastructure resources such as geographic information systems, automatic language translation machines, and online recommender systems can be used in the humanities as easily as in the sciences. The conferences and growing body of publications devoted to converging technologies offer a picture of the world a decade or two in the future when information resources of all kinds are available at all times and places, organized in a unified but malleable ontology, and presented through interfaces tailored to the needs and abilities of individual users. Ideally, education from kindergarten through graduate school will be organized around a coherent set of concepts capable of structuring reality in ways that are simultaneously accurate and congenial to
human minds of all ages. Possibly no such comprehensive explanation of reality (an algorithm for controlling nature) is possible. Or perhaps the intellectuals and investors who must build this future world may not be equal to the task. Thus, whether it succeeds or fails, the technological convergence movement presents a huge challenge for the field of human-computer interaction, testing how well we can learn to design machines and information systems that help humans achieve their maximum potential. William Sims Bainbridge See also Augmented Cognition; Collaboratories FURTHER READING Atkins, D. E., Drogemeier, K. K., Feldman, S. I., Garcia-Molina, H., Klein, M. L., Messerschmitt, D. G., Messina, P., Ostriker, J. P., & Wright, M. H. (2003). Revolutionizing science and engineering through cyberinfrastructure. Arlington, VA: National Science Foundation. Kuhn, T. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Roco, M. C., & Bainbridge, W. S. (2001). Societal implications of nanoscience and nanotechnology. Dordrecht, Netherlands: Kluwer. Roco, M. C., & Bainbridge, W. S. (2003). Converging technologies for improving human performance. Dordrecht, Netherlands: Kluwer. Roco, M. C., & Montemagno, C. D. (Eds.). (2004). The coevolution of human potential and converging technologies. Annals of the New York Academy of Sciences, 1013. New York: New York Academy of Sciences. Stark, R., & Bainbridge, W. S. (1987). A theory of religion. New York: Toronto/Lang. Wilson, E. O. (1998). Consilience: The unity of knowledge. New York: Knopf.
CYBERCOMMUNITIES For many people, the primary reason for interacting with computers is the ability, through computers, to communicate with other people. People form cybercommunities by interacting with one another through computers. These cybercommunities are conceived of as existing in cyberspace, a conceptual realm created through the networking and interconnection that computers make possible.
134 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Cybercommunity Definition and History The prefix cyber first appeared in the word cybernetics, popularized by Norbert Weiner (1894–1964) in the 1940s to refer to the science of “control and communication in the animal and the machine” (the subtitle of Weiner’s 1948 book Cybernetics). Since that time, cyber has prefixed many other words to create new terms for various interconnections between computers and humans. One of the terms, cyberspace, has become a popular metaphor for the perceived location of online interactions. Coined by William Gibson in his 1984 novel Neuromancer, cyberspace originally referred to a graphical representation of computerized data to which people connected through direct electrical links to the brain. Since then, the term has come to mean any virtual forum in which people communicate through computers, whether the form of communication involves text, graphics, audio, or combinations of those. Cybercommunities predate widespread use of the Internet, with the first forming in localized systems called bulletin board services (BBSs). BBSs usually ran on a single computer, and participants connected through modems and a local phone line. This meant that most participants lived within a limited geographical area. Thus many BBSs were able to hold occasional face-to-face get-togethers, enhancing community relationships. Communication on BBSs was usually asynchronous; that is, people logged on at different times and posted messages in various topical forums for others to read and respond to later. (E-mail and similar bulletin boards now available on the World Wide Web are also asynchronous forms of online communication, while the various types of online chat and instant messaging are considered to be synchronous forms of communication, since participants are present on a forum simultaneously and can spontaneously respond to each other’s communications.) From the earliest days of the Internet and its military-funded precursor, the Arpanet (established in 1969), online participants began forming cybercommunities. E-mail immediately emerged as the largest single use of the Internet, and remained so until 2002, when it was matched by use of the World Wide Web (an information-exchange service avail-
able on the Internet first made available in 1991 and given a graphical interface in 1993). People also began using the Internet in the early 1980s to run bulletin board services such as Usenet, which, unlike the earlier local BBSs, could now be distributed to a much larger group of people and accessed by people in widely dispersed geographical locations. Usenet expanded to include many different cybercommunities, most based around a common interest such as Linux programming or soap operas. Existing Cybercommunities Cybercommunities have risen in number with the increasing availability and popularity of the Internet and the World Wide Web. Even within the short overall history of cybercommunities, some cybercommunities have been short-lived. However, there are several, begun in the early days of computer networking, that still exist online and therefore present a useful view of factors involved in the formation and maintenance of online communities. One of the oldest still-extant cybercommunities is The WELL, which began in 1985 as a local BBS in the San Francisco Bay Area in California. Laurence Brilliant, a physician with an interest in computer conferencing, and Stewart Brand, editor of the Whole Earth Review and related publications, founded The WELL with the explicit goal of forming a virtual community. One savvy method the founders used to attract participants was to give free accounts to local journalists, many of whom later wrote about their participation, generating further interest and publicity. In the early years, when most participants lived in the same geographical area, The WELL held monthly face-to-face meetings. Currently owned by Salon.com, The WELL is now accessible through the World Wide Web. Another venerable cybercommunity, LambdaMOO, also began as an experiment in online community. In contrast to The WELL, LambdaMOO provided a forum for synchronous communication and allowed people to create a virtual environment within which to interact. LambdaMOO is an example of a type of program called a MUD, (for multiuser dimension or multiuser dungeon). MUDs are similar to online chatrooms, but also allow
CYBERCOMMUNITIES ❚❙❘ 135
participants to create their own virtual spaces and objects as additions to the program, with which they and others can then interact. These objects enhance the feel of being in a virtual reality. Created by the computer scientist Pavel Curtis as a research project for Xerox, LambdaMOO opened in 1990. A 1994 article about it in Wired magazine led to a significant increase in interest in it and to dramatic population growth. Pavel Curtis has moved on to other projects, and LambdaMOO is no longer associated with Xerox. But although it has undergone considerable social changes over the years, it still attracts hundreds of participants. MUDs began as interactive text-based roleplaying games inspired by similar face-to-face roleplaying games such as Dungeons and Dragons (hence dungeon in one expansion of the acronym). More recently, similar online games have become available with the enhancement of a graphical interface. People have used MMORPGs (massively multiplayer online role-playing games) such as Everquest as forums for socializing as well as gaming, and cybercommunities are forming amongst online gamers. Web logs, or blogs, are a relatively new and increasingly popular platform for cybercommunities. Blogs are online journals in which one can post one’s thoughts, commentary, or reflections, sometimes also allowing others to post comments or reactions to these entries. Many blogs provide a forum for amateur (or, in some cases, professional) journalism, however others more closely resemble online personal diaries. There are many different programs available for blogging, and some give specific attention to community formation. LiveJournal, for instance, enables each participant to easily gather other journals onto a single page, making it easy to keep up with friends’ journals. Links between participants are also displayable, enabling people to see who their friends’ friends are and to easily form and expand social networks. Community Networks Some cybercommunities grow out of existing offline communities. In particular, some local municipalities have sought to increase citizen participation in
Welcome to LamdaMOO
B
elow is an introduction to the cybercommunity LamdaMOO, as presented on www.lamdamoo.info: LambdaMOO is sort of like a chat room. It’s a text-only based virtual community of thousands of people from all over the world. It’s comprised of literally thousands of “rooms” that have been created by the users of LambdaMOO, and you endlessly navigate (walk around) north, south, etc. from room to room, investigating, and meeting people that you can interact with to your hearts content. You get there not thru an HTML browser like Netscape or IE but through another program called TELNET (search). Your computer most likely has Telnet but enhanced versions can be found. (Telnet address: telnet://lambda.moo .mud.org:8888/). You can try the Lambda button at the top of this page to see if all goes well. If so, a window will open and you’ll be able to log in. When you get the hang of it, you can create a character who has a name and a physical description, and who can be “seen” by all who meet you. As you walk around from room to room you are given a description of the room and a list of contents (including other people). You can “look” at each person to get a more detailed description and when you do, they see a message stating that you just checked them out. You can talk to them and they see your words in quotes, like reading spoken words in a book. You can also emote (communicate with body language) using gestures such as a smile or a nod of the head. In time you’ll learn to create your own rooms or other objects, which are limited only by your imagination. There are many people to meet up with and build “cyber-friendships” with. When you first get there you’ll be asked to log in. First timers can sign in as a guest. After that you can apply for a permanent character name and password. Give it a try and if you see me around say hi. Felis~Rex Status: Programmer/108/33%Fogy/PC Parent: Psychotic Class of Players Seniority: 1320/4093, (33%) MOO-age: 108 months. (1995 January 28, Saturday)
136 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
the local government and community by forming community computer networks that allow people access to government officials and provide forums for community discussion. The first of these, the Public Electronic Network (PEN), started in Santa Monica, California, in 1989. It was particularly successful in providing access to low-income citizens who might not otherwise have had access to computers or computer networks. In more recent years, some offline communities have gone beyond providing an online forum specifically related to the community and have also sought to promote computer use and connectivity in general. For instance, the town of Blacksburg, Virginia, with the help of Virginia Polytechnic Institute and State University (known as Virginia Tech and located in town) and local businesses, is attempting to provide the infrastructure necessary to bring Internet connectivity to every household in town and in the surrounding rural area. This project, called the Blacksburg Electronic Village (BEV) has had several goals, including expanding the local economy through the promotion of high-tech industry, increasing citizen access to online resources, and promoting a stronger sense of community. Recent evaluations by project leaders indicate that BEV has been more successful in the first two areas than in the third. Access Issues As the BEV project recognized, in order to participate in cybercommunities, people need access to computers and to computer networks, especially to the Internet and the World Wide Web. Although such access has been expanding rapidly, people in poorer nations and disadvantaged populations in more affluent countries still have limited access to the Internet, if they have it at all. This issue has been particularly salient for community networks, which are often created with the specific goal of making it possible for disadvantaged groups to access and influence their local governmental structures. Thus many community networks, in addition to setting up websites, have provided publicly accessible terminals for the use of those who do not have access to computers at home or at work.
As more and more online sites use multimedia and bandwidth-intensive enhancements (that is, enhancements that can only be successfully transmitted across a wide range—or band—of electromagnetic frequencies), speed of access has also become a crucial issue. People with older equipment—slower modems and computer processors—are disadvantaged in their ability to access online materials, especially at multimedia sites. Some governments, notably in South Korea and Japan, have sought to address that problem by subsidizing the development of broadband networks, enabling widespread relatively inexpensive access in those countries to high-speed Internet connections. In addition to access to equipment and networks, people need the skills that enable them to use that access. Research has also shown that people are unlikely to take advantage of the availability of computers and the Internet if they do not consider computer-related activities useful and do not have social support for such activities from people they know, especially their peers. This is particularly apparent in wealthier nations such as the United States, where the usefulness of and accessibility to online resources is taken for granted by more affluent members of society but where such online resources are less likely to be perceived as desirable by members of less affluent communities. To address that problem, several nonprofit groups in the United States have set up community computing centers in poorer neighborhoods, where they provide both training in necessary computer skills and a communitybased context for valuing such skills. Another approach to broadening community access to the Internet has been to integrate Internet connections into the construction of new buildings or entire neighborhoods. However, these types of developments also benefit only those who can afford to buy into them. Interfaces The direct-brain interfaces envisioned by Gibson, if possible at all, are likely many years in the future (although there have been some promising early experiments in recent years, including one in which a blind person was given partial sight through a video feed wired to the optical nerve). Most people cur-
CYBERCOMMUNITIES ❚❙❘ 137
rently access and participate in cybercommunity through personal computers. Usually, these computers are connected to the Internet by a modem or other wired connection to an Internet service provider. However, wireless services are increasing, and in some countries, most notably Japan, cell phones are commonly used to access the Internet and to communicate textually with others. In other countries, including the United States, people are also beginning to use cell phones and personal digital assistants (PDAs) for these purposes. Most communication in cybercommunities occurs through text, although some forums use graphics or voice communication, often supplemented by text. Some of the oldest existing cybercommunities are still text-only and therefore require a high level of literacy as well as comfort with computers. Early text-based forums were not always particularly easy to use, either. The WELL’s original interface was notoriously difficult to work with. This meant that only those with an understanding of computers and a strong interest in the possibilities of cybercommunity had the motivation and ability to participate. Currently, The WELL has a much more accessible Web interface and a concomitantly more diverse population of users. As available Internet bandwidth and computer processing speeds have increased, cybercommunities are able to use graphical representations of people and objects within the cyberspace. One of the earliest examples, from 1985, was Habitat, a role-playing game and socializing space that emulated an offline community. Habitat featured a local economy (based on points rather than real money) and such social structures as a church and sheriff ’s office. Habitat used two-dimensional cartoon-like drawings to represent people and objects within the forum. Many current graphical worlds also use flat cartoon-type representations. Habitat originated the use of the term avatar to refer to the representation of people in such graphical worlds, and that term has persisted in most such systems. The technical difficulties inherent in rendering three-dimensional spaces through which characters can move and in which people can manipulate virtual objects, along with the high level of computer processing power required to make possible three-
dimensional virtual spaces, has slowed the development of cybercommunities using three-dimensional spaces and avatars. One such community, Active Worlds (introduced in 1995), provides a threedimensional view of the environment similar to those first used in first-person shooter computer games (games in which you see on the screen what your character sees, rather than watching your character move about) such as Doom and Quake. Technical considerations, including the simple problem of the amount of “real estate” available on a computer screen, meant that participants in the early years of Active Worlds could see only the twelve closest avatars. This contrasts with text-only interactive forums such as MUDs and chat, in which thirty to fifty participants can be simultaneously involved in overlapping textual conversations. Graphical interfaces provide both limitations and enhancements to online communications. Identity in Cybercommunities Aside from the more technical aspects of interface design, cybercommunities have also had to grapple with the question of self-representation. How do participants appear to one another? What can they know about one another at the outset, and what can they find out? How accountable are cybercommunity members for their words and behavior within the virtual space? In purely text-based systems such as chat forums or MUDs, participants are generally expected to provide some sort of description or personal information, although on some systems it is understood that this information may be fanciful. On LamdaMOO, for instance, many participants describe themselves as wizards, animals, or creatures of light. However, each participant is required to choose a gender for their character, partly in order to provide pronoun choice for text generated by the MUD program, but also indicating the assumed importance of this aspect of identity. In a divergence from real life, LambdaMOO provides ten choices for gender identification. Despite this, most participants choose either male or female. LambdaMOO participants choose what other personal information they wish to reveal. On other MUDs, especially those intended as professional spaces or as forums for discussions
138 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
relating to “real life” (offline life), participants may be required to provide e-mail addresses or real names. In graphical forums, participants are represented both by the textual information they provide about themselves and by their avatar. Design choices involved in avatar creation in different virtual spaces often reveal important underlying social assumptions, as well as technical limitations. In the early years on Active Worlds, for instance, participants were required to choose from a limited number of existing predesigned avatars. In part this stemmed from the difficulties of rendering even nominally human-seeming avatars in the three-dimensional space. However, the particular avatars available also revealed biases and assumptions of the designers. In contrast to MUDs such as LambdaMOO, all avatars were human. At one point, participants exploited a programming loophole to use other objects, such as trees and walls, as personal representations, but this loophole was quickly repaired by the designers, who felt strongly that human representations promoted better social interaction. Active Worlds’ avatars also displayed a very limited range of human variation. Most were white, and the few non-white avatars available tended to display stereotypical aspects. For instance, the single Asian avatar, a male, used kung-fu moves, the female avatars were all identifiable by their short skirts, and the single black male avatar sported dreadlocks. Since then, programming improvements and feedback from users has enabled Active Worlds to improve their graphics (the avatars now have distinct facial features) and expand their representational offerings. In two-dimensional graphical environments such as Worlds Away (introduced in 1995), variation tended to be greater from the beginning, and participants were given the ability to construct avatars from components. They could even change avatar appearance at will by (for example) buying new heads from the “head shop.” In some systems, participants can also import their own graphics to further customize their online self-representation. Cybercommunities with greater ties to offline communities also have to deal with interface and representations issues. In order to provide community access to as wide a range of townspeople as possible, networks such as PEN need an easy-to-use inter-
face that can recover well from erroneous input. Names and accountability are another crucial issue for community forums. PEN found that anonymity tended to facilitate and perhaps even encourage “flaming” (caustic criticism or verbal abuse) and other antisocial behavior, disrupting and in some cases destroying the usefulness of the forums for others. Conflict Management and Issues of Trust Cybercommunities, like other types of communities, must find ways to resolve interpersonal conflicts and handle group governance. In the early years of the Internet, users of the Internet were primarily white, male, young, and highly educated; most were connected to academic, government, or military institutions, or to computing-related businesses. However, in the mid-1990s the Internet experienced a great increase in participation, especially from groups who had previously been on private systems not connected to the Internet, notably America Online (AOL). This sudden change in population and increase in diversity of participants created tensions in some existing cybercommunities. In one now-famous Usenet episode in 1993, participants in a Usenet newsgroup called alt.tasteless, a forum for tasteless humor frequented primarily by young men, decided to stage an “invasion” of another newsgroup, rec.pets.cats, whose participants, atypically for Usenet newsgroups at the time, were largely women, older than Usenet participants in general, and in many cases relatively new to the Internet. The alt.tasteless participants flooded rec.pets.cats with gross stories of cat mutilation and abuse, disrupting the usual discussions of cat care and useful information about cats. Some of the more computer-savvy participants on rec.pets.cats attempted to deal with the disruption through technical fixes such as kill files (which enable a participant to automatically eliminate from their reading queue messages posted by particular people), but this was difficult for participants with less understanding of the somewhat arcane Usenet system commands. The invaders, meanwhile, found ways around those fixes. The conflict eventually spread to people’s offline lives, with some rec.pets.cats participants receiving physical threats, and at least one alt.tasteless participant having their Internet access terminated for abusive be-
CYBERCOMMUNITIES ❚❙❘ 139
havior. Eventually, the invaders tired of their sport and rec.pets.cats returned to normal. Some newsgroups have sought to avoid similar problems by establishing a moderator, a single person who must approve all contributions before they are posted to the group. In high traffic groups, however, the task of moderation can be prohibitively time-consuming. LambdaMOO experienced a dramatic population surge in the late 1990s, causing not only social tensions, but also technical problems as the LambdaMOO computer program attempted to process the increasing numbers of commands. LambdaMOO community members had to come up with social agreements for slowing growth and for limiting commands that were particularly taxing on the server. For instance, they instituted a limit on the numbers of new participants that could be added each day, started deleting (“reaping”) the characters and other information of participants who had been inactive for several months, and set limits on the number of new virtual objects and spaces that participants could build. This created some tension as community members attempted to find fair ways to determine who would be allowed to build and how much. The solution, achieved through vote by participants, was to create a review board elected by the community that would reject or approve proposed projects. Designers of cybercommunity forums have also had to consider what types of capabilities to give participants and what the social effects of those capabilities might be. For instance, Active Worlds originally did not allow participants to have private conversations that were not visible to all other participants in the same virtual space. The designers felt that such conversations were antisocial and might lead to conflicts. However, participants continued to request a command that enabled such “whispered” conversations, and also implemented other programs, such as instant messaging, in order to work around the forum’s limitations. The designers eventually acquiesced and added a whisper command. Similarly, some MUDs have a command known as mutter. Rather than letting you talk only to one other person, as is the case with whisper, mutter lets you talk to everyone else in the virtual room except a designated person; in other words, it enables you to talk behind a person’s back while that person is
present—something not really possible offline. As a positive contribution, this command can allow community members to discuss approaches to dealing with a disruptive participant. However, the command can also have negative consequences. The use of avatars in graphical forums presents another set of potential conflicts. In the twodimensional space of Worlds Away, participants found that they could cause another participant to completely disappear from view by placing their own avatar directly on top of the other’s. With no available technical fix for this problem, users had to counter with difficult-to-enforce social sanctions against offenders. Trust The potential for conflicts in cybercommunities is probably no greater than that in offline communities. On the one hand, physical violence is not possible online (although in theory escalating online conflicts can lead to offline violence). On the other hand, the difficulty in completely barring offenders from a site (since people can easily reappear using a different e-mail address) and the inability to otherwise physically enforce community standards has increased cybercommunities’ vulnerability to disruption. In some cases, the greater potential for anonymity or at least pseudonymity online has also facilitated antisocial behavior. Many cybercommunities have therefore tried to find ways to enhance trust between community members. Some have sought to increase accountability by making participants’ e-mail addresses or real life names available to other participants. Others have set rules for behavior with the ultimate sanction being the barring of an individual from the forum (sometimes technologically tricky to implement). LambdaMOO, for instance, posts a set of rules for polite behavior. Because it is also one of the most famous (and most documented) cybercommunities, LambdaMOO’s opening screen also displays rules of conduct for journalists and academic researchers visiting the site. LiveJournal requires potential participants to acquire a code from an existing user in order to become a member, which it is hoped ensures that at least one person currently a member of the community
140 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
vouches for the new member. LiveJournal is considering abandoning this practice in favor of a complex system of interpersonal recommendations that give each participant a trust rating, theoretically an indication of their trustworthiness and status within the community. Although perhaps not as complex, similar systems are in use at other online forums. Slashdot, a bulletin board service focusing primarily on computer-related topics, allows participants to rank postings and then to filter what they read by aggregated rank. A participant can, for instance, decide to read only messages that achieve the highest average rating, as averaged from the responses of other participants. The online auction site eBay has a feedback system through which buyers and sellers rate one another’s performance after each transaction, resulting in a numerical score for each registered member. Each instance of positive feedback bestows a point, and each instance of negative feedback deletes one. A recent change in the way auctions are displayed now lists a percentage of positive feedback for each seller. Users can also read the brief feedback messages left for other users. These features are intended to allow users to evaluate a person’s trustworthiness prior to engaging in transactions with that person. The degree to which these types of trustpromotion systems work to foster and enhance community is unclear. Participants in various cybercommunities continue to consider issues of trust and to work on technological enhancements to the virtual environment that will help suppress antisocial behavior and promote greater community solidarity. Future Directions As cybercommunities first developed, mainstream media commentary discussed a variety of hyperbolic fears and hopes. People feared that cybercommunities would replace and supplant other forms of community and that cybercommunities were less civilized, with greater potential for rude and antisocial behavior. On the other hand, people also hoped that cybercommunities might provide forms of interconnectedness that had otherwise been lost in modern life. Some people also suggested that cybercommunities could provide a forum in which pre-
vious prejudices might be left behind, enabling a utopian meeting of minds and ideas. So far, it appears that cybercommunities tend to augment rather than supplant people’s other social connections. They appear to contain many of the same positive and negative social aspects present in offline communities. Further, many cybercommunities emerge from existing offline groups, also include an offline component (including face-toface contact between at least some participants), or utilize other technologies such as the telephone to enhance connections. Whatever form cybercommunities take in the future, their presence and popularity from the earliest days of computer networks makes it clear that such interconnections will continue to be a significant part of human-computer interaction. Lori Kendall See also Avatars; Digital Divide; MUDs FURTHER READING Baym, N. K. (2000). Tune in, log on: Soaps, fandom, and online community. Thousand Oaks, CA: Sage. Belson, K., & Richtel, M. (2003, May 5). America’s broadband dream is alive in Korea. The New York Times, p. C1. Benedikt, M. (Ed.). (1992). Cyberspace: First steps. Cambridge, MA: MIT Press. Blackburg Electronic Village. (n.d.) About BEV. Retrieved August 12, 2003, from http://www.bev.net/about/index.php Cherny, L. (1999). Conversation and community: Chat in a virtual world. Stanford, CA: CSLI Publications. Damer, B. (1998). Avatars! Berkeley, CA: Peachpit Press. Dibbell, J. (1998). My tiny life: Crime and passion in a virtual world. New York: Henry Holt and Company. Gibson, W. (1984). Neuromancer. New York: Ace Books. Hafner, K. (2001). The Well: A Story of love, death & real life in the seminal online community. Berkeley, CA: Carroll & Graf. Hampton, K. (2001). Living the wired life in the wired suburb: Netville, glocalization and civil society. Unpublished doctoral dissertation, University of Toronto, Ontario, Canada. Herring, S. C., with D. Johnson & T. DiBenedetto. (1995). “This discussion is going too far!” Male resistance to female participation on the Internet. In M. Bucholtz & K. Hall (Eds.), Gender articulated: Language and the socially constructed self (pp. 67–96). New York: Routledge. Jones, S. (Ed.). (1995). Cybersociety: Computer-mediated communication and community. Thousand Oaks, CA: Sage. Jones, S. (Ed.). (1997). Virtual culture: Identity and communication in cybersociety. London: Sage.
CYBERSEX ❚❙❘ 141
Kavanaugh, A., & Cohill, A. (1999). BEV research studies, 1995– 1998. Retrieved August 12, 2003, from http://www.bev.net/about/ research/digital_library/docs/BEVrsrch.pdf Kendall, L. (2002). Hanging out in the virtual pub. Berkeley, CA: University of California Press. Kiesler, S. (1997). Culture of the Internet. Mahwah, NJ: Lawrence Erlbaum Associates. McDonough, J. (1999). Designer selves: Construction of technologicallymediated identity within graphical, multi-user virtual environments. Journal of the American Society for Information Science, 50(10), 855–869. McDonough, J. (2000). Under construction. Unpublished doctoral dissertation, University of California at Berkeley. Morningstar, C., & Farmer, F. R. (1991). The lessons of Lucasfilm’s Habitat. In M. Benedikt (Ed.), Cyberspace: First steps (pp. 273–302). Cambridge, MA: The MIT Press. Porter, D. (1997). Internet culture. New York: Routledge. Renninger, K. A., & Shumar, W. (Eds.). (2002). Building virtual communities. Cambridge, UK: Cambridge University Press. Rheingold, H. (1993). The virtual community: Homesteading on the electronic frontier. Reading, MA: Addison-Wesley. Smith, M., & Kollock, P. (Eds.). (1999). Communities and cyberspace. New York: Routledge. Taylor, T. L. (2002). Living digitally: Embodiment in virtual worlds. In R. Schroeder (Ed.), The social life of avatars: Presence and interaction in shared virtual environments. London: Springer Verlag. Turkle, S. (1995). Life on the screen: Identity in the age of the Internet. New York: Simon & Schuster. Wellman, B. (2001). The persistence and transformation of community: From neighbourhood groups to social networks. Report to the Law Commission of Canada. Retrieved August 12, 2003, from http:// www.chass.utoronto.ca/~wellman/publications/lawcomm/ lawcomm7.htm Wellman, B., & Haythornthwaite, C. (Eds.). (2002). The Internet in everyday life. Oxford, UK: Blackwell. Wellman, B., Boase, J., & Chen, W. (2002). The networked nature of community online and offline. IT & Society, 1(1), 151–165. Weiner, N. (1948). Cybernetics, or control and communication in the animal and the machine. Cambridge, MA: MIT Press. WELL, The. (2002). About the WELL. Retrieved August, 2003, from http://www.well.com/aboutwell.html
CYBERSEX The term cybersex is a catch-all word used to describe various sexual behaviors and activities performed while on the Internet. The term does not indicate that a particular behavior is good or bad, only that the sexual behavior occurred in the context of the Internet. Examples of behaviors or activities that may be considered cybersex include sexual conversations in Internet chatrooms, retrieving sexual media (for example, photographs, stories, or videos) via the
Internet, visiting sex-related websites, masturbating to sexual media from the Internet, engaging in sexualized videoconferencing activities, creating sexual materials for use/distribution on the Internet, and using the Internet to obtain/enhance offline sexual behaviors. A broader term used to describe Internet sexual behavior is “online sexual activity” (OSA), which includes using the Internet for any sexual purpose, including recreation, entertainment, exploration, or education. Examples of OSA are using online services to meet individuals for sexual /romantic purposes, seeking sexual information on the Internet (for instance, about contraception and STDs), and purchasing sexual toys/paraphernalia online. What distinguishes cybersex from OSA is that cybersex involves online behaviors that result in sexual arousal or gratification, while other online sexual activities may lead to offline sexual arousal and gratification. Sexual arousal from cybersex is more immediate and is due solely to the online behavior.
Venues Many people assume that the World Wide Web is the main venue for cybersex. In fact, the Web represents only a small portion of the places where cybersex activities can occur. Other areas of the Internet where cybersex may take place include the following: ■
Newsgroups— This area serves as a bulletin board where individuals can post text or multimedia messages, such as sexual text, pictures, sounds, and videos; ■ E-mail—E-mail can be used for direct communication with other individuals or groups of individuals. In the case of cybersex, the message may be a sexual conversation, story, picture, sound, or video; ■ Chatrooms—Both sexualized conversation and multimedia can be exchanged in chatrooms. Casual users are familiar with Web-based chatting such as Yahoo Chat or America Online (AOL) Chat. Most Web-based chat areas have sections dedicated to sexual chats. However, the largest chat-based system is the Internet Relay Chat (IRC), an area largely unfamiliar to most casual users. In addition to text-based chatting,
142 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
IRC contains a number of chatrooms specifically dedicated to the exchange of pornography through “file servers”; ■ Videoconferencing/Voice Chatting—The use of these areas is rapidly increasing. As technology improves and connection speeds increase, the use of the Internet for “live” cybersex sessions will become commonplace. Videoconferencing combined with voice chat constitutes a high-tech version of a peep show mixed with an obscene phone call; and ■ Peer-to-Peer File Sharing—Software packages such as Napster and Kazaa have made file sharing a popular hobby. Casual users of this software know its use for exchanging music files, but any file can be shared on the network, including sexual images, sounds, and videos.
Statistics Although the term cybersex often has negative connotations, research in this area suggests that nearly 80 percent of individuals who engage in Internet sex report no significant problems in their lives associated with their online sexual activities. Although this may be an underestimate since the research relied on the self-reports of respondents, it is safe to assume that the majority of individuals who engage in cybersex behavior report this activity to be enjoyable and pleasurable, with few negative consequences. However, there are individuals who engage in cybersex who do report significant negative consequences as a result of their online sexual behavior. These individuals often report that their occupational, social, or educational life areas have been negatively impacted or are in jeopardy as a result of their sexual use of the Internet. Often these individuals report a sense of being out of control or compulsive in their sexual use of the Internet and often compare it to addictions like gambling, eating, shopping, or working. Several large-scale studies estimate the percentage of individuals who are negatively impacted by cybersex behaviors. While exact numbers are impossible given the size of the Internet, estimates are
that from 11 to 17 percent of individuals who engaged in cybersex report some consequences in their life and score moderately high on measures of general sexual compulsivity. In addition, approximately 6 percent report feeling out of control with their Internet sexual behavior and scored high on measures of sexual compulsivity.
Healthy Versus Problematic Cybersex One of the difficulties in defining cybersex as either healthy or problematic is the fact that there are few agreed-upon definitions about what constitutes sexually healthy behavior. Society has clearly delineated some behaviors as unhealthy, for example, sex with children or other non-consenting partners. However, people disagree about whether masturbation, multiple affairs, bondage, and fetishes are healthy or unhealthy. In the world of cybersex, these same gray areas exist between healthy and unhealthy and are often even more difficult to define since the behavior does not include actual sexual contact. It is also important not to assume that frequency is the key factor in determining whether an individual is engaged in unhealthy cybersex. Some individuals engage in cybersex at a high frequency and have few problems, while others who engage in it only a few hours a week have significant negative consequences. Physician and researcher Jennifer Schneider proposed three criteria to help determine if someone’s behavior has become compulsive—that is, whether the person has crossed the line from a “recreational” to a “problematic” user of cybersex. The three criteria are (1) loss of freedom to choose whether to stop the behavior; (2) negative consequences as a result of the behavior; and (3) obsessive thinking about engaging in the behavior. The Internet Sex Screening Test (ISS) described by counseling professor David Delmonico and professor of school psychology Jeffrey Miller can be used to conduct initial screening of whether an individual has a problem with cybersex.
CYBERSEX ❚❙❘ 143
The Appeal of the Internet With an estimated 94 million users accessing it regularly, it is difficult to dispute the Internet’s widespread appeal. In 2001 Delmonico, Moriarity, and marriage and family therapist Elizabeth Griffin, proposed a model called “the Cyberhex” for understanding why the Internet is so attractive to its users. Their model lists the following six characteristics: Integral: The Internet is nearly impossible to avoid. Even if a cybersex user decided to never use the Internet again, the integral nature of the Internet would make that boundary nearly impossible, since many need the Internet for work, or to access bank information, and so on. In addition, public availability, the use of e-mail, and other activities like shopping and research make the Internet a way of life that is integrated into our daily routines. Imposing: The Internet provides an endless supply of sexual material 7 days a week, 365 days a year. The amount of information and the imposing nature of marketing sexual information on the Internet contributes to the seductiveness of the world of cybersex. Inexpensive: For a relatively small fee, twenty to forty dollars per month, a user can access an intoxicating amount of sexual material on the Internet. In the offline world such excursions can be cost-prohibitive to many. Isolating: Cybersex is an isolating activity. Even though interpersonal contact may be made during the course of cybersex, these relationships do not require the same level of social skills or interactions that offline behaviors require. The Internet becomes a world in itself, where it is easy to lose track of time, consequences, and real-life relationships. The isolation of cybersex often provides an escape from the real world, and while everyone takes short escapes, cybersex often becomes the drug of choice to anesthetize any negative feelings associated with real-life relationships. Interactive: While isolating in nature, the Internet also hooks individuals into pseudorelationships. These pseudorelationships often approximate reality without running the risks of real relationships—like emotional and physical vulnerability and intimacy. This close approximation to reality can be fuel for the fantasy life of those who experience problems with their cybersex behaviors.
Cybersex Addiction
T
he Center for Online and Internet Addiction (www .netaddiction.com) offers the following test to help diagnose cybersex addiction:
1. Do you routinely spend significant amounts of time in chat rooms and private messaging with the sole purpose of finding cybersex? 2. Do you feel preoccupied with using the Internet to find on-line sexual partners? 3. Do you frequently use anonymous communication to engage in sexual fantasies not typically carried out in real-life? 4. Do you anticipate your next on-line session with the expectation that you will find sexual arousal or gratification? 5. Do you find that you frequently move from cybersex to phone sex (or even real-life meetings)? 6. Do you hide your on-line interactions from your significant other? 7. Do you feel guilt or shame from your on-line use? 8. Did you accidentally become aroused by cybersex at first, and now find that you actively seek it out when you log on-line? 9. Do you masturbate while on-line while engaged in erotic chat? 10. Do you provide less investment with your real-life sexual partner only to prefer cybersex as a primary form of sexual gratification? Source: Are you addicted to cybersex. Center for Online and Internet Addiction. Retrieved March 23, 2004, from http://www.netaddiction.com/ resources/cybersexual_addiction_test.htm
Intoxicating: This is what happens when the preceding five elements are added together. This combination makes for an incredibly intoxicating experience that is difficult for many to resist. The intoxication of the Internet is multiplied when cybersex is involved since behaviors are reinforced with one of the most powerful rewards, sex. Any single aspect of the Internet can be powerful enough to entice a cybersex user. However, it is typically a combination of these six factors that draws problematic cybersex users into their rituals
144 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
and leads to their loss of control over their cybersex use.
Special Populations Engaged in Cybersex The following subgroups of cybersex users have been studied in some detail: Males and Females: In the early to mid-1990s there were three times as many males online as females. Recent research shows that the gap has closed and that the split between male and female Internet users is nearly fifty-fifty. As a result, research on cybersex behavior has also included a significant number of females who engage in cybersex. Most of this research suggests that men tend to engage in more visual sex (for example, sexual media exchange), while women tend to engage in more relational sex (for example, chatrooms and e-mail). Females may find the Internet an avenue to sexual exploration and freedom without fear of judgment or reprisal from society. In this way, the Internet can have genuine benefits. Gays and Lesbians: Researchers have reported that homosexuals tend to engage in cybersex at higher levels than heterosexuals, which may be because they don’t have to fear negative cultural responses or even physical harm when they explore sexual behaviors and relationships on the Internet. Some homosexuals report that cybersex is a way to engage in sexual behavior without fear of HIV or other sexually transmitted diseases. By offering homosexuals a safe way to explore and experience their sexuality, the Internet gives them freedom from the stigma often placed on them by society. Children and Adolescents: Studies conducted by AOL and Roper Starch revealed that children use the Internet not only to explore their own sexuality and relationships, but also to gather accurate sexual health information. Since many young adults have grown up with the Internet, they often see it through a different lens than adults. Children, adolescents, and young adults use the Internet to seek answers to a multitude of developmental questions, including sexuality, which they may be afraid to address directly with other adults. Although the Internet can
be useful in educating children and adolescents about sexuality, it can also be a dangerous venue for the development of compulsive behavior and victimization by online predators. Although the effect of hardcore, explicit pornography on the sexual development of children and adolescents has yet to be researched, early exposure to such pornography may impact their moral and sexual development. Physically or Developmentally Challenged People: Only recently have questions been raised about the appropriate use of the Internet for sexual and relational purposes among physically challenged individuals. This area warrants more research and exploration, but initial writings in this area suggest that the Internet can confer a tremendous benefit for sexual and relationship exploration for persons with disabilities. While sex on the Internet can be a positive experience for these subpopulations, it can also introduce the people in these groups to the same problems associated with cybersex that other groups report.
Implications Cybersex is changing sexuality in our culture. The positive side is that sexual behavior is becoming more open and varied, and better understood. The negative implications are that sexuality may become casual, trivial, and less relational. The pornography industry continues to take advantage of the new technologies with the primary goal of profit, and these new technologies will allow for faster communication to support better video and voice exchanges. The eventual development of virtual reality technologies online will further enhance the online sexual experience, and perhaps make the sexual fantasy experience more pleasurable than real life. These technological advances will continue to alter the way we interact and form relationships with others. Researchers are just starting to realize the implications of sex on the Internet. Theories like Cyberhex are helpful in understanding why people engage in cybersex, but the best methods for helping those struggling with cybersex have yet to be discovered. However, society will continue to be impacted by the Internet and cybersex. Parents, teachers, and others who
CYBORGS ❚❙❘ 145
have not grown up with the Internet will fail future generations if they discount the significant impact it can have on social and sexual development. Continued research and education will be necessary to help individuals navigate the Internet and the world of cybersex more safely. David L. Delmonico and Elizabeth J. Griffin See also Chatrooms; Cybercommunities FURTHER READING Carnes, P. J. (1983). Out of the shadows. Minneapolis, MN: CompCare. Carnes, P. J., Delmonico, D. L., Griffin, E., & Moriarity, J. (2001). In the shadows of the Net: Breaking free of compulsive online behavior. Center City, MN: Hazelden Educational Materials. Cooper, A. (Ed.). (2000). Sexual addiction & compulsivity: The journal of treatment and prevention. New York: Brunner-Routledge. Cooper, A. (Ed.). (2002). Sex and the Internet: A guidebook for clinicians. New York: Brunner-Routledge. Cooper, A., Delmonico, D., & Burg, R. (2000). Cybersex users, abusers, and compulsives: New findings and implications. Sexual Addiction and Compulsivity: Journal of Treatment and Prevention, 7, 5–29. Cooper, A., Scherer, C., Boies, S. C., & Gordon, B. (1999). Sexuality on the Internet: From sexual exploration to pathological expression. Professional Psychology: Research and Practice, 30(2), 154–164. Delmonico, D. L. (1997). Internet sex screening test. Retrieved August 25, 2003, from http://www.sexhelp.com/ Delmonico, D. L., Griffin, E. J., & Moriarity, J. (2001). Cybersex unhooked: A workbook for breaking free of compulsive online behavior. Wickenburg, AZ: Gentle Path Press. Delmonico, D. L., & Miller, J. A. (2003). The Internet sex screening test: A comparison of sexual compulsives versus non-sexual compulsives. Sexual and Relationship Therapy, 18(3), 261–276. Robert Starch Worldwide, Inc. (1999). The America Online/Roper Starch Youth Cyberstudy. Author. Retrieved on December 24, 2003, from http://www.corp.aol.com/press/roper.html/ Schneider, J. P. (1994). Sex addiction: Controversy within mainstream addiction medicine, diagnosis based on the DSM-III-R and physician case histories. Sexual Addiction & Compulsivity: The Journal of Treatment and Prevention, 1(1), 19–44. Schneider, J. P., & Weiss, R. (2001). Cybersex exposed: Recognizing the obsession. Center City, MN: Hazelden Educational Materials. Tepper, M. S., & Owens, A. (2002). Access to pleasure: Onramp to specific information on disability, illness, and other expected changes throughout the lifespan. In A. Cooper (Ed.), Sex and the Internet: A guidebook for clinicians. New York: BrunnerRoutledge. Young, K. S. (1998). Caught in the Net. New York: Wiley. Young, K. S. (2001). Tangled in the web: Understanding cybersex from fantasy to addiction. Bloomington, IN: 1st Books Library.
CYBORGS A cyborg is a technologically enhanced human being. The word means cybernetic organism. Because many people use the term cybernetics for computer science and engineering, a cyborg could be the fusion of a person and a computer. Strictly speaking, however, cybernetics is the science of control processes, whether they are electronic, mechanical, or biological in nature. Thus, a cyborg is a person, some of whose biological functions have come under technological control, by whatever means. The standard term for a computer-simulated person is an avatar, but when an avatar is a realistic copy of a specific real person, the term cyclone is sometimes used, a cybernetic clone or virtual cyborg.
Imaginary Cyborgs The earliest widely known cyborg in literature, dating from the year 1900, is the Tin Woodman in The Wonderful Wizard of Oz by L. Frank Baum. Originally he was a man who earned his living chopping wood in the forest. He wanted to marry a beautiful Munchkin girl, but the old woman with whom the girl lived did not want to lose her labor and prevailed upon the Wicked Witch of the East to enchant his axe. The next time he went to chop wood, the axe chopped off his left leg instead. Finding it inconvenient to get around without one of his legs, he went to a tinsmith who made a new one for him. The axe then chopped off his right leg, which was then also replaced by one made of tin. This morbid process continued until there was nothing left of the original man but his heart, and when that finally was chopped out, he lost his love for the Munchkin girl. Still, he missed the human emotions that witchcraft and technology had stolen from him, and was ready to join Dorothy on her journey to the Emerald City, on a quest for a new heart. This story introduces one of the primary themes associated with cyborgs: the idea that a person accepts the technology to overcome a disability. That is, the person is already less than complete, and the technology is a substitute for full humanity, albeit an inferior one. This is quite different from assimilating
146 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
new technology in order to become more than human, a motive severely criticized by the President’s Council on Bioethics in 2003. A very different viewpoint on what it means to be “disabled” has been expressed by Gregor Wolbring, a professor at the University of Calgary. Who decides the meanings of disability and normality is largely a political issue, and Wolbring argues that people should generally have the power to decide for themselves. He notes the example of children who are born without legs because their mothers took thalidomide during pregnancy, then forced to use poorly designed artificial legs because that makes them look more normal, when some other technology would have given them far better mobility.
C makes it easy to shoot yourself in the foot. C++ makes it harder, but when you do, it blows away your whole leg. Bjarne Stroustrup
A common variation on the disability theme is the hero who suffers a terrible accident, is rebuilt, and becomes a cyborg superhero. A well-known example is The Six Million Dollar Man, a television series that aired 1973–1978 and was based on the 1972 novel Cyborg by Martin Caidin. Test pilot Steve Austin is severely injured in a plane crash, then rebuilt with bionic (biological plus electronic) technology. A spinoff series, The Bionic Woman (1976–1978) focuses on tennis player Jaime Sommers who is similarly disabled in a parachute accident. Both become superhero special agents, perhaps to justify the heavy investment required to insert and maintain their bionics. An especially striking example is the motion picture Robocop (1987). Policeman Alex Murphy lives in a depressing future Detroit, dominated by a single, exploitative corporation. To control the increasingly violent population, the corporation develops robot police possessing overwhelming firepower but lacking the judgment to interact successfully with human beings. Thus, when Murphy is blown to pieces by criminals, the corporation transforms him into a cyborg that combines human judgment with machine power. The corporation denies Murphy the right to be considered human, thereby forcing him to
become its enemy. This reflects a second persistent literary theme associated with cyborgs: They reflect the evils of an oppressive society in which technology has become a tool by which the masters enslave the majority. By far the most extensive treatment of the idea that cyborg technology is wicked can be found in the Dalek menace from the long-running BBC television series, Dr. Who. Sometimes mistaken for robots, Daleks are metal-clad beings that resemble huge salt shakers, wheeled trash cans, or British post boxes. They became extremely popular villains since their first appearance in 1963. Two low-budget feature films that retold the first TV serials added to their fame, Dr. Who and the Daleks (1965) and Dr. Who: Daleks Invasion Earth 2150 A.D. (1966). Inside a Dalek’s metal shell lurks a helpless, sluggish creature with vestigial claws, yet the combination of biology and technology gave it the possibility of conquering the universe. Their motto describes how they treat all other living creatures: “Exterminate.” The secret of their origins is revealed in the 1975 serial, “Genesis of the Daleks.” The protagonist of Dr. Who, The Doctor, lands his time machine on the battlescarred planet Skaro, just as the nuclear war between the Thals and the Kaleds reaches its climax. Davros, the evil (and disabled) Kaled scientist, recognizes that chemical weapons are causing his people to mutate horribly, and rather than resist this trend, he accelerates it, transforming humans into the vile Dalek cyborgs.
Real Cyborg Research Since human beings began wearing clothing, the boundary between ourselves and our technology has blurred. Arguably, everybody who wears a wristwatch or carries a cell phone is already a cyborg. But the usual definition implies that a human body has been modified, typically by insertion of some nonbiological technology. In the early years of the twentieth century, when surgeons first gained technological control over pain and infection, many brave or irresponsible doctors began experimenting with improvements to their patients. Sir William Arbuthnot Lane, the British royal physician, theorized that many illnesses were caused by a sluggish movement of food
CYBORGS ❚❙❘ 147
through the bowels that supposedly flooded the system with poisonous toxins. Diagnosing this chronic intestinal stasis in many cases, Lane performed surgery to remove bands and adhesions, and free the intestines to do their job. Some of his colleagues operated on neurotic patients, believing that moving the abdominal organs into their proper places could alleviate mental disorders. Later generations of doctors abandoned these dangerous and useless procedures, but one of Lane’s innovations has persisted. He was the first to “plate” a bone—that is to screw a supportive metal plate onto a broken bone. Today many thousands of people benefit from artificial hip and knee joints. In World War I, even before the introduction of antibiotics, rigorous scientific techniques were sufficiently effective to prevent death from infection in most wounded cases, thereby vastly increasing the number of people who survived with horrendous war-caused disabilities. The Carrel-Dakin technique was especially impressive, employing an antiseptic solution of sodium hypochlorite in amazingly rigorous procedures. Suppose a soldier’s leg had been badly torn by an artillery shell. The large and irregular wound would be entirely opened up and cleaned. Then tubes would be placed carefully in all parts of the wound to drip the solution very slowly, for days and even for weeks. Daily, a technician takes samples from every part of the wound, examining them under the microscope, until no more microbes are seen and the wound can be sewed up. Restorative plastic surgery and prosthetics could often help the survivors live decent lives. In the second half of the twentieth century, much progress was achieved with transplants of living tissue—such as kidneys from donors and coronary artery bypass grafts using material from the patient. Inorganic components were also successfully introduced, from heart values to tooth implants. Pacemakers to steady the rhythm of the heart and cochlear transplants to overcome deafness are among the relatively routine electronic components inserted into human bodies, and experiments are being carried out with retina chips to allow the blind to see. There are many difficult technical challenges, notably how to power artificial limbs, how to connect large components to the structure of the human body
safely, and how to interface active or sensory components to the human nervous system. Several researchers, such as Miguel Nicolelis of Duke University, have been experimenting with brain implants in monkeys that allow them to operate artificial arms, with the hope that this approach could be applied therapeutically to human beings in the near future.
Visions of the Future Kevin Warwick, professor of Cybernetics at Reading University in Britain, is so convinced of the near-term prospects for cyborg technology, that he has experimented on his own body. In 1998, he had surgeons implant a transponder in his left arm so a computer could monitor his movements. His first implant merely consisted of a coil that picked up power from a transmitter and reflected it back, letting the computer know where he was so it could turn on lights when he entered a room. In 2002 he had neurosurgeons connect his nervous system temporarily to a computer for some very modest experiments, but in the future he imagines that implants interfacing between computers and the human nervous system will allow people to store, playback, and even share experiences. He plans someday to experiment with the stored perceptions associated with drinking wine, to see if playing them back really makes him feel drunk. His wife, Irena, has agreed that someday they both will receive implants to share feelings such as happiness, sexual arousal, and even pain. In the long run, Warwick believes, people will join with their computers to become superhuman cyborgs. In so doing, they will adopt a radically new conception of themselves, including previously unknown understandings, perceptions, and desires. Natasha Vita-More, an artist and futurist, has sketched designs for the cyborg posthuman she calls Primo, based on aesthetics and general technological trends. Although she is not building prototypes or experimenting with components at the present time, she believes that her general vision could be achieved within this century. Primo would be ageless rather than mortal, capable of upgrades whenever an organ wore out or was made obsolete by technical progress, and able to change gender
148 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
whenever (s)he desires. Nanotechnology would give Primo 1,000 times the brainpower of a current human, and thus capable of running multiple viewpoints in parallel rather than being locked into one narrow frame of awareness. Primo’s senses would cover a vastly wider bandwidth, with sonar mapping onto the visual field at will, an internal grid for navigating and moving anywhere like an acrobatic dancer with perfect sense of direction, and a nervous system that can transmit information from any area of the body to any other instantly. Primo’s nose could identify any chemical or biological substance in the environment, and smart skin will not only protect the body, but provide vastly enhanced sensations. Instead of the depression and envy that oppress modern humans, (s)he would be filled with ecstatic yet realistic optimism. The old fashioned body’s need to eliminate messy wastes will be transcended by Primo’s ability to recycle and purify. William J. Mitchell, the director of Media Arts and Sciences at the Massachusetts Institute of Technology, argues that we have already evolved beyond traditional homo sapiens by become embedded in a ubiquitous communication network. The title of his book, ME++: The Cyborg Self and the Networked City (2003), offers a nice metaphor derived from the C language for programming computers. C was originally developed by a telephone company (Bell Labs) and has become possibly the most influential language among professional programmers, especially in the modular version called C++. In C (and in the Java language as well), “++” means to increment a number by adding 1 to it. Thus, C++ is one level more than C, and ME++ is one level more than me, in which
technology takes me above and beyond myself. A person who is thoroughly plugged in experiences radically transformed consciousness: “I construct, and I am constructed, in a mutually recursive process that continually engages my fluid, permeable boundaries and my endlessly ramifying networks. I am a spatially extended cyborg” (Mitchell 2003, 39). Williams Sims Bainbridge FURTHER READING Bainbridge, W. S. (1919). Report on medical and surgical developments of the war. Washington, DC: Government Printing Office. Barnes, B. A. (1977). Discarded operations: Surgical innovation by trial and error. In J. P. Bunker, B. A. Barnes, & F. Mosteller (Eds.), Costs, risks, and benefits of surgery (pp. 109–123). New York: Oxford University Press. Baum, L. F. (1900). The wonderful wizard of Oz. Chicago: G. M. Hill. Bentham, J. (1986). Doctor Who: The early years. London: W. H. Allen. Caidin, M. (1972). Cyborg. New York: Arbor House. Haining, P. (Ed.). (1983). Doctor Who: A celebration. London: W. H. Allen. Mitchell, W. J. (2003). ME++: The cyborg self and the networked city. Cambridge, MA: MIT Press. Nicolelis, M. A. L., & Srinivasan, M. A. (2003). Human-machine interaction: Potential impact of nanotechnology in the design of neuroprosthetic devices aimed at restoring or augmenting human performance. In M. C. Roco & W. S. Bainbridge (Eds.), Converging technologies for improving human performance (pp. 251–255). Dordrecht, Netherlands: Kluwer. President’s Council on Bioethics. (2003). Beyond therapy: Biotechnology and the pursuit of happiness. Washington, DC: President’s Council on Bioethics. Warwick, K. (2000). Cyborg 1.0. Wired, 8(2), 144–151. Wolbring, G. (2003). Science and technology and the triple D (Disease, Disability, Defect). In M. C. Roco & W. S. Bainbridge (Eds.), Converging technologies for improving human performance (pp. 232–243). Dordrecht, Netherlands: Kluwer.
DATA MINING DATA VISUALIZATION DEEP BLUE DENIAL-OF-SERVICE ATTACK
D DATA MINING Data mining is the process of automatic discovery of valid, novel, useful, and understandable patterns, associations, changes, anomalies, and statistically significant structures from large amounts of data. It is an interdisciplinary field merging ideas from statistics, machine learning, database systems and datawarehousing, and high-performance computing, as well as from visualization and human-computer interaction. It was engendered by the economic and scientific need to extract useful information from the data that has grown phenomenally in all spheres of human endeavor. It is crucial that the patterns, rules, and models that are discovered be valid and generalizable not only in the data samples already examined, but also in
DESKTOP METAPHOR DIALOG SYSTEMS DIGITAL CASH DIGITAL DIVIDE DIGITAL GOVERNMENT DIGITAL LIBRARIES DRAWING AND DESIGN
future data samples. Only then can the rules and models obtained be considered meaningful. The discovered patterns should also be novel, that is, not already known to experts; otherwise, they would yield very little new understanding. Finally, the discoveries should be useful as well as understandable. Typically data mining has two high-level goals: prediction and description. The former answers the question of what and the latter the question of why. For prediction, the key criterion is the accuracy of the model in making future predictions; how the prediction decision is arrived at may not be important. For description, the key criterion is the clarity and simplicity of the model describing the data in understandable terms. There is sometimes a dichotomy between these two aspects of data mining in that the most accurate prediction model for a problem may not be easily understandable, and the 149
150 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
most easily understandable model may not be highly accurate in its predictions.
Steps in Data Mining Data mining refers to the overall process of discovering new patterns or building models from a given dataset. There are many steps involved in the mining enterprise. These include data selection, data cleaning and preprocessing, data transformation and reduction, data mining task and algorithm selection, and finally, postprocessing and the interpretation of discovered knowledge. Here are the most important steps: Understand the application domain: A proper understanding of the application domain is necessary to appreciate the data mining outcomes desired by the user. It is also important to assimilate and take advantage of available prior knowledge to maximize the chance of success. Collect and create the target dataset: Data mining relies on the availability of suitable data that reflects the underlying diversity, order, and structure of the problem being analyzed. Therefore, it is crucial to collect a dataset that captures all the possible situations relevant to the problem being analyzed. Clean and transform the target dataset: Raw data contain many errors and inconsistencies, such as noise, outliers, and missing values. An important element of this process is the unduplication of data records to produce a nonredundant dataset. Another important element of this process is the normalization of data records to deal with the kind of pollution caused by the lack of domain consistency. Select features and reduce dimensions: Even after the data have been cleaned up in terms of eliminating duplicates, inconsistencies, missing values, and so on, there may still be noise that is irrelevant to the problem being analyzed. These noise attributes may confuse subsequent data mining steps, produce irrelevant rules and associations, and increase computational cost. It is therefore wise to perform a dimension-reduction or feature-selection step to separate those attributes that are pertinent from those that are irrelevant. Apply data mining algorithms: After performing the preprocessing steps, apply appropriate data
mining algorithms—association rule discovery, sequence mining, classification tree induction, clustering, and so on—to analyze the data. Interpret, evaluate, and visualize patterns: After the algorithms have produced their output, it is still necessary to examine the output in order to interpret and evaluate the extracted patterns, rules, and models. It is only by this interpretation and evaluation process that new insights on the problem being analyzed can be derived.
Data Mining Tasks In verification-driven data analysis the user postulates a hypothesis, and the system tries to validate it. Common verification-driven operations include querying and reporting, multidimensional analysis, and statistical analysis. Data mining, on the other hand, is discovery driven—that is, it automatically extracts new hypotheses from data. The typical data mining tasks include the following: Association rules: Given a database of transactions, where each transaction consists of a set of items, association discovery finds all the item sets that frequently occur together, and also the rules among them. For example, 90 percent of people who buy cookies also buy milk (60 percent of grocery shoppers buy both). Sequence mining: The sequence-mining task is to discover sequences of events that commonly occur together. For example, 70 percent of the people who buy Jane Austen’s Pride and Prejudice also buy Emma within a month. Similarity search: An example is the problem where a person is given a database of objects and a “query” object, and is then required to find those objects in the database that are similar to the query object. Another example is the problem where a person is given a database of objects, and is then required to find all pairs of objects in the databases that are within some distance of each other. Deviation detection: Given a database of objects, find those objects that are the most different from the other objects in the database—that is, the outliers. These objects may be thrown away as noise, or they may be the “interesting’’ ones, depending on the specific application scenario.
DATA MINING ❚❙❘ 151
Classification and regression: This is also called supervised learning. In the case of classification, someone is given a database of objects that are labeled with predefined categories or classes. They are required to develop from these objects a model that separates them into the predefined categories or classes. Then, given a new object, the learned model is applied to assign this new object to one of the classes. In the more general situation of regression, instead of predicting classes, real-valued fields have to be predicted. Clustering: This is also called unsupervised learning. Here, given a database of objects that are usually without any predefined categories or classes, the individual is required to partition the objects into subsets or groups such that elements of a group share a common set of properties. Moreover, the partition should be such that the similarity between members of the same group is high and the similarity between members of different groups is low.
Challenges in Data Mining Many existing data mining techniques are usually ad hoc; however, as the field matures, solutions are being proposed for crucial problems like the incorporation of prior knowledge, handling missing data, adding visualization, improving understandability, and other research challenges. These challenges include the following: Scalability: How does a data mining algorithm perform if the dataset has increased in volume and in dimensions? This may call for some innovations based on efficient and sufficient sampling, or on a trade-off between in-memory and disk-based processing, or on an approach based on high-performance distributed or parallel computing. New data formats: To date, most data mining research has focused on structured data, because it is the simplest and most amenable to mining. However, support for other data types is crucial. Examples include unstructured or semistructured (hyper)text, temporal, spatial, and multimedia databases. Mining these is fraught with challenges, but it is necessary because multimedia content and digital libraries proliferate at astounding rates. Handling data streams: In many domains the data changes over time and/or arrives in a constant
stream. Extracted knowledge thus needs to be constantly updated. Database integration: The various steps of the mining process, along with the core data mining methods, need to be integrated with a database system to provide common representation, storage, and retrieval. Moreover, enormous gains are possible when these are combined with parallel database servers. Privacy and security issues in mining: Privacypreserving data mining techniques are invaluable in cases where one may not look at the detailed data, but one is allowed to infer high-level information. This also has relevance for the use of mining for national security applications. Human interaction: While a data mining algorithm and its output may be readily handled by a computer scientist, it is important to realize that the ultimate user is often not the developer. In order for a data mining tool to be directly usable by the ultimate user, issues of automation—especially in the sense of ease of use—must be addressed. Even for computer scientists, the use and incorporation of prior knowledge into a data mining algorithm is often a challenge; they too would appreciate data mining algorithms that can be modularized in a way that facilitates the exploitation of prior knowledge. Data mining is ultimately motivated by the need to analyze data from a variety of practical applications— from business domains such as finance, marketing, telecommunications, and manufacturing, or from scientific fields such as biology, geology, astronomy, and medicine. Identifying new application domains that can benefit from data mining will lead to the refinement of existing techniques, and also to the development of new methods where current tools are inadequate. Mohammed J. Zaki
FURTHER READING Association for Computing Machinery’s special interest group on knowledge discovery and data mining. Retrieved August 21, 2003, from http://www.acm.org/sigkdd. Dunham, M. H. (2002). Data mining: Introductory and advanced topics. Upper Saddle River, NJ: Prentice Hall.
152 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. San Francisco: Morgan Kaufman. Hand, D. J., Mannila, H., & Smyth, P. (2001). Principles of data mining. Cambridge, MA: MIT Press. Kantardzic, M. (2002). Data mining: Concepts, models, methods, and algorithms. Somerset, NJ: Wiley-IEEE Press. Witten, I. H., & Frank, E. (1999). Data mining: Practical machine learning tools and techniques with Java implementations. San Francisco: Morgan Kaufmann.
DATA VISUALIZATION Data visualization is a new discipline that uses computers to make pictures that elucidate a concept, phenomenon, relationship, or trend hidden in a large quantity of data. By using interactive threedimensional (3D) graphics, data visualization goes beyond making static illustrations or graphs and emphasizes interactive exploration. The pervasive use of computers in all fields of science, engineering, medicine, and commerce has resulted in an explosive growth of data, presenting people with unprecedented challenges in understanding data. Data visualization transforms raw data into pictures that exploit the superior visual processing capability of the human brain to detect patterns and draw inferences, revealing insights hidden in the data. For example, data visualization allows us to capture trends, structures, and anomalies in the behavior of a physical process being modeled or in vast amounts of Internet data. Furthermore, it provides us with a visual and remote means to communicate our findings to others. Since publication of a report on visualization in scientific computing by the U.S. National Science Foundation in 1987, both government and industry have invested tremendous research and development in data-visualization technology, resulting in advances in visualization and interactive techniques that have helped lead to many scientific discoveries, better engineering designs, and more timely and accurate medical diagnoses.
Visualization Process A typical data-visualization process involves multiple steps, including data generation, filtering, map-
ping, rendering, and viewing. The data-generation step can be a numerical simulation, a laboratory experiment, a collection of sensors, an image scanner, or a recording of Web-based business transactions. Filtering removes noise, extracts and enhances features, or rescales data. Mapping derives appropriate representations of data for the rendering step. The representations can be composed of geometric primitives such as point, lines, polygons, and surfaces, supplemented with properties such as colors, transparency, textures. Whereas the visualization of a computerized tomography (CT) scan of a fractured bone should result in an image of a bone, plenty of room for creativity exists when making a visual depiction of the trend of a stock market or the chemical reaction in a furnace. Rendering generates two-dimensional or three-dimensional images based on the mapping results and other rendering parameters, such as the lighting model, viewing position, and so forth. Finally, the resulting images are displayed for viewing. Both photorealistic and nonphotorealistic rendering techniques exist for different purposes of visual communication. Nonphotorealistic rendering, which mimics how artists use brushes, strokes, texture, color, layout, and so forth, is usually used to increase the clarity of the spatial relationship between objects, improve the perception of an object’s shape and size, or give a particular type of media presentation. Note that the filtering and mapping steps are largely application dependent and often require domain knowledge to perform. For example, the filtering and mapping steps for the visualization of website structure or browsing patterns would be quite different from those of brain tumors or bone fractures. A data-visualization process is inherently iterative. That is, after visualization is made, the user should be able to go back to any previous steps, including the data-generation step, which consists o f a nu m e r i c a l o r p hy s i c a l m o d e l , to m a ke changes such that more information can be obtained from the revised visualization. The changes may be made in a systematic way or by trial and error. The goal is to improve the model and understanding of the corresponding problem via this visual feedback process.
DATA VISUALIZATION ❚❙❘ 153
Computational Steering Data visualization should not be performed in isolation. It is an integral part of data analysis and the scientific discovery process. Appropriate visualization tools integrated into a modeling process can greatly enhance scientists’ productivity, improve the efficiency of hardware utilization, and lead to scientific breakthroughs. The use of visualization to drive scientific discovery processes has become a trend. However, we still lack adequate methods to achieve computational steering—the process of interacting with as well as changing states, parameters, or resolution of a numerical simulation—and to be able to see the effect immediately, without stopping or restarting the simulation. Consequently, the key to successful data visualization is interactivity, the ability to effect change while watching the changes take effect in real time on the screen. If all the steps in the modeling and visualization processes can be performed in a highly interactive fashion, steering can be achieved. The ability to steer a numerical model makes the visualization process a closed loop, becoming a scientific discovery process that is self-contained. Students can benefit from such a process because they can more easily move from concepts to solutions. Researchers can become much more productive because they can make changes according to the interactive graphical interpretation of the simulation states without restarting the simulation every time. Computational steering has been attempted in only a few fields. An example is the SCIRun system used in computational medicine. To adopt computational steering, researchers likely must redesign the computational model that is required to incorporate feedback and changes needed in a steering process. More research is thus needed to make computational steering feasible in general.
Computer-Assisted Surgery During the past ten years significant advances have been made in rendering software and hardware technologies, resulting in higher fidelity and real-time visualization. Computer-assisted surgery is an application of such advanced visualization technologies with a direct societal impact. Computer
visualization can offer better 3D spatial acuity than humans have, and computer-assisted surgery is more reliable and reproducible. However, computerassisted surgery has several challenges. First, the entire visualization process—consisting of 3D reconstruction, segmentation, rendering, and image transport and display—must be an integrated part of the end-to-end surgical planning and procedure. Second, the visualization must be in real time, that is, flicker free. Delayed visual response could lead to dangerous outcomes for a surgery patient. Most important, the visualization must attain the required accuracy and incorporate quantitative measuring mechanisms. Telesurgery, which allows surgeons at remote sites to participate in surgery, will be one of the major applications of virtual reality and augmented reality (where the virtual world and real world are allowed to coexist and a superimposed view is presented to the user). Due to the distance between a patient and surgeons, telesurgery has much higher data visualization, hardware, and network requirements. Nevertheless, fast-improving technology and decreasing costs will make such surgery increasingly appealing. Whether and how much stereoscopic viewing can benefit surgeons remains to be investigated. The most needed advance, however, is in interface software and hardware.
User Interfaces for Data Visualization Most data visualization systems supply the user with a suit of visualization tools that requires the user to be familiar with both the corresponding user interfaces and a large visualization parameter space (a multidimensional space which consists of those input variables used by the visualization program). Intuitive and intelligent user interfaces can greatly assist the user in the process of data exploration. First, the visual representation of the process of data exploration and results can be incorporated into the user interface of a visualization system. Such an interface can help the user to keep track of the visualization experience, use it to generate new visualizations, and share it with others. Consequently, the interface needs to display not only the visualizations but also the visualization process to the user. Second, the
154 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
task of exploring large and complex data and visualization parameter space during the mapping step can be delegated to an intelligent system such as a neural network. One example is to turn the 3D segmentation problem into a simple 2D painting process for the user, leaving the neural network to classify the multidimensional data. As a result, the user can focus on the visualizations rather than on the user interface widgets (e.g., a color editor, plotting area, or layout selector) for browsing through the multidimensional parameter space. Such nextgeneration user interfaces can enhance data understanding while reducing the cost of visualization by eliminating the iterative trial-and-error process of parameter selection. For routine analysis of largescale data sets, the saving can be tremendous.
Research Directions The pervasiveness of the World Wide Web in average people’s lives has led to a data explosion. Some data are relevant to some people’s needs, but most are not. Nevertheless, many people do their everyday jobs by searching huge databases of information distributed in locations all over the world. A large number of computer services repeatedly operate on these databases. Information visualization, a branch of visualization, uses visual-based analysis of data with no spatial references, such as a large amount of text and document. A data mining step (the procedure to reduce the size, dimensionality, and/or complexity of a data set), which may be considered as the filtering step, usually precedes the picture-making step of visualization. The mapping step often converts reduced relations into graphs or charts. Most information visualizations are thus about displaying and navigating 2D or 3D graphs. People need new reduction, mapping, and navigation methods so that they can manage, comprehend, and use the fast-growing information on the World Wide Web. Other important research directions in data visualization include improving the clarity of visualizations, multidimensional and multivariate data (a data set with a large number of dependent variables) visualization, interaction mechanisms for large and shared display space, visualization designs guided
by visual perception study, and user studies for measuring the usability of visualization tools and the success of visualizations. Kwan-Liu Ma See also Information Spaces; Sonification FURTHER READING Johnson, C., & Parker, S. (1995). Applications in computational medicine using SCIRun: A computational steering programming environment. The 10th International Supercomputer Conference (pp. 2–19). Ma, K. L. (2000). Visualizing visualizations: Visualization viewpoints. IEEE Computer Graphics and Applications, 20(5), 16–19. Ma, K.-L. (2004). Visualization—A quickly emerging field. Computer Graphics, 38(1), 4–7. McCormick, B., DeFanti, T., & Brown, M. (1987). Visualization in scientific computing. Computer Graphics, 21(6)
DEEP BLUE In 1997, the chess machine Deep Blue fulfilled a long-standing challenge in computer science by defeating the human world chess champion, Garry Kasparov, in a six-game match. The idea that a computer could defeat the best humanity had to offer in an intellectual game such as chess brought many important questions to the forefront: Are computers intelligent? Do computers need to be intelligent in order to solve difficult or interesting problems? How can the unique strengths of humans and computers best be exploited?
Early History Even before the existence of electronic computers, there was a fascination with the idea of machines that could play games. The Turk was a chess-playing machine that toured the world in the eighteenth and nineteenth centuries, to much fanfare. Of course the technology in the Turk was mainly concerned with concealing the diminutive human chess master hidden inside the machine.
DEEP BLUE ❚❙❘ 155
In 1949, the influential mathematician Claude Shannon (1916–2001) proposed chess as an ideal domain for exploring the potential of the then-new electronic computer. This idea was firmly grasped by those studying artificial intelligence (AI), who viewed games as providing an excellent test bed for exploring many types of AI research. In fact, chess has often been said to play the same role in the field of artificial intelligence that the fruit fly plays in genetic research. Although breeding fruit flies has no great practical value, they are excellent subjects for genetic research: They breed quickly, have sufficient variation, and it is cheap to maintain a large population. Similarly, chess avoids some aspects of complex real-world domains that have proven difficult, such as natural-language understanding, vision, and robotics, while having sufficient complexity to allow an automated problem solver to focus on core AI issues such as search and knowledge representation. Chess programs made steady progress in the following decades, particularly after researchers abandoned the attempt to emulate human thought processes and instead focused on doing a more thorough and exhaustive exploration of possible move sequences. It was soon observed that the playing strength of such “brute-force” chess programs correlated strongly with the speed of the underlying computer, and chess programs gained in strength both from more sophisticated programs and faster computers.
Deep Blue The Deep Blue computer chess system was developed in 1989–1997 by a team (Murray Campbell, A. Joseph Hoane, Jr., Feng-hsiung Hsu) from IBM’s T. J. Watson Research Center. Deep Blue was a leap ahead of the chess-playing computers that had gone before it. This leap resulted from a number of factors, including: ■
a computer chip designed specifically for highspeed chess calculations, ■ a large-scale parallel processing system, with more than five hundred processors cooperating to select a move,
■
a complex evaluation function to assess the goodness of a chess position, and ■ a strong emphasis on intelligent exploration (selective search) of the possible move sequences. The first two factors allowed the full Deep Blue system to examine 100–200 million chess positions per second while selecting a move, and the complex evaluation function allowed Deep Blue to make more informed decisions. However, a naive brute-force application of Deep Blue’s computational power would have been insufficient to defeat the top human chess players. It was essential to combine the computer’s power with a method to focus the search on move sequences that were “important.” Deep Blue’s selective search allowed it to search much more deeply on the critical move sequences. Deep Blue first played against world champion Garry Kasparov in 1996, with Kasparov winning the six-game match by a score of 4-2. A revamped Deep Blue, with improved evaluation and more computational power, won the 1997 rematch by a score of 3.5–2.5.
Human and Computer Approaches to Chess It is clear that systems like Deep Blue choose moves using methods radically different from those employed by human experts. These differences result in certain characteristic strengths of the two types of players. Computers tend to excel at the shortrange tactical aspects of a game, mainly due to an extremely thorough investigation of possible move sequences. Human players can only explore perhaps a few dozen positions while selecting a move, but can assess the long-term strategic implications of these moves in a way that has proven difficult for a computer. The combination of human and computer players has proven to be very powerful. High-level chess players routinely use computers as part of their preparation. One typical form of interaction would have the human player suggest strategically promising moves that are validated tactically by the computer player.
156 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Research Areas While computer programs such as Deep Blue achieve a very high level of play, most of the knowledge in such systems is carefully crafted by human experts. Much research is needed to understand how to have future systems learn the knowledge necessary to play the game without extensive human intervention. TD-gammon, a neural-network system that plays world-class backgammon, was an early leader in this area. Determining the best mode for humans and computers to interact in a more cooperative manner (for example, with one or the other acting as assistant, trainer, or coach) is another area worthy of further research. Game-playing programs that are based on large-scale searches have problems in translating the search results into forms that humans can deal with easily. Some types of games, such as the Chinese game Go, have too many possible moves to allow a straightforward application of the methods used for chess, and pose significant challenges. Games with hidden information and randomness, such as poker or bridge, also require new and interesting approaches. Interactive games, which employ computer-generated characters in simulated worlds, can be more realistic and entertaining if the characters can behave in intelligent ways. Providing such intelligent characters is a key goal for future AI researchers.
Newborn, M. (2003). Deep Blue: An artificial intelligence milestone. New York: Springer-Verlag. Schaeffer, J. (2001). A gamut of games. AI Magazine, 22(3), 29–46. Schaeffer, J., & van den Herik, J. (Eds.). (2002). Chips challenging champions: Games, computers, and artificial intelligence. New York: Elsevier. Shannon, C. (1950). Programming a computer for playing chess. Philosophical Magazine, 41, 256–275. Standage, T. (2002). The Turk: The life and times of the famous eighteenth-century chess-playing machine. New York: Walker & Company.
DENIAL-OF-SERVICE ATTACK A denial-of-service (DoS) attack causes the consumption of a computing system’s resources— typically with malicious intent—on such a scale as to compromise the ability of other users to interact with that system. Virgil Gligor coined the term denialof-service attack in reference to attacks on operating systems (OS) and network protocols. Recently the term has been used specifically in reference to attacks executed over the Internet. As governments and businesses increasingly rely on the Internet, the damage that a DoS attack can cause by the disruption of computer systems has provided incentive for attackers to launch such attacks and for system operators to defend against such attacks.
Murray Campbell See also Artificial Intelligence
Evolution of Denial-of-Service Attacks
FURTHER READING
DoS vulnerabilities occur when a poor resources-allocation policy allows a malicious user to allocate so many resources that insufficient resources are left for legitimate users. Early DoS attacks on multiuser operating systems involved one user spawning a large number of processes or allocating a large amount of memory, which would exhaust the memory available and result in operating system overload. Early network DoS attacks took advantage of the fact that the early Internet was designed with implicit trust in the computers connected to it. The unin-
Campbell, M., Hoane, A. J., & Hsu, F. (2002). Deep Blue. Artificial Intelligence, 134(1–2), 57–83. Frey, P. W. (Ed.). (1983). Chess skill in man and machine. New York: Springer-Verlag. Hsu, F. (2002). Behind Deep Blue: Building the computer that defeated the world chess champion. Princeton, NJ: Princeton University Press. Laird, J. E., & van Lent, M. (2000). Human-level AI’s killer application: Interactive computer games. AI Magazine, 22(2), 15–26. Marsland, T. A., & Schaeffer, J. (Eds.). (1990). Computers, chess, and cognition. New York: Springer-Verlag.
DENIAL-OF-SERVICE ATTACK ❚❙❘ 157
tended result of this trust was that users paid little attention to handling packets (the fundamental unit of data transferred between computers on the Internet) that did not conform to standard Internet protocols. When a computer received a malformed packet that its software was not equipped to handle, it might crash, thus denying service to other users. These early DoS attacks were relatively simple and could be defended against by upgrading the OS software that would identify and reject malformed packets. However, network DoS attacks rapidly increased in complexity over time. A more serious threat emerged from the implicit trust in the Internet’s design: The protocols in the Internet themselves could be exploited to execute a DoS attack. The difference between exploiting an Internet software implementation (as did the previous class of DoS attacks) and exploiting an Internet protocol itself was that the former were easy to identify (malformed packets typically did not occur outside of an attack) and once identified could be defended against, whereas protocol-based attacks could simply look like normal traffic and were difficult to defend against without affecting legitimate users as well. An example of a protocol attack was TCP SYN flooding. This attack exploited the fact that much of the communication between computers over the Internet was initiated by a TCP handshake where the communicating computers exchanged specialized packets known as “SYN packets.” By completing only half of the handshake, an attacker could leave the victim computer waiting for the handshake to complete. Because computers could accept only a limited number of connections at one time, by repeating the half-handshake many times an attacker could fill the victim computer’s connection capacity, causing it to reject new connections by legitimate users or, even worse, causing the OS to crash. A key component of this attack was that an attacker was able to hide the origin of his or her packets and pose as different computer users so the victim had difficulty knowing which handshakes were initiated by the attacker and which were initiated by legitimate users. Then another class of DoS attack began to appear: distributed denial-of-service (DDoS) attacks. The difference between traditional DoS attacks
and the DDoS variant was that attackers were beginning to use multiple computers in each attack, thus amplifying the attack. Internet software implementation and protocol attacks did not require multiple attackers to be successful, and effective defenses were designed (in some cases) against them. A DDoS attack, however, did not require a software implementation or protocol flaw to be present. Rather, a DDoS attack would consist of an attacker using multiple computers (hundreds to tens of thousands) to send traffic at the maximum rate to a victim’s computer. The resulting flood of packets was sometimes enough to either overload the victim’s computer (causing it to slow to a crawl or crash) or overload the communication line from the Internet to that computer. The DDoS attacker would subvert control of other people’s computers for use in the attack, often using flaws in the computer’s control code similar to Internet software implementation DoS attacks or simply attaching the attack control codes in an e-mail virus or Internet worm. The presence of attacking computers on many portions of the Internet gave this class of attacks its name.
Defense against DoS Attacks Defending against DoS attacks is often challenging because the very design of the Internet allows them to occur. The Internet’s size requires that even the smallest change to one of its fundamental protocols be compatible with legacy systems that do not implement the change. However, users can deploy effective defenses without redesigning the entire Internet. For example, the defense against Internet software implementation DoS attacks is as simple as updating the software on a potential victim’s computer; because the packets in this type of attack are usually malformed, the compatibility restriction is easy to meet. Defending against a protocol-level attack is more difficult because of the similarity of the attack itself to legitimate traffic. Experts have proposed several mechanisms, which mostly center on the concept of forcing all computers initiating a handshake to show that they have performed some amount of “work” during the handshake. The expectation is that an attacker will not have the computing power to
158 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
impersonate multiple computers making handshake requests. Unfortunately, this class of defenses requires a change in the Internet protocol that must be implemented by all computers wanting to contact the potential victim’s computer. Moreover, more protocol-compliant solutions involve placing between the victim’s computer and the Internet specialized devices that are designed to perform many handshakes at once and to pass only completed handshakes to the victim. The DDoS variant of DoS attacks is the most difficult to defend against because the attack simply overwhelms the victim’s computer with too many packets or, worse, saturates the victim’s connection to the Internet so that many packets are dropped before ever reaching the victim’s computer or network. Some businesses rely on overprovisioning, which is the practice of buying computer resources far in excess of expected use, to mitigate DDoS attacks; this practice is expensive but raises the severity of an attack that is necessary to disable a victim. Proposed defenses against this type of attack—more so than proposed defenses against other types of attacks— have focused on changing Internet protocols. Many proposals favor some type of traceback mechanism, which allows the victim of an attack to determine the identity and location of the attacking computers, in the hope that filters can be installed in the Internet to minimize the flood of traffic while leaving legitimate traffic unaffected. At the time of this writing, no DDoS defense proposal has been accepted by the Internet community.
The Future DoS attacks are likely to trouble the Internet for the foreseeable future. These attacks, much like urban graffiti, are perpetrated by anonymous attackers and require a substantial investment to defend against, possibly requiring a fundamental change in the Internet’s protocols. Although several DoS attacks have succeeded in bringing down websites of well-known businesses, most attacks are not as wildly successful, nor have all businesses that have been victimized reported attacks for fear of publicizing exactly how weak their computing infrastructure is.
We must wait to see whether DoS attacks will further threaten the Internet, provoking the acceptance of radical defense proposals, or will simply fade into the background and become accepted as a regular aspect of the Internet. Adrian Perrig and Abraham Yaar See also Security; Spamming FURTHER READING Aura, T., Nikander, P., & Leiwo, J. (2000). DoS-resistant authentication with client puzzles. Security Protocols—8th International Workshop. Gligor, V. D. (1983). A note on the denial of service problem. Proceedings of 1983 Symposium on Security and Privacy (pp. 139–149). Gligor, V. D. (1986). On denial of service in computer networks. Proceedings of International Conference on Data Engineering (pp. 608–617). Gligor, V. D. (2003). Guaranteeing access in spite of service-flooding attacks. Proceedings of the Security Protocols Workshop. Savage, S., Wetherall, D., Karlin, A., & Anderson, T. (2000). Practical network support for IP traceback. Proceedings of ACM SIGCOMM 2000 (pp. 295–306). Wang, X., & Reiter, M. K. (2003). Defending against denial-of-service attacks with puzzle auctions. Proceedings of the 2003 IEEE Symposium on Security and Privacy (pp. 78–92). Yaar, A., Perrig, A., & Song, D. (2003). Pi: A path identification mechanism to defend against DDoS attacks. IEEE Symposium on Security and Privacy (pp. 93–107).
DESKTOP METAPHOR The desktop metaphor is being used when the interface of an interactive software system is designed such that its objects and actions resemble objects and actions in a traditional office environment. For example, an operating system designed using the desktop metaphor represents directories as labeled folders and text documents as files. In graphical user interfaces (GUIs), the bitmap display and pointing devices such as a mouse, a trackball, or a light pen are used to create the metaphor: The bitmap display presents a virtual desk, where documents can be created, stored, retrieved, reviewed, edited, and discarded. Files, folders, the trash can (or recycle bin) and so
DESKTOP METAPHOR ❚❙❘ 159
forth are represented on the virtual desktop by graphical symbols called icons. Users manipulate these icons using the pointing devices. With pointing devices, the user can select, open, move, or delete the files or folders represented by icons on the desktop. Users can retrieve information and read it on the desktop just as they would read actual paper documents at a physical desk. The electronic document files can be stored and organized in electronic folders just as physical documents are saved and managed in folders in physical file cabinets. Some of the accessories one finds in an office are also present on the virtual desktop; these include the trash can, a clipboard, a calendar, a calculator, a clock, a notepad, telecommunication tools, and so on. The metaphor of the window is used for the graphical boxes that let users look into information in the computer. Multiple windows can be open on the desktop at once, allowing workers to alternate quickly between multiple computer applications (for example, a worker may have a word processing application, a spreadsheet application, and an Internet browser open simultaneously, each in its own window). Computer users can execute, hold, and resume their tasks through multiple windows.
BITMAP An array of pixels, in a data file or structure, which correspond bit for bit with an image.
Beginning in the late 1970s, as personal computers and workstations became popular among knowledge workers (people whose work involves developing and using knowledge—engineers, researchers, and teachers, for example), the usability of the computers and the productivity of those using them became important issues. The desktop metaphor was invented in order to make computers more usable, with the understanding that more-usable computers would increase users’ productivity. The desktop metaphor enabled users to work with computers in a more familiar, more comfortable manner and to spend less time learning how to use them. The invention of the desktop metaphor greatly enhanced the quality of human-computer interaction.
Historical Overview In the 1960s and 1970s, several innovative concepts in the area of HCI were originated and implemented using interactive time-shared computers, graphics screens, and pointing devices. Sketchpad Sketchpad was a pioneering achievement that opened the field of interactive computer graphics. In 1963, Ivan Sutherland used a light pen to create engineering drawings directly on the computer screen for his Ph.D. thesis, Sketchpad: A Man-Machine Graphical Communications System. His thesis initiated a totally new way to use computers. Sketchpad was executed on the Lincoln TX-2 computer at MIT. A light pen and a bank of switches were the user interface for this first interactive computer graphics system. Sketchpad also pioneered new concepts of memory structures for storing graphical objects, rubberbanding of lines (stretching lines as long as a user wants) on the screen, the ability to zoom in and out on the screen, and the ability to make perfect lines, corners, and joints. NLS The scientist and inventor Doug Engelbart and his colleagues at Stanford University introduced the oN Line System (NLS) to the public in 1968. They also invented the computer mouse. The NLS was equipped with a mouse and a pointer cursor for the first time; it also was the first system to make use of hypertext. Among other features, the system provided multiple windows, an online context-sensitive help system, outline editors for idea development, two-way video conferencing with shared workspace, word processing, and e-mail. Smalltalk In the early 1970s, Alan Kay and his colleagues at Xerox’s Palo Alto Research Center (PARC) invented an object-oriented programming language called Smalltalk. It was the first integrated programming environment, and its user interface was designed using the desktop metaphor. It was designed not only for expert programmers of complex software, but also for the novice users, including children: Its
160 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
designers intended it to be an environment in which users learned by doing. In 1981 Xerox PARC integrated the innovations in the fields of human-computer symbiosis, personal computing, objected-oriented programming languages, and local-area networks and arrived at the Xerox 8010 Star information system. It was the first commercial computer system that implemented the desktop metaphor, using a mouse, a bitmap display, and a GUI. In interacting with the system, users made use of windows, icons, menus, and pointing devices (WIMP). Most of the workstations and personal computers that were developed subsequently, including Apple Computer’s Lisa (1983) and Macintosh (1984) and Microsoft’s Windows (1985), were inspired by Star; like the Star system, they too adopted the desktop metaphor. Apple’s Lisa was designed to be a high-quality, easy-to-use computer for knowledge workers such as secretaries, managers, and professionals in general office environments. Its design goals were: User friendliness: The developers of the Lisa wanted users to use the computer not only because doing so was part of their job, but also because it was fun to use. The users were expected to feel comfortable because the user interface resembled their working environment. Standard method of interaction: A user was provided with consistent look and feel in the system and all applications, which meant that learning time could be dramatically decreased and training costs lowered. Gradual and intuitive learning: A user should be able to complete important tasks easily with minimal training. The user should not be concerned with more sophisticated features until they are necessary. Interaction with the computer should be intuitive; that is, the user should be able to figure out what he or she needs to do. Error protection: A user should be protected from obvious errors. For example, Lisa allowed users to choose from a collection of possible operations that were proper for the occasion and the object. By limiting the choices, fatal errors and obvious errors could be avoided. Any error from a user should be processed in a helpful manner by generating a warning message or providing a way of recovering from the error.
Personalized interaction: A user could set up attributes of the system in order to customize the interaction with the system. The personalized interaction did not interfere with the standard interaction methods. Multiple tasks: Because workers in office environments perform many tasks simultaneously and are often interrupted in their work, Lisa was designed to be able to hold the current work while users attended to those interruptions and to other business. The idea was that the user should be able to switch from one task to another freely and instantly. Apple’s Lisa provided knowledge workers with a virtual desktop environment complete with manipulable documents, file folders, calculators, electronic paper clips, wastebasket, and other handy tools. The documents and other office-based objects were represented by naturalistic icons. The actions defined for the icons, such as selecting, activating, moving, and copying, were implemented by means of mouse operations such as clicking, double-clicking, and dragging. Lisa users did not have to memorize commands such as “delete” (“del”), “remove” (“rm”), or “erase” in order to interact with the system. The first version of Microsoft Windows was introduced in 1985. It provided an interactive software environment that used a bitmap display and a mouse. The product included a set of desktop applications, including a calendar, a card file, a notepad, a calculator, a clock, and telecommunications programs. In 1990 Windows 3.0 was introduced, the first real GUI-based system running on IBM-compatible PCs. It became widely popular. The Windows operating system evolved through many more incarnations, and in the early 2000s was the most popular operating system in the world.
Research Directions The desktop metaphor, implemented through a graphical user interface, has been the dominate metaphor for human-computer interfaces since the 1980s. What will happen to the human-computer interaction paradigm in the future? Will the desktop metaphor continue to dominate? It is extremely difficult to predict the future in the computer world. However, there are several pioneering researchers
DESKTOP METAPHOR ❚❙❘ 161
exploring new interaction paradigms that could replace the desktop-based GUI.
OPERATING SYSTEM Software (e.g., Windows 98, UNIX, or DOS) that enables a computer to accept input and produce output to peripheral devices such as disk drives and printers.
A Tangible User Interface At present, interactions between human and computers are confined to a display, a keyboard, and a pointing device. The tangible user interface (TUI) proposed by Hiroshi Ishii and Brygg Ullmer at MIT’s Media Lab in 1997 bridges the space between human and computers in the opposite direction. The user interface of a TUI-based system can be embodied in a real desk and other real objects in an office environment. Real office objects such as actual papers and pens could become meaningful objects for the user interface of the system. Real actions on real objects can be recognized and interpreted as operations applied to the objects in the computer world, so that, for example, putting a piece of paper in a wastebasket could signal the computer to delete a document. This project attempts to bridge the gap between the computer world and the physical office environment by making digital information tangible. While the desktop metaphor provides the users with a virtual office environment, in a TUI the physical office environment, including the real desktop, becomes the user interface. Ishii and Ullmer designed and implemented a prototype TUI called metaDESK for Tangible Geospace, a physical model of landmarks such as the MIT campus. The metaDESK was embodied in realworld objects and regarded as a counterpart of the virtual desktop. The windows, icons, and other graphical objects in the virtual desktop corresponded to physical objects such as activeLENS (a physically embodied window), phicon (a physically embodied icon—in this case, models of MIT buildings such as the Great Dome and the Media Lab building), and so forth. In the prototype system, the activeLENS was equivalent to a window of the virtual desktop and was used in navigating and examining the three di-
mensional views of the MIT geographical model. Users could physically control the phicons by grasping and placing them so that a two dimensional map of the MIT campus appears on the desk surface beneath the phicons. The locations of the Dome and the Media Lab buildings on the map should match with the physical locations of the phicons on the desk. Ubiquitous Computing In 1988 Mark Weiser at Xerox PARC introduced a computing paradigm called ubiquitous computing. The main idea was to enable users to access computing services wherever they might go and whenever they might need them. Another requirement was that the computers be invisible to the users, so the users would not be conscious of them. The users do what they normally do and the computers in the background recognize the intention of the users and provide the best services for them. It means that the users do not have to learn how to operate computers, how to type a keyboard, how to access the Internet, etc. Therefore, the paradigm requires that new types of computing services and computer systems be created. New technologies such as context awareness, sensors, and intelligent distributed processing. are required. Their interaction methods must be based on diverse technologies such as face recognition, character recognition, gesture recognition, and voice recognition.
OPEN-SOURCE SOFTWARE Open-source software permits sharing of a program’s original source code with users, so that the software can be modified and redistributed to other users.
As new computing services and technologies are introduced, new types of computing environments and new interaction paradigms will emerge. The desktop metaphor will also evolve to keep pace with technological advances. However, the design goals of the user interfaces will not change much. They should be designed to make users more comfortable, more effective, and more productive in using their computers. Jee-In Kim
162 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
See also Alto; Augmented Reality; Graphical User Interface FURTHER READING Goldberg, A. (1984). Smalltalk-80: The interactive programming environment. Reading, MA: Addison-Wesley. Ishii, H., & Ullmer, B. (1997). Tangible bits: Towards seamless interfaces between people, bits and atoms. In Proceedings of CHI ’97 (pp. 234–241), New York: ACM Press. Kay, A. (1993) The early history of Smalltalk. ACM SIGPLAN Notices, 28(3), 69–95. Kay, A., & Goldberg, A (1977). Personal dynamic media. IEEE Computer, 10(3), 31–42. Myers, B., Ioannidis, Y., Hollan, J., Cruz, I., Bryson, S., Bulterman, D., et al. (1996). Strategic directions in human computer interaction. ACM Computing Survey, 28(4), 794–809. Perkins, R., Keller, D., & Ludolph, F. (1997). Inventing the Lisa user interface. Interactions, 4(1), 40–53. Shneiderman, B. (1998). Designing the user interface: Strategies for effective human-computer interaction (3rd ed.). Reading, MA: Addison-Wesley. Weiser, M. (1991). The computer for the 21st century. Scientific American, 256(3), 94–104.
DIALOG SYSTEMS Speech dialog systems are dialog systems that use speech recognition and speech generation to allow a human being to converse with a computer, usually to perform some well-defined task such as making travel reservations over the telephone. A dialog is a two-way interaction between two agents that communicate. Dialogs are incremental and can be adapted dynamically to improve the effectiveness of the communication. While people communicate efficiently and effectively using dialog, computers do not typically engage in dialogs with people. More common are presentation systems, which are concerned with the effective presentation of a fixed content, subject to a limited number of constraints. Unlike dialogs, presentations are planned and displayed in their entirety (without intermediate feedback from the user) and thus do not allow the system to monitor the effectiveness of the presentation or allow the user to interrupt and request clarification. Dialog systems have been less common than presentation systems because they
are more resource intensive; however, despite the extra costs, the need to reach a broader community of users and the desire to create systems that can perform tasks that require collaboration with users has led to a shift toward more dialog-based systems. Speech is a good modality for remote database access systems, such as telephone information services, which would otherwise require a human operator or a tedious sequence of telephone keystrokes. Spoken interaction is also useful when a user’s hands are busy with other tasks, such as operating mechanical controls, or for tasks for which the sound of the user’s speech is important, such as tutoring speakers in oral reading. Speech interaction can also make computers more accessible to people with vision impairments. Speech dialog systems have been used for tutoring in oral reading; for providing information about public transportation, train schedules, hotels, and sight seeing; for making restaurant and real estate recommendations; for helping people diagnose failures in electronic circuits; and for making travel reservations. Spoken dialog is most successful when the scope of the task is well-defined and narrow, such as providing airline reservations or train schedules, because the task creates expectations of what people will say—and the more limited the scope, the more limited the expectations. These expectations are needed for the system to interpret what has been said; in most cases the same group of speech sounds will have several different possible interpretations, but the task for which the dialog system is used makes one of the interpretations by far the most likely.
The Architecture of Speech Dialog Systems Speech dialog systems include the following components or processes: speech recognition, natural-language parsing, dialog management, natural-language generation, and speech synthesis. There is also an application or database that provides the core functionality of the system (such as booking a travel reservation) and a user interface to transmit inputs from the microphone or telephone to the speech-recognition component.
DIALOG SYSTEMS ❚❙❘ 163
Speech Recognition Understanding speech involves taking the sound input and mapping it onto a command, request, or statement of fact to which the application can respond. Speech recognition is the first step, which involves mapping the audio signal into words in the target language. Early approaches, such as HEARSAY II and HARPY, which were developed in the 1970s, were based on rule-based artificial intelligence. They were not very successful. Current approaches are based on statistical models of language that select the most probable interpretation of each sound unit given immediately preceding or following ones and the context of the task (which determines the vocabulary). For spoken-dialog systems, the level of speech recognition quality that is desired is known as telephone quality, spontaneous speech (TQSS). This level of recognition is necessary if spoken-dialog applications such as reservation services are to be successful over a telephone line. TQSS is more difficult to understand than face-to-face speech because over the telephone the audio signal normally includes background and channel noise, acoustical echo and channel variations, and degradation due to bandwidth constraints. Moreover, spontaneous speech includes pauses, disfluencies (such as repetitions and incomplete or ill-formed sentences), pronunciation variations due to dialects, as well as context-dependent formulations and interruptions or overlapping speech (known as “barge-in”). One way that speech recognition is made more accurate is by limiting the vocabulary that the system allows. To determine a sublanguage that will be sufficiently expressive to allow people to use the application effectively and comfortably, two techniques are generally used: One approach is to observe or stage examples of two people engaged in the domain task; the other approach is to construct a simulated man-machine dialog (known as a Wizard of Oz [WOZ] simulation) in which users try to solve the domain task. In a WOZ simulation, users are led to believe they are communicating with a functioning system, while in reality the output is generated by a person who simulates the intended functionality of the system. This approach allows the designers to see what language people will use in response to the limited vocabulary of the proposed system and
its style of dialog interaction. In either case, a vocabulary and a language model can be obtained that may only require a few thousand (and possibly only a few hundred) words. Natural-Language Parsing Natural-language parsing maps the sequence of words produced by the speech recognizer onto commands, queries, or propositions that will be meaningful to the application. There are a variety of approaches to parsing; some try to identify general-purpose linguistic patterns, following a so-called syntactic grammar, while others look for patterns that are specific to the domain, such as a semantic grammar. Some systems use simpler approaches, such as word spotting, pattern matching, or phrase spotting. The output of the parser will typically be a slot-andfiller-based structure called a case frame, in which phrases in the input are mapped to slots corresponding to functions or parameters of the application. A key requirement of parsing for spoken dialog is that the parser be able to handle utterances that do not form a complete sentence or that contain the occasional grammatical mistake. Such parsers are termed robust. Syntactic and semantic parsers work best when the input is well formed structurally. Simpler methods such as pattern matching and phrase spotting can be more flexible about structural ill-formedness, but may miss important syntactic variations such as negations, passives, and topicalizations. Also, the simpler approaches have little information about how to choose between two different close matches. To be useful in practice, another requirement is that the parser be fast enough to work in real time, which is usually only possible if the analysis is expectation driven. By the late 1990s, most spoken-dialog systems still focused on getting the key components to work together and had not achieved real-time behavior (interpretation and generation of speech)—with only a few systems using real-time behavior. Dialog Management Dialog management involves interpreting the representations created by the natural-language parser and deciding what action to take. This process is often the central one and drives the rest of the
164 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
system. Dialog management may involve following a fixed pattern of action defined by a grammar (for example, answering a question, or it may involve reasoning about the users’ or the system’s current knowledge and goals to determine the most appropriate next step. In this second instance, then, dialog management may also keep track of the possible goals of the users and their strategies (plans) for achieving them. It may also try to identify and resolve breakdowns in communication caused by lack of understanding, misunderstanding, or disagreement. One factor that distinguishes dialog managers is the distribution of control between the system and the user. This has been referred to as initiative, or the mode of communication, with the mode being considered from the perspective of the computer system. When the computer has complete control, it is responsible for issuing queries to the user, collecting answers, and formulating a response. This has been called directive mode. At the opposite extreme, some systems allow the user to have complete control, telling the system what the user wants to do and asking the system to provide answers to specific queries. This is known as passive mode. In the middle are systems that share initiative with the user. The system may begin by issuing a query to the user (or receiving a query from the user) but control may shift if either party wishes to request clarification or to obtain information needed for a response. Control may also shift if one party identifies a possible breakdown in communication or if one party disagrees with information provided by the other. Dialogs have been shown to more efficient if control can shift to the party with the most information about the current state of the task. Natural-Language Generation Natural-language generation is used to generate answers to the user’s queries or to formulate queries for the user in order to obtain the information needed to perform a given task. Natural-language generation involves three core tasks: content selection (deciding what to say), sentence planning (deciding how to organize what to say into units), and realization (mapping the planned response onto a grammatically correct sequence of words).
Historically, natural-language generation components have not run in real time, with the realization component being an important bottleneck. These systems can be slow if they follow an approach that is essentially the inverse of parsing—taking a structural description of a sentence, searching for grammar rules that match the description, and then applying each of the rules to produce a sequence of words. As a result, many spoken-dialog systems have relied on preformulated answers (canned text). More recently, real-time approaches to text generation have been developed that make use of fixed patterns or templates that an application can select and thereby bypass the need to perform a search within the generation grammar. Speech Synthesis Speech synthesis allows the computer to respond to the user in spoken language. This may involve selecting and concatenating pieces of prerecorded speech or generating speech two sounds at a time, a method known as diphone-based synthesis. (Diphone refers to pairs of sounds.) Databases of utterances to be prerecorded for a domain can be determined by analyzing the utterances produced by a human performing the same task as the information system and then selecting the most frequent utterances. Diphonebased synthesis also requires a database of prerecorded sound; however instead of complete utterances the database will contain a set of nonsense words (that have examples of all pairs of sounds), containing all phone-phone transitions for the target output language. Then when the synthesizer wants to generate a pair of sounds, it selects a word that contains the sound-pair (diphone) and uses the corresponding portion of the recording. Although these basic components of speech dialog systems can be combined in a number of ways, there are three general approaches: pipelined architectures, agent-based architectures, and huband-spoke-based architectures. In a pipelined architecture, each component in the sequence processes its input and initiates the next component in the sequence. Thus, the audio interface would call the speech recognizer, which would call the naturallanguage parser, and so on, until the speech synthesis
DIALOG SYSTEMS ❚❙❘ 165
component is executed. In an agent-based approach, a centralized component (typically the dialog manager) initiates individual components and determines what parameters to provide them. This may involve some reasoning about the results provided by the components. In a hub-and-spoke architecture there is a simple centralized component (the hub) which brokers communication among the other components, but performs no reasoning. Since 1994, a hub-and-spoke architecture called Galaxy Communicator has been under development. It has been proposed as a standard reference architecture that will allow software developers to combine “plug-and-play”-style components from a variety of research groups or commercial vendors. The Galaxy Communicator effort also includes an opensource software infrastructure.
Dialog System Toolkits Creating speech dialog systems is a major undertaking because of the number and complexity of the components involved. This difficulty is mitigated by the availability of a number of software toolkits that include many, if not all, of the components needed to create a new spoken-dialog system. Currently such toolkits are available both from commercial vendors (such as IBM, which markets ViaVoice toolkits for speech recognition and speech synthesis) and academic institutions. Academic institutions generally distribute their software free for noncommercial use, but sell licenses for commercial applications. Below, we consider a few of the major (academically available) speech dialog toolkits. In addition to these toolkits, a number of institutions distribute individual components useful in building speech dialog systems. For example, the Festival Speech Synthesis System developed by the University of Edinburgh has been used in a number of applications. The Communicator Spoken Dialog Toolkit, developed by researchers at Carnegie Mellon University, is an open-source toolkit that provides a complete set of software components for building and deploying spoken-language dialog systems for both desktop and telephone applications. It is built on top of the Galaxy Communicator software
infrastructure and is distributed with a working implementation for a travel-planning domain. It can be downloaded freely from the Internet. The group also maintains a telephone number connected to their telephone-based travel-planning system that anyone can try. The Center for Spoken Language Research (CSLR) at the University of Colorado in Boulder distributes the Conversational Agent Toolkit. This toolkit includes modules that provide most of the functionality needed to build a spoken-dialog system, although code must be written for the application itself. As a model, CSLR distributes their toolkit with a sample (open-source) application for the travel domain that can be used as a template; it is based on the Galaxy Communicator hub architecture. TRINDIKIT is a toolkit for building and experimenting with dialog move engines (mechanisms for updating what a dialog system knows, based on dialog moves (single communicative actions such as “giving positive feedback”) and information states (information stored by the dialog system). It has been developed in the TRINDI and SIRIDUS projects, two European research projects that investigate humanmachine communication using natural language. TRINDIKIT specifies formats for defining information states, rules for updating the information state, types of dialog moves, and associated algorithms.
The Evaluation of Speech Dialog Systems Prior to 1990, methods of evaluating speech dialog systems concentrated on the number of words that the speech recognizer identified correctly. In the early 1990s, there was a shift to looking at the quality of the responses provided by spoken-dialog systems. For example, in 1991, the U.S. Defense Advanced Research Projects Agency community introduced a metric that evaluates systems based on the number of correct and incorrect answers given by the system. Systems are rewarded for correct answers and penalized for bad answers, normalized by the total number of answers given. (The effect is that it is better to give a nonanswer such as “I do not understand”
166 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
or “please rephrase your request” than to give an incorrect answer.) This approach relies on the existence of a test database with a number of sample sentences from the domain along with the correct answer, as well as a set of answers from the system to be evaluated. Starting in the late 1990s, approaches to evaluating dialog success have looked at other measures, such as task-completion rates and user satisfaction (as determined by subjective questionnaires). Subjective factors include perceived system-response accuracy, likeability, cognitive demand (how much effort is needed to understand the system), habitability (how comfortable or natural the system is to use), and speed. There has also been success in predicting user satisfaction or task completion on the basis of objectively observable features of the dialog, such as task duration, the number of system words per turn, the number of user words per turn, the number of overlapping turns, sentence error rates, and perceived task completion. Statistical methods such as multiple regression models and classification trees are then used to predict user satisfaction and task-completion scores.
The Research Community for Speech Dialog Systems Research on speech dialog systems is interdisciplinary, bringing together work in computer science, engineering, linguistics, and psychology. There are a number of journals, conferences, and workshops through which researchers and developers of spokendialog systems disseminate their work. Important journals include Computer Speech and Language and Natural Language Engineering. Conferences most focused on such systems include Eurospeech and Interspeech (the International Conference on Spoken Language Processing). In addition, the Special Interest Group on Discourse and Dialog (SIGdial) organizes an annual workshop. SIGdial is a Special Interest Group (SIG) of both the Association for Computational Linguistics and the International Speech Communication Association (ISCA). SIGdial is an international, nonprofit cooperative organization that includes researchers from academia, industry,
and government. Among its goals are promoting, developing, and distributing reusable discourseprocessing components; encouraging empirical methods in research; sharing resources and data among the international community; exploring techniques for evaluating dialog systems; promoting standards for discourse transcription, segmentation, and annotation; facilitating collaboration between developers of various system components; and encouraging student participation in the discourse and dialog community. Susan W. McRoy See also Natural-Language Processing, Open Source Software, Speech Recognition, Speech Synthesis
FURTHER READING Allen, J. F., Schubert, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., et al. (1995). The TRAINS project: A case study in building a conversational planning agent. Journal of Experimental and Theoretical Artificial Intelligence, 7, 7–48. Bernsen, N. O., Dybkjaer, H., & Dybkjaer, L. (1998). Designing interactive speech systems: From first ideas to user testing. New York: Springer Verlag. Fraser, N. (1997). Assessment of interactive systems. In D. Gibbon, R. Moore, and R. Winski (Eds.), Handbook of standards and resources for spoken language systems (pp. 564–614). New York: Mouton de Gruyter. Grosz, B. J., & Sidner, C. (1986). Attention, intention, and the structure of discourse. Computational Linguistics, 12(3), 175–204. Haller, S., Kobsa, A., & McRoy, S. (Eds.). (1999). Computational models for mixed-initiative interaction. Dordrect, Netherlands: Kluwer Academic Press. Huang X. D., Alleva, F., Hon, H. W., Hwang, M. Y., Lee, K. F., and Rosenfeld, R. (1993). The Sphinx II Speech Recognition System: An overview. Computer Speech and Language, 7(9), 137–148. Jurafsky, D., & Martin, J. (2000). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice-Hall. Larsson, S., & Traum, D. (2000). Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit [Special issue on best practice in spoken dialogue systems]. Natural Language Engineering, 6(3–4), 323–340. Luperfoy, S. (Ed.). (1998). Automated spoken dialog systems. Cambridge, MA: MIT Press. McRoy, S. W. (Ed.). (1998). Detecting, repairing, and preventing human-machine miscommunication [Special issue]. International Journal of Human-Computer Studies, 48(5). McRoy, S. W., Channarukul, S., & Ali, S. S. (2001). Creating natural language output for real-time applications intelligence. Intelligence: New Visions of AI in Practice, 12(2), 21–34.
DIGITAL CASH ❚❙❘ 167
McRoy, S. W., Channarukul, S., & Ali, S. S. (2003). An augmented template-based approach to text realization. Natural Language Engineering, 9(2), 1–40. Minker, W., Bühler, D., & Dybkjær, L. (2004). Spoken multimodal human-computer dialog in mobile environments. Dordrect, Netherlands: Kluwer. Mostow, J., Roth, S. F., Hauptmann, A., & Kane, M. (1994). A prototype reading coach that listens. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94) (pp. 785–792). Seattle, WA: AAAI Press. Pellom, B., Ward, W., Hansen, J., Hacioglu, K., Zhang, J., Yu, X., & Pradhan, S. (2001, March). University of Colorado dialog systems for travel and navigation. Paper presented at the Human Language Technology Conference (HLT-2001), San Diego, CA. Roe, D. B., & Wilpon, J. G. (Eds.). (1995). Voice communication between humans and machines. Washington, D.C.: National Academy Press. Seneff, S., Hurley, E., Lau, R. Pau, C., Schmid, P., & Zue, V. (1998). Galaxy II: A reference architecture for conversational system development. Proceedings of the 5th International Conference on Spoken Language Processing, 931–934. Smith, R., & Hipp, D. R. (1995). Spoken natural language dialog systems: A practical approach. New York: Oxford University Press. Smith, R., & van Kuppevelt, J. (Eds.). (2003). Current and new directions in discourse and dialogue. Dordrect, Netherlands: Kluwer. van Kuppevelt, J, Heid, U., & Kamp, H. (Eds.). (2000). Best practice in spoken dialog systems [Special issue]. Natural Language Engineering, 6(3–4). Walker, M., Litman, D., Kamm, C., & Abella, A. (1998). Evaluating spoken dialogue agents with PARADISE: Two case studies. Computer Speech and Language, 12(3), 317–347. Walker, M. A., Kamm, C. A., & Litman, D. J. (2000). Towards developing general models of usability with PARADISE [Special issue on best practice in spoken dialogue systems]. Natural Language Engineering, 6(3–4). Wilks, Y. (Ed.). (1999). Machine conversations. Dordrect, Netherlands: Kluwer.
DIGITAL CASH The use of digital cash has increased in parallel with the use of electronic commerce; as we purchase items online, we need to have ways to pay for them electronically. Many systems of electronic payment exist.
Types of Money Most systems of handling money fall into one of two categories:
1. Token-based systems store funds as tokens that can be exchanged between parties. Traditional currency falls in this category, as do many types of stored-value payment systems, such as subway fare cards, bridge and highway toll systems in large metropolitan areas (e.g., FastPass, EasyPass), and electronic postage meters. These systems store value in the form of tokens, either a physical token, such as a dollar bill, or an electronic register value, such as is stored by a subway fare card. During an exchange, if the full value of a token is not used, then the remainder is returned (analogous to change in a currency transaction)—either as a set of smaller tokens or as a decremented register value. Generally, if tokens are lost (e.g., if one’s wallet is stolen or one loses a subway card), the tokens cannot be recovered. 2. Account-based systems charge transactions to an account. Either the account number or a reference to the account is used to make payment. Examples include checking accounts, credit card accounts, and telephone calling cards. In some instances, the account is initially funded and then spent down (e.g., checking accounts); in other instances, debt is increased and periodically must be paid (e.g., credit cards). In most account-based systems, funds (or debt) are recorded by a trusted third party, such as a bank. The account can be turned off or renumbered if the account number is lost. The more complex an electronic payment system is, the less likely consumers are to use it. (As an example, a rule of thumb is that merchants offering “one-click ordering” for online purchases enjoy twice the order rate of merchants requiring that payment data be repeatedly entered with each purchase.)
Electronic Payment Using Credit Cards The most common form of electronic payment on the Internet today is credit card payment. Credit cards are account based. They are issued by financial institutions to consumers and in some cases to
168 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
organizations. A consumer presents the credit card number to a merchant to pay for a transaction. On the World Wide Web credit card account numbers are typically encrypted using the Secure Socket Layer (SSL) protocol built into most Web browsers. The merchant often attempts to verify the card holder by performing address verification (checking numbers appearing in an address) or by using a special verification code (typically printed on the reverse side of the credit card). In the United States credit card users typically enjoy strong rights and can reverse fraudulent transactions. Although the SSL protocol (in typical configurations) provides strong encryption preventing third parties from observing the transaction, risks still exist for the credit card holder. Many merchants apply inadequate security to their database of purchases, and attackers have gained access to large numbers of credit cards stored online. Moreover, some merchants charge incorrect amounts (or charge multiple times) for credit card transactions. Although fraudulent transactions are generally reversible for U.S. residents, time and effort are required to check and amend such transactions. In some instances, criminals engage in identity theft to apply for additional credit by using the identity of the victim. To reduce these risks, some experts have proposed a system that uses third parties (such as the bank that issued the card) to perform credit card transactions. A notable example of this type of system is Verified by Visa. However, the additional work required to configure the system has deterred some consumers, and as a result Verified by Visa and similar systems remain largely unused. The most elaborate of these systems was the Secure Electronic Transactions (SET) protocol proposed by MasterCard International and Visa International; however, the complexity of SET led to its being abandoned. In these systems credit card purchases are usually funded with a fee that is charged to the merchant. Although rates vary, typical fees are fifty cents plus 2 percent of the purchase amount.
Third-Party Payment Accounts A merchant must be able to process credit card payments. This processing is often inconvenient for small merchants, such as people who sell items in online
auctions. As a result, a market has opened for thirdparty payment processors. Today, the largest thirdparty payment processor is PayPal, owned by the eBay auction service. Third-party payment processor systems are account based. Consumers can pay for third-party purchases in three ways: by paying from an account maintained with the third party, by paying from a credit card account, and by paying from a checking account. Merchants’ rates for accepting funds from a credit card account are slightly higher than their rates for accepting funds from a conventional credit card account. Third-party payment accounts are convenient because they are simple to use and provide consumers with protection against being overcharged. However, they tend not to provide the same degree of protection that a credit card-funded purchase provides. Because third-party payment accounts are widely used with auction systems, where fraud rates are unusually high, the degree of protection is a serious consideration.
Smartcards and Other Stored-Value Systems Stored-value systems store value on a card that is used as needed. Smartcards are a token-based payment system. Many smartcards use an integrated circuit to pay for purchases. They are widely used in Europe for phone cards and in the GSM cellular telephone system. Mondex is a consumer-based system for point-of-sale purchases using smartcards. Use of smartcards is limited in Asia and largely unused in North America. (In North America only one major vendor, American Express, has issued smartcards to large numbers of users, and in those cards the smartcard feature is currently turned off.) Experts have raised a number of questions about the security of smartcards. Successful attacks conducted by security testers have been demonstrated against most smartcard systems. Experts have raised even deeper questions about the privacy protection provided by these systems. For example, in Taiwan, where the government has been moving to switch from paper records to a smartcard system for processing National Health Insurance payments,
DIGITAL CASH ❚❙❘ 169
considerable public concern has been raised about potential privacy invasions associated with the use of health and insurance records on a smartcard system. A number of devices function like a smartcard but have different packaging. For example, some urban areas have adopted the FastPass system, which allows drivers to pay bridge and highway tolls using radio link technology. As a car passes over a sensor at a toll booth, value stored in the FastPass device on the car is decremented to pay the toll. The state of California recently disclosed that it uses the same technology to monitor traffic flow even when no toll is charged. The state maintains that it does not gather personal information from FastPass-enabled cars, but experts say that it is theoretically possible.
Anonymous Digital Cash A number of researchers have proposed anonymous digital cash payment systems. These would be tokenbased systems in which tokens would be issued by a financial institution. A consumer could “blind” such tokens so that they could not be traced to the consumer. Using a cryptographic protocol, a consumer could make payments to merchants without merchants being able to collect information about the consumer. However, if a consumer attempted to copy a cryptographic token and use it multiple times, the cryptographic protocol would probably allow the consumer’s identity to be revealed, allowing the consumer to be prosecuted for fraud. Anonymous digital cash payment systems have remained primarily of theoretical interest, although some tr ials have been made (notably of the Digicash system pioneered by David Chaum). Anonymous payment for large purchases is illegal in the United States, where large purchases must be recorded and reported to the government. Moreover, consumers generally want to record their purchases (especially large ones) to have maximum consumer protection. Some researchers have demonstrated that anonymous digital cash payment systems are not compatible with atomic purchases (that is, guaranteed exchange of goods for payment). The principal demand for anonymous payment appears to be for transactions designed to evade taxes, transactions of contraband, and transactions of socially undesirable material.
Micropayments One of the most interesting types of electronic payment is micropayments. In many instances consumers wish to purchase relatively small-value items. For example, consider a website that vends recipes. Each recipe might be sold for only a few cents, but sold in volume, their value could be considerable. (Similarly, consider a website that offers online digital recordings of songs for ninety-nine cents each.) Currently, making small payments online using traditional payment methods is not feasible. For example, as mentioned, credit card companies typically charge merchants a processing fee of fifty cents plus 2 percent of the purchase amount for credit card transactions—clearly making credit card purchases for items that cost less than fifty cents impractical. Most merchants refuse to deal with small single-purchase amounts and require that consumers either buy a subscription or purchase the right to buy large numbers of items. For example, newspaper websites that offer archived articles typically require that consumers purchase either a subscription to access the articles or purchase a minimum number of archived articles—they refuse to sell archived articles individually. To enable small single purchases, a number of researchers have proposed micropayment systems that are either token based or account based. An example of an account-based micropayment system is the NetBill system designed at Carnegie Mellon University. This system provides strong protection for both consumers and merchants and acts as an aggregator of purchase information. When purchases across a number of merchants exceed a certain threshold amount, that amount is charged in a single credit card purchase. An example of a token-based micropayment system is the PepperCoin system proposed by Ron Rivest and Silvio Micali and currently being commercialized. Peppercoin uses a unique system of “lottery tickets” for purchases. For example, if a consumer wishes to make a ten-cent purchase, he might use a lottery ticket that is worth ten dollars with a probability of 1 percent. The expected value paid by the consumer would be the same as the items he purchased; but any single charge would be large enough to justify being charged using a traditional payment mechanism (such as a credit card).
170 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Despite the promise of micropayment systems, they remain largely unused. Most merchants prefer to support small-value items by using Web-based advertising or subscriptions. Nonetheless, advocates of micropayment systems maintain that such systems enable new classes of electronic commerce.
Challenges for Digital Cash Although digital cash is being increasingly used, a number of challenges remain. The principal challenge is associating payment with delivery of goods (this challenge is often known as the “atomic swap” or “fair exchange” problem.) Merchants also need to be protected from using stolen payment information, and consumers need to be protected from merchants who inadequately protect payment information (or, even worse, engage in fraud.) Finally, effective payment methods need to be developed and accepted to support both large and small purchases. A balance must be reached between consumers who want anonymous purchases and government authorities who want to tax or record purchases. These challenges make digital cash a rapidly developing research area. J. D. Tygar See also E-business
FURTHER READING Chaum, D., Fiat, A., & Naor, M. (1990). Untraceable electronic cash. In G. Blakley & D. Chaum (Eds.), Advances in cryptolog y (pp. 319–327). Heidelberg, Germany: Springer-Verlag. Electronic Privacy Information Center. (2003). Privacy and human rights 2003. Washington, DC: Author. Evans, D., & Schmalensee, R. (2000). Paying with plastic: The digital revolution in buying and borrowing. Cambridge, MA: MIT Press. Kocher, P., Jaffe, J., & Jun, B. (1999). Differential power analysis. In M. Weiner (Ed.), Advances in cryptology (pp. 388–397). Heidelberg, Germany: Springer-Verlag. Mann, R., & Winn, J. (2002). Electronic commerce. Gaithersburg, MD: Aspen Publishers. O’Mahony, D., Peirce, M., & Tewari, H. (2001). Electronic payment systems for e-commerce (2nd ed.). Norwood, MA: Artech House. Tygar, J. D. (1998). Atomicity in electronic commerce. Networker, 2(2), 23–43. Wayner, P. (1997). Digital cash: Commerce on the Net (2nd ed.). San Francisco: Morgan-Kaufmann.
DIGITAL DIVIDE There is both optimism and pessimism about the ultimate impact of the digital revolution on individual, societal, and global well-being. On the optimistic side are hopes that access to information and communication technologies, particularly the Internet, will facilitate a more equitable distribution of social, economic, and political goods and services. On the pessimistic side are beliefs that lack of access to these technologies will exacerbate existing inequalities, both globally and among groups within societies. The phrase digital divide was coined to refer to this gap between the technology haves and have-nots— between those who have access to information and communications technologies, most notably the Internet, and those who do not. The overriding concern is that a world divided by geographic, religious, political, and other barriers will become further divided by differing degrees of access to digital technologies.
Evidence of a Digital Divide Evidence supporting the existence of a global digital divide is overwhelming. Of the estimated 430 million people online in 2001, 41 percent resided in the United States and Canada. The remaining Internet users were distributed as follows: 25 percent in Europe, 20 percent in the Asian Pacific (33 percent of this group in Asia, 8 percent in Australia and New Zealand), 4 percent in South America, and 2 percent in the Middle East and Africa. Even among highly developed nations there are vast differences in Internet access. For example, in Sweden 61 percent of homes have Internet access compared to 20 percent of homes in Spain. In light of the global digital divide evidence that Internet use is rapidly increasing takes on additional significance. According to data compiled by a variety of sources, the rise in Internet use extends throughout both the developed and developing world. The rapidly increasing global reach of the Internet intensifies concerns about its potential to exacerbate existing global economic and social disparities.
DIGITAL DIVIDE ❚❙❘ 171
HomeNetToo Tries to Bridge Digital Divide “If I’m stressed out or depressed or the day is not going right, I just get on the computer and just start messing around and I come up with all sorts of things like ‘okay, wow.’ ”
B e g u n i n t h e f a l l o f 2 0 0 0 , Hom e Ne t To o w a s a n eighteen-month field study of home Internet use in lowincome families. Funded by an Information Technology Research grant from the National Science Foundation, the project recruited ninety families who received in-home instruction on using the Internet, and agreed to have their Internet use recorded and to complete surveys on their experiences. In exchange, each family received a new home computer, Internet access, and in-home technical support. The comments of the HomeNetToo participants about their computer use provide a broad range of views about the pleasures and problems of computer interactions:
“You get a lot of respect because you have a computer in your house. I think people view you a little differently.” “A lot of times I’m real busy, and it was hard for me to get a turn on the computer too. My best chance of getting time on the computer is I get up at 6 AM and the rest of the family gets up at seven. So if I finish my bath and get ready quickly I can get on before anyone else is up. And I can have an hour space to do whatever I want while they’re sleeping and getting up and dressed themselves.”
“When somebody’s on the computer whatever it is they’re doing on that computer at that time, that’s the world they’re in…it’s another world.”
“I feel like I don’t have time ...who has time to watch or play with these machines. There’s so much more in life to do.”
“With the computer I can do things…well, I tell the computer to do things nobody else will ever know about, you know what I am saying? I have a little journal that I keep that actually nobody else will know about unless I pull it up.”
“Instead of clicking, I would like to talk to it and then say ‘Can I go back please?’ ” “They talk in computer technical terms. If they could talk more in layman’s terms, you know, we could understand more and solve our own problems.”
“I escape on the computer all the time...I like feeling ‘connected to the world’ and I can dream.”
Source: Jackson, L. A., Barbatsis, G., von Eye, A., Biocca, F. A., Zhao, Y., & Fitzgerald, H. E. (2003c). Implications for the digital divide of Internet use in low-income families. IT & Society., 1(5), 219–244.
Evidence for a digital divide within the United States is a bit more controversial, and has shifted from irrefutable in 1995 to disputable in 2002. In its first Internet repor t in 1995, the U.S. Department of Commerce noted large disparities in Internet access attributable to income, education, age, race or ethnicity, geographic location, and gender. In its fifth Internet report in 2002, all disparities had shrunk substantially. However, only a few disappeared entirely. Although 143 million U.S citizens now have access to the Internet (54 percent of the population), gaps attributable
to the following factors have been observed in all surveys to date: ■
Income: Income is the best predictor of Internet access. For example, only 25 percent of households with incomes of less than $15,000 had Internet access in 2001, compared to 80 percent of households with incomes of more than $75,000. ■ Education: Higher educational attainment is associated with higher rates of Internet use. For example, among those with bachelor’s degrees or
172 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
better, over 80 percent use the Internet, compared to 40 percent of those with only a high school diploma. ■ Age: Internet use rates are highest between the ages of twelve and fifty; they drop precipitously after age fifty-five. ■ Race or ethnicity: Asian/Pacific Islanders and whites are more likely to use the Internet (71 percent and 70 percent, respectively) than are Hispanics and African-Americans (32 percent and 40 percent, respectively). However, growth in Internet use has been greater among the latter than the former groups. The gender gap so evident in the 1995 U.S. Department of Commerce survey disappeared by the 2002 survey. However, gender-related differences remain. Among those over sixty years old, men had higher Internet use rates than did women. Among the twenty-to-fifty-year-old group, women had higher Internet use rates than did men. Also diminishing if not disappearing entirely are gaps related to geographic location. Internet use rates in rural areas climbed to 53 percent in 2002, almost as high as the national average, but use rates for central-city residents was only 49 percent, compared to 57 percent for urban residents outside the central city. In addition to the five Internet reports by the U.S. Department of Commerce, a number of other organizations have been tracking Internet use and issues related to the digital divide. The Pew Internet and American Life Project devoted one of its several reports to African-Americans and the Internet, focusing on how African-Americans’ Internet use differ from whites’ use. These differences are important to understanding the racial digital divide in the United States and are potentially important to understanding global digital-divide issues that may emerge as access to the Internet becomes less problematic. The Pew Internet and American Life Project reported the following findings: ■
African-Americans are more likely than whites to use the Internet to search for jobs, places to live, entertainment (for example, music and videos), religious or spiritual information and health care information, and as a means to pursue hobbies and learn new things.
■
African-Americans are less likely than whites to say the Internet helps them to stay connected to family and friends. ■ Women and parents are driving the growth of the African-American Internet population. ■ Mirroring the pattern of gender differences in the general population, African-American women are much more likely than African-American men to search for health, job, and religious information online. African American men are much more likely than African-American women to search for sports and financial information and to purchase products online. ■ Compared with older African-Americans, those under age thirty are more likely to participate in chat rooms, play games, and use multimedia sources. Older African-Americans are more likely to search for religious information than are younger African-Americans. ■ The gap in Internet access between AfricanAmericans and whites is closing, but AfricanAmericans still have a long way to go. Moreover, those with access to the Internet do not go online as often on a typical day as do whites, and online African-Americans do not participate on a daily basis in most Web activities at the same level as do online whites. A number of researchers have also been interested in race differences in U.S. Internet access. Thomas Hoffman and Donna Novak, professors of management at Vanderbilt University, examined the reasons for race differences in Internet access and concluded that income and education cannot fully explain them. Even at comparable levels of income and education, African-Americans were less likely to have home PCs and Internet access than were whites. The psychologist Linda Jackson and her colleagues have found race differences in Internet use among college students who had similar access to the Internet. The United States is not the only country to report a domestic digital divide. In Great Britain the digital divide separates town and country, according to a 2002 joint study by IBM and Local Futures, a research and strategy consultancy. According to the study’s findings, Britain’s digital divide may soon
DIGITAL DIVIDE ❚❙❘ 173
grow so wide that it will not be bridgeable. People in Great Britain’s rural areas currently do not have the same degree of access to new technologies, such as cell phones, as do people in cities and the areas surrounding them.
Why Is There a Digital Divide? The global digital divide appears to have an obvious cause. In the absence of evidence to the contrary, it is reasonable to assume that the divide is attributable to differing degrees of access to digital technologies, especially the Internet. Of course there are a host of reasons why access may be lacking, including the absence of necessary infrastructure, government policy, and abject poverty. Regardless of the specific factor or factors involved, the access explanation assumes that if access were available, then the global divide would disappear. In other words, Internet access would translate readily into Internet use. Explaining the U.S. digital divide in terms of access to digital technologies is a bit more problematic. Indeed, some have argued that there is no digital divide in the U.S. and that the so-called information have-nots are really information want-nots. Those advocating this perspective view the U.S. Department of Commerce 2002 report as evidence that individuals without access have exercised their free choice to say no to the Internet in favor of higher priorities. Moreover, those who argue that the divide is disappearing say that because the growth rate in Internet use is much higher for low-income groups than it is for high-income groups (25 percent as opposed to 15 percent), the gap between rich and poor will eventually be negligible without any intervention from government or the private sector. Those who argue that a digital divide persists in the United States despite increasing low-income access suggest that the divide be reconceptualized to focus on use rather than access. This reconceptualization highlights the importance of understanding people’s motivations for Internet use and nonuse, an understanding that will be even more important if the global digital divide proves to be more than a matter of access to digital technologies.
The Divide between Digital Use and Nonuse Why do individuals choose to use or not use the Internet, assuming they have access to it? A number of studies have examined people’s motivations for using or not using the Internet. According to the “uses and gratifications” model of media use, individuals should use the Internet for the same reasons they use other media, namely, for information, communication, entertainment, escape, and transactions. Research generally supports this view, although the relative importance of these different motivations varies with demographic characteristics of the user and changes in the Internet itself. For example, older users are more likely to use the Internet for information, whereas younger users are more likely to use it for entertainment and escape. Entertainment and escape motives are more important today than they were when the World Wide Web was first launched in 1991. A report issued in 2000 by the Pew Internet and American Life Project focused specifically on why some Americans choose not to use the Internet. The authors noted that 32 percent of those currently without Internet access said they would definitely not be getting access—about 31 million people. Another 25 percent of non-Internet users said they probably would not get access. Reasons for not going online centered on beliefs that the Internet is a dangerous place (54 percent), that the online world has nothing to offer (51 percent), that Internet access is too expensive (39 percent), and that the online world is confusing and difficult to navigate (36 percent). The strongest demographic predictor of the decision not to go online was age. Older Americans apparently perceived few personal benefits to participating in the online world; 87 percent of those sixty-five and older did not have Internet access, and 74 percent of those over fifty who were not online said they had no plans to go online. In contrast, 65 percent of those under fifty said they planned to get Internet access in the near future. Ipsos-Reid, a research firm, used an international sample to examine people’s reasons for not going online. Their findings, published in 2000, were similar to the Pew report findings: Thirty-three percent
174 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
of respondents said they had no intention of going online. Their reasons included lack of need for the online world (40 percent), lack of a computer (33 percent), lack of interest in going online (25 percent), lack of necessary technical skills, and general cost concerns (16 percent). The Children’s Partnership, which also published a report in 2000 on why people do not go online, offered four reasons why low-income and underserved Americans may choose to stay away from the Internet. First, the Internet may lack the local information of interest to low-income and underserved Americans; second, there may be literacy barriers; third, there may be language barriers; and fourth, the lack of cultural diversity on the Internet may keep them from participating. Lack of local information disproportionately affects users living on limited incomes. Literacy barriers come into play because online content is often directed at more educated Internet users, particularly users who have discretionary money to spend online. Reading and understanding Web content may be especially difficult for the less educated and those for whom English is a second language (32 million Americas). An estimated 87 percent of the documents on the Internet are in English. The lack of cultural diversity on the Internet may be rendering the Internet less interesting to millions of Americans. Others have argued that access alone may not be enough to produce equity in Internet use in the United States. Gaps will persist due to differences in education, interest in Web topics, and interpersonal contact with others familiar with these topics. All of these factors may affect how eagerly an individual seeks out and consumes information on the Internet.
Whose Responsibility is the Digital Divide? Opinions vary about whose responsibility it is to address the digital divide, whether it be the global divide, the U.S. divide, or the divide between users and nonusers. At the global level, in June 2002 the United Nations’ telecommunications agency argued that it would take concerted global action to keep the digital divide from growing. The U.N. adopted a res-
olution to organize a world summit on the information society, the first in Geneva in 2003, and the second in Tunisia in 2005. The summits are expected to promote universal access to the information, knowledge, and communications technologies needed for social and economic development. In April 2002, Erkki Liikanen, the European commissioner for the Enterprise Directorate General and the Information Society Directorate General, argued that developing countries must be included in the shift to a networked, knowledge-based global economy. He stressed the importance of strong political leadership, top-level involvement and contributions from both the public and private sectors. In 2000, the European Commission launched an action plan, the goal of which was to bring all of Europe online by 2002. As a result of this action plan, decision making on telecommunications and e-commerce regulation accelerated and Internet access has moved to the top of the political agenda in all European Union member countries. In the coming years the focus will move to the user and usage of the Internet. The goal is to encourage more profound and inclusive use of the Internet. In the United States a number of nonprofit organizations have looked to the federal government to address the digital divide. For example, upon release of the U.S. Department of Commerce’s fifth digital divide report in 2002, the Benton Foundation issued a policy brief stating that “Targeted [government] funding for community technology is essential to maintain national digital divide leadership” (Arrison 2002). The government, however, continues to minimize the importance of the digital divide, asserting that for the all intents and purposes it no longer exists. Thus, while some call for broad-based approaches to eliminating the global digital divide and government intervention to eliminate the U.S. digital divide, others argue that nothing at all needs to be done, that market forces will bridge the digital divide without any other action being taken. Still others believe that access to and use of digital technologies, particularly the Internet, are neither necessary for everyday life nor solutions to social and economic problems in the United States or elsewhere. Linda A. Jackson
DIGITAL GOVERNMENT ❚❙❘ 175
See also Economics and HCI; Internet—Worldwide Diffusion
FURTHER READING Arrison, S. (2002, April 19). Why digital dividers are out of step. Retrieved July 17, 2003, from http://www.pacificresearch.org/press/opd/ 2002/opd_02-04-19sa.html Associated Press. (2002, June 22). U.N. warns on global digital divide. Retrieved July 18, 2003, from http://lists.isb.sdnpk.org/ pipermail/comp-list/2002-June/001053.html BBC News. (2002, March 10). Digital divisions split town and country. Retrieved July 18, 2003, from http://news.bbc.co.uk/2/hi/ science/nature/1849343.stm Carvin, A. (2000). Mind the gap: The digital divide as the civil rights issue of the new millenium. Multimedia Schools, 7(1), 56–58. Retrieved July 17, 2003, from http://www.infotoday.com/mmschools/ jan00/carvin.htm Cattagni, A., & Farris, E. (2001). Internet access in U.S. public schools and classrooms: 1994–2000 (NCES No. 2001-071). Retrieved July 18, 2003, from http://nces.ed.gov/pubsearch/pubsinfo .asp?pubid=2001071 Children’s Partnership. (2000). Online content for low-income and underserved Americans: The digital divide’s new frontier. Retrieved July 17, 2003, from http://www.childrenspartnership.org/pub/low_income/ Cooper, M. N. (2002, May 30). Does the digital divide still exist? Bush administration shrugs, but evidence says “yes.” Retrieved July 18, 2003, from http://www.consumerfed.org/DigitalDivideReport20020530 .pdf Digital Divide Network staff. (2003). Digital divide basics fact sheet. Retrieved July 18, 2003, from http://www.digitaldividenetwork .org/content/stories/index.cfm?key=168 eEurope. (1995–2002). An information society for all. Retrieved July 18, 2003, from http://europa.eu.int/information_society/eeurope/ index_en.htm European Union. (2002, May 4). e-Government and development: Br idg ing the gap. Ret r ie ved July 18, 2003, from http:// europa.eu.int/rapid/start/cgi/guesten.ksh?p_action.gettxt=gt&doc= SPEECH/02/157|0|RAPID&lg=EN&display= Gorski, P. (Fall, 2002). Dismantling the digital divide: A multicultural education framework. Multicultural Education, 10(1), 28–30. Hoffman, D. L., & Novak, T. P. (1998, April). Bridging the racial divide on the Internet. Science, 280, 390–391. Hoffman, D. L., Novak, T. P., & Schlosser, A. E. (2000). The evolution of the digital divide: How gaps in Internet access may impact electronic commerce. Journal of Computer Mediated Communication, 5(3), 1–57. Jackson, L. A., Ervin, K. S., Gardner, P. D., & Schmitt, N. (2001a). The racial digital divide: Motivational, affective, and cognitive correlates of Internet use. Journal of Applied Social Psychology, 31(10), 2019–2046. Jackson, L. A., Ervin, K. S., Gardner, P. D., & Schmitt, N. (2001b). Gender and the Internet: Women communicating and men searching. Sex Roles, 44(5–6), 363–380. Jackson, L. A., von Eye, A., Biocca, F., Barbatsis, G., Fitzgerald, H. E., & Zhao, Y. (2003, May 20–24). The social impact of Internet Use: Findings from the other side of the digital divide. Paper presented at
the twelfth International World Wide Web Conference, Budapest, Hungary. Lenhart, A. (2000). Who’s not online: 57% of those without Internet access say they do not plan to log on. Washington, DC: Pew Internet & American Life Project. Retrieved July 18, 2003, from http://www.pewinternet.org/reports/pdfs/Pew_Those_ Not_Online_Report.pdf Local Futures. (2001) Local futures research: On the move—mobile and wireless communications. Retrieved July 18, 2003, from http:// www.localfutures.com/article.asp?aid=41 National Telecommunications and Information Administration, Economics and Statistics Administration. (n.d.) A nation online: How Americans are expanding their use of the Internet. Retrieved July 18, 2003, from http://www.ntia.doc.gov/ntiahome/dn/ html/toc.htm Ombwatch. (2002, August 18). Divided over digital gains and gaps. Retrieved July 18, 2003, from http://www.ombwatch.org/article/ articleview/1052/ The relevance of ICT in development. (2002, May-June) The Courier ACP-EU, 192, 37–39. Retrieved 17 July 2003, from http://europa.eu.int/comm/development/body/publications/ courier/courier192/en/en_037_ni.pdf Spooner, T., & Rainie, L. (2000). African-Americans and the Internet. Washington, DC: Pew Internet & American Life Project. African Americans and the Internet. Retrieved July 18, 2003, from http://www.pewinternet.org/reports/pdfs/PIP_African_Americans_ Report.pdf UCLA Center for Communication Policy. (2000). The UCLA Internet report: Surveying the digital future. Retrieved July 18, 2003, from http://www.ccp.ucla.edu/UCLA-Internet-Report-2000.pdf UCLA Center for Communication Policy. (2003). The UCLA Internet report: Surveying the digital future, year three. Retrieved July 18, 2003, from http://www.ccp.ucla.edu/pdf/UCLA-Internet-ReportYear-Three.pdf U.S. Department of Commerce. (1995). Falling through the Net: A survey of the “have nots” in rural and urban America. Retrieved July 18, 2003, from http://www.ntia.doc.gov/ntiahome/fallingthru.html U.S. Department of Commerce. (2000). Falling through the Net: Toward digital inclusion. Retrieved July 18, 2003, from http://search.ntia .doc.gov/pdf/fttn00.pdf U.S. Department of Commerce. (2002). A nation online: How Americans are expanding their use of the Internet. Retrieved July 18, 2003, from http://www.ntia.doc.gov/ntiahome/dn/anationonline2.pdf Weiser, E. B. (2002). The functions of Internet use and their social and psychological consequences. Cyberpsychology and Behavior, 4(2), 723–743.
DIGITAL GOVERNMENT Electronic government (e-government) is intimately connected to human-computer interaction (HCI). Critical HCI issues for e-government include technical and social challenges and interactions between the two. First, at a broad, societal level, the adaptation of
176 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
government and civic engagement to increasingly computerized environments raises political, organizational, and social questions concerning use, the appropriate contexts or environments for use, reciprocal adaptation mechanisms, learning and the design of government work, the design of political and civic communities of interest, and the design of nations themselves as well as international governance bodies. Second, HCI focuses on human characteristics and their relationship to computing. The significant human characteristics of importance to e-government include cognition, motivation, language, social interaction, and ergonomics or human factors issues. The usability and feasibility of e-government require a deep understanding by designers of individual, group, and societal cognition and behavior. On the technological side HCI is concerned with the outputs and processes of design and development of systems and interfaces. Third, HCI and e-government intersect is the design of computer systems and interface architectures. Design questions apply to input and output devices, interface architectures (including all types of dialogue interfaces for individuals and shared spaces for multiple users), computer graphics, maps, visualization tools, and the effects of these systems and interface architectures on the quality of interaction among individuals, groups, and government. Fourth, HCI examines the development process itself, ranging from how designers and programmers work to the evaluations of human-computer systems in terms of feasibility, usability, productivity and efficiency and, more recently, their likelihood to promote and sustain democratic processes. These issues may be described separately; however, e-government projects require attention to several of these issues simultaneously. For example, user-friendly and socially effective applications that cannot be implemented in a government setting for reasons of privacy, fairness, cost, or user resistance prove infeasible for e-government. Multiple constraints and demands therefore make this area challenging for governments. Electronic government is typically defined as the production and delivery of information and services inside government and between government and the
public using a range of information and communication technologies (ICTs). The public includes individuals, interest groups, and organizations, including nonprofit organizations, nongovernmental organizations, firms, and consortia. The definition of e-government used here also includes e-democracy, that is, civic engagement and public deliberation using digital technologies. Governments in industrialized and developing countries are experimenting with interactive systems to connect people with government information and officials. Many observers have claimed that interactive technologies will revolutionize governance. We must wait to see how and to what extent individuals and groups will use computing to affect civic engagement and how governments will use computing to influence political and civic spheres.
Development Paths of E-government Initial efforts by government agencies to develop e-government entailed simply digitizing and posting static government information and forms on the World Wide Web using the language, displays, and design of existing paper-based documents. Beginning during the 1990s and continuing into the present many government agencies have begun to adapt operations, work, and business processes and their interface with the public to simplify and integrate information and services in online environments. The federal governments of the United States, Canada, Finland, and Singapore are among those at the forefront of e-government in terms of the amount of information and interactivity available to the public and attention to system development and interface architecture. The country-level Web portal designed to help people navigate and search information for entire federal governments is one of the key types of e-government initiatives. The U.S. government Web portal (www.FirstGov.gov) is an interface with a search tool meant to serve as a single point of entry to U.S. government information and services. The federal government of Singapore developed a single Web portal, called “Singov” (www.gov.sg), to simplify access to government information for visitors, citizens, and businesses. Similarly, the Web portal for the government of
DIGITAL GOVERNMENT ❚❙❘ 177
Canada (www.canada.gc.ca) was designed in terms of three main constituents: Canadians, non-Canadians, and Canadian business.
Organizational Redesign through Cross-Agency Integration During the 1990s several federal agencies and state governments created “virtual agencies”—online sources of information and services from several agencies organized in terms of client groups. For example, during the early 1990s the U.S. federal government developed the websites students.gov, seniors.gov, and business.gov to organize and display information using interfaces designed specifically for these populations with a single point of entry into a government portal focused on each population’s interests. By the end of the administration of President Bill Clinton approximately thirty cross-agency websites existed in the U.S. federal government. Beginning in 2001, the U.S. federal government continued this development path by undertaking more than twenty-five cross-agency e-government projects. The development path shifted from a loose confederation of interested designers in the government to an enterprise approach to e-government managed and controlled centrally and using lead agencies to control projects. The desire for internal efficiencies, as much as desire for service to the public, drives these projects. Several payroll systems are being consolidated into a few payroll systems for the entire government. Multiple and abstruse (difficult to comprehend) requirements for finding and applying for government grants are being streamlined into one federal online grants system called “e-grants.” Myriad rulemaking processes in agencies throughout the federal government, although not consolidated, have been captured and organized in the interface architecture of one Web portal, called “e-rulemaking.” The website recreation.gov uses an architecture that organizes recreation information from federal, state, and local governments. System design and interface architecture simplify search, navigation, and use of information concerning recreational activities, recreational areas, maps, trails, tourism sites, and weather reports by location.
Standardization, consolidation, and integration of information, operations, and interfaces with the public have been the key drivers for e-government in most federal government efforts. The ability to digitize visual government information is an additional development path for e-government. To note one example: The U.S. House Committee on Government Reform Subcommittee on Technology, Information Policy, Intergovernmental Relations, and the Census Web-casts its hearings and makes testimony before the committee searchable online. Previously, testimony was not searchable until a transcript of a hearing was produced—a process that could take up to six months. Considerable human, financial, and technical resources are required to design, develop, build, and maintain state-of-the-art e-government. For this reason, many local governments in poor economic areas or in cities and towns with small and medium populations lack resources to build interactive e-government unless resources are provided by federal and state governments. In the United States some of the most developed state-level e-government sites are in the states of Washington and Virginia. Municipal government websites vary dramatically in quality and level of development.
Interactivity and E-government Interactive e-government services include online tax payments, license applications and renewals, and grants applications and renewals. The city of Baltimore website (http://www.ci.baltimore.md.us/) has won awards for implementation of computing technology in government. The website allows citizens to pay parking fines, property taxes, and water and other bills. Users can search crime statistics by geographic area within the city and track several city services, including trash removal and street cleaning. The city of Baltimore has implemented an online version of the 311 service available in some other large U.S. cities, which allows citizens to request city information and services over the telephone. Citizens can report and then track the status of a request for city services, including removal of abandoned vehicles, repair of potholes, removal of graffiti, and requests for a change in traffic signs. These requests
178 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
not only provide interactivity but also promote government compliance and accountability to voters by making provision of city services more transparent to the public. Interactivity is increasing as governments continue to develop systems and as citizens adapt to government online. To note a few trends: In the United States the number of online federal tax filings increased from 20,000 in 1999 to 47 million, or about 36 percent of individual filings, in 2002. The Environmental Protection Agency reports that it saves approximately $5 million per year in printing and mailing costs by providing information digitally to the public. Public health agencies at all levels of government increasingly access centralized information online through the Centers for Disease Control and Protection of the U.S. Public Health Service.
Usability and E-government Usability studies in HCI examine the ease and efficiency with which users of a computer system can accomplish their goals as well as user satisfaction with a system. Usability in e-government is important because it is likely to affect public participation in ways that might result in unequal access or discrimination due to biases built into design and architecture. One area of usability concerns disabled people. Many governments around the world have passed laws to ensure usability to the disabled. Section 508 of the U.S. Rehabilitation Act (29 U.S.C. 794d), as amended by the Workforce Investment Act of 1998 (P.L. 105-220), 7 August 1998, mandates a set of requirements for U.S. federal government sites to assist disabled users. These requirements include standards for Web-based software and applications, operating systems, telecommunications products, personal computers, video, and multimedia products. Major federal services initiatives have been delayed and others upgraded to ensure compliance with Section 508 requirements. Disabilities increase as a population ages and chiefly include visual impairment and decreases in cognitive and motor skills important in an online environment. A research initiative, Toolset for Making Web Sites Accessible to Aging Adults in a Multi-
cultural Environment (http://www.cba.nau.edu/ facstaff/becker-a/Accessibility/main.html), focuses on the development of tools for government agencies to assess the usability of systems and sites for the elderly as well as standards of measurement for evaluating such sites. Developers will use evaluation tools to measure a site’s accessibility in terms of reading complexity and potential usability issues such as font size and font style, background images, and text justification. Transformational tools will convert a graphical image to one that can be seen by those users with color-deficiency disabilities. Developers are creating simulation tools to model many of the problems that elderly users experience, such as yellowing and darkening of images. Finally, compliance tools will be designed to modify webpages to comply with usability requirements for the elderly. Other U.S. researchers are working with the Social Security Administration, the Census Bureau, and the General Services Administration to better provide for their visually impaired users in a project entitled “Open a Door to Universal Access.” Project researchers are building and prototyping key technologies for disabled employees at the partner agencies. These technologies will later be transferred to the private sector for wider dissemination in work settings. Usability includes all elements of accessibility, including “look and feel,” readability, and navigability. For example, usability research focused on local government websites indicates that the reading level required to comprehend information on websites often exceeds that of the general population, raising concerns about accessibility, comprehension, interpretation, and associated potential for discrimination. Ongoing research regarding e-government and usability focuses primarily on development of tools for usability, including navigability and information representation in text, tabular, graphical, and other visual forms.
Internet Voting One of the most important developments in egovernment, with great significance for issues in HCI, is Internet voting. People have debated three main possibilities for Internet voting. First, computerized voting can be used at polling places in a “closed
DIGITAL GOVERNMENT ❚❙❘ 179
system” within a secure computer local area network (LAN). Local votes would be recorded from individual voting consoles and tallied at local polling stations. Second, voting consoles or kiosks can be located in areas widely accessible to the general population, such as public libraries or shopping malls. Third, Internet voting might take place from remote locations, such as homes or offices. Many observers predicted that Internet voting would simplify voting processes and thereby increase voter participation. These predictions are far from reality at present. Current systems and architectures lack the security and reliability required for Internet voting of the third type. In addition to questions of feasibility, experts are uncertain of how Internet voting would affect participation and the cognitive, social, and political process of voting itself. A current research study, Human Factors Research on Voting Machines and Ballot Design (http://www.capc.umd.edu/rpts/MD_EVoteHuFac .html), focuses on the human-machine interface in voting. Given the prominence of issues surrounding traditional voting methods during the 2000 U.S. presidential election, researchers from the University of Maryland are developing a process to evaluate several automated voting methods and ballot designs. The study compares technologies such as optical scanning and digital recording of electronic equipment and evaluates the effect of various voting methods and ballot designs on the precision with which voters’ intentions are recorded and other critical variables.
Representing Complex Government Information Government statistics are a powerful source of information for policymakers and the public. Large, democratic governments produce and distribute a vast quantity of statistical information in printed and electronic form. Yet, vital statistics continue to be stored in databases throughout governments and in forms that are not easily accessible, navigable, or usable by most citizens. A U.S. project called “Quality Graphics for Federal Statistics” (http://www.geovista .psu.edu/grants/dg-qg/intro.html) focuses on de-
velopment of graphical tools to simplify complex information. This project will develop and assess quality graphics for federal statistical summaries considering perceptual and cognitive factors in reading, interaction, and interpretation of statistical graphs, maps, and metadata (data about data). The project addresses four areas: conversion of tables to graphs, representation of metadata, interaction of graphs and maps, and communication of the spatial and temporal relationships among multiple variables. The project uses Web-based “middleware”—software which connects applications—to enable rapid development of graphics for usability testing. Another research project, Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National Statistical Knowledge Network (http://ils.unc.edu/ govstat/), takes a different HCI approach. Members of the civically engaged public often struggle to access and combine the vast and increasing amount of statistical data—often in a variety of formats— available from government agency websites. Researchers working in cooperation with government agencies are developing standardized data formats and studying social processes to facilitate integration of search results. In addition, the project’s research team is developing a solutions architecture to accommodate users with a variety of communications and hardware needs and providing for broad-based usability requirements.
Ways Forward The technological potential exists for individuals, groups, and communities to participate in and shape government in new ways. Some observers speculate that increased access to government online will lead to greater interest, knowledge, and discussion of politics. The Internet might allow citizens to organize and mobilize resources in powerful new ways. The Internet enables groups and communities to deliberate in new, possibly more effective ways. Some observers have also speculated that computing will lead to direct democracy, with individuals voting on a wide range of issues. Currently, little evidence shows that this potential is being realized. Those groups already civically engaged are using
180 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
computing to enhance their activities. The propensity to simplify and distort information in public discourse is not abated by changes in media. Unequal access to the Internet and a wide range of computerized information and communication tools, roughly divided between people with education and people without, highly correlated with income and political participation, creates a digital divide in e-government in spite of advances in HCI. Lack of literacy and lack of computer literacy worsen the digital divide in access. Disparities among rich and poor nations parallel digital-divide challenges within countries. Yet, innovations in several developing countries and in rural areas invite some degree of optimism. Rural farmers and craftspeople are beginning to connect through the Internet to enhance their economic well-being. Rural communities in China are using the Internet, as yet on a modest scale, to decry local corruption and in some cases have forced the federal government to intervene in local affairs. Interfaces for preliterate populations are being developed. Human-computer interaction begins with the study of the mutual adaptation of social and technical systems. We cannot predict the path or the outcome of the many and varied complex adaptation processes now in play. One of the chief sources of learning for designers of e-government has been to focus on tools for building and sustaining democracy rather than to focus merely on efficiency. While researchers learn more about human cognition, social interaction, and motivation in computermediated environments and while designers develop new tools and interfaces to encompass a wider range of activities and discourse in online environments, large-scale adaptation continues between societies, governments, and technology. Jane E. Fountain and Robin A. McKinnon See also Online Voting; Political Science and HCI
FURTHER READING Abramson, M. A., & Means, G. E. (Eds.). (2001). E-government 2001 (ISM Center for the Business of Government). Lanham, MD: Rowman & Littlefield.
Alvarez, R. M. (2002). Ballot design options, California Institute of Technology. Retrieved February 17, 2004, from http://www.capc .umd.edu/rpts/MD_EVote_Alvarez.pdf Ceaparu, I. (2003). Finding governmental statistical data on the Web: A case study of FedStats. IT & Society, 1(3), 1–17. Retrieved February 17, 2004, from http://www.stanford.edu/group/siqss/itandsociety/ v01i03/v01i03a01.pdf Conrad, F. G. (n.d.). Usability and voting technology: Bureau of Labor Statistics. Retrieved February 17, 2004, from http://www.capc.umd .edu/rpts/MD_EVote_Conrad.pdf David, R. (1999). The web of politics: The Internet’s impact on the American political system. New York: Oxford University Press. Dutton, W. H. (1999). Society on the line: Information politics in the digital age. Oxford, UK: Oxford University Press. Dutton, W. H., & Peltu, M. (1996). Information and communication technologies—Visions and realities. Oxford, UK: Oxford University Press. Echt, K. V. (2002). Designing Web-based health information for older adults: Visual considerations and design directives. In R. W. Morrell (Ed.), Older adults, health information, and the World Wide Web (pp. 61–88). Mahwah, NJ: Lawrence Erlbaum Associates. Fountain, J. E. (2001). Building the virtual state: Information technology and institutional change. Washington, DC: Brookings Institution Press. Fountain, J. E. (2002). Information, institutions and governance: Advancing a basic social science research program for digital government. Cambridge, MA: National Center for Digital Government, John F. Kennedy School of Government. Fountain, J. E., & Osorio-Urzua, C. (2001). The economic impact of the Internet on the government sector. In R. E. Litan & A. M. Rivlin (Eds.), The economic payoff from the Internet re volution (pp. 235–268). Washington, DC: Brookings Institution Press. Harrison, T. M., & Zappen, J. P. (2003). Methodological and theoretical frameworks for the design of community information systems. Journal of Computer-Mediated Communication, 8(3). Retrieved February 17, 2004, from http://www.ascusc.org/jcmc/ vol8/issue3/harrison.html Harrison, T. M., Zappen, J. P., & Prell, C. (2002). Transforming new communication technologies into community media. In N. W. Jankowski & O. Prehn (Eds.), Community media in the information age: Perspectives and prospects (pp. 249–269). Cresskill, NJ: Hampton Press Communication Series. Hayward, T. (1995). Info-rich, info-poor: Access and exchange in the global information society. London: K. G. Saur. Heeks, R. (Ed.). (1999). Reinventing government in the information age: International practice in IT-enabled public sector reform. London and New York: Routledge. Hill, K. A., & Hughes, J. E. (1998). Cyberpolitics: Citizen activism in the age of the Internet. Lanham, MD: Rowman & Littlefield. Holt, B. J., & Morrell, R. W. (2002). Guidelines for website design for older adults: The ultimate influence of cognitive factors. In R. W. Morrell (Ed.), Older adults, health information, and the World Wide Web (pp. 109–129). Mahwah, NJ: Lawrence Erlbaum Associates. Internet Policy Institute. (2001). Report of the National Workshop on Internet Voting: Issues and research agenda. Retrieved February 17, 2004, from http://www.netvoting.org Kamarck, E. C., & Nye, J. S., Jr. (2001). Governance.com: Democracy in the information age. Washington, DC: Brookings Institution Press.
DIGITAL LIBRARIES ❚❙❘ 181
Margolis, M., & Resnick, D. (2000). Politics as usual: The cyberspace “revolution.” Thousand Oaks, CA: Sage. Nass, C. (1996). The media equation: How people treat computers, televisions, and new media like real people and places. New York: Cambridge University Press. Norris, P. (2001). Digital divide: Civic engagement, information poverty, and the Internet worldwide. Cambridge, UK: Cambridge University Press. O’Looney, J. A. (2002). Wiring governments: Challenges and possibilities for public managers. Westport. CT: Quorum Books. Putnam, R. (2000). Bowling alone: The collapse and revival of American community. New York: Simon & Schuster. Rash, W. (1997). Politics on the nets: Wiring the political process. New York: Freeman. Schwartz, E. (1996). Netactivism: How citizens use the Internet. Sebastapol, CA: Songline Studios. Wilheim, A. G. (2000). Democracy in the digital age: Challenges to political life in cyberspace. New York: Routledge.
DIGITAL LIBRARIES For centuries the concept of a global repository of knowledge has fascinated scholars and visionaries alike. Yet, from the French encyclopedist Denis Diderot’s L’Encylopedie to the British writer H. G. Wells’s book World Brain to Vannevar Bush’s (director of the U.S. Office of Scientific Research and Development) Memex (a desktop system for storing and retrieving information) to Ted Nelson’s Project Xanadu (a vision of an information retrieval system based on hyperlinks among digital content containers), the dream of such an organized and accessible collection of the totality of human knowledge has been elusive. However, recent technological advances and their rapid deployment have brought the far-reaching dream into renewed focus. The technologies associated with computing, networking, and presentation have evolved and converged to facilitate the creation, capture, storage, access, retrieval, and distribution of vast quantities of data, information, and knowledge in multiple formats. During the late 1980s and early 1990s the term digital libraries emerged to denote a field of interest to researchers, developers, and practitioners. The term encompasses specific areas of development such as electronic publishing, online databases, information retrieval, and data mining—the process of information extraction with the goal of discovering
hidden facts or patterns within databases. The term digital libraries has been defined in many ways. For example: ■
“The Digital Library is the collection of services and the collection of information objects that support users in dealing with information objects available directly or indirectly via electronic/ digital means” (Fox and Urs 2002, 515). ■ “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily available for use by a defined community or set of communities”(Fox and Urs 2002, 515). ■ “A collection of information which is both digitized and organized” (Lesk 1997, 1). ■ “Digital libraries are a set of electronic resources and associated technical capabilities for creating, searching, and using information . . . they are an extension and enhancement of information storage and retrieval systems that manipulate digital data in any medium (text, images, sounds, static or dynamic images) and exist in distributed networks” (Borgman et al. 1996, online). Clifford Lynch, a specialist in networked information, made a clear distinction between content that is born digital and content that is converted into digital format and for which an analogue counterpart may or may not continue to exist. The definitions of digital libraries have considerable commonality in that they all incorporate notions of purposefully developed collections of digital information, services to help the user identify and access content within the collections, and a supporting technical infrastructure that aims to organize the collection contents as well as enable access and retrieval of digital objects from within the collections. Yet, the term digital libraries may have constrained development in that people have tended to restrict their view of digital libraries to a digital version of more traditional libraries. They have tended to focus on textual content rather than on the full spectrum of content types—data, audio, visual images, simulations, etc. Much of the development to date has
182 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
focused on building collections and tools for organizing and extracting knowledge from them. Experts only recently have acknowledged the role of the creators and users of knowledge and the contexts in which they create and use. Dagobert Soergel, a specialist in the organization of information, characterizes much of the digital library activity to date as “still at the stage of ‘horseless carriage’; to fulfill its fullest potential, the activity needs to move on to the modern automobile” (Soergel 2002, 1).
Key Concepts Whereas researchers have expended considerable effort in developing digital libraries, theoretical
Vannevar Bush on the “Memex”
S
cientist Vannevar Bush’s highly influential essay “As We May Think” (1945) introduced the idea of a device he called the “memex”—inspiring others to develop digital technologies that would find and store a vast amount of information. The owner of the memex, let us say, is interested in the origin and properties of the bow and arrow. Specifically he is studying why the short Turkish bow was apparently superior to the English long bow in the skirmishes of the Crusades. He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together. Thus he goes, building a trail of many items. Occasionally he inserts a comment of his own, either linking it into the main trail or joining it by a side trail to a particular item. When it becomes evident that the elastic properties of available materials had a great deal to do with the bow, he branches off on a side trail which takes him through textbooks on elasticity and tables of physical constants. He inserts a page of longhand analysis of his own. Thus he builds a trail of his interest through the maze of materials available to him. Source: Bush, V. (1945, July). As we may think. The Atlantic Monthly, 176(1). Retrieved March 25, 2004, from http://www.theatlantic.com/unbound/ flashbks/computer/bushf.htm
development has been elusive. Because of the multidisciplinary roots of the field, the different perspectives, and the lack of consensus on definition, we can have difficulty understanding the basic constructs of digital libraries. At its simplest interpretation, the term digital libraries brings together the notions of digital computing, networking, and content with those of library collections, services, and community. Researchers are giving attention to the 5S framework developed by Edward A. Fox, director of Digital Libraries Laboratory at Virginia Tech, Marcos André Gonçlaves of Digital Libraries Research, and Neill A. Kipp of Software Architecture. This framework defines streams, structures, spaces, scenarios, and societies to relate and unify the concepts of documents, metadata (descriptions of data or other forms of information content), services, interfaces, and information warehouses that are used to define and explain digital libraries: ■
Streams: sequences of information-carrying elements of all types—can carry static content and dynamic content ■ Structures: specifications of how parts of a whole are arranged or organized, for example, hypertext, taxonomies (systems of classification), user relationships, data flow, work flow, and so forth ■ Spaces: sets of objects and operations performed on those objects, for example, measure, probability, and vector spaces (a form of mathematical representation of sets of vectors) used for indexing, visualizations, and so forth ■ Scenarios: events or actions that deliver a functional requirement, for example, the services that are offered—data mining, information retrieval, summarization, question answering, reference and referral, and so forth ■ Societies: understanding of the entities and their interrelationships, individual users, and user communities
Digital Libraries Today A report from the President’s Information Technology Advisory Committee (PITAC) in 2001 acknowledges the need for much more work to be accomplished before we can think of digital libraries as fully successful in the United States. The report
DIGITAL LIBRARIES ❚❙❘ 183
identifies deficiencies in digital content availability and accessibility: Less than 10 percent of publicly available information is available in digital form, and less than 1 percent of the digital content is indexed, and therefore identifiable, via Web search engines. Thus, the “visible Web” is still small relative to the total potential Web. The report goes on to acknowledge the need to create digital library collections at a faster rate and much larger scale than are currently available. The report also identifies the need for improved metadata standards and mechanisms for identifying and providing access to digital library content and the need to advance the state of the art in user interfaces so that digital library users with different needs and circumstances can use interfaces better suited to their contexts. The PITAC report acknowledges that much of the progress to date in digital libraries has resulted from the federal government’s investments through multiagency digital-library research and development initiatives and through provision of access to libraries of medical and scientific data. In 1993 the National Science Foundation (NSF) funded Mosaic, the first Web browser to run on multiple platforms, thereby encouraging widescale access to digital content via the Internet and the Web. In 1994 the Digital Libraries Initiative (DLI)—involving NSF, Defense Advanced Research Projects Agency (DARPA), and National Aeronautics and Space Administration (NASA)—funded six university-led consortia to conduct research and development to make large distributed digital collections accessible and interoperable. In 1998 the program was expanded to include the National Institutes of Health/National Library of Medicine (NIH/NLM), the Library of Congress, National Endowment for the Humanities (NEH), Federal Bureau of Investigation (FBI), National Archives and Records Administration (NARA), the Smithsonian Institution, and the Institute for Museum and Library Services. Other federal agencies compiled some of the largest publicly accessible databases, such as Earthobserving satellite data, weather data, climate data, and so forth. Most recently, new forms of digital data library collections have been initiated, including digital libraries of molecules, cells, genomes, proteins, and so forth. The PITAC report calls for the federal
government to play a more aggressive and proactive role in provision of digital content to all and to use digital library technologies and content to transform the way it services its citizens. Another key area identified by the PITAC report is the opportunities and challenges of digital libraries and their long-term preservation. Experts see a slow and steady leakage of digital content from the Web as content is updated, archived, or removed. They also see a need for both standards for digital preservation and archival processes for periodic transfer/transformation to new formats, media, and technologies. Finally, the PITAC report says the issue of intellectual property rights needs to be addressed for digital libraries to achieve their full potential. In particular, clarification was sought by the PITAC Committee on access to information subject to copyright, the treatment of digital content of unknown provenance or ownership, policies about federally funded digital content, and the role of the private sector.
The Significance for HCI The first decade of digital library research and development provided ample evidence that our ability to generate and collect digital content far exceeds our ability to organize, manage, and effectively use it. We need not look further than our own experiences with the growth of digital content and services on the Web. Although the Web may be perceived by the majority of the using public as a “vast library,” it is not a library in several important aspects. Experts acknowledge the importance of understanding how people interact with digital libraries, how their needs relate to new types of information available, and the functionality that is needed by these new types of information. Numerous experts have called for more “user-centric” approaches to the design and operation of digital libraries. However, these same calls tend to still see the user involved only in reaction to the development of certain collections. Thus, “user-centric” seems to mean “user involvement” rather than “placement of the user and potential user at the center of digital library activity.” For a truly user-centric approach to emerge, we must start by understanding user need and
184 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
A meteorologist at the console of the IBM 7090 electronic computer in the Joint Numerical Weather Prediction Unit, Suitland, Maryland, circa 1965. This computer was used to process weather data for short and long-range forecasts, analyses, and reseach. Photo courtesy of the U.S. Weather Bureau.
context. This understanding includes recognizing users both as individuals and as part of their social context. Who the users are, what they are trying to do, and how they interact with others are all meaningful areas of discovery for future digital library development. Social groups—be they families, work groups, communities of practice, learning communities, geographic communities, and so on— grow up with, create, structure, accept, and use information and knowledge. Digital library content, tools, and services are needed to support these groups and the individuals whom they encompass. Issues of trust, reputation, belief, consistency, and uncertainty of information will continue to prevail,
especially in digital distributed environments in which people question assumptions about identity and provenance. Similarly, economic, business, and market frameworks will complicate digital library use and development. For people interested in human-computer interaction, digital libraries offer a complex, widespread environment for the research, development, and evaluation of technologies, tools, and services aimed at improving the connection of people to d i g i t a l e nv i ro n m e n t s a n d to e a ch o t h e r through those environments. Effective interaction between human and computer is essential for successful digital libraries.
DIGITAL LIBRARIES ❚❙❘ 185
Research Digital-library research has made significant progress in demonstrating our ability to produce digital versions of traditional library collections and services. However, what began as an effort to create “digital” libraries has been transformed into something much more dynamic than was originally envisioned. The idea of curated, network-accessible repositories was (and remains) a fundamental need of scholarly inquiry and communication, as was the idea that these repositories should support information in multiple formats, representations, and media. However, not until people made serious efforts to build such resources, particularly for nontextual digital content (audio, image, and video, for example), did they realize that this venture would stretch the limits of existing disciplinary boundaries and require involvement of new interdisciplinary collaborations. The NSF recently sponsored a workshop of digital-library scholars and researchers to frame the longterm research needed to realize a new scholarly inquiry and communication infrastructure that is ubiquitous in scope and intuitive and transparent in operation. Five major research directions were recommended. The first direction is expansion of the range of digital content beyond traditional text and multimedia to encompass all types of recorded knowledge and artifacts (data, software, models, fossils, buildings, sculptures, etc.). This content expansion requires improved tools for identification, linkage, manipulation, and visualization. The second research direction is the use of context for retrieving information. Such context has two dimensions: the relationships among digital information objects and the relationship between these objects and users’ needs. In particular, because of continually accumulating volumes of digital content, such context needs to be automatically and dynamically generated to the extent possible. However, automated tools could be supplemented with contextual additions from the using community (akin to reader input to the Amazon.com website, for example). Involvement of the using community in building the knowledge environment will also build a sense of ownership and stewardship relative to the particular content and services of interest. The third research direction is the integration of information spaces into
everyday life. Such integration requires customized and customizable user interfaces that encompass dynamic user models (with knowledge of the history, needs, preferences, and foibles of the users and their individual and social roles). The fourth direction is the reduction of data to actionable information. This reduction requires developing capabilities to reduce human effort and provide focused, relevant, and useful information to the user; to do this again requires an in-depth understanding of the users and their individual and social contexts. The fifth research direction is to improve accessibility and productivity through developments in information retrieval, image processing, artificial intelligence, and data mining.
Realizing the Potential The PITAC report offers a vision for digital libraries (universally accessible collections of human knowledge): All citizens anywhere anytime can use an Internetconnected digital device to search all of human knowledge. Via the Internet, they can access knowledge in digital collections created by traditional libraries, museums, archives, universities, government agencies, specialized organizations, and even individuals around the world. These new libraries offer digital versions of traditional library, museum, and archive holdings, including text, documents, video, sound and images. But they also provide powerful new technological capabilities that enable users to refine their inquiries, analyze the results, and change the form of the information to interact with it, such as by turning statistical data into a graph and comparing it with other graphs, creating animated maps of wind currents over time, or exploring the shapes of molecules. Very-high-speed networks enable groups of digital library users to work collaboratively, communicate with each other about their findings, and use simulation environments, remote scientific instruments, and streaming audio and video. No matter where digital information resides physically, sophisticated search software can find it and present it to the user. In this vision, no classroom, group or person is ever isolated from the world’s greatest knowledge resources. (PITAC 2001, 1)
Clearly underlying this vision is the notion of engaged communities of both information providers and information users. Sometimes called “knowledge
186 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
communities,” these communities, defined by a shared interest in knowing or wanting to know about a subject area, are in constant flux. Understanding the dynamics of knowledge communities, why, when, and how they form or cease to function, will be important to the realization of the PITAC vision. Similarly, researchers need to acknowledge the social construction of knowledge and the roles of various members in the communities over time. Traditional publishers, libraries, museums, archives, and other information collection and distribution entities that are bound by the physicality of their collections and audiences can clearly be represented in virtual environments. However, the real power of the emerging technologies is their unleashing of human creativity, connection, and collaboration in their creation, discovery, and sharing of new knowledge. Developing technologies that are more human-centric in their design and function is a critical element in achieving this future. Perhaps the greatest potential change that may result from digital libraries of the future will be in the institutional framework. When collection content no longer needs to be physically colocated, when service providers no longer need to be physically close to their intended user communities, and when the roles of provider and user blend, people will question the continued need for physical institutions and information-professional roles. Such a future may well see librarians, museum professionals, and others working within knowledge communities, not just as providers to those communities. As digital libraries and their contents are dispersed across the Internet, and as permanent availability and access to those contents are assured, the need for individual institutions to “own” and house collections and service access points (the means by which individuals can request and receive service, i.e. an online catalog, a physical library, a reference desk, or an online help desk) will diminish. For institutions whose reputations have grown with the growth and maintenance of their scholarly library collections, how will this future play out? Although the opportunities are significant and the technological developments astounding, the abilities of institutions to change at a similar pace are not clear. Issues of trust and control are likely to
constrain the kinds of institutional developments that we envision. José-Marie Griffiths See also Information Organization; Information Retrieval
FURTHER READING Atkins, D. (1999). Visions for digital libraries. In P. Schauble & A. F. Smeaton (Eds.), Summary report of the series of joint NSF-EU working groups on future directions for digital libraries research (pp. 11–14). Washington, DC: National Science Foundation. Bishop, A. P., & Starr, S. L. (1996). Social informatics of digital library use and infrastructure. Annual Review of Information Science and Technology (ARIST), 31, 301–401. Borgman, C. L., Bates, M. J., Cloonan, M. V., Efthimiadis, E. N., Gilliland-Swetland, A., Kafai, Y., Leazer, G. H., & Maddox, A. B. (1996). Social aspects of digital libraries: Final report to the National Science Foundation. Los Angeles: Graduate School of Library & Information Studies, UCLA. Retrieved January 26, 2004, from http://dlis.gseis.ucla.edu/DL/UCLA_DL_Report.html Bush, V. (1945). As we may think. In J. Nyce & P. Kahn (Eds.), From Memex to hypertext: Vannevar Bush and the mind’s machine (pp. 85–110). San Diego, CA: Academic Press. Diderot, D., & le Rond D’ Alembert, J. (Eds.). (1758–1776). Encyclopedie ou dictionnaire raisonne des sciences, des arts et des métiers, par une societe de gens de letteres (Encyclopedia or rational dictionary of sciences, arts, and the professions, by a society of people of letters) (2nd ed). Luca, Italy: André Le Breton. Fox, E. A., Gonçalves, M. A., & Kipp, N. A. (2002). Digital libraries. In H. Adelsberger, B. Collis, & J. Pawlowski (Eds.), Handbook on information systems (pp. 623–641). Berlin: Springer-Verlag. Fox, E. A., & Urs, S. R. (2002). Digital libraries. Annual Review of Information and Science and Technology (ARIST), 46, 503–589. Griffiths, J.-M. (1998). Why the Web is not a library. In B. Hawkins & P. Battin (Eds.), The mirage of continuity: Reconfiguring academic information resources for the twenty-first century (pp. 229–246). Washington, DC: Council on Library and Information Resources, Association of American Universities. Lesk, M. (1997). Practical digital libraries: Books, bytes and bucks. San Francisco: Morgan Kaufmann. Lynch, C. A. (2002). Digital collections, digital libraries, and the digitization of cultural heritage information. First Monday, 7(5). National Science Foundation. (2003, June). Report of the NSF workshop on digital library research directions. Chatham, MA: Wave of the Future: NSF Post Digital Library Futures Workshop. Nelson, T. H. (1974). Dream machines: New freedoms through computer screens—A minority report (p. 144). Chicago: Nelson. President’s Information Technology Advisory Committee, Panel on Digital Libraries. (2001). Digital libraries: Universal access to human knowledge, report to the president. Arlington, VA: National Coordination Office for Information Technology Research and Development.
DRAWING AND DESIGN ❚❙❘ 187
Soergel, D. (2002). A framework for digital library research: Broadening the vision. D-Lib Magazine, 8(12). Retrieved January 26, 2004 from http://www.dlib.org/dlib/december02/soergel/12soergel.html Waters, D. J. (1998). The Digital Library Federation: Program agenda. Washington, DC: Digital Libraries, Council of Library and Information Resources. Wells, H. G. (1938). World brain. Garden City, NY: Doubleday, Doran.
DRAWING AND DESIGN Ever since the Sketchpad system of computer graphics pioneer Ivan Sutherland, designers have dreamed of using drawing to interact with intelligent systems. Built in the early 1960s, Sketchpad anticipated modern interactive graphics: The designer employed a light pen to make and edit a drawing and defined its behavior by applying geometric constraints such as parallel, perpendicular, and tangent lines. However, the widespread adoption of the windows-mouse interface paradigm on personal computers in the 1980s relegated pen-based interaction to a specialized domain, and for many years little research was done on computational support for freehand drawing. The re-emergence of stylus input and flat display output hardware in the 1990s renewed interest in pen-based interfaces. Commercial software has mostly focused on text interaction (employing either a stylized alphabet or full-fledged handwriting recognition), but human-computer interfaces for computer-aided design must also support sketching, drawing, and diagramming. Computer-aided design (CAD) is widely used in every design discipline. CAD software supports making and editing drawings and three-dimensional computer graphics models, and in most design firms, computer-aided design applications have replaced the old-fashioned drawing boards and parallel rules. Digital representations make it easier for a design team to share and edit drawings and to generate computer graphics renderings and animated views of a design. The predominant use of computers in design is simply to make and edit drawings and models, leaving it to human designers to view, evaluate, and make design decisions. However, computational design assistants are being increasingly brought in to help not only with creating drawings and mod-
els, but also with design evaluation and decision making. Despite the almost universal adoption of computer-aided design software, it is typically used in the later—design development—phases of a design process, after many of the basic design decisions have already been made. One reason for this, and a primary motivation for supporting sketching, diagramming, and drawing interfaces in computer-aided design, is that during the conceptual phases many designers prefer to work with pencil and paper. The history of computers and human-computer interaction shows a strong tendency to favor a problem-solving approach, and computer languages have quite properly focused on requiring programmers to state problems precisely and definitely. This has, in turn, colored a great deal of our software, including computer-aided design, which demands of its users that they be able to precisely articulate what they are doing at all times. Yet designing in particular, and drawings more generally, seem at least sometimes ill-suited to this historical paradigm. Although the goal of designing is to arrive at definite design decisions that make it possible to construct an artifact, during the process designers are often quite willing to entertain (or tolerate) a great deal of uncertainty. This makes building human-computer interfaces for computer-aided design an interesting challenge, and one that may ultimately demand new forms of computational representations. The development of freehand interfaces for computer-aided design will certainly depend on technical advances in pen-based interaction. However, successful drawing-based interfaces for design will ultimately also be informed by research on design processes (how designing works and how people do design) as well as by the efforts of cognitive psychologists to understand the role of drawing and visual representations in thinking. Along with the development of freehand-drawing software systems, research on design and visual cognition has recently enjoyed a resurgence of interest. In addition to human-computer interaction, relevant work is being done in design research, artificial intelligence, and cognitive science. An increasing number of conferences, workshops, and journals are publishing work in this growing research area.
188 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Drawing as an Interface to Everything People sketch, diagram, and draw to compose, consider, and communicate ideas. Information about the ideas is coded in the lines and other drawing marks and the spatial relationships among them. Some people may think of drawings as belonging to the realm of aesthetics, or as ancillary representations to “real” thinking carried out in words or mathematics, but many disciplines—from logic to sports, from physics to music, from mathematics and biology to design—employ drawings, sketches, and diagrams to represent ideas and reason about them. Many scientific and engineering disciplines use welldefined formal diagram languages such as molecular structure diagrams, analog and digital circuit diagrams, or user-modeling language (UML) diagrams in human-computer interactions. Drawing is seldom the sole vehicle for communicating information, but in many domains it is either a primary representation or an important auxiliary one, as a look at whiteboards in any school or company will confirm. Drawing plays a special role in design. In physical domains such as mechanics, structural engineering, and architecture, and in graphic (and graphical user interface) designs, a design drawing correlates directly with the ultimate artifact: The drawing’s geometry corresponds directly to the geometry of the artifact being designed. For example, a circle represents a wheel. In other disciplines, a diagram may correlate symbolically, for example a supply-demand curve in economics. Even in domains where a graphic representation only abstractly represents the artifact being designed, drawing supports the supposing, proposing, and disposing process of design decision making. For these reasons, drawing can be an interaction modality to a wide range of computational processes and applications—drawing as an interface to everything.
What’s in a Drawing? Drawings range from conceptual diagrams to rough sketches to precisely detailed drawings. The purposes of these representations differ, although designers may employ them all in the course of designing: Beginning with a conceptual diagram of an idea, they
develop it through a series of sketches, ultimately producing a precise and detailed drawing. Both diagram and sketch are typically preliminary representations used in early design thinking to capture the essence of an idea or to rapidly explore a range of possibilities. A diagram employs shapes and spatial relations to convey essentials concisely. A sketch, however, is often more suggestive than definitive, and it may convey details while simultaneously avoiding specificity. A schematic drawing involves more detail and complexity than a diagram and is usually intended as a more precise and definitive representation. Features of a drawing that are potentially relevant include shape and geometry, topology, curvature and points of inflection of lines, absolute and relative dimensions, positions of drawing marks and spatial relationships among them, line weights, thickness and color, speed and sequence of execution, and relationships with nearby text labels. In any particular drawing only some of these features may be relevant. For example, a diagram of digital logic is largely indifferent to geometry but the drawing topology (connections among the components) is essential to its meaning. On the other hand, in a schematic drawing of a mechanism or a sketch map, geometry and scale are essential. In addition to the information that a designer decides deliberately to communicate, a drawing also conveys information about the designer’s intent, that is, metainformation about the designing process. For example, the speed with which a sketch is executed, the extent to which a designer overtraces drawing marks, the pressure of the pen, or the darkness of the ink all offer important information. Intense overtracing in one area of the drawing may indicate that the designer is especially concerned with that part of the design, or it may reveal that the drawing represents several alternative design decisions. A quickly made sketch may reflect broad, high-level thinking, whereas a slow drawing may reveal a high degree of design deliberation. During design brainstorming it is common to find several sketches and diagrams on the same sheet of paper or whiteboard; they may be refinements of a single design, alternative designs, designs for different components of the artifact, or even representations of entirely unrelated ideas.
DRAWING AND DESIGN ❚❙❘ 189
Input Issues Two different approaches—ink based and stroke based—to building pen-based interaction systems are currently being followed, and each has certain advantages. An ink-based system registers the drawing marks the user makes in an array of pixels captured by a video camera or scanner, which serves as input for an image-processing system to parse and interpret. A stroke-based system records the motion of the user’s pen, usually as a sequence of x,y (and sometimes pressure and tilt) coordinates. To an inkbased system any drawing previously made on paper can serve as scanned input, whereas one that is strokebased must capture input as it is produced. This makes dynamic drawing information such as velocity, pen pressure, and timing available to stroke-based systems. Many stroke-based systems, for example, use timing information to segment drawing input into distinct drawing elements, or glyphs. Designers traditionally distinguish between freehand and hard-line drawings. Freehand drawings are typically made with only a stylus, whereas hardline drawings are made using a structured interface, previously a triangle and parallel rule, today the menus and tool palettes of a conventional computeraided design program. The structured interface has certain advantages: In selecting drawing elements from a tool palette the designer also identifies them, eliminating the need for the low-level recognition of drawing marks that freehand drawing systems typically require. While this helps the computer program to manage its representation of the design, many designers feel that the structured interface imposes an unacceptable cognitive load and requires a greater degree of commitment and precision than is appropriate, especially during the early phases of designing. Designers also complain that menus and tool systems get in the way of their design flow. A freehand drawing conveys more subtle nuances of line and shape than a hard-line drawing. Freehand drawings are often less formal and precise and more ambiguous than hard-line representations, all arguably advantageous characteristics in the early phases of design thinking. Some computer-based drawing systems automatically replace hand-drawn sketchy shapes and lines with “beautified” ones. Other systems retain the
user’s original drawing, even if the system has recognized sketched components and could replace them with precise visual representations. Many designers consider the imprecise, rough, and suggestive nature of a sketch or diagram to be of great value and therefore prefer a hand-drawn sketch to a refined, geometrically precise beautified drawing. On the other hand, some users strongly prefer to work with perfectly straight lines and exact right angles rather than crude-looking sketches. This depends at least in part on the user’s own experience with drawing: Novices are more likely to feel uncomfortable with their sketching ability and prefer to work with beautified drawings, whereas seasoned designers tend to see the nuances of their hand-drawn sketches as positive characteristics. Whether beautification is considered helpful or harmful also depends in part on the drawing’s intended purpose.
Recognition Issues A great deal of research in interactive drawing aims at recognizing sketches, diagrams, and drawings for semantic information processing by intelligent systems that apply domain knowledge to reason about designs. After the system extracts from the drawing the semantics of a proposed design, then various knowledge-based design aids, such as simulation programs, expert systems, and case-based reasoning tools, and other automated advisors can be brought to bear. An interface that recognizes and interprets the design semantics of sketches and diagrams enables a designer to employ these programs in the early phases of designing. For example, a program that recognizes the components and connections of a mechanical diagram can construct and execute a computer simulation of the mechanism. A program that recognizes the layout of an architectural floor plan can retrieve from a database other similar or analogous floor plans. A program that recognizes a sketched layout of a graphical user interface can generate code to construct that interface. A variety of recognition approaches have been explored, including visual-language parsing and statistical methods. Parsing approaches consider a drawing as an expression in a visual language composed of glyphs (simple drawing marks such as
190 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
arrows, circles, and rectangles) arranged in various spatial relations into configurations. Typically a lowlevel recognizer first identifies the glyphs. Some systems restrict glyphs to a single stroke, requiring, for example, that a box be drawn without lifting the pen; others allow multiple-stroke glyphs, allowing the box to be drawn as four distinct strokes. After the glyph recognizer has identified the basic drawing elements, the parser identifies legal visual expressions by matching the drawing against grammar rules. Smaller visual units—initially glyphs, then configurations arranged in specific spatial relations—make up more complex visual expressions. Each design domain has its own visual language, so parsing approaches to general-purpose sketch recognition must either be told which visual language to use or must determine this information from the context. Statistical methods such as Bayesian networks and hidden Markov models have proved successful in other kinds of recognition, notably speech recognition and natural-language understanding. Statistical techniques make it possible to build visual-language recognizers without having to manually construct a grammar for each domain-specific language. Against sketch recognition the argument is leveled that people are highly sensitive to recognizer failure and will not tolerate imperfect recognizer performance. Experience (for instance, with speech-totext systems and early handwriting recognizers) shows that users become quite frustrated unless recognition is extremely reliable, that is, has accuracy rates above 99 percent. On the other hand, unlike speech and character recognition—where it can be assumed that the input has only one intended interpretation—uncertainty in various forms may be more acceptable in drawing, especially when a designer wants to preserve ambiguity. Then for sketch recognition, the methods of sustaining ambiguity and vagueness would be at least as important as accuracy. An intermediate approach to recognition asks the user to label the elements of a sketch rather than attempt low-level glyph recognition. In this hybrid approach the user enters a freehand drawing; then after the user has labeled the elements (choosing from a palette of symbols) the system can reason about the drawing’s spatial organization.
Designers often sketch during the early stages of design thinking, and therefore a sketch may serve the dual purpose of (1) recording what the designer already has decided and (2) exploring possible alternatives. Sketches in general will vary along the dimensions of ambiguity and precision, and even within a single sketch some parts may record definite and precise design decisions while other parts are vague, amorphous, and imprecise, representing work-in-progress exploration. Recognition-based drawing systems must be able to deal with these extremes as well as with the range of representations in between, and they must also be able to determine autonomously—from the drawing itself—what degree of ambiguity and imprecision the designer intended to convey. For example, a recognition-based system might be able to distinguish between its own failure to recognize precise input and a drawing that is deliberately indeterminate. The ability of a system to sustain ambiguous and imprecise representations is for this reason especially important, and this may pertain not only to the interface-recognition algorithms, but also to any back-end processes behind the interface that later represent or reason about the designs. A recognizer can support imprecision and ambiguity in several ways. Recognition-based interfaces can catch, resolve, or mediate potential errors and ambiguities at input time, for example, by presenting the user with a sorted list of alternative interpretations. Visual-language interpreters can employ fuzzy-logic techniques, representing match probabilities in the parse, or they may allow the parse to carry multiple alternative interpretations. Rather than requiring an entire drawing to represent a single visual sentence, a recognizer may take a bottom-up approach that identifies some parts of the drawing while allowing others to remain uninterpreted.
Avoiding Recognition: Annotation and Multimodal Systems Another response to the problem of recognition is to avoid it entirely and simply manage drawings as design representations independent of their semantic content. This approach is taken in systems that
DRAWING AND DESIGN ❚❙❘ 191
treat drawings as components of a collection of multimodal conversations. Despite a popular myth of the lone creative designer, real-world design typically involves a team of participants that includes experts from a variety of design disciplines as well as other stakeholders, and a process that can range in duration from weeks to years. The record of the designing process (the design history) can therefore include successive and alternative versions over time and the comments of diverse participants, along with suggestions, revisions, discussions, and arguments. Sketches, diagrams, and drawings are important elements in the record of this design history. Design drawings are inevitably expressions in a larger context of communication that includes spoken or written information, photographs and video, and perhaps computational expressions such as equations or decision trees. This gives rise to a wide range of multimodalities. For example, a designer may (a) mark up or ”redline” a drawing, photograph, 3D model, or video to identify problems or propose changes, or add text notes for similar reasons; (b) insert a drawing to illustrate an equation or descriptive text or code; (c) annotate a drawing with spoken comments, recording an audio (or video) track of a collaborative design conversation as the drawing is made or attaching audio annotations to the drawing subsequently. Associated text and audio/video components of the design record can then be used in conjunction with the drawing; for example, text can be indexed and used to identify the role, function, or intentions of the accompanying drawings.
From Sketch to 3D Designers in physical domains such as mechanical, product, and industrial engineering and architecture often sketch isometric and perspective drawings to describe three-dimensional artifacts. Therefore, sketch-recognition research has long sought to build systems that can generate three-dimensional models from two-dimensional sketches. Although this goal has not yet been achieved in the general case of arbitrary 2D sketches, a variety of approaches have been pursued, each with particular strengths and limitations, and each supporting specific kinds of sketch-to-3D constructions. Recent representative
efforts include SKETCH!, Teddy, Chateau, SketchVR, and Stilton. Despite its name, the SKETCH! program does not interpret line drawings; rather, the designer controls a 3D modeler by drawing multistroke gestures, for example, three lines to indicate a corner of a rectangular solid. Teddy enables a user to generate volumes with curved surfaces (such as Teddy bears) by “inflating” 2D curve drawings. It uses simple heuristics to generate a plausible model from a sketch. Chateau is a “suggestive”’ interface: It offers alternative 3D completions of a 2D sketch as the user draws, asking in effect,“Do you mean this? Or this?” SketchVR generates three-dimensional models from 2D sketches by extrusion. It identifies symbols and configurations in the drawing in the 3D scene and replaces them with modeling elements chosen from a library. In Stilton, the user draws on top of the display of a 3D scene; the program uses heuristics about likely projection angles to interpret the sketch.
The Future Much of the personal computer era has been dominated by interfaces that depend on text or on interacting with mouse-window-menu systems. A renewed interest in sketch-based interaction has led to a new generation of systems that manage and interpret handdrawn input. Today, human-computer interaction research is enabling computer-aided design software to take advantage of sketching, drawing, and diagramming, which have long been essential representations in design, as well as in many other activities. Progress in freehand-drawing interaction research will go hand in hand with research in design processes and cognitive studies of visual and diagrammatic reasoning. Mark D. Gross See also Evolutionary Engineering; Pen and Stylus Input
FURTHER READING Davis, R. (2002). Sketch understanding in design: Overview of work at the MIT AI lab. In R. Davis, J. Landay & T. F. Stahovich (Eds.), Sketch understanding: Papers from the 2002 AAAI Symposium
192 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
(pp. 24–31). Menlo Park, CA: American Association for Artificial Intelligence (AAAI). Do, E. Y.-L. (2002). Drawing marks, acts, and reacts: Toward a computational sketching interface for architectural design. AIEDAM (Artificial Intelligence for Engineering Design, Analysis and Manufacturing), 16(3), 149–171. Forbus, K., Usher, J., & Chapman, V. (2003). Sketching for military courses of action diagrams. In International Conference on Intelligent User Interfaces (pp. 61–68). San Francisco: ACM Press. Goel, V. (1995). Sketches of thought. Cambridge MA: MIT Press. Gross, M. D., & Do, E. Y.-L. (2000). Drawing on the back of an envelope: A framework for interacting with application programs by freehand drawing. Computers and Graphics, 24(6), 835–849. Igarashi, T., & Hughes, J. F. (2001). A suggestive interface for 3-D drawing. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST) (pp. 173–181). New York: ACM Press. Igarashi, T., Matsuoka, S., & Tanaka, H. (1999). Teddy: A sketching interface for 3-D freeform design. In Proceedings of the SIGGRAPH 1999 Annual Conference on Computer Graphics (pp. 409–416). New York: ACM Press/Addison-Wesley Publishing Co. Kurtoglu, T., & Stahovich, T. F. (2002). Interpreting schematic sketches using physical reasoning. In R. Davis, J. Landay, & T. Stahovich. (Eds.), AAAI Spring Symposium on Sketch Understanding (pp. 78–85). Menlo Park, CA: AAAI Press. Landay, J. A., & Myers, B. A. (1995). Interactive sketching for the early stages of interface design. In CHI ’95—Human Factors in Computing Systems (pp. 43–50). Denver, CO: ACM Press. Larkin, J., & Simon, H. (1987). Why a diagram is (sometimes) worth 10,000 words. Cognitive Science, 11, 65–99. Mankoff, J., Hudson, S. E., & Abowd, G. D. (2000). Providing integrated toolkit-level support for ambiguity in recognition-based
interfaces. In Proceedings of the Human Factors in Computing (SIGCHI) Conference (pp. 368–375). The Hague, Netherlands: ACM Press. Negroponte, N. (1973). Recent advances in sketch recognition. In AFIPS (American Federation of Information Processing) National Computer Conference, 42, 663–675. Boston: American Federation of Information Processing. Oviatt, S., & Cohen, P. (2000). Multimodal interfaces that process what comes naturally. Communications of the ACM, 43(3), 45–53. Pinto-Albuquerque, M., Fonseca, M. J., & Jorge, J. A. (2000). Visual languages for sketching documents. In Proceedings, 2000 IEEE International Symposium on Visual Languages (pp. 225–232). Seattle, WA: IEEE Press. Saund, E., & Moran, T. P. (1994). A perceptually supported sketch editor. Paper presented at the ACM Symposium on User Interface Software and Technology, Marina del Rey, CA. Sutherland, I. (1963). Sketchpad: A man-machine graphical communication system. In Proceedings of the 1963 Spring Joint Computer Conference (pp. 329–346). Baltimore: Spartan Books. Suwa, M., & Tversky, B. (1997). What architects and students perceive in their sketches: A protocol analysis. Design Studies, 18, 385–403. Turner, A., Chapman, D., & Penn, A. (2000). Sketching space. Computers and Graphics, 24, 869–876. Ullman, D., Wood, S., & Craig, D. (1990). The importance of drawing in the mechanical design process. Computers and Graphics, 14(2), 263–274. Zeleznik, R., Herndon, K. P., & Hughes, J. F. (1996). SKETCH: An interface for sketching 3-D scenes. In SIGGraph ’96 Conference Proceedings (pp. 163–170). New York: ACM Press.
E-BUSINESS EDUCATION IN HCI ELECTRONIC JOURNALS ELECTRONIC PAPER TECHNOLOGY
E
ELIZA E-MAIL EMBEDDED SYSTEMS ENIAC ERGONOMICS ERRORS IN INTERACTIVE BEHAVIOR ETHICS ETHNOGRAPHY EVOLUTIONARY ENGINEERING EXPERT SYSTEMS EYE TRACKING
E-BUSINESS Although business challenges such as time and space now matter less, new challenges arise when people conduct electronic business (e-business). These challenges arise from two fundamental sources: global customers’ cultural values and culturally sensitive technology applications. Important cultural questions affect e-business and global customers. Such questions include (1) why is culture important to consider when conducting e-business? and (2) how do companies leverage their information technology (IT) applications in light of cultural differences exhibited by global customers? Answering these questions can help companies that use IT in a multicultural market.
The Technological Revolution A new landscape for conducting e-business has arisen with the proliferation of technologies that facilitate e-business, such as information communication technologies (ICT) (any communication device or application encompassing radio, television, cellular phones, satellite systems, etc.); enterprise resource planning (ERP) (any software system designed to support and automates the business processes of medium and large businesses); electronic data interchange (EDI) (an information system or process integrating all manufacturing and related applications for an entire enterprise; and manufacturing resource planning (MRP) (a system for effectively managing material requirements in a manufacturing process). Agility, flexibility, speed, and change are the conditions for developing e-business models. In 193
194 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
addition, by using information and telecommunication systems, companies are able to communicate with their global customers where barriers such as time zones, currencies, languages, and legal systems are reduced or eliminated. As a result, global customers can be reached anywhere and at anytime. Services and products can be obtained whenever, wherever, and by whomever. The digital economy is blazing a new path for doing business where the notion of “value through people” becomes the driving force for a successful model of e-business. With the advent of the World Wide Web, business is increasingly becoming an online environment. Traditional brick-and-mortar businesses have evolved into “click-and-mortar” businesses. Additionally, the Internet has changed from a communications tool used mostly by scientists to a business tool used by companies to reach millions of customers across the globe. As a result, the Internet has become a powerful business resource because its technology enables firms to conduct business globally (Simeon 1999). In addition, online sales easily penetrate global markets. Some companies treat Web customers as a new type of audience—so united in their use of the Internet that national differences no longer apply. Other companies, such as IBM, Microsoft, and Xerox, have developed local versions of their websites. These versions run off regional servers, address technical issues (such as the need to display different character sets), and provide information about local services and products. Occasionally they reflect aesthetic differences—such as cultural biases for or against certain colors—but few companies actively consider cultural variations that might enhance the delivery of their products.
What Are E-business and E-commerce? The terms e-business and e-commerce have slightly different meanings. E-business is “. . . a broader term that encompasses electronically buying, selling, servicing customers, and interacting with business partners and intermediaries over the Internet. Some exper ts see e-business as the objective and e-commerce as the means of achieving that objec-
tive” (Davis and Benamati 2003, 8). In essence, e-business means any Internet or network-enabled business, for example companies can buy parts and supplies from each other, collaborate on sales promotion, and conduct joint research. On the other hand, e-commerce is a way of doing business using purely the Internet as a means, whether the business occurs between two partners (business to business—B2B), between a business and its customers (business to customers—B2C), between customers (C2C), between a business and employees (B2E), or between a business and government (B2G). According to Effy Oz (2002), an expert in information technology and ethics, there are three categories of organizations that want to incorporate the Web into their e-business: (1) organizations that have a passive presence online and focus on online advertising, (2) organizations that use the Web to improve operations, and (3) organizations that create stand-alone transaction sites as their main or only business. In contrast, e-commerce is not exclusively about buying and selling. Although the ultimate goal of business is profit generation, e-commerce is not exclusively about buying and selling. Instead, the real goal of e-commerce is to improve efficiency by the deployment of technologies. Factors that influence the development of e-commerce are a competitive environment, strategic commitment of the company, and the required competencies. Thus, the definition of e-commerce has a more restricted application than that of e-business.
Understanding Cultural Concepts Explained below are three different categories of culture. The first category is national culture in which the differences of the cultural values are based on four key dimensions. First is the individualismcollectivism dimension, which denotes a culture’s level of freedom and independence of individuals. Second is the power-distance dimension, which denotes the levels of inequality expected and accepted by people in their jobs and lives. Third is the uncertainty-avoidance dimension, which denotes how societies deal with the unknown aspects of a dif-
E-BUSINESS ❚❙❘ 195
ferent environment and how much people are willing to accept risks. Fourth is the masculinityfemininity dimension, which denotes a culture’s ranking of values such as being dominant, assertive, tough, and focused on material success. The second category of culture is related to organizational culture. According to Edgar J. Schein, an organizational psychologist, organizational culture “is a property of a group. It arises at the level of department, functional groups, and other organizational units that have a common occupational core and common experience. It also exists at every hierarchical level of the organizations and at the level of the whole organization” (Schein 1999, 13–14). Thus, intense organizational culture can result in manifestations such as the phrases “the way we do things around here,” “the rites and rituals of our company,” “our company climate,” “our common practices and norms,” and “our core values.” A third category of culture can be called “information technology culture.” Information technolog y culture often overlaps national and organizational cultures. Indeed, IT culture is part of the organizational culture, which determines whether the user (i.e., customer) accepts or resists the technology to be used. IT culture can be defined as the sets of values and practices shared by those members of an organization who are involved in IT-related activities, such as information system professionals, and managers who are involved in ITrelated activities (i.e., programming, system analysis and design, and database management).
Global Customers: Challenges of Cultural Differences IT applications in the context of e-business have become more important because today companies of all sizes and in all sectors are adopting the principles of cultural diversity, as opposed to cultural convergence, when reaching out to global customers. Some questions that are worth considering are why global customers resist new IT implementation, how organizational culture affects new customers’ attitudes toward new IT implementation, and why many companies fail to consider the role
of culture when developing and implementing IT applications. Companies have difficulty in understanding or even recognizing cultural factors at a deeper level because the factors are complex and subtle. Companies’ understanding of cultural factors is normally only superficial, which is why people have difficulty observing the magnitude of the impact of such factors on the success or failure of e-business companies. Although people have conducted an increasing amount of research in global IT, this research has been primarily limited to descriptive cross-cultural studies where comparison analyses were made between technologies in different national cultures. A universal interface should not be mistakenly considered as one interface for all customers. The concept of universalism is somewhat misleading in this context. The most important goal is to ensure that customers feel at home when exploring the Internet. Fundamentally, cultural factors have strong influences on global customers’ preferences. Each customer has his or her own culturally rooted values, beliefs, perceptions, and attitudes. When loyal customers are satisfied with the way they have been buying goods and services, they resist changes. Making purchases online is less desirable to many customers. The fact that customers cannot touch or smell the products that they want makes some resistant to ebusiness. Customers also can be resistant because they lack the skills to use new technologies and an understanding of how e-business is conducted. Different ethnic cultures demonstrate different cognitive reactions, requiring different environmental stimuli (Tannen 1998). Similarly, Web-marketing psychology depends on different mixtures of cognitive and behavioral elements (Foxall 1997). Language, values, and infrastructure can also be barriers to ebusiness. For example, the preference of many Chinese people for a cash-based payment system or “cash-on-delivery” is the main obstacle to conducting e-business in China. The phenomenon can be explained by factors such as a lack of real credit cards, a lack of centralized settlement systems (the ability for credit cards to be used anywhere), and a lack of trust in conducting business via the Internet (Bin, Chen, and Sun 2003).
196 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
When people of Malaysia and Australia were asked to evaluate eight websites from their countries, the findings confirmed that the subjects had no preference for one-half the websites but had a preference associated with subject nationality for the other onehalf (Fink and Laupase 2000). A study of mobilephone use in Germany and China showed that people accepted support information and rated it as more effective when it was culturally localized. Similar studies have shown that cultural factors influence how long customers stay on the Internet, how likely they are to buy products online when the content is presented in their native language, and the usability of Web design elements. Researchers believe that culturally specific elements increase international participation in conducting e-business more than do genre-specific elements. Interestingly, the role of culture in user interface design can be identified as the localization elements that could be considered as cultural markers. These cultural markers are influenced by a specific culture or specific genre (Barber and Badre 1998). Examples of cultural markers are interface design elements that reflect national symbols, colors, or forms of spatial organization. After reviewing hundreds of websites from different countries and in different languages, Barber and Badre posited that different cultural groups prefer different types of cultural markers.
E-business Strategies Because global customers come from all over the world, their demands, needs, and values are more divergent than similar. Cultural context and cultural distance may have an impact on how goods and services can be delivered to them—that is, on marketing channels and logistics. Hence, e-business companies must fully understand the values that affect customers’ preferences. Companies need to tailor their products to customers’ electronic requirements. Selling products electronically means that businesses must consider international channels of distribution that fit with customers’ values. The electronic environment can become a barrier to successful business endeavors. For example, in some cultures a business transaction is best conducted face
to face. Customers can negotiate better with a seller because such a setting allows the reciprocal communication of interests and intentions. After seller and customers establish rapport, this rapport creates a more trusting relationship. This trusting relationship could lead to repeated transactions. Trusting the seller or buyer is crucial in certain cultures. Companies that conduct e-business need to find new mechanisms and strategies that overcome such cultural differences. In a situation such as the study of Chinese customers, where credit cards were not the common system of payment, a pragmatic strategy might be to buy online and pay offline. The findings in the research of customer interfaces for World Wide Web transactions indicate that there are significant cultural variations in why people used the Internet (O’Keefe et al. 2000). Their U.S. subjects used the Internet solely to search for information, whereas their Hong Kong subjects used the Internet to communicate socially. A wise e-business strategy for a company is thus to enhance personal competence for Western values and to seek out social relationships and shared loyalty for Eastern values. Another e-business strategy emphasizes leveraging technology. Electronic businesses have two options when designing websites for customers in different countries—design one website for all or “localized” websites for each country. If the audience crosses national borders, a single website may be appropriate. For instance, websites exist for Arctic researchers and astronomers. However, this strategy is less likely to be successful when no overriding professional or occupational focus unifies the audience. The alternative is for companies to develop local versions of their websites. These local versions may be run off regional servers to enhance performance or to display different character sets. They also can emphasize different product lines. Unfortunately, unless the company is highly decentralized, variations in the basic message or mode of presentation that might enhance delivery of its products to people in another culture are rarely seen. Melissa Cole and Robert O’Keefe (2000) believe that Amazon.com and Autobytel.com (an auto sales website) have transcended global differences by employing a standardized transaction-oriented inter-
E-BUSINESS ❚❙❘ 197
face. Such an interface may be practical for people who have a limited goal (such as deciding which book to buy) but may not be practical for people who do not. Because different audiences use the Internet for different purposes, standardized features may not be practical for all the nuances of cultural values. Designing interfaces for people who are searching for social relationships, rather than seeking information, imposes different requirements on Web retailers and designers. Culture has significant impacts on global customers and software designers. The merging concepts of culture and usability have been termed “cultural user interfaces” by Alvin Yeo (1996) and “culturability” by Barber and Badre (1998). Yeo talks about culture’s effect on overt and covert elements of interface design. Tangible, observable elements such as character sets and calendars are overt and easy to change, whereas metaphors, colors, and icons may reflect covert symbols or taboos and be difficult to recognize and manipulate. Barber and Badre assert that what is user friendly to one nation or culture may suggest different meanings and understandings to another. Therefore, efforts to build a generic global interface may not be successful. Instead, cultural markers should be programmatically changed to facilitate international interactions.
Implications Because of globalization and competitiveness in international business, many multinational and local companies have considered implementing technologies to conduct e-business. Among the primary factors that companies must consider are the effect of culture on customers’ technology acceptance and customers’ cultural values. Companies should address the appropriateness of management policies and practices across countries. For example, managers need to make decisions concerning certain situations such as whether a global company can override national cultural differences and when local policies are best. IT provides vast opportunities for companies to compete in the global and electronic arena. At the same time, customers from different cultures can differ significantly in their perceptions, beliefs, at-
titudes, tastes, selection, and participation in e-business. Hence, companies need to fully understand cultural variances in order to make decisions on which e-business strategies work best. Some basic questions for future research would be: (1) What makes for universally appealing IT practices? (2) Does acceptability or familiarity drive global IT use? (3) How does one successfully introduce technology applications that are unusual or not appropriate in a country? (4) How can cultural differences be considered in the planning of IT practices? In a nutshell, companies and Web designers need to be sensitive to the different needs of global customers and to build strategies and interfaces that consider cultural assumptions and characteristics. Taking advantage of national differences and preferences provides resource-based competencies and competitive advantages for e-businesses. Companies need a more innovative e-business model. With new e-business practices, success is centered on people’s values, agility, speed, flexibility, and change. Hence, the common business phrase “Think globally, act locally” may not be as practical as “Think locally, act globally.” Reaching out to global customers means reflecting their local cultures, language, and currency. Norhayati Zakaria See also Anthropology and HCI; Ethnography; Website Design FURTHER READING Barber, W., & Badre, A. (1998). Culturability: The merging of culture and usability. Human Factors and the Web. Retrieved March 1, 2004, from http://www.research.att.com/conf/hfweb/proceedings/barber/ index.htm Bin, Q., Chen, S., & Sun, S. (2003). Cultural differences in e-commerce: A comparison between the U.S. and China. Journal of Global Information Management, 11(2), 48–56. Cole, M., & O’Keefe, R. M. (2000). Conceptualizing the dynamics of globalization and culture in electronic commerce. Journal of Global Information Technology Management, 3(1), 4–17. Cooper, R. B. (1994). The inertia impact of culture on IT implementation. Information and Management, 17(1), 17–31. Davis, W. S., & Benamati, J. (2003). E-commerce basics: Technology foundations and e-business applications. New York: Addison-Wesley. Fink, D., & Laupase, R. (2000). Perceptions of web site design characteristics: A Malaysian/Australian comparison. Internet Research, 10(1), 44–55.
198 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Foxall, G. R. (1997). Marketing psychology: The paradigm in the wings. London: Macmillan. Hofstede, G. (1980). Culture's consequences: International differences in work-related values. Beverly Hills, CA: Sage. Honald, P. (1999). Learning how to use a cellular phone: Comparison between German and Chinese users. Technical Communication: Journal of the Society for Technical Communication, 46(2), 196–205. Janssens, M., Brett, J. M., & Smith, F. J. (1995). Confirmatory crosscultural research: Testing the viability of a corporate-wide safety policy. Academy of Management Journal, 38, 364–382. Johnston, K, & Johal, P. (1999). The Internet as a “virtual cultural region”: Are extant cultural classifications schemes appropriate? Internet Research: Electronic Networking Applications and Policy, 9(3), 178–186. Kowtha, N. R., & Choon, W. P. (2001). Determinants of website development: A study of electronic commerce in Singapore. Information & Management, 39(3), 227–242. O'Keefe, R., Cole, M., Chau, P., Massey, A., Montoya-Weiss, M., & Perry, M. (2000). From the user interface to the consumer interface: Results from a global experiment. International Journal of Human Computer Studies, 53(4), 611–628. Oz, E. (2002). Foundations of e-commerce. Upper Saddle River, NJ: Pearson Education. Rosenbloom, B., & Larsen, T. (2003). Communication in international business-to-business marketing channels: Does culture matter? Industrial Marketing Management, 32(4), 309–317. Ryan, A. M., McFarland, L., Baron, H., & Page, R. (1999). An international look at selection practices: Nation and culture as explanations for variability in practice. Personnel Psychology, 52, 359–391. Sanders, M. (2000). World Net commerce approaches hypergrowth. Retrieved March 1, 2004, from http://www.forrester.com/ER/ Research/Brief/0,1317,9229,FF.html Schein, E. H. (1999). The corporate culture survival guide: Sense and nonsense about cultural change. San Francisco: Jossey-Bass. Simeon, R. (1999). Evaluating domestic and international web-sites strategies. Internet Research: Electronic Networking Applications and Policy, 9(4), 297–308. Straub, D., Keil, M., & Brenner, W. (1997). Testing the technology acceptance model across cultures: A three country study. Information & Management, 31(1), 1–11. Tannen, R. S. (1998). Breaking the sound barrier: Designing auditory displays for global usability. Human Factors and the Web. Retrieved March 1, 2004, from http://www.research.att.com/conf/hfweb/ proceedings/tannen/index.htm Wargin, J., & Dobiey, D. (2001). E-business and change: Managing the change in the digital economy. Journal of Change Management, 2(1), 72–83. Yeo, A. (1996). World-wide CHI: Cultural user interfaces, a silver lining in cultural diversity. SIGCHI Bulletin, 28(3), 4–7. Retrieved March 1, 2004, from http://www.acm.org/sigchi/bulletin/1996.3/ international.html.
ECONOMICS AND HCI See Denial-of-Ser v ice Attack; Digital Cash; E-business; Hackers
EDUCATION IN HCI Education in human-computer interaction (HCI) teaches students about the development and use of interactive computerized systems. Development involves analysis, design, implementation, and evaluation, while use emphasizes the interplay between the human users and the computerized systems. The basic aim of instruction in HCI is that students learn to develop systems that support users in their activities. Education in HCI is primarily conducted in two contexts: academia and industry. HCI is an important element in such diverse disciplines as computer science, information systems, psychology, arts, and design. Key elements of HCI are also taught as industry courses, usually with more focus on the design and development of interactive systems.
Development of HCI as a Discipline The first education programs in computer science and computer engineering were developed in the 1970s and 1980s. They dealt extensively with hardware and software; mathematics was the main supporting discipline. As the field of computer science has developed, other disciplines have been added to accommodate changes in use and technology. HCI is one such discipline; it has been added to many computer science curricula during the 1990s and early 2000s. In order to promote a more unified and coherent approach to education in HCI, the Special Interest Group on Human-Computer Interaction (SIGCHI), part of the Association for Computing Machinery (ACM), the world’s oldest and largest international computing society, decided in 1988 to initiate development of curriculum recommendations. The result of this work was published in 1992 under the title ACM SIGCHI Curricula for Human-Computer Interaction. The report defined the discipline and presented six main content areas. The report also provided four standard courses: CS1 (User Interface Design and Development), CS2 (Phenomena and Theories of Human-Computer Interaction), PSY1 (Psychology of Human-Computer
EDUCATION IN HCI ❚❙❘ 199
A Personal Story—Bringing HCI Into the “Real World” In teaching HCI concepts, I often try to make connections to interaction with the real world. One of the classrooms in which I teach is adjacent to a chemistry lab. A solid wooden door connects the two rooms. Until recently, a large white sign with red lettering was posted on the door, visible to all in the classroom, reading, “Fire door. Do not block.” I found nothing remarkable about this arrangement until one day I noticed that the door has no knob, no visible way of opening it. Further examination showed that the hinges are on the inside of the door, so that it opens into the classroom. A bit of thought led to the realization that the door is for the students in the chemistry lab; if a fire breaks out in the lab they can escape into the classroom and then out into the corridor and out of the building. All well and good, but where does that leave students in the classroom? Imagine a fire alarm going off and the smell of smoke in the air. My students rush to what looks to be the most appropriate exit, and find that there's no way of opening the door marked “Fire door,” and that pushing on it is not the solution in any case. When I describe this scenario to my HCI students in the classroom, as an example of inadequate design in our immediate surroundings, it usually gets a few chuckles, despite the context. Still, they can learn a few lessons about design from this example. Messages are targeted at specific audiences, and messages must be appropriate for their audience. Here we have two potential audiences, the students in each of the two adjoining rooms. For the students in the chemistry lab, the sign would be perfectly appropriate if it were visible on the other side of the door. For the students in the classroom, less information would actually improve the message: “Important: Do not block this door” would be sufficient. This avoids drawing attention to the function of the door, functionality that is not targeted at those reading the sign. In general, conveying an unambiguous message can be difficult and requires careful thought. The sign no longer hangs on the door, which now stands blank. Robert A. St. Amant
Interaction), and MIS1 (Human Aspects of Information Systems). CS1 and CS2 were designed to be offered in sequence in a computer science or computer engineering department. CS1 focused on HCI aspects of software, dealing primarily with practical development of interfaces. It was defined as a general course that complemented basic programming and software engineering courses. CS2 was for students specializing in HCI, and it examined HCI in a broader context, presented more-refined design and evaluation techniques, and placed more emphasis on scientific foundations. The PSY1 course was designed to be offered in a psychology, human factors, or industrial engineering department. It stressed the theoretical and empirical foundations of human-computer interaction. Here too the emphasis was more on design and evaluation techniques and less on implementation. The MIS1 course was designed to be offered in an information systems department. It focused on
use in order to contribute to consumer awareness of interactive systems. It emphasized the role of computers in organizations and evaluation of the suitability of technological solutions. Although the students were not thought of as system builders, the ACM SIGCHI report recommended teaching program design and implementation as well as the use of tools such as spreadsheets and databases that have considerable prototyping and programming capability. This classical curriculum has been very influential as a framework and source of inspiration for the integration of HCI into many educational programs. Several textbooks have been created to cover these areas, including Prentice-Hall’s 1993 HumanComputer Interaction, Addison-Wesley’s 1994 Human-Computer Interaction, and Addison-Wesley’s 1998 Designing the User Interface. A classical reference for the graduate level is Readings in HumanComputer Interaction: Toward the Year 2000, published in 1995 by Morgan Kaufmann.
200 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Typical Problems HCI literature includes numerous guidelines and methods for analyzing users’ work, for implementation, and for evaluation. There are also many discussions of specific designs for interactive systems, since the systematic design of a user interface is an essential activity in the development process. However, although courses expose students to a rich variety of systems, as soon as the students are confronted with the task of designing a new system, they are equipped with very little in the way of methodologies. Current education in HCI also does not pay enough attention to the interplay between design and implementation. Design and implementation can be seen as separate activities, but the tools used for implementation support certain designs and impede others. When the two activities are treated separately, this fundamental relation is ignored. Another weakness of many introductory courses is that they focus solely on design and implementation and fail to stress the importance of evaluation—of defining and measuring usability in a systematic manner. Within a single course, it is impossible to master all the issues involved in the development of a user interface, but students should be exposed to all the issues and understand their importance and how they are related. If they only learn about design and implementation and not about evaluating the usability of their products, we risk ending up with systems that are attractive on the surface but are of no practical use to a real user. The opposite approach is to focus primarily on evaluation from the very beginning. Students learn to evaluate the usability of an existing system through a course in usability engineering, which they can take in the first semester of an undergraduate program. Field evaluations and other, more complicated, forms of evaluations can then be introduced in later semesters.
New Challenges HCI education continues to be challenged by new technological developments. The PC revolution that occurred in the middle of the 1990s and the widespread use of graphical user interfaces required more focus on graphical design. Many courses have adapted to these developments.
Since the late 1990s, small mobile computers and Web-based applications have presented new challenges. The design of interfaces for such technologies is only supported to a very limited extent by the methods and guidelines that are currently taught in many HCI courses. Textbooks that deal with these challenges are beginning to appear. Web Site Usability, published in 1999 by Morgan Kaufmann, teaches design of Web-based applications. HCI courses have to some extent been adjusted to include brief descriptions of novel systems and devices to inspire students to use their imaginations, but most education in HCI still focuses on designing and developing traditional computer systems. Guidelines for developing interactive interfaces typically include careful analysis of the context of use, which has traditionally been work activities. Yet the new technologies are used in a multitude of other contexts, such as entertainment, and these new contexts must be taken into consideration for future guidelines.
Integrating Practical Development For students of HCI truly to understand the nature of the field, they must try putting their knowledge into action. There are two basically different ways of giving students experience with practical development: through course exercises and student projects. The ACM SIGCHI curriculum contains proposals for limited development tasks that students can solve as exercises in a course. CS1 encourages a focus on design and implementation, using interface libraries and tools. CS2 suggests having students begin from less well-defined requirements, thereby changing the focus more toward user work and task analysis. It is suggested that the students also complete design, implementation, and evaluation activities. The problem with such exercises is that they are limited in time and therefore tend to simplify the challenges of interface development. In addition, exercises are usually conducted in relation to just one course. Therefore, they usually involve topics from that one course only. A more radical approach is to have students work on projects that involve topics from a cluster of
EDUCATION IN HCI ❚❙❘ 201
courses. There are some courses of study in which HCI is one element in a large project assignment that student teams work to complete. These courses introduce general issues and support work with the project assignment—for example, an assignment to develop a software application for a specific organization might be supported with courses in HCI, analysis and design, programming, and algorithmics and data structures. This basic pedagogical approach introduces students to theories and concepts in a context that lets the students see the practical applications of those theories and concepts. Projects undertaken during different semesters can be differentiated by overall themes. Such themes might reflect key challenges for a practitioner—for example, software development for a particular organization or design of software in collaboration with users. Using projects as a major building block in each semester increases an educational program’s flexibility, for while the content of a course tends to be static and difficult to change, the focus of the projects is much easier to change and can accommodate shifting trends in technology or use. Thus while courses and general themes of the projects can be fixed for several years, the content of the projects can be changed regularly, so that, for example, one year students work on administrative application systems and the next on mobile devices. Managers from organizations that hire students after graduation have emphasized the importance of projects. The students get experience with large development projects that are inspired by actual realworld problems. In addition, the students learn to work with other people on solving a task. The managers often say that a student with that sort of training is able to become a productive member of a project team in a very short time.
The Future In the last decades of the twentieth century, HCI was integrated into many educational programs, and there are no signs that the subject will diminish in importance in the years to come. On the contrary, one can expect that many programs that have a basic focus on computing and information systems but that lack courses in HCI will take up the subject.
There are a growing number of cross-disciplinary programs that involve development and use of computers. In several of these, HCI is becoming a key discipline among a number of scientific approaches that are merged and integrated in one institutional setting. Finally, multidisciplinary education programs with an explicit and strong focus on design are beginning to appear. These programs handle the challenge from emerging technologies by using an overall focus on design to treat such diverse disciplines as computer science, architecture, industrial design, communication and interaction theory, culture and organization theory, art, media, and aesthetics. The goal is to educate students to think of themselves as designers who posses a rich and constructive understanding of how modern information technology can be used to support human interaction and communication. HCI will be a core subject in such programs. Jan Stage See also Classrooms
FURTHER READING Baecker, R. M., Grudin, J., Buxton, W. A. S., & Greenberg, S. (Eds.). (1995). Readings in human-computer interaction: Toward the year 2000 (2nd ed.). Los Altos, CA: Morgan Kaufmann. Dahlbom, B. (1995). Göteborg informatics. Scandinavian Journal of Information Systems, 7(2), 87–92. Denning, P. J. (1992): Educating a new engineer. Communications of the ACM, 35(12), 83–97. Dix, A., Finlay, J., Abowd, G., & Beale, R. (1993). Human-computer interaction. Hillsdale, NJ: Prentice-Hall. Hewett, T. T., Baecker, R., Card, S., Carey, T., Gasen, J., Mantei, M., et al. (1992). ACM SIGCHI curricula for human-computer interaction. New York: ACM. Retrieved July 24, 2003, from http://www. acm.org/sigchi/cdg/ Kling, R. (1993): Broadening computer science. Communications of the ACM, 36(2), 15–17. Mathiassen, L., & Stage, J. (1999). Informatics as a multi-disciplinary education. Scandinavian Journal of Information Systems, 11(1), 13–22. Nielsen, J. (1993). Usability engineering. San Francisco: Morgan Kaufmann. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., & Carey, T. (1995). Human-computer interaction. Reading, MA: AddisonWesley. Rubin, J. (1994). Handbook of usability testing. New York: Wiley.
202 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Shneiderman, B. (1998). Designing the user interface (3d ed.). Reading, MA: Addison-Wesley. Skov, M. B., & Stage, J. (2003). Enhancing usability testing skills of novice testers: A longitudinal study. Proceedings of the 2nd Conference on Universal Access in Computer-Human Interaction. Mahwah, NJ: Lawrence-Erlbaum. Spool, J. M., Scanlon, T., Schroeder, W., Snyder, C., & DeAngelo, T. (1999). Web site usability. Los Altos, CA: Morgan Kaufmann.
ELECTRONIC JOURNALS Scholarly journals, which include substantive research articles and other materials, including letters to the editor, book reviews, and announcements of meetings, trace their origins back to 1665, with Les Journal des Scavans (trans., “Journal of the experts”) in Paris and Proceedings of the Royal Society of London in London. These journals developed to share scientific discoveries among interested parties and to establish who was first to have made a given discovery or to have advanced a given theory. Peer review is an important part of publication in scholarly journals. It is a system whereby scholars who are experts in the same field as the author (the author’s peers) read, comment on, and recommend publication or rejection of an article. This process is usually single-blind (the author does not know who the reviewers are, but the reviewers know who the author is) or double-blind (the author does not know who the reviewers are and the reviewers do not know the identity of the author), which gives both readers and authors increased confidence in the validity of the published articles. Although it has been criticized from time to time, peer review remains one of the most valued aspects of publication in scholarly journals, which are also referred to as peer-reviewed journals, scholarly journals, or refereed journals.
Status of Electronic Journals Today Today, according to Ulrich’s Periodicals Directory, there are approximately 15,000 peer-reviewed journals actively published in all fields. (This number should be considered approximate, as new journals are constantly being launched and old ones constantly ceasing publication. In addition, journals
sometimes change their titles, making it difficult to arrive at an exact figure.) Beginning in the 1960s, the first attempts were made to convert scholarly journals or articles from journals into digital format. As information technologies and telecommunications infrastructure developed, digital, or electronic, journals have become a viable alternative to print. As of 2003, over 80 percent (approximately 12,000) of peer-reviewed journals are available in some electronic form Fulltext Sources Online, published twice a year by Information Today, Inc., lists by title the scholarly journals, magazines, newspapers, and newsletters that are available in some digital form. The number of listings in Fulltext Sources Online grew from about 4,400 in 1993 to over 17,000 by the end of 2002. The formats of electronic journals (or e-journals) vary considerably, however.
Electronic Journals: Journal Focused or Article Focused E-journals can be categorized as either journal focused or article focused. Journal-focused e-journals are complete replacements for print, providing an entire journal and, often, even more information than is available in any extant print alternative versions. A journal-focused e-journal generally has a recognizable journal title, an editorial process, a collection of articles on related topics, and may even have volumes and issue numbers. These complete ejournals often permit browsing through tables of contents and often feature a search engine that lets readers search for specific information. Complete electronic journals provide the same branding function that print journals provide. They are typically available directly from the primary journal publisher, usually for a subscription charge. Article-focused e-journals are just databases of separate articles extracted from print or electronic versions of the complete journal. Commercial databases of separate articles may be available either from the primary publisher or from an aggregator service such as ProQuest, InfoTrac, or EbscoHost. Articlefocused e-journals typically emphasize searching over browsing and mix articles from many different jour-
E-JOURNALS ❚❙❘ 203
nals. In these databases it is selected articles, rather than complete journal titles, that are made available. Even within a journal-focused e-journals, there are many variations. The scholars Rob Kling and Ewa Callahan describe four kinds of electronic journals: pure e-journals distributed only in digital form; e-p-journals, which are primarily distributed electronically, but are also distributed in paper form in a limited way; p-e-journals, which are primarily distributed in paper form, but are also distributed electronically; and p- + e-journals, which have parallel paper and electronic editions. Electronic journals may be mere replicas of a print version, with papers presented in PDF format for handy printing, or they may provide a new e-design with added functionality, color graphics, video clips, and links to data sets. Both browsing and searching may be possible, or only one or the other. The availability of back issues also varies considerably. The American Astronomical Society has an advanced electronic-journals system, with added functions, links to other articles and to data sets, and extensive back files of old issues. Aggregators of electronic-journal articles are companies that act as third parties to provide access to journal articles from a variety of publishers. The advantage of an aggregator or a publisher that offers many titles is, of course, the availability of many articles from many journals in just one system. The system may offer articles from a wide variety of publishers and the originals may be print, electronic, or both.
Publishers of Scholarly Journals From their early days, scholarly journals were published by scholarly societies, commercial publishers, university presses, and government agencies. These main categories of publishers continue today with both print and electronic-journal publishing. The number of journals published by each is not equally distributed, however. Societies may be the most visible to scholars, yet only approximately 23 percent of scholarly journals are published by societies. They have a core constituency to serve, and publishing activities are almost always seen as a money-making venture to pay
for member services. Members may receive a subscription to a print or electronic journal with their society membership or, increasingly, pay extra for it. Society publishers' main revenue source is from subscriptions paid for by libraries. Some say that for-profit companies (commercial publishers) should not publish scholarly publications because research and scholarship should be freely available to all. A for-profit company owes its primary allegiance to its shareholders and the “bottom line” rather than only to the propagation of knowledge. Subscription fees create a barrier that means only those who can pay or who belong to an institution that can pay, have access to important research information. Still, in scholarly journal publishing, commercial publishers such as Elsevier Science, Wiley, and Springer publish the largest percentage of the scholarly journals, and that percentage is growing. For-profit publishers range from those giants to relatively tiny publishers, and together they publish approximately 40 percent of all scholarly journals. Libraries are the main subscribers to both print and electronic journals and provide access to library constituents either by password or Internet protocol address (the address, given in numbers, that corresponds to an Internet location). University presses mostly publish monographs, but universities and other educational institutions also account for about 16 percent of scholarly journals. Other publishers, mostly government agencies, contribute 21 percent of the titles published. Many scientists and social scientists prefer electronic journals for the convenience of desktop access and additional functions, such as the ability to e-mail an article to a colleague. E-journals also allow scholars to save time locating and retrieving articles. Since almost all electronic journals have a subscription charge, libraries are the main customers, providing seamless access for faculty, students, staff, or researchers.
Article-Focused Alternatives to E-journals Article-focused e-journals, being collections of articles organized in subject-related databases,
204 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
are particularly good for in-depth reading over time or for access to articles that come from unfamiliar sources. They extend, rather than replace, a library’s journal collection and, like journals, are provided to library constituents on a secure basis through passwords or other authentication. Article databases are changing the nature of scholarship: In the late 1970s, scientists and social scientists read articles from an average of thirteen journal titles each year; with electronic-journal databases they now read from an average of twenty-three journal titles. In addition to taking advantage of aggregators’ article databases, readers can also choose to get individual articles from special electronic services, such as the Los Alamos/Cornell arXiv.org service or those linked to by the Department of Energy, Office of Scientific and Technical Information PrePrint Network (http://www.osti.gov/preprints/). These services provide access to articles that may be preprints of articles that will be submitted to peerreviewed journals by the author, postprints (copies of articles that are also published in journals), or papers that will never be submitted to traditional journals. Individual electronic articles may also be accessed at an author’s website or at institutional repositories. The Open Archives Initiative has led the way in alternatives to traditional journal publishing and has inspired related initiatives that move the responsibility for distributing scholarship from publishers to the scholars themselves or to the scholars’ institutions. Institutional repositories are now at the early planning and development stage, but ideally will include the entire intellectual capital of a university faculty, including papers, data, graphics, and other materials. The Open Archives Initiative promotes software standards for establishing institutional or individual e-print services (access to digital “preprints” or “postprints”) so many institutions are establishing OAI-compliant sites. E-print services are well established in some academic disciplines, in particular high-energy physics and astrophysics. They are not as common in disciplines such as medicine and chemistry, which rely heavily on peer review.
The Impact of E-publishing Alternatives The fact that authors are now using a variety of publishing venues leads to worries about duplicate versions, as it is hard to tell which is the definitive or archival version of a paper when multiple versions of the same paper are posted over time. Also, it may be difficult to distinguish low-quality papers from high-quality papers when it is so easy for all papers to be posted. The positive impact of speedy access to research literature overshadows these fears in many scholars’ minds, however, and so far some scholars and students report being able to assess the definitiveness and quality of articles without too much difficulty. All of the new electronic models, formats, and choices show us clearly that scholarly publishing is at a crossroads. To understand what impact these new options for reading and publishing scholarly materials may have, it is useful first to consider what the traditional structure and fundamental purposes of scholarly publishing have been. Traditionally, many people have been involved in the business of moving scholarly ideas from the hands of the author to the hands of the reader. If the people and stages involved are seen as links in a chain, the first link is the author and the last link is the reader, but there are many intervening links— peer review, editing, distribution, indexing, subscription, and so forth. Each link adds value, but it also adds costs and time delays. Some of the links are by-products of a print distribution system and reflect the limitations of print access. Electronic distribution may be one way to cut out the intervening links, so an artic l e m o ve s d i r e c t l y f ro m t h e a u t h o r t o t h e reader. But it is important to remember the functions of those links and the value they add. Peer review, for example, adds authority; editing adds quality; distribution adds accessibility; and archiving adds longevity. Online alternatives that protect these functions to some degree will be the most successful in the long run, although the relative value versus cost of these functions is hotly debated.
ELECTRONIC PAPER TECHNOLOGY ❚❙❘ 205
The Future Online journals today range from simplistic (and quite old-fashioned-looking) ASCII texts (texts that rely on the American Standard Code for Information Interchange, or ASCII, for data transmission) of individual articles available from aggregator services such as Lexis-Nexis to complex multimedia and interactive electronic journals available on the publisher’s website. Fully electronic journals without print equivalents are still rare, but they are expected to become more common in many disciplines. Fully electronic journals can be highly interactive and can include multimedia, links to data sets, and links to other articles; they can also encourage a sense of community among their readers. Therefore their impact on scholarship in the future is likely to continue to grow.
Pullinger, D., & Baldwin, C. (2002). Electronic journals and user behaviour. Cambridge, UK: Deedot Press. Rusch-Feja, D. (2002). The Open Archives Initiative and the OAI protocol for metadata harvesting: Rapidly forming a new tier in the scholarly communication infrastructure. Learned Publishing, 15(3), 179–186. Schauder, D. (1994). Electronic publishing of professional articles: Attitudes of academics and implications for the scholarly communication industry. Journal of the American Society for Information Science, 45(2), 73–100. Tenopir, C., King, D. W., Boyce, P., Grayson, M., Zhang, Y., & Ebuen, M. (2003). Patterns of journal use by scientists through three evolutionary phases. D-Lib Magazine, 9(5). Retrieved July 29, 2003, from http://www.dlib.org/dlib/may03/king/05king.html Tenopir, C., & King, D. W. (2000). Towards electronic journals: Realities for scientists, librarians, and publishers. Washington, DC: Special Libraries Association. Weller, A. C. (2001). Editorial peer review: Its strengths and weaknesses. Medford, NJ: Information Today.
Carol Tenopir See also Digital Libraries FURTHER READING Borgman, C. L. (2000). From Gutenberg to the Global Information Infrastructure: Access to information in the networked world. Cambridge, MA: MIT Press. Harnad, S. (2001). For whom the gate tolls? How and why to free the refereed research literature online through author/institution selfarchiving, now. Retrieved July 28, 2003, from http://cogprints .soton.ac.uk/documents/disk0/00/00/16/39/index.html King, D. W., & Tenopir, C. (2001). Using and reading scholarly literature. In M. E. Williams (Ed.), Annual review of information science and technology: Vol. 34. 1999–2000 (pp. 423–477). Medford, NJ: Information Today. Fjallbrant, N. (1997). Scholarly communication: Historical development and new possibilities. Retrieved July 28, 2003, from http://internet.unib.ktu.lt/physics/texts/schoolarly/scolcom.htm Ginsparg, P. (2001). Creating a global knowledge network. Retrieved July 28, 2003, from http://arxiv.org/blurb/pg01unesco.html Kling, R., & Callahan, E. (2003). Electronic journals, the Internet, and scholarly communication. In B. Cronin (Ed.), Annual review of information science and technology: Vol. 37. 2003 (pp. 127–177). Medford, NJ: Information Today. Meadows,A. J. (1998). Communicating research. New York: Academic Press. Nature Webdebates. (2001). Future e-access to the primary literature. Retrieved July 28, 2003, from http://www.nature.com/nature/ debates/e-access/ Page, G., Campbell, R., & Meadows, A. J. (1997). Journal publishing (2nd ed.). Cambridge, UK: Cambridge University Press. Peek, R. P., & Newby, G. B. (1996). Scholarly publishing: The electronic frontier. Cambridge, MA: MIT Press.
ELECTRONIC PAPER TECHNOLOGY For nearly two thousand years, ink on paper has been the near-universal way to display text and images on a flexible, portable, and inexpensive medium. Paper does not require any external power supply, and images and text can be preserved for hundreds of years. However, paper is not without limitations. Paper cannot be readily updated with new images or text sequences, nor does it remain lightweight when dealing with large quantities of information (for example, books). Nevertheless, although laptop computers have enabled people to carry around literally thousands of documents and images in a portable way, they still have not replaced ink on paper. Imagine a thin film that possesses the look and feel of paper, but whose text and images could be readily changed with the press of a button. Imagine downloading an entire book or newspaper from the web onto this thin medium, rolling it up, and taking it to work with you. The technology to make this and similar concepts possible is currently being developed. There are several different approaches to creating what has become known as electronic ink or electronic paper.
206 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Ink on paper is a very powerful medium for several reasons. Not only is it thin, lightweight, and inexpensive, but ink on paper reflects ambient light, has extraordinary contrast and brightness, retains its text and images indefinitely, has essentially a 180˚ viewing angle (a viewing angle is the angle at which something can be seen correctly) is flexible, bendable, and foldable, and perhaps most importantly, consumes no power. Objectively speaking, paper is an extraordinary technology. Creating a new electronic technology that will serve as a successful paper surrogate and match all the positive attributes of paper is no easy task. In fact, it is one of the biggest challenges facing technologists today. Broadly defined, electronic display materials that can be used in electronic paper applications can be made from a number of different substances, reflect ambient light, have a broad viewing angle, have a paper-like appearance and most importantly, have bistable memory. Bistable memory—a highly soughtafter property—is the ability of an electrically created image to remain indefinitely without the application of any additional electrical power. There are currently three types of display technologies that may make electronic paper or ink applications possible. These technologies are bichromal rotating ball dispersions, electrophoretic devices, and cholesteric liquid crystals.
Rotating Ball Technology: Gyricon Sheets A Gyricon sheet is a thin layer of transparent plastic in which millions of small beads or balls, analogous to the toner particles in a photocopier cartridge, are randomly dispersed in an elastomer sheet. The beads are held within oil-filled cavities within the sheet; they can rotate freely in those cavities. The beads are also bichromal in nature; that is, the hemispheres are of two contrasting colors (black on one hemisphere and white on the other hemisphere). Because the beads are charged, they move when voltage is applied to the surface of the sheet, turning one of their colored faces toward the side of the sheet that will be viewed. The beads may rotate
all the way in one direction or the other, in which case the color viewed will be one of the contrasting colors or the other, or they may rotate partially, in which case the color viewed will be a shade between the two. For example, if the contrasting colors are black and white, then complete rotation in one direction will mean that black shows, complete rotation in the other will mean white shows, and partial rotation will mean a shade of gray. The image that is formed by this process remains stable with no additional electrical addressing on the sheet a long time (even for days). This innovative technology was pioneered at Xerox’s Palo Alto Research Center and is currently being commercialized by Gyricon Media. Given contrasting colors of black and white, the white side of each bead has a diffuse white reflecting appearance that mimics the look and effect of paper, while the other side of the ball is black to create optical contrast. Gyricon displays are typically made with 100-micrometer balls. An important factor in this technology’s success is the fact that the many millions of bichromal beads that are necessary can be inexpensively fabricated. Molten white and black (or other contrasting colors) waxlike plastics are introduced on opposite sides of a spinning disk, which forces the material to flow to the edges of the disk, where they form a large number of ligaments (small strands) protruding past the edge of the disk. The jets are black on one side and white on the other, and quickly break up into balls as they travel through the air and solidify. The speed of the spinning disk controls the balls’ diameter. There are many applications envisioned for this type of display technology. As a paper substitute (electronic paper), it can be recycled several thousand times; it could be fed through a copy machine such that its old image is erased and the new one is presented, or a wand can be pulled across the paperlike surface to create an image. If the wand is given a built-in input scanner, it becomes multifunctional: It can be a printer, copier, fax, and scanner all in one. This technology is very cheap because the materials used and the manufacturing techniques are inexpensive.
ELECTRONIC PAPER TECHNOLOGY ❚❙❘ 207
Electrophoretic Technology Electrophoretic materials are particles that move through a medium in response to electrical stimulation. Researchers at the Massachusetts Institute of Technology pioneered a technique to create microcapsules with diameters of 30–300 micrometers that encase the electrophoretic materials, which may be white particles in a dark dye fluid or black and white particles in a clear fluid. They have coined the name electronic ink (or e-ink) to identify their technology. Material containing these microcapsules is then coated onto any conducting surface. By encapsulating the particles, the researchers solved the longstanding problem of electrophoretic materials’ instability. (Electrophoretic materials have tendencies toward particle clustering, agglomeration, and lateral migration.) By having the particles encapsulated in discrete capsules, the particles cannot diffuse or agglomerate on any scale larger than the capsule size. In the technology using white particles in a dark dye, when a voltage of one polarity is applied to a surface that has been coated with this material, the tiny white encapsulated particles are attracted to the top electrode surface so that the viewer observes a diffuse white appearance. By changing the polarity of the applied voltage, the white particles then migrate back to the rear electrode where they are concealed by the dye and the pixel appears dark to the viewer. After migration occurs in both states the white particles stay in their location indefinitely even after the voltage is removed. Gray scale is possible by controlling the degree of particle migration with applied voltage. This innovative technology is currently being commercialized by E Ink. In the system using black and white particles in a clear fluid, each microcapsule contains positively charged white particles and negatively charged black particles suspended in a transparent fluid.When one polarity of the voltage is applied, the white particles move to the top of the microcapsule where they become visible to the user (this part appears white). At the same time, an opposite polarity pulls the black particles to the bottom of the microcapsules where they are no longer visible to the viewer. By reversing this process, the black particles migrate to the top of
the capsule and the white particles to the bottom, which now makes the surface appear dark at that spot.
Cholesteric Liquid Crystals Cholesteric liquid crystal materials also have many of the positive attributes of paper, and they have the added advantage of being amenable to full color. The optical and electrical properties of a cholesteric liquid crystal material allow it to form two stable textures when sandwiched between conducting electrodes. The first is a reflective planar texture with a helical twist whose pitch, p, can be tuned to reject a portion of visible light: When the material is placed on a black background, the viewer sees a brilliant color reflection. The second is a focal conic texture that is relatively transparent. The reflection bandwidth (Dl) in the perfect planar texture is approximately 100 nanometers (100 billionths of a meter). This narrow selected reflection band is different from the broadband white reflection of Gyricon and electronic ink reflective display renditions. Upon the application of an applied voltage, V 1 , the planar structure transforms into the focal conic state that is nearly transparent to all wavelengths in the visiblelight range. The black background is then visible, and an optical contrast is created between reflecting color pixels and black pixels. In this state, the voltage can be removed and the focal conic state will remain indefinitely, creating a bistable memory between the reflecting planar state and the transparent focal conic state. In order to revert from the focal conic state back to the planar reflecting texture, the molecules must transition through a highly aligned state, which requires the application of voltage V2, which is slightly higher than V1. Abruptly turning off the voltage after the aligned state results in the planar texture. There are ways in which the planar texture can be altered to make it more paperlike in its reflectivity. Gray scale is inherent in cholesteric liquid crystals technology since the focal conic domains can be controlled with different levels of voltage. Since cholesteric liquid crystal materials are transparent, they can be vertically integrated to create a true color addition scheme. Although stacking creates more complicated driving circuitry, it preserves
208 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
resolution and brightness levels since the pixels are vertically integrated rather than spatially arranged across the substrate plane, as is the case with conventional liquid crystal displays. The technology was developed at Kent State University and is now being commercialized by Kent Displays. Cholesteric liquid crystal materials are being developed for document viewers, electronic newspapers and books, and information signs. Gregory Philip Crawford See also Cathode Ray Tubes; Liquid Crystal Display FURTHER READING Comiskey, B., Albert, J. D., Yoshizawa, H., & Jacobson, J. (1998). An electrophoretic ink for all-printed reflective electronic displays. Nature, 394(6690), 253–255. Crawford, G. P. (2000). A bright new page in portable displays. IEEE Spectra, 37(10), 40–46. Sheridon, N. K.; Richley, E. A.; Mikkelsen, J. C.; Tsuda, D.; Crowley, J. C.; Oraha, K. A., et al. (1999). The gyricon rotating ball display. Journal for the Society for Information Display, 7(2), 141.
ELIZA The computer program Eliza (also known as “Doctor”) was created by the U.S. computer scientist Joseph Weizenbaum (b. 1923) as an artificial intelligence application for natural language conversation. Considered a breakthrough when published, Eliza was named after the character Eliza Doolittle, who learned how to speak proper English in G. B. Shaw's play Pygmalion. Weizenbaum developed this program in the 1960s while a computer scientist at MIT (1963–1988). Eliza is actually only one specialized script running on a general conversational shell program that could have various scripts with different content. The Eliza script presents the computer's conversational role as a mock Rogerian (referring to the U.S. psychologist Carl Rogers) client-centered psychotherapist while the user plays the role of a client. At the time the program was
so convincing that many people believed that they were talking with a human psychotherapist.
Eliza as Psychotherapist In client-centered sessions a psychotherapist reflects back what the client says to invite further responses instead of offering interpretations. If a client reports a dream about “a long boat ride,” Eliza might respond with “Tell me about boats.” Most users would not immediately assume that the program is ignorant of even the basic facts about boats. Weizenbaum designed Eliza to take advantage of the user's projected illusion of understanding as a way of masking the program's profound lack of real-world knowledge. He also carefully noted that the assumption of a program understanding what the user says is one made by the user. In 1966 the popular understanding of mainframe computers as electronic brains superior to human capabilities was so strong that most people did indeed project vast knowledge and understanding onto any computer. So, despite flaws and limitations that later users would immediately notice, the first users attributed so much credibility to Eliza's responses that “some subjects have been very hard to convince that ELIZA is not human” (Weizenbaum 1966, 42). Scientists were also impressed with Eliza’s potentials. Nobel Prize winner Herbert Simon (economist, computer scientist, psychologist, and philosopher of science) (1916–2001) suggested that the U.S. Congress fund the development of Eliza as a substitute for human psychotherapists, and some psychoanalysts attempted to develop more sophisticated scripts for the program. Such efforts inevitably failed. In fact, Weizenbaum critiques Eliza's reception and popularity and asserts that such programs should not be used for sensitive interactions.
Eliza as Conversational Interface Nevertheless, Eliza's computerized techniques for conversational interaction were a significant improvement in human-computer interfaces. Eliza was among the first artificial intelligence programs that
ELIZA ❚❙❘ 209
Talking with ELIZA
J
oseph Weizenbaum reports the following as a “typical conversation” with ELIZA (Weizenbaum 1966). CLIENT: Men are all alike. ELIZA: IN WHAT WAY They're always bugging us about something or other. CAN YOU THINK OF A SPECIFIC EXAMPLE Well, my boyfriend made me come here. YOUR BOYFRIEND MADE YOU COME HERE He says I'm depressed much of the time. I AM SORRY TO HEAR YOU ARE DEPRESSED It's true. I am unhappy. DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY I need some help, that much seems certain. WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP Perhaps I could learn to get along with my mother. TELL ME MORE ABOUT YOUR FAMILY My mother takes care of me.
allowed for a conversation between people and computers. Eliza opened up the possibility of immediate user input and computer response in ways that were much closer to natural human interactions. A crucial property of human-computer interfaces disclosed by Weizenbaum's program is that they must maintain the user's “sense of being heard and understood” (Weizenbaum 1966, 42). In that sense Eliza satisfied one aspect of Alan M. Turing's (mathematician and founder of computer science and cryptology) test for artificial intelligence.
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU My father. YOUR FATHER You are like my father in some ways. WHAT RESEMBLANCE DO YOU SEE You are not very aggressive but I think you don't want me to notice that. WHAT MAKES YOU THINK I AM NOT VERY AGGRESSIVE You don't argue with me. WHY DO YOU THINK I DON'T ARGUE WITH YOU You are afraid of me. DOES IT PLEASE YOU TO BELIEVE I AM AFRAID OF YOU My father is afraid of everybody. WHAT ELSE COMES TO MIND WHEN YOU THINK OF YOUR FATHER Bullies. DOES THAT HAVE ANYTHING TO DO W I T H T H E FAC T T H AT YO U R BOYFRIEND MADE YOU COME HERE Source: Weizenbaum, Joseph. (1966, January). ELIZA—A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1). Retrieved March 22, 2004, from http://i5.nyu.edu/~mm64/x52.9265/january1966.html
Several features of the interactive environment and conversational interface contributed to Eliza's credibility. An environmental feature easily taken for granted today was MIT's MAC (Multi-Access Computer) time-sharing operating system, which allowed multiple users to have quick response times to their individual input. Eliza appeared to speak back to the user the way another person would. A user could generate input spontaneously at the teletype machine and have the program respond to that specific input conversationally at the same teletype—
210 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
not unlike today's Internet chat rooms, only with responses generated by a “bot” (robot). Compared to submitting a stack of punched cards and waiting a day for a printout, Eliza's interface was positively friendly.
Interface Problems and How Eliza Solves Them Weizenbaum's program dealt with several specific interface problems: identifying keywords, discovering minimal context, choosing and calculating appropriate responses (transformations), generating responses for input without any keywords, and most importantly, allowing for designing separate, changeable scripts that encode the content, that is, the particular keywords and transformations for a given conversational role. Thus, the shell program that computes responses and a script provide an interface to the content encoded in that script. The program first scans the user's input sentence to see if any of the words are in its dictionary of keywords. If a keyword is found, then the sentence is “decomposed” by matching it to a list of possible templates. The design of the templates is what discovers some minimal context for the user's input. In one of Weizenbaum's examples, the sentence “It seems that you hate me” is matched to a template for the keywords “YOU” and “ME”: (0 YOU 0 ME) The “0” in the template stands for any number of filler words. The template is used to break up the input sentence into four groups: (1) It seems that (2) YOU (3) hate (4) ME. This decomposition is then matched to one of several possible “reassembly” rules that can be used to generate a response. In this case the one chosen is: (WHAT MAKES YOU THINK I : 3 : YOU). The response then substitutes the third part of the input sentence, “hate,” into the response “What makes you think I hate you” (Weizenbaum 1966, 38). That is the basic operation of Eliza, although the program has many more technical nuances. The real ingenuity comes from designing the decomposition and reassembly rules that make up the script. We can
easily see how merely reusing input words by putting them into canned sentences leads to a loss of meaning.
Achievements and Continued Influence The program's real achievement was as an example of a conversational interface for some useful content. This kind of interface is successful for a narrow, theoretically well-defined, or foreseeable field of interactions such as solving simple arithmetic problems. Eliza quickly entered into intellectual and popular culture and continues to be discussed and cited forty years later. The program has many variants, including psychiatrists Kenneth Colby’s Parry (short for paranoid schizophrenic), the program Racter, described as “artificially insane,” and many more sophisticated descendents. William H. Sterner See also Dialog Systems; Natural-Language Processing FURTHER READING Bobrow, D. G. (1965). Natural language input for a computer problem solving system (Doctoral dissertation, MIT, 1965), source number ADD X1965. Colby, K. M., Watt, J. B., & Gilbert, J. P. (1966). A computer method of psychotherapy: Preliminary communication. The Journal of Nervous and Mental Disease, 142(2), 148–152. Lai, J. (Ed.). (2000). Conversational interfaces. Communications of the ACM, 43(9), 24–73. Raskin, J. (2000). The humane interface—New directions for designing interactive systems. New York: Addison-Wesley. Rogers, C. (1951). Client centered therapy: Current practice, implications and theory. Boston: Houghton Mifflin. Turing, A. M. (1981). Computing machinery and intelligence. In D. R. Hofstadter & D. C. Dennett (Eds.), The mind's I—Fantasies and reflections on self and soul (pp. 53–68). New York: Bantam Books. (Reprinted from Mind, 49[236], 433–460) Turkle, S. (1984). The second self—Computers and the human spirit. New York: Simon & Schuster. Weizenbaum, J. (1966). ELIZA—A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. Weizenbaum, J. (1967). Contextual understanding by computers. Communications of the ACM, 10(8), 474–480.
E-MAIL ❚❙❘ 211
Weizenbaum, J. (1976). Computer power and human reason—From judgment to calculation. San Francisco: W. H. Freeman. Winograd, T. (1972). Understanding natural language. New York: Academic Press.
E-MAIL Electronic mail, also called“e-mail”or simply“email”, is a system for exchanging text messages between computers. First invented in 1971, e-mail came into very widespread usage in the 1990s, and is considered by many to be the most important innovation in personal communications since the telephone. E-mail has changed the way businesses, social groups, and many other kinds of groups communicate.
History of E-mail E-mail was invented in 1971 by Ray Tomlinson, who was a scientist at BBN in Cambridge, Massachusetts. (The first-ever e-mail message, probably “QWERTY UIOP”, was sent as a test between two computers on Tomlinson’s desk. Many, but not all e-mail messages sent since then have been more informative.) This was not the first text message sent via computer, but the first-ever sent between computers using the nowstandard addressing scheme. The Internet, or Arpanet as it was then called, had come into existence a few years earlier, and was used by scientists at a few locations. Users of the Arpanet system already used messaging, but one could only send messages to other users at the same location (e.g. user “TomJones” at State U might easily leave a message for “SallySmith” at the same location). Tomlinson was working on a way to send files between mainframes using filetransfer program called CPYNET. He decided to also extend the messaging system this so that users could send messages to other users anywhere in the Arpanet system. One of the problems facing Tomlinson was addressing. How would TomJones at State U indicate that he wanted to send a message to SallySmith at TechU, not State U? Tomlinson chose the @ symbol as the centerpoint for his new addressing system. Information on the right of the @ would indicate the
location, and information on the left would indicate the user, so a message for
[email protected] would arrive at the right place. The @ symbol was an obvious choice, according to Tomlinson, because it was a character that never appeared in names, and already had the meaning“at,”so was appropriate for addressing. All e-mail addresses still include this symbol. E-mail has grown exponentially for three decades since. In the 1970s and 1980s it grew until it was a standard throughout American universities. Starting in 1988 it moved out into the nonuniversity population, promoted by private companies such as CompuServe, Prodigy, and America Online. A study of e-mail growth between 1992–1994 showed traffic doubling about every twelve months—279 million messages sent in November of 1992, 508 million the next year, and topping the 1 billion messages/ month mark for the first time in November of 1994 (Lyman and Varian 2004). Not only were more people getting e-mail accounts, but the people who had them were sending more and more messages. For more and more groups, there was enough “critical mass” that e-mail became the preferred way of communicating. By the early twenty-first century e-mail was no longer a novelty, but a standard way of communicating throughout the world between all kinds of people.
Format of E-mail Messages At its most basic, e-mail is simply a text message with a valid address marked by “To:” Imagine that
[email protected] now wants to send an e-mail address to
[email protected] The part of the message after the @ sign refers to an Internet Domain Name. If the e-mail is to be delivered correctly, this domain must be registered on the Internet Domain Name Server (DNS) system, just as Web pages must be. Likely, TomJones’s university keeps a constantly updated list of DNS entries (a “DNS lookup service”) so that it knows where to sent Tom’s outgoing mail. The computer receiving Tom’s message must have an e-mail server or know how to forward to one, and must have an account listed for“joe.”If either of these
212 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
A Personal Story—The Generation Gap When I first went off to college, e-mail was something your university offered only to those savvy enough to take advantage of it. It was an exclusive club: People who could send mail and have it arrive in seconds, rather than the usual two or three days that the U.S. Postal Service required. And so, a freshman at college, I talked with my parents almost every day for free, via e-mail, while my friends racked up large phone bills calling home. The formality of a written letter, or even a phone call, was a treat saved only for a special occasion. But it took some time for my mother to warm to this interaction; to her, e-mail was only on the computer, not personal like a letter could be. Even today, it is more like a second language to her. By the time I graduated from college, e-mail was commonplace and ubiquitous. Despite the diaspora of my college friends across the country, my phone bill remained small, my e-mail rate high, until suddenly a new technology burst onto the scene. In 1997 I started using Instant Messenger (IM), leaving a small window open on the corner of my screen. As my friends slowly opted in we gravitated toward the peripheral contact of the “buddy list” and away from the more formal interaction of e-mail. Gradually, I realized that long e-mail threads had been replaced by quick, frequent IM interaction: a brief question from a friend, a flurry of activity to plan a night out. But I’ve become a bit of a fuddy-duddy; the technology has passed me by. Recently I added a young acquaintance to my buddy list. He mystified me by sending brief messages: "Hi!" To this I would reply, "What's up? Did you have a question?" This would confuse him—why would he have a question? I finally realized that we used the medium in different ways. To me, IM was a path for getting work done, a substitute for a quick phone call or a short e-mail. To him, the presence of friends on his buddy list was simply the warmth of contact, the quick hello of a friend passing by on the Web. Observing his use is fascinating; he has well over a hundred friends on his list, and generally keeps a dozen or more conversations occurring simultaneously. No wonder I rated no more than a quick hello in his busy world! I tried to keep up once, but found I could not match his style of use of the medium. As new technologies arise, their new users will no doubt take to them with a gusto and facility that we cannot fully comprehend. It is our job as designers to ensure that we offer these users the flexibility and control to make of these new media what they will, and not limit them by the boundaries of our own imagination. Alex Feinman
is not correct, the message will be “bounced” back to the original sender. E-mail can be sent to multiple recipients by putting multiple e-mail addresses in the “To” field separated by commas or by using the “cc” field or “bcc” field. CC stands for “Carbon Copy,” and is a convention taken from office communications long predating e-mail. If you receive an e-mail where you are listed under the CC field, this means that you are not the primary intended recipient of the message, but are being copied as a courtesy. Recipients listed in the CC field are visible to all recipients. BCC in contrast stands for “Blind Carbon Copy,” and contents of this field are not visible to message recipients. If you receive a BCC message, other recipients will not see that you were copied on the message, and you will not see other BCC recipients.
Standard E-mail messages also contain other, nonessential fields usually including a “From” field identifying the sender and a “Subject” field summarizing the content. Other optional fields are: ■
Mime type: Describes the file format for attachments ■ HTML formatting: Indicates that the message contains formatting, graphics, or other elements described in the standard Web html format ■ Reply-To: Can list a “reply to” address that may be different from the sender. This is useful for lists that want to avoid individual replies being accidentally sent to the entire group. ■ SMS: Indicates that the e-mail can be sent to a device using the Simple Messaging System pro-
E-MAIL ❚❙❘ 213
tocol used by cell phones and other handheld devices ■ Priority: Can be interpreted by some Web browser to indicate different priority statuses These are only a few of the more common optional fields that may be included in an e-mail. When an e-mail is sent using these optional features, the sender cannot be sure that the recipient’s e-mail software will be able to interpret them properly. No organization enforces these as standards, so it is up to developers of e-mail server software and e-mail client software to include or not include these. Companies such as Microsoft and IBM may also add specialized features that work only within their systems. E-mail with specialized features that are sent outside of the intended system doesn’t usually cause undue problems, however—there will just be extra text included in the e-mail header that can be disregarded by the recipient.
E-mail Lists An important technological development in the history of e-mail was the e-mail list. Lists are one-tomany distributions. A message sent to an e-mail list address (e.g., “
[email protected]”) is sent by an individual and received by everyone subscribed to the list. One popular way of administering lists is using ListServ software, which was first developed in 1986 for use on IBM mainframes, and currently marketed by Lsoft (www.lsoft.com). ListServ software has the advantage that membership is selfadministered—you don’t need a moderator’s help to subscribe, unsubscribe, or change membership options, these are done by sending messages that are interpreted and carried out automatically by the server. For example, Tom Jones could subscribe himself to an open list by sending the message SUBSCRIBE dogtalk-l to the appropriate listserv address. And, just as important, he could unsubscribe himself later by sending the e-mail UNSUBSCRIBE dogtalk-l. There are also a wide variety of options for list subscriptions, such as receiving daily digests or subscribing anonymously. Another popular way of administering groups is through online services such as Yahoogroups. These
groups are administered through buttons and links on the group web page, not text commands. These groups may also include other features such as online calendars or chatrooms. E-mail lists, like most other groups, have certain group norms that they follow, and newcomers should take note of them. Some of these are options that are set by the list administrator: ■
Is the list moderated or unmoderated? In moderated lists, an administrator screens all incoming messages before they are sent to the group. In unmoderated lists, messages are immediately posted. ■ Does the list by default “Reply to all”? When users hit the “Reply” button to respond to a list message, will they by default be writing to the individual who sent the message, or to the entire group? Not all lists are the same, and many embarrassments have resulted in failure to notice the differences. Users can always manually override these defaults, simply by changing the recipient of their messages in the “To” line. Lists also have group norms that are not implemented as features of the software, but are important nonetheless. How strictly are list members expected to stick to the topic? Is the purpose of the list social or purely informational? Are commercial posts welcome or not? Listserv software can be configured to send an automatic “Welcome” message to new members explaining the formal and informal rules of the road.
Social Characteristics of E-mail Academic researchers in the fields of communications, psychology, and human-computer interaction were quick to recognize that this radical new communications method could have effects on both individuals and organizations. This research area, which encompasses the study of e-mail and other online media, is referred to as the study of “Computer-Mediated Communications,” abbreviated CMC. Some well-established characteristics of e-mail are:
214 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Casual Style Electronic mail was very quickly recognized to have some unique effects on communication style, and possibly have long-term effects on the groups that use it. Casual style is one common marker of email communication. Many people use the verb “talk” rather than “write,” as in “I’ll talk to you on email” rather than “I’ll write you an e-mail.” E-mail never developed the formal salutations and benedictions of letters—few e-mails begin with “Dear Mr. Jones” or end with “Sincerely, Sally Smith.” In 1978 one early e-mail user observed: “One could write tersely and type imperfectly, even to an older person in a superior position and even to a person one did not know very well, and the recipient took no offense. The formality and perfection that most people expect in a typed letter did not become associated with network messages, probably because the network was so much faster, so much more like the telephone” (J.C.R. Licklider, quoted in Vezza 1978). The casual style is partly a result of the unique early-Internet “hacker” culture, but also partly a result of the medium itself. E-mail messages are often delivered in a few seconds, lending a feeling of immediacy. The fact that e-mail is easily deleted and not printed on paper lends a feeling of impermanence (although this is illusory, as many legal defendants are now finding!) While in some settings, such as when conducting corporate or legal business, e-mails are now expected to be formal and guarded in the manner of a letter, in general the literary genre of e-mail remains one of casualness and informality. E-mail, along with other means of ComputerMediated Communications, also lends a feeling of social distance. Individuals feel less close, and less inhibited via e-mail compared to being face-to-face with message recipients. The social distance of e-mail has a number of good and bad effects. Self-Disclosure via E-mail Online communication with strangers also leads to a feeling of safety, because the relationship can be more easily controlled. Many support groups for highly personal issues thrive as e-mail lists. Individuals may use an online forum to disclose feelings
and share problems that they would be extremely reluctant to discuss with anyone face-to-face. Online dating services often arrange e-mail exchanges prior to phone or face-to-face meetings. Lack of social cues may sometimes promotes an artificial feeling of closeness that Joseph Walther calls a “Hyperpersonal” effect (Walther, 1996). Individuals may imagine that other are much closer to themselves in attitudes than they really are, and this may lead to highly personal revelations being shared online that would rarely be communicated faceto-face. Egalitarianism Text-only communication does not convey status cues, or other information that tends to reinforce social differences between individuals. E-mail is believed to promote egalitarian communication (Dubrovsky, Kiesler, and Sethna 1991). Lower-level employees can easily send e-mails to executives that they would never think to phone or visit, loosening restraints on corporate communication and potentially flattening corporate hierarchies. It has also been observed that students who rarely contribute verbally in classes will contribute more via e-mail or other online discussion, probably because of the increased social distance and reduced inhibition (Harasim 1990). Negative Effects: Flaming and Distrust The social distance and lack of inhibition can have negative effects as well. E-mail writers more easily give in to displays of temper than they would in person. In person, blunt verbal messages are often presented with body language and tone of voice to alleviate anger, but in e-mail these forms of communication are not present. Recipients of rude emails may more easily feel insulted, and respond in kind. Insulting, angry, or obscene e-mail is called “flaming.” In one early experimental study of comparing e-mail and face-to-face discussions, researchers counted 34 instances of swearing, insults and namecalling, which were behaviors that never occurred in a face-to-face group performing the same task (Siegel et al. 1986). For similar reasons, it is often harder to build trust through e-mail. Rocco (1998) found that groups using e-mail could not solve a social
E-MAIL ❚❙❘ 215
dilemma that required trust building via e-mail but groups working face-to-face could do so easily. Beyond these interpersonal difficulties that can occur online, there are some practical limitations of e-mail as well. The asynchronous nature of e-mail makes it difficult to come to group decisions (see Kiesler and Sproull 1991). Anyone who has tried to use e-mail to set up a meeting time among a large group of busy people has experienced this difficulty.
Culture Adapts to E-mail These observations about the effects of e-mail were made relatively early in its history, before it had become as widespread as it currently is. As with all new technologies, however, culture rapidly adapts. It has not taken long, for example, for high-level business executives to assign assistants to screen e-mails the way they have long done for phone calls. It is probably still the case that employees are more likely to exchange email with top executives than to have a phone or personal meeting with them, but the non-hierarchical utopia envisioned by some has not yet arrived. A simple and entertaining development helps email senders convey emotion a little better than plain text alone. “Emoticons” are sideways drawings made with ASCII symbols (letters,numbers and punctuation) that punctuate texts. The first emoticon was probably : ) which, when viewed sideways, looks like a smiley face. This emoticon is used to alert a recipient that comments are meant as a joke, or in fun, which can take the edge off of blunt or harsh statements. Most experienced e-mail users also develop personal awareness and practices that aid communication. Writers learn to reread even short messages for material that is overly blunt, overly personal, or otherwise ill conceived. If harsh words are exchanged via e-mail, wise coworkers arrange a time to meet face-to-face or on the phone to work out differences. If a group needs to make a decision over e-mail, such as setting a meeting time, they adopt practices such as having the first sender propose multiplechoice options (should we meet Tuesday at 1 or Wednesday at 3?) or assigning one person to collect all scheduling constraints. Groups also take advantage of e-mail’s good characteristics to transform themselves in interesting
ways. Companies are experimenting with more virtual teams, and allowing workers to telecommute more often, because electronic communications make it easier to stay in touch. Universities offer more off-campus class options than ever before for the same reason. Organizations may take on more democratic decision-making practices, perhaps polling employees as to their cafeteria preferences or parking issues, because collecting opinions by e-mail is far easier than previous methods of many-tomany communication.
Future of E-mail Electronic mail has been such a successful medium of communication that it is in danger of being swamped by its own success. People receive more electronic mail than they can keep up with, and struggle to filter out unwanted e-mail and process the relevant information without overlooking important details. Researchers have found that e-mail for many people has become much more than a communication medium (Whittaker and Sidner 1996). For example, many people do not keep a separate address book to manage their personal contacts, but instead search through their old e-mail to find colleagues’ addresses when needed. People also use their overcrowded e-mail “inboxes” as makeshift calendars, “to-do” lists, and filing systems. Designers of highend e-mail client software are trying to accommodate these demands by incorporating new features such as better searching capability, advanced filters and “threading” to help users manage documents (Rohall and Gruen 2002). E-mail software is often integrated with electronic calendars and address books to make it easy to track appointments and contacts. And e-mail is increasingly integrated with synchronous media such as cell phones, instant messaging, or pagers to facilitate decisionmaking and other tasks that are difficult to accomplish asynchronously.
The Spam Problem A larger, more insidious threat to e-mail comes in the form of “spam” or “junk” e-mail. Spam refers to unwanted e-mail sent to many recipients. The
216 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
term was first used to describe rude but fairly innocuous e-mailings, such as off-topic comments sent to group lists, or personal messages accidentally sent to a group. But spam has taken on a more problematic form, with unscrupulous mass-marketers sending unsolicited messages to thousands or even millions of e-mail addresses. These spammers are often marketing shady products (video spy cameras, pornographic websites) or worse, soliciting funds in e-mail scams. These professional spammers take advantage of two of the characteristics of e-mail that have made it so popular: its flexibility and inexpensiveness. Spammer usually forge the “from” line of the e-mails they send, so that their whereabouts cannot be easily blocked. (Messages usually include Web addresses hosted in nations where it would be difficult to shut them down.) Spammers also take advantage of the fact that email is essentially free for senders. The only significant cost of e-mail is borne by recipients, who must pay to store e-mail until it can be read or deleted. Low sending cost means that spammers can afford to send out advertisements that get only a miniscule fraction of responses. The effect of this spamming is that users are often inundated with hundreds of unwanted e-mails, storage requirements for service providers are greatly increased, and the marvelously free and open world of international e-mail exchange is threatened. What is the solution to spam? Many different groups are working on solutions, some primarily technical, some legal, and some economic or social. Software companies are working on spam “filters” that can identify and delete spam messages before they appear in a user’s inbox. The simplest ones work on the basis of keywords, but spammers quickly developed means around these with clever misspellings. Other filters only let through e-mails from known friends and colleagues. But most users find this idea distasteful—isn’t the possibility of finding new and unexpected friends and colleagues one of the great features of the Internet? Research continues on filters that use more sophisticated algorithms, such as Bayesian filtering, to screen out a high percentage of unwanted e-mail. There are also attempts afoot to outlaw spam. In December 2003 the U.S. Congress passed a bill (CAN-SPAM,
2004), designed to limit spamming. This bill would, among other things, mandate that commercial emailers provide “opt-out” options to recipients and prohibit false e-mail return addresses and false subject headings. This bill will not eliminate the problem, because most spam currently originates outside of the United States. Similar multinational efforts may eventually have an effect, however. Individuals can also purchase antispam software or antispam services that will delete some (but not all) unwanted e-mails. The best way to avoid receiving spam is never to list your e-mail address on your website in machine-readable text. Many spam lists are assembled by automatic spider software that combs through webpages looking for the telltale @ sign. If you still want your e-mail to be available on the Web, two simple ways around this are to replace the @ symbol in your e-mail address with the word “at” or create a graphic of your e-mail address and use it as a substitute for the text. Despite these challenges, electronic mail has carved itself an essential place in the social world of the twenty-first century and should continue to grow in importance and usefulness for many years to come. Nathan Bos See also Internet in Everyday Life; Spamming FURTHER READING Bordia, P. (1997). Face-to-face versus computer-mediated communication. Journal of Business Communication, 34, 99–120. C A N - S PA M l e g i s l a t i o n . Re t r i e ve d Ma rch 3 1 , 2 0 0 4 , f ro m http://www.spamlaws.com/federal/108s877.html Crocker, D. E-mail history. Retrieved March 31, 2004, from www. livinginternet.com Dubrovsky, V. J., Kiesler, S., & Sethna, B. N. (1991). The equalization phenomenon: Status effects in computer-mediated and face-toface decision-making groups. Human-Computer Interaction, 6, 119–146. Garton, L. & Wellman, B. (1995). Social impacts of electronic mail in organizations: a review of the research literature. In B. R. Burleson (Ed.), Communications Yearbook, 18. Thousand Oaks, CA: Sage. Harasim, L. M. (Ed.). (1990). Online education: perspectives on a new environment (pp. 39–64). New York: Praeger. Hardy, I. R. (1996). The evolution of ARPANET e-mail. History Thesis, University of California at Berkeley. Retrieved March 31, 2004, from http://www.ifla.org/documents/internet/hari1.txt
EMBEDDED SYSTEMS ❚❙❘ 217
Kiesler, S., & Sproull, L. S. (1992). Group decision-making and communication technology. Organizational Behavior and Human Decision Processes, 52, 96–123. Lyman, P., & Varian, H. R. (2000). How much information. Retrieved March 31, 2004, from http://www.sims.berkeley.edu/how-muchinfo Rocco, E. (1998). Trust breaks down in electronic contexts but can be repaired by some initial face-to-face contact. In Proceedings of Human Factors in Computing Systems, CHI 1998 (pp. 496–502). Rohall, S. L., & Gruen, D. (2002). Re-mail: A reinvented e-mail prototype. In Proceedings of Computer-Supported Cooperative Work 2002. New York: Association for Computer Machinery. Siegel, J., Dubrovsky, V., Kiesler, S., & McGuire, T. W. (1986). Group processes in computer-mediated communication. Organizational Behavior and Human Decision Processes, 37, 157–186. Sproull, L., & Kiesler, S. (1991). Connections: New ways of working in the networked organization. Cambridge, MA: The MIT Press. Vezza, A. (1978). Applications of information networks. In Proceedings of the IEEE, 66(11). Walther, J. B. (1996). Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Communication Research, 23, 3–43. Whittaker, S. & Sidner, C. (1996). E-mail overload: Exploring personal information management of e-mail. In Proceedings of ComputerHuman Interaction. New York: ACM Press. Zakon, R. H. (1993). Hobbes’ Internet timeline. Retrieved March 31, 2004, from http://www.zakon.org/robert/internet/timeline/
EMBEDDED SYSTEMS Embedded systems use computers to accomplish specific and relatively invariant tasks as part of a larger system function—as when, for example, a computer in a car controls engine conditions. Computers are embedded in larger systems because of the capability and flexibility that is available only through digital systems. Computers are used to control other elements of the system, to manipulate signals directly and in sophisticated ways, and to take increasing responsibility for the interface between humans and machines in machine-human interactions. Prior to this embedding of computers in larger systems, any nontrivial system control required the design and implementation of complex mechanisms or analog circuitry. These special-purpose dedicated mechanisms and circuits were often difficult to design, implement, adjust, and maintain. Once implemented, any significant changes to them were impractical. Further, there were severe limits on the types of control that were feasible using this approach.
The embedding of computers in larger systems enables the implementation of almost unlimited approaches to control and signal processing. A computer can implement complex control algorithms that can adapt to the changing operation of a larger system. Once a computer has been embedded in a larger system, it can also be used to provide additional functionality, such as communications with other computers within or outside the larger system that it serves. It can also be used to support improved interfaces between machines and human operators. In addition, an embedded computing system can be updated or altered through the loading of new software, a much simpler process than is required for changes to a dedicated mechanism or analog circuit. People living in modern technological societies come into contact with many embedded systems each day. The modern automobile alone presents several examples of embedded systems. Computerbased engine control has increased fuel efficiency, reduced harmful emissions, and improved automobile starting and running characteristics. Computerbased control of automotive braking systems has enhanced safety through antilock brakes. Embedded computers in cellular telephones control system management and signal processing, and multiple computers in a single handset handle the human interface. Similar control and signal-processing functions are provided by computers in consumer entertainment products such as digital audio and video players and games. Embedded computing is at the core of high-definition television. In health care, many people owe their lives to medical equipment and appliances that could only be implemented using embedded systems.
Defining Constraints The implementation and operation constraints on embedded systems differentiate these systems from general-purpose computers. Many embedded systems require that results be produced on a strict schedule or in real time. Not all embedded systems face this constraint, but it is imposed much more on embedded systems than on general-purpose computers. Those familiar with personal computers rarely think
218 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
of the time required for the computer to accomplish a task because results are typically returned very quickly, from the average user’s point of view. Some personal-computing operations, such as very large spreadsheet calculations or the editing of large, highresolution photographs, may take the computer a noticeable amount of time, but even these delays are rarely more than an inconvenience. In contrast, embedded systems can operate at unimaginable speeds, but if an embedded system violates a realtime constraint, the results can be catastrophic. For example, an automobile engine controller may need to order the injection of fuel into a cylinder and the firing of a sparkplug at a rate of thousands of injections and sparks each second, and timing that deviates by less than one-thousandth of a second may cause the engine to stall. Systems that involve control or signal processing are equally intolerant of results that come early or late: Both flaws are disruptive. Limited electrical power and the need to remove heat are challenges faced by the designers of many embedded systems because many embedded applications must run in environments where power is scarce and the removal of heat is inconvenient. Devices that operate on batteries must strike a balance between demand for power, battery capacity, and operation time between charges. Heat removal is a related problem because heat production goes up as more power is used. Also, embedded systems must often fit within a small space to improve portability or simply to comply with space constraints imposed by a larger system. Such space constraints exacerbate the problem of heat removal and thus further favor designs that limit power consumption. A cellular telephone, for example, features embedded systems that are hampered by significant power and space constraints. A less obvious example is the avionics package for a general-aviation aircraft. Such a system must not draw excessive power from the aircraft’s electrical system, and there may be little space available for it in the aircraft cockpit. Users of older personal computers learned to expect frequent computer failures requiring that the users restart the computer by pressing a combination of keys or a reset button. Newer personal computers are more robust, but many embedded systems demand even greater robustness and can-
not rely on human intervention to address failures that might arise. Users of personal computers accept that software often includes bugs, but the same users expect that their hardware—in this context, devices such as household appliances, automobiles, and telephones—will operate without problems. Traditionally, such devices have been very robust because they were relatively simple. The embedding of computing into these sorts of devices offers potential for greater functionality and better performance, but the consumer still expects the familiar robustness. Further, embedded computing is often found in systems that are critical to the preservation of human life. Examples include railroad signaling devices and medical diagnostic and assistive technology such as imaging systems and pacemakers. These systems must be robust when first placed in service and must either continue to operate properly or fail only in ways that are unlikely to cause harm. Further, as mentioned above, these systems must operate without human intervention for extended periods of time. Most current embedded systems operate in isolation, but some perform their functions with limited monitoring and direction from other computers. As with general-purpose computing, there appears to be a trend toward increasing the interoperability of embedded systems. While increasing the interaction among embedded systems offers the potential for new functionality, networking of embedded computing devices also increases security concerns.
An Illustrative Example People tend instead to think of embedded systems in conjunction with cutting-edge technology, such as the various spacecraft developed and deployed by NASA. The first embedded computer used by NASA in a manned spacecraft was developed for the Gemini program in the early 1960s. That computer was used for guidance and navigation. (The Mercury program preceding Gemini involved manned space flight, but the flights were simple enough to be controlled from the ground.) The NASA programs following Gemini placed increasing reliance on embedded computers to accomplish a range of tasks required for the successful completion of manned space missions.
EMBEDDED SYSTEMS ❚❙❘ 219
Unmanned space flights have needed embedded computers to provide flight control for spacecraft too far away to tolerate control from Earth. Closer to Earth, however, the modern automobile may contain a hundred embedded computers, each with greater computational capabilities than the single computer that traveled on the Gemini space flights. Embedded computer engine control was introduced in the late 1970s to satisfy emissions requirements while maintaining good performance. Those who have operated automobiles from before the days of embedded systems will recall that those automobiles were more difficult to start when the weather was too cold or too hot or too wet. Automobiles of that era were also less fuel efficient, emitted more pollution, and had performance characteristics that varied with driving and environmental conditions more than is the case today. Embedded computer engine control addresses these variations by adapting the engine control in response to sensed environmental and engine operation data. The next element in the automobile drive train is the transmission. The first cars with automatic transmissions typically suffered from poorer performance and fuel economy than cars with manual transmissions. Modern automatic transmissions controlled by embedded computers, by comparison, compare favorably with manual transmissions in both performance and economy. The computer control supports the selection of different shifting strategies depending on whether the driver prefers sports driving or economy driving. Further, manufacturers can match a single transmission to a wide range of engines by changing the software in the transmission controller. The embedded transmission system can also be configured to communicate with the embedded engine system to generate better performance and economy than each system could achieve operating independently. Other familiar automotive capabilities provided through embedded systems include cruise control, control of antilock brakes, traction control, active control of vehicle suspension, and control of steering for variable power assist or four-wheel steering. Automobile interior climate and accessories such as
wipers and power windows may be controlled by embedded systems. In some instances late-model automobiles that have been recalled to the factory have had the required repair accomplished entirely through an embedded computer software change. Embedded communication and navigation systems for automobiles are now available, and these systems are more complex than those used in the early space program. In addition, the human interface between the automobile and its driver is now managed by one or more embedded systems. In the early 1980s, several automakers replaced analog human interfaces with computer-based interfaces. Some of those interfaces were not well received. Latermodel automobiles retained the computer control of the interfaces, but returned to the more familiar analog appearance. For example, many drivers prefer the dial speedometer to a digital display, so even though the speedometer is actually controlled by a computer, auto designers reverted from digital to analog display.
Increasing Dependence Embedded computers can be used to implement far more sophisticated and adaptive control for complex systems than would be feasible with mechanical devices or analog controllers. Embedded systems permit the human user to interact with technology as the supervisor of the task rather than as the controller of the task. For example, in an automobile, the engine controller frees the driver from having to recall a particular sequence of actions to start a car in cold weather. Similarly, the automatic transmission controller releases the driver from tracking engine speed, load, and gear, leaving the driver free to concentrate on other important driving tasks. The computer’s ability to manage mundane tasks efficiently is one of the great assets of embedded systems. Unfortunately, the increased complexity that embedded systems make possible and the increased separation between the user and the machine also introduce new potential dangers. Embedded systems make previously impractical applications practical. Prior to the late 1970s, mobile telephone service was cumbersome and expensive because of limited capabilities to manage the
220 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
available radio communication channels. Embedded computing initially made the modern cellular telephone industry feasible because computers embedded within the cellular telephone base stations provided radio channel management and efficient hand-offs as mobile users moved from cell to cell. Newer digital cell phones include computers embedded within the handsets to improve communication and power efficiency. Without embedded computing, the explosive expansion of the cell phone industry could not have occurred. Implanted pacemakers help maintain the human heart’s proper pumping rhythm and have improved the quality and duration of life for many people, but those people are now dependent upon this embedded system.
The Future Embedded systems will certainly expand in functionality, influence, and diversity for the foreseeable future. The digital technology required to implement embedded systems continues to improve, and computer hardware is becoming more powerful, less expensive, smaller, and more capable of addressing electrical power considerations. In parallel, techniques for the production of robust real-time software are steadily improving. Digital communication capability and access is also expanding, and thus future embedded systems are more likely to exhibit connectivity outside of their larger systems. Lessons learned from early interfaces between humans and embedded systems coupled with the improvements in embedded computing should yield better interfaces for these systems. Embedded systems are likely to become so woven into everyday experience that we will be unaware of their presence. Ronald D. Williams See also Fly-by-Wire; Ubiquitous Computing FURTHER READING Graybill, R., & Melhwm, R. (2002). Power aware computing. New York: Kluwer Academic Press/Plenum. Hacker, B. (1978). On the shoulders of Titans: A history of Project Gemini. Washington, DC: NASA Scientific and Technical Information Office.
Jeffrey, K. (2001). Machines in our hearts: The cardiac pacemaker, the implantable defibrillator, and American health care. Baltimore: Johns Hopkins University Press. Jurgen, R. (1995). Automotive electronics handbook. New York: McGrawHill. Leveson, N. (1995). Safeware: System safety and computers. Reading, MA: Addison-Wesley. Shaw, A. ( 2001). Real-time systems and software. New York: John Wiley & Sons. Stajano, R. (2002). Security for ubiquitous computing. West Sussex, UK: John Wiley & Sons. Vahid, F., & Givargis, T. (2002). Embedded systems design: A unified hardware/software introduction. New York: John Wiley & Sons. Wolf, W. (2001). Computers as components. San Francisco: Morgan Kaufmann Publishers.
ENIAC The Electronic Numerical Integrator and Computer (ENIAC), built at the University of Pennsylvania between 1943 and 1946, was the first electronic digital computer that did useful work. Large analog computers had existed since Vannevar Bush and his team had built the differential analyzer in 1930. Depending on one's definition, the first digital computer may have been the exper imental Atanasoff-Berry machine in 1940. Unlike its predecessors, ENIAC possessed many of the features of later digital computers, with the notable exceptions of a central memory and fully automatic stored programs. In terms of its goals and function, ENIAC was the first digital supercomputer, and subsequent supercomputers have continued in the tradition of human-computer interaction that it established. They tend to be difficult to program, and a technically adept team is required to operate them. Built from state-of-the-art components, they involve a demanding trade-off between performance and reliability. Their chief purpose is to carry out large numbers of repetitious numerical calculations, so they emphasize speed of internal operation and tend to have relatively cumbersome methods of data input and output. Remarkably, ENIAC solved problems of kinds that continue to challenge supercomputers more than a half-century later: calculating ballistic trajectories, simulating nuclear explosions, and predicting the weather.
ENIAC ❚❙❘ 221
A technician changes a tube in the ENIAC computer during the mid-1940s. Replacing a faulty tube required checking through some 19,000 possibilities. Photo courtesy of the U.S. Army.
Historians debate the relative importance of various members of the ENIAC team, but the leaders were the physicist John W. Mauchly, who dreamed of a computer to do weather forecasting, and the engineer J. Presper Eckert. After building ENIAC for the U.S. Army, they founded a company to manufacture computers for use in business as well as in government research. Although the company was unprofitable, their UNIVAC computer successfully transferred the ENIAC technology to the civilian sector when they sold out to Remington Rand in 1950. Both development and commercialization of digital computing would have been significantly delayed had it not been for the efforts of Mauchly and Eckert.
The problem that motivated the U.S. Army to invest in ENIAC was the need for accurate firing tables for aiming artillery during World War II. Many new models of guns were being produced, and working out detailed instructions for hitting targets at various distances empirically by actually shooting the guns repeatedly on test firing ranges was costly in time and money. With data from a few test firings, one can predict a vast number of specific trajectories mathematically, varying such parameters as gun angle and initial shell velocity. The friction of air resistance slows the projectile second by second as it flies, but air resistance depends on such factors as the momentary speed of the projectile and its altitude.
222 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Thus, the accuracy of calculations is improved by dividing the trajectory into many short intervals of time and figuring the movement of the projectile in each interval on the basis of the output of the preceding intervals and changing parameters. At the dedication ceremony for ENIAC in 1946, the thirty-second trajectory of an artillery shell was calculated to demonstrate the machine's effectiveness. Using desk calculators, people would take three days to complete the job, compared with thirty minutes on the best ballistics analog computer, the differential analyzer. ENIAC did the calculation accurately in twenty seconds—less than the time the shell would be in the air. Because World War II had ended by the time ENIAC was ready, the first real job it did was evaluating the original design for the thermonuclear (hydrogen) bomb, finding that the design was flawed and causing the atomic scientists to develop a better approach. Filling 167 square meters in a large room, 27metric ton ENIAC was constructed in a “U” shape, with the panels and controls facing inward toward an area where the operators worked. ENIAC was built with about eighteen thousand vacuum tubes, consuming 174 kilowatts of electric power and keeping the room quite hot. Many experts had been skeptical that the machine could work because vacuum tubes frequently burned out, but taking great care in testing the tubes and running them below their specifications kept failures in use to about six hundred a year. ENIAC had both an IBM card reader and an automatic card punch, used chiefly for output and input of data calculated during one run that would be used later in another run; the cards were not used to enter programs. The computer was programmed largely by plugging in equipment and connecting by means of cables the twenty accumulators (electronic adders) that performed the calculations. Hundreds of flashing lights on the accumulators gave the operators clues about how the work was progressing. The calculations were done in the decimal system, rather than binary, and parameters were input manually by setting rotary switches. Switches also controlled local program-control circuits. To set parameters for a given run, the programmers held
paper instructions in one hand and used the other hand to turn rotary switches on the tall function tables, one 0–9 switch for each digit. Arranged in rows from head to ankle height, these switches had a simple color coding to reduce errors: Every fifth row of knobs was red and the others black; the plates behind the knobs alternated shinny with black, three columns at a time. Multiplication, division, and square-root calculation were handled by specially built components that could be plugged in as needed. A master programmer unit handled conditional (“ifthen”) procedures. Programmers might require a month to write a program for ENIAC and from a day to a week to set up and run one, but this was not as inefficient as it seems because after the machine was ready to do a particular job, a large number of runs could be cranked out rapidly with slight changes in the parameters. ENIAC continued to do useful work for the military until October 1955. Parts of this pioneering machine are on display at the National Museum of American History in Washington, D.C., along with videos of Presper Eckert explaining how it was operated. William Sims Bainbridge See also Atanasoff-Berry Computer; Supercomputers FURTHER READING McCartney, S. (1999). ENIAC: The triumphs and tragedies of the world's first computer. New York: Walker. Metropolis, N., Howlett, J., & Rota, G.-C. (Eds.). (1980). A history of computing in the twentieth century. New York: Academic Press. Stern, N. (1981). From ENIAC to UNIVAC: An appraisal of the EckertMauchly computers. Bedford, MA: Digital Press. Weik, M. H. (1961, January/February). The ENIAC story. Ordnance, 3–7.
ERGONOMICS The field of human factors and ergonomics plays an important and continuing role in the design of
ERGONOMICS ❚❙❘ 223
human-computer interfaces and interaction. Researchers in human factors may specialize in problems of human-computer interaction and system design, or practitioners with human factors credentials may be involved in the design, testing, and implementation of computer-based information displays and systems. Human factors and ergonomics can be defined as the study, analysis, and design of systems in which humans and machines interact. The goal of human factors and ergonomics is safe, efficient, effective, and error-free performance. Human factors researchers and practitioners are trained to create systems that effectively support human performance: Such systems allow work to be performed efficiently, without harm to the worker, and prevent the worker from making errors that could adversely affect productivity, or (more importantly), have adverse affects on him or herself or others. Research and practice in the field involve the design of workplaces, systems, and tasks to match human capabilities and limitations (cognitive, perceptual, and physical), as well as the empirical and theoretical analysis of humans, tasks, and systems to gain a better understanding of human-system interaction. Methodologies include controlled laboratory experimentation, field and observational studies, and modeling and computer simulation. Human factors and ergonomics traces its roots to the formal work descriptions and requirements of the engineer and inventor Frederick W. Taylor (1856–1915) and the detailed systems of motion analysis created by the engineers Frank and Lillian Gilbreth (1868–1924, 1878–1972). As human work has come to require more cognitive than physical activities, and as people have come to rely on increasingly sophisticated computer systems and automated technologies, human factors researchers and practitioners have naturally moved into the design of computerized, as well as mechanized, systems. Human factors engineering in the twenty-first century focuses on the design and evaluation of information displays, on advanced automation systems with which human operators interact, and on the appropriate role of human operators in supervising and controlling computerized systems.
An Ergonomics Approach to Human-Computer Interaction When it comes to designing computer systems, human factors and ergonomics takes the view that people do not simply use computers; rather, they perform tasks. Those tasks are as various as controlling aircraft, creating documents, and monitoring hospital patients. Thus, when analyzing a computer system, it is best to focus not on how well people interact with it (that is, not on how well they select a menu option, type in a command, and so forth), but how well the system allows them to accomplish their task-related goals. The quality of the interface affects the usability of the system, and the more congruent the computer system is with the users’ task- and goal-related needs, the more successful it will be. David Woods and Emilie Roth (researchers in cognitive engineering) describe a triad of factors that contribute to the complexity of problem solving: the world to be acted on, the agents (automated or human), and how the task is represented. In human-computer systems, the elements of the triad are, first, aspects of the task and situation for which the system is being employed; second, the human operator or user; and third, the manner in which information relevant to the task is represented or displayed. The role of the computer interface is to serve as a means of representing the world to the human operator. Research related to human factors and ergonomics has addressed numerous topics related to human-computer interaction, including the design of input devices and interaction styles (for example, menus or graphical interfaces), computer use and training for older adults, characteristics of textual displays, and design for users with perceptual or physical limitations. Areas within human factors and ergonomics that have direct applicability to the design of human-computer systems include those focused on appropriate methodologies and modeling techniques for representing task demands, those that deal with issues of function allocation and the design of human-centered automation, and those concerned with the design of display elements that are relevant to particular tasks.
224 ❘❙❚ BERKSHIRE ENCYCLOPEDIA OF HUMAN-COMPUTER INTERACTION
Methodologies and Modeling Frameworks for Task And Work Analysis John Gould and Clayton Lewis (1985) claimed that a fundamental component of successful computer system design requires an early and continual focus on system users and the tasks that they need to perform. In addition, Donald Norman (1988) has suggested that for a computer system to be successful, users must be given an appropriate model of what the system does, and how; information on system operation must be visible or available; and users must have timely and meaningful feedback regarding the results of their actions. Users should never be unable to identify or interpret the state of the computer system, nor should they be unable to identify or execute desired actions. Therefore, human factors and ergonomics research that focuses on humancomputer system design devotes considerable energies to analysis of system components, operator characteristics, and task requirements, using task and work analysis methods. A hierarc