This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
ISSN: 1571 5035 ISBN: 978 1 84800 049 0 e ISBN: 978 1 84800 050 6 DOI: 10.1007/978 1 84800 050 6 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: 2008933315 # Springer Verlag London Limited 2008 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licenses issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Printed on acid free paper Springer ScienceþBusiness Media springer.com
Shadi Abou-Zahra World Wide Waeb Consortium (W3C), Web Accessibility Initiative (WAI), France, [email protected] Chieko Asakawa IBM Research Division, Tokyo Research Kanagawaken, Japan, [email protected]
Laboratory,
yamato-shi,
Armando Barreto Electrical & Computer Engineering Department and Biomedical Engineering Department, Florida International University, FL, USA, [email protected] Sean Bechhofer University of Manchester, Manchester, UK, [email protected] Anna Cavender Computer Science and Engineering, University of Washington, Seattle, WA, USA, [email protected] Alistair D. N. Edwards Department of Computer Science, University of York, Heslington, York, UK, [email protected] Stefano Ferretti Department of Computer Science, University of Bologna, Bologna, Italy, [email protected] Becky Gibson IBM, Emerging Technologies, Westford, MA, USA, [email protected] Jon Gunderson University of Illinois at Urbana/Champaign, Disability Resources and Educational Resources, Champaign, IL, USA, [email protected] xi
xii
Contributors
Vicki L. Hanson IBM, T J Watson Research Center, Hawthorne, NY, USA, [email protected] Simon Harper School of Computer Science, University of Manchester, Manchester, UK, [email protected] Masahiro Hori Faculty of Informatics, Kansai University, Osaka, Japan, [email protected] Ian Horrocks University of Manchester, Manchester, UK, [email protected] Sarah Horton Dartmouth College, Academic Computing, Hanover, NH, USA, [email protected] Caroline Jay Department of Computer Science, University of Manchester, Manchester, England, [email protected] Takashi Kato Kansai University, Ryozenji-cho, Takatsuki-shi, Osaka, Japan, [email protected] Andrew Kirkpatrick Adobe Systems Incorporated, USA, [email protected] Sri H. Kurniawan School of Informatics, University of Manchester, Manchester, UK, [email protected] Richard E. Ladner Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA, [email protected] Laura Leventhal Department of Computer Science, Bowling Green State University, Bowling Green, OH, USA, [email protected] Clayton Lewis Department of Computer Science, University of Colorado, Boulder, CO, USA, [email protected] Darren Lunn School of Computer Science, University of Manchester, Manchester, UK, [email protected]
Contributors
xiii
Eleni Michailidou School of Computer Science, University of Manchester, Manchester, UK, [email protected] Silvia Mirri University of Bologna, Bologna, Italy, [email protected] Ethan V Munson University of Wisconsin-Milwaukee, Milwaukee, WI, USA, [email protected] Maria da Grac¸a C Pimentel Universidade de Sao Paulo, Sa˜o Carlos, Brazil, [email protected] T. V. Raman Google Research, USA, [email protected] Bob Regan Adobe Systems Incorporated, USA, [email protected] John T. Richards IBM, T J Watson Research Center, Hawthorne, NY, USA, [email protected] Marco Roccetti Department of Computer Science, University of Bologna, Bologna, Italy, [email protected] Paola Salomoni Department of Computer Science, University of Bologna, Bologna, Italy, [email protected] Cal Swart IBM, T J Watson Research Center, Hawthorne, NY, USA, [email protected] Hironobu Takagi IBM Research Division, Tokyo Research Laboratory, Kanagawaken, Japan, [email protected] Jutta Treviranus Adaptive Technology Resource Center, Faculty of Information Studies, University of Toronto, Toronto, Ontario, Canada, jutta.treviranus@ utoronto.ca Shari Trewin IBM, T J Watson Research Center, Hawthorne, NY, USA, [email protected]. com Yeliz Yesilada Human Centred Web, School of Computer Science, University of Manchester, Manchester, UK, [email protected]
Introduction
Web accessibility conjures the vision of designers, technologists, and researchers valiantly making the World Wide Web (Web) open to disabled users. While this maybe true in part, the reality is a little different. Indeed, Web accessibility is actually about achieving two complementary objectives: (1) reverse engineering and design rediscovery – correcting our past mistakes by making the current Web fulfil the original Web vision of access for all and (2) the discovery and understanding of factors which influence the accessibility of the Web within the context of the human interaction. It just so happens that in the process of trying to achieve these objectives, that have for the most part been ignored, we may understand and even solve, a number of larger scale usability issues faced by every Web user. Indeed, by understanding disabled-user’s interaction, we enhance our understanding of all users operating in constrained modalities, where the user is handicapped by both environment and technology. It is for this reason that Web accessibility is a natural preface to wider Web usability and universal accessibility; it is also why main-stream human factors researchers take it so seriously and understand its cross-cutting benefits. Humans are variously skilled and part of assuring the accessibility of technology consists of seeing that an individual’s skills match up well with the requirements for operating the technology. There are two components to this; training the human to accommodate the needs of the technology and designing the technology to meet the needs of the human. The better we do the latter, the less we need of the former. One of the non trivial tasks given to a designer of human machine interfaces is to minimize the need for training. Because computer based technology is relatively new, we have concentrated primarily on the learnability aspects of interface design, but efficiency of use once learning has occurred and automaticity achieved has not received its due attention. In addition, we have focused largely on the ergonomic problems of users, sometimes not asking if the software is causing cognetic problems. In the area of accessibility, efficiency and cognetics can be of primary concern. For example, users who must operate a keyboard with a pointer held in their mouths benefit from specially designed keyboards and well shaped pointers. However well made the pointer, how ever refined the keyboard layout, and however comfortable the physical environment we have made for this user, if the software requires more keystrokes than absolutely necessary, we are not delivering an optimal interface for that user. When we study interface design, we usually think in terms of accommodating higher mental activities, the human capabilities of conscious thought and ratiocination. Working with these
xv
xvi
Introduction areas of thought bring us to questions of culture and learning, and the problems of localizing and customizing interface designs. These efforts are essential, but it is almost paradoxical that most interface designs fail to first assure that the interfaces are compatible with the universal traits of the human nervous system in particular those traits that are sub cortical and that we share with other animals. These characteristics are independent of culture and learning, and often are unaffected by disabilities. Most interfaces, whether designed to accommodate accessibility issues or not, fail to satisfy the more general and lower level needs of the human nervous system. In the future, designers should make sure that an interface satisfies the universal properties of the human brain as a first step to assuring usability at cognitive levels. Jef Raskin The Humane Interface
We may imagine that there are many reasons for the Web to be accessible ranging from moral necessity, through ethical requirement, to legal obligation. However, the two most compelling are solid mainstream considerations: the business case and the u¨ber use case. The business case for Web accessibility is strong on three fronts. First, one in five people over the age of 65 are disabled. Population demographics indicate that our populations are ageing across the board. As the population ages, the financial requirement to work longer is increased, but the ability to work longer is reduced because disability becomes a bar to employment. Secondly, an ageing and disabled, but Web literate, population indicates a large market for online shopping and services especially when mobility is a problem for the shopper. A final benefit for business, keeping in mind that disability does not equal unskilled, is a highly motivated and skill-rich workforce. With the growth of the knowledge economy through many developed countries, and a move from manual work to more thought-and communication-based activities, there is the very real possibility of disabled Web users being able to finding productive, fulfilling, and social empowering employment; if only technology, and specifically the Web, was available to them. Web accessibility means commercial success. Web accessibility is really just a u¨ber use case because in the end, we will all be handicapped by the technology or the environment. Work on Web accessibility is helping us address many other domains including those centred around user mobility. For instance, work on physical disability and the Web is helping solve problems of the usability of mobile technology. By applying the same technology, used to counter a physically disabled user’s tremors and jerky movements, to the mobile Web, the operational problems of mobile interaction in moving environments are being solved. Similarly, mobile Web access suffers from the interoperability and usability problems that make the Web as difficult to interact with for mainstream users as it is for visually impaired users. Again, solutions proposed 3–4 years ago in the Web accessibility community are now being applied to mainstream mobile devices. Indeed, a fact often forgotten is that we are all unique. The disabled user serves as a reminder that Web accessibility is a truly individual experience and that by understanding the flexible and personalisation required by disabled users, we can understand that at some point this same
Introduction
xvii
flexibility and personalisation will be required by all. To understand the needs of disabled users is to understand the needs of everyone. An important route to achieve Web accessibility is to improve our knowledge and understanding of it through research and innovation. Although many books have been published on Web accessibility, unfortunately they have been mostly written from a technical perspective. They do not really tell the whole story – What about research on Web accessibility? How did it all start? How did it evolve? Which sub-areas have been developed? What is the current state of the art? What are the missing pieces? If we want to take Web accessibility to the next level, we need to answer these questions and this book aims to do that. We have invited experts from different specialised areas of Web accessibility to give us an overview of their area, discuss the limitations and strengths, and present their thoughts on the future directions of that area. As the famous Nobel prize winning research scientist Albert Szent-Gyorgyi said research is to see what everybody else has seen and to think what nobody else has thought. This book aims to help research scientists who are new in the area to see what everybody else has seen and help them think what nobody else thought.
Web Science So how does Web accessibility and the foundational research of this book fit into the new and evolving field of Web Science? We can see that practical aspects such as guidelines and practice-related topics fit more closely into a Web engineering context. However, Web accessibility research investigates the Web from both an experimental and analytical deployment viewpoint. Indeed, we can see this to be exactly the perspective of the Web Science Research Initiative. The Web is an engineered space created through formally specified languages and protocols. However, because humans are the creators of Web pages and links between them, their interactions form emergent patterns in the Web at a macroscopic scale. These human interactions are, in turn, governed by social conventions and laws. Web science, therefore, must be inherently interdisciplinary; its goal is to both understand the growth of the Web and to create approaches that allow new powerful and more beneficial patterns to occur. Tim Berners Lee, Wendy Hall, James Hendler et al. Creating a Science of the Web (Science 11 August 2006)
We can see that these two aspects of experimental, lets say in vitro,1 and observational, ‘‘in vivo,2 enquiry are closely related and are present with Web accessibility research. Indeed, by understanding the Web as it evolves in vivo in concert with running experimental studies in vitro, our work is placed squarely at the centre of Web Science. Web accessibility research is however missing 1 2
(Latin: (with)in the glass). (Latin: (with)in the living).
xviii
Introduction
components of the Web Science vision at the moment in that we do not investigate the Web as a first-order organism. We do not yet introduce our in vitro work into the deployed Web and continue to investigate the effects it has on the Web organism’’ over time.
Keep in Mind To understand accessibility, the researcher must take account a number of truths: (1) there is never just one solution; (2) solutions are not simple; (3) a single solution will never work, instead, combinations of solutions are required; (4) you do not know the user or their requirements at the granularity required to make assumptions; and finally, (5) remember that work in the Web accessibility field is not only for disabled people, but for organisations and people without disabilities.3 To build applications and content that allows for heterogeneity, flexibility, and device independence is incredibly difficult, incredibly challenging, and incredibly necessary.
Orientation With this in mind, we will split this book into four main parts. First, we will examine the intersection between accessibility and disability in an effort to understand the differing needs of users, and the technology provided to fulfil those needs; you could consider this to be a Disability Primer. In Parts II and III, we will investigate the past and current state of play in Web accessibility research. We will define the tools, techniques, and technologies in current use to help design, build, check, and transform Web pages into accessible forms. Finally, we will present an analysis of the future direction of Web accessibility based on an investigation of emergent technologies and techniques. By defining these emergent areas, explaining their effect on the wider Web, and their possible impact on usability, we may be able to predict the future direction of Web accessibility, and buy guiding research and development suggest a different and accessible future for the fast-evolving Web.
About this Book Web accessibility has tended to be considered as a Web Design challenge, and therefore, existing relevant books are mostly prescriptive tutorials on how to achieve Web accessibility. However, there are hundreds, if not thousands, of 3
See http://www.w3.org/WAI/EO/.
Introduction
xix
research scientists and research and development programmers working in academia and industry advancing the understanding of the current Web, and making developments in the next Web, accessible. As such, this book will cover Web accessibility from a purely broad research perspective. The book is primarily aimed at academics, scientists, engineers, and postgraduate students as the definitive, foundational text on Web Accessibility from a deeply research perspective. Written by leading experts in the field, it not only provides an overview of existing research but looks to future developments and includes expert opinion with the understanding that this kind of insight cannot be derived purely from existing research publication stores such as Google Scholar.
Part I
Understanding Disabilities
To provide a proper foundation for the upcoming parts II, III and IV, an initial understanding of each disability is required and Part I provides a starting point for this. While we cannot comprehensively cover each of our four disabilities (Visual Impairments; Cognitive and Learning Impairments; Hearing Impairments; and Physical Impairments) it is our intention to, at least, cover the main problems as a starting point to facilitate further investigation. Indeed, we also cover a special case (Ageing) because older people can exhibit combinations of disabilities, often with a minor severity, but when taken together can have a significant impact on quality of life. Pre-eminent when discussing Web accessibility is visual impairment and profound blindness, however, it should be remembered that visual disability is just one aspect of Web accessibility. Other disabilities also exist which require improvements to the Web to enable accessibility. Before considering these, it is important to consider just what accessibility really means and why it is necessary to think about ‘access’ when building Web sites. Many disabled users consider the Web to be a primary source for information, employment and entertainment. Indeed, from questioning and contact with many disabled users we have discovered that the importance of the Web cannot be under-estimated. ‘‘For me being online is everything. Its my hi fi, my source of income, my supermarket, my telephone. Its my way in.’’
This quote, taken from a blind user, sums up the sentiments we experience when talking with many disabled users and drives home the importance of Web accessibility in the context of independent living. The message... by making sites accessible we help people live more productive lives. However, the research scientist should remember that there are no ‘catch-all’ solutions to providing accessibility to Web-based resources for disabled users. Each user is an individual and each disability is unique, understanding both in isolation and in combination is one of the main challenges facing the Web accessibility researcher.
Visual Impairments Armando Barreto
Abstract This chapter introduces a summary of the physiological processes that support key visual functional capabilities such as visual acuity, contrast sensitivity and field of view. The chapter also considers some of the most common causes of visual dysfunction and their impact on the visual capabilities that are necessary for successful interaction with contemporary computer systems, particularly for access to the World Wide Web. This framework is used to outline the general approaches that have been proposed to facilitate access of individuals with visual impairments to computers, and a few innovative approaches are highlighted as potential directions for further development in this area.
1 Introduction: The Physiological Basis of Visual Perception In the analysis of the processes at play during the performance of human– computer interactions, it is common to consider that there are at least three types of human subsystems involved. Card et al. (1983) proposed that successful human–computer interaction would require the involvement of a perceptual system, receiving sensory messages from the computer, a motor system, controlling the actions that the user performs to provide input to the computer, and a cognitive system, connecting the other two systems, by integrating the sensory input received to determine appropriate user actions. Given the pervasiveness of graphic user interfaces (GUIs) in most contemporary computing systems, and certainly in the majority of World Wide Web (Web) sites, the demands placed on the visual channel of the user’s perceptual system have been raised beyond the capabilities of a significant portion of potential Web users. Typically, a computer system utilizes the user’s visual channel by presenting on a surface (computer screen) patterns of light point sources (pixels) emitting A. Barreto Electrical and Computer Engineering Department and Biomedical Engineering Department, Florida International University, FL, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_1, Ó Springer Verlag London Limited 2008
3
4
A. Barreto
Object viewed (e.g., screen)
1 Refraction (Cornea, lens, etc.)
Retinal Projection
2 Neural Function (Retina)
Electrical signals to the brain
Fig. 1 Two stages of functionality required for reception of visual information
light made up of the mixture of three basic colors: red, green and blue. If all three colors are fully present, the resulting mixture will be perceived as a white pixel. If none of the colors is present, at all, the pixel will be perceived as ‘‘black’’ or ‘‘off’’. Partial mixtures of red, green and blue will yield the perception of a pixel with a specific color (e.g., purple). Effective human–computer interaction requires that the light patterns formed on the computer screen be successfully processed through a two-stage sequence: the physical reception of the stimulus (e.g., a desktop icon) and then the interpretation of that stimulus by the user (Dix et al. 1998). Unfortunately, even the reception of the stimulus may be hindered if the visual system of the user is not performing its expected functions at its full capacity. One could consider the process needed to receive visual information from a computer interface as involving two necessary stages. First the refraction system of the eye must create a proper distribution of light on the retina, to represent the graphical element being viewed. In addition to this requirement, the neural function of the retina must be operative to translate the retinal image into a proper set of neural signals that will be carried to the brain, where they will be ultimately interpreted. This is shown as a simplified diagram in Fig. 1.
1.1 Formation of the Retinal Image The process of visual perception begins in the eye, where the distribution of point light sources that make up a pattern in the computer display (e.g., an icon) must be faithfully projected onto the retina, the layer at the back of the inside of the eyeball where photoreceptors known as cones and rods will trigger electrical signals called action potentials when stimulated by light. The eye is often compared to a digital camera, and the analogy is indeed attractive, since, just as for the camera, the image of an external object must be appropriately formed on the imaging device of the camera (commonly an array of charge-coupled devices, or CCDs), where the images would also be converted to electrical signals. Ideally, each point source of light in the computer display should be projected onto the retina in a single location, to achieve a one-to-one correspondence between the points of light that make up a pattern (e.g., an icon) and a corresponding distribution of illuminated points on the retina (albeit scaled down, upside down and reverted left to right). However, this requires that the
Visual Impairments
5
rays of light emanating form each pixel, which naturally diverge, be bent back into a single point of the retina. If this bending of the rays from a single external point source did not take place, each point source in the scene being viewed (e.g., each pixel on the computer screen) would result in the illumination of a relatively widespread area on the retina, which would be perceived as a blur, instead of a point. Overall, this would result in the perception of a defocused or distorted image. The eye performs the necessary bending of the light or refraction primarily in two stages. Refraction first occurs when the light passes from the air to the cornea, the transparent portion of the outer layer of the eye, in front of the opening of the eye, known as the pupil. Ideally, the cornea should have a hemispheric shape, which would cause it to introduce a fixed and uniform amount of refraction. One more stage of significant refraction occurs as the light passes from the aqueous humor (filling the space between the cornea and the lens) into the denser lens (Martini et al. 2001). Furthermore, the lens provides a variable level of refraction, dependent on its shape. If the lens is allowed to take its intrinsically spherical shape, it will perform a stronger bending of the rays of light. This is needed to focus the image from a nearby point source, whose rays are clearly divergent when they reach the observer’s eye. On the other hand, focusing the light from a distant point source will require less refraction, as the rays in this case arrive at the eye in an almost parallel configuration. In this case, the refraction provided by the lens is decreased by pulling on it radially, which flattens it. This process of re-shaping of the lens to achieve the right amount of overall refraction in the eye that will map a point light source (located at different distances from the observer) to an illuminated point in the retina is called accommodation, and is controlled by the ciliary muscle of the eye. If a computer user is unable to accommodate properly, in order to focus images displayed on the computer screen, each pixel will be perceived as a blur, instead of a point, and the pattern displayed may be perceived as blurred or distorted, compromising its role in the use of a GUI. In addition to deficiencies in the accommodation, irregularities in the refraction introduced by the cornea (from simple lack of radial symmetry, such as astigmatism, to some more complex irregularities, such as keratoconus) may result in the distortion of the retinal representation of each pixel, and therefore in a distorted retinal image. The formation of a properly focused and undistorted retinal image will produce a distribution of light on the retina that will be sampled by visual receptors, rods and cones, which cover the posterior inner wall of the eyeball. Each of these receptors trigger electrical signals, known as action potentials, when enough light impinges on it. The rods are very sensitive, triggering action potentials even if very low levels of illumination act on them, and respond to any of the light frequencies or colors. They are somewhat scattered on the whole retinal surface. Because of these characteristics the rods are key to our peripheral vision and support vision under dimly lit conditions, but are not capable of providing perception of details. On the other hand, there are three types of cone receptors, classified according to the color which each perceives: red, green and
6
A. Barreto
blue. Most of the roughly 6 million cones are found densely packed in a central region within the retina, called the fovea (Martini et al. 2001). The high density of cones in the fovea implies that we normally are capable of perceiving higher levels of detail in objects that lie in our central vision, i.e., near the visual axis of the eye, which is an imaginary straight-line crossing through the fovea and the center of the pupil.
1.2 Neurological Function in the Retina The retina does more than just sampling the light distribution defined by the objects we view. Cones and rods transmit their action potentials to a second layer of excitable cells, the bipolar cells, which in turn transmit their impulses to a layer of ganglion cells, whose axons collectively leave the eye globe at a single location (the optic disc), to constitute the optic nerve. However, the transmission of visual information from receptors to bipolar cells to ganglion cells is not a simple relay. Instead, neural processing takes place in the retina, which also includes horizontal cells and amacrine cells interconnecting cells present in the same layer. This retinal neural processing is suspected to be responsible for an additional increase in our net contrast sensitivity. Furthermore, while the transmission of neural information from the cones to their corresponding ganglion cells (P cells) is approximately in a proportion of 1:1, the neural activity in each of the ganglion cells associated with rods (M cells) may be defined by as many as a thousand rods, making our low-illumination vision less capable of perceiving detail. About half of the fibers of the optic nerve of each eye project to a structure in the brain called the lateral geniculate nucleus (LGN) on the same side as the originating eye, while the remaining fibers cross over to the lateral geniculate nucleus of the opposite side. From each LGN, neural activity associated with visual input is relayed to the occipital area of the corresponding brain hemisphere, to the visual cortex, where the interpretative phase of visual perception commences. Hopefully, the previous sketch of the process by which we sense visual information will provide some background for the following sections, in which some key functional capabilities of our visual system will be considered, as well as their potential disruption and the impact that it may have on human–computer interaction.
2 Overview: Functional Requirements for Visual Perception Proper function of the visual system of a computer user would endow the user with visual capabilities that may be considered in terms of specific visual functions. Four critical visual functions that can be considered and assessed
Visual Impairments
7
for each given individual are visual acuity, contrast sensitivity, visual field and color perception (Jacko et al. 2002). Visual acuity refers to the capability of the eye to resolve two point sources of light that may be located close to each other. As mentioned before, a single point source of light should ideally result in a very small illuminated region of the retina. In practice, even without visual dysfunction the retinal spot corresponding to a distant point source of light may be as wide as 11mm in diameter (Guyton and Hall 1996). Therefore, two distant point light sources might be perceived as a single source if the physical separation between them is smaller than a certain threshold. The normal visual acuity of the human eye for discriminating between point sources of light is about 25 s of arc (Guyton and Hall 1996). Clinically, visual acuity is assessed through the identification of letters of decreasing size, which will therefore have features that are progressively harder to resolve. With the use of an eye chart, such as the Snellen chart, acuity assessments are expressed by comparison to a norm. So, 20/20 acuity indicates that a specific subject can see details at a distance of 20 ft as clearly as would an individual with normal vision, whereas 20/30 indicates a decreased visual acuity, by which the subject must be at 20 ft from an object to discern details that a person with normal vision could distinguish at 30 ft (Martini et al. 2001). Clearly, accommodation impairments or refractive errors due to imperfections of the cornea or the lens will result in decreased visual acuity. Contrast sensitivity describes the ability of a subject to discern subtle differences in shades of gray present in an image (Ginsburg and Hendee 1993). Clinically, contrast sensitivity may be assessed with the use of the Pelli–Robson chart, in which letters at different levels of contrast are presented to the subject (Pelli et al. 1998). The field of vision is the visual area seen by an eye at a given instant. The extent of the field of vision may be assessed through a method called perimetry, which requires the subject to look toward a central fixation point, directly in front of the eye. Under those circumstances, a small illumination source is gradually brought into the field of view along a meridian trajectory, and the subject is asked to report when it is first perceived, therefore providing an approximate location for the edge of the field of view along that meridian (Guyton and Hall 1996). The color perception capability of the human eye is based on the fact that there are three populations of cones in the retina: red, green and blue, which are sensitive only to electromagnetic radiation in the corresponding spectral regions (around 450 nm wavelength for blue cones; around 575 nm wavelength for green cones and around 600 nm wavelength for red cones). Proper color perception can be tested with pseudoisochromatic color plates, which will fail to revel the expected numerical patterns if the viewer has specific color perception deficiencies. Color perception can also be tested with Farnsworth ordering tests.
8
A. Barreto
3 Discussion: Functionality Restrictions in Visual Disorders The significant human capabilities in terms of visual acuity, contrast sensitivity, field of view and color vision, along with the continuously increasing performance characteristics of computer displays, have encouraged the designers of graphic interfaces and web pages to fully exploit the resolution, size and color available to them. However, this has, indirectly, set the demands on the user’s visual system very high. Unfortunately, there is a wide variety of conditions that may result in diminishing visual functionality. It is not uncommon at all to find individuals for whom the refraction implemented by the cornea and the lens is imperfect. This leads to an inability to accommodate images from objects located far away (myopia) or at close range (hyperopia). The inability to focus objects at a close range is, in fact, expected as the individual becomes older, since aging tends to make the lens less elastic (presbyopia). A young adult can usually focus on objects 15–20 cm away, but as aging proceeds this near point of vision shifts gradually. The near point at age 60 is typically 83 cm (Martini et al. 2001). Furthermore, a corneal shape that departs from a hemisphere implies that the effective refraction implemented by the eye is not the same along different axes of the field of view (e.g., horizontal vs. vertical). This lack of symmetry in the refraction, or astigmatism, will produce distorted retinal projections of the external objects viewed. Similarly, other more severe refraction imperfections, such as keratoconus, produced by an abnormal shaping of the cornea which approximates a conical shape, will produce retinal images that do not faithfully reflect the external objects being viewed. All of these circumstances will deteriorate the effective visual acuity and contrast sensitivity of the individual, reducing them to levels that may be insufficient to meet the high demands imposed by high-resolution graphic interfaces. The formation of a proper retinal image may also be impeded by the abnormal presence of opacities in the lens or ‘‘cataracts’’ which may result from drug reactions or simply from aging (Martini et al. 2001). The intervening opacities may result in deteriorated visual acuity and restricted field of view. Beyond the formation of a properly focused retinal image representing the external object being viewed (e.g., an icon on a computer screen), adequate perception of the image requires that all the elements that support the neural function of the retina be present and fully functional. For example, a subject with a congenital lack of red cones (protanopia) will not be able to distinguish red from green. Similarly, there can be a lack of green cones (deuteranopia) or a lack or under-representation of blue cones (tritanopia) (Guyton and Hall 1996). These conditions clearly constrain the typical color vision capabilities of the individual and may compromise the understanding of graphic user interfaces that rely heavily on color to communicate their message to the user. In addition to congenital deficiencies in the neural function of the retina, there are several diseases that may result in deterioration of that neural function. So, for example, in the United States, the most common causes of
Visual Impairments
9
decreased vision are diabetic retinopathy, glaucoma and age-related macular degeneration (AMD) (Jacko et al. 2002). Diabetic retinopathy develops in many individuals with diabetes mellitus, which affects approximately 16 million Americans (Jacko et al. 2002), although the damage to the retina may not be noticeable for years. Diabetic retinopathy develops over a period of years due to the circulatory effects of diabetes, which may include degeneration and rupture of blood vessels in the retina. Visual acuity is lost, and over time the photoreceptors are destroyed due to the lack of proper oxygenation (Martini et al. 2001). Glaucoma is a relatively common condition, with over 2 million cases reported in the United States alone (Martini et al. 2001), and it is one of the most common causes of blindness. Glaucoma is characterized by a pathological increase of the intraocular pressure (normally varying between 12 and 20 mm Hg), rising acutely as high as 60–70 mm Hg, sometimes due to inappropriate drainage of the aqueous humor (which fills the space between the cornea and the lens). As pressure rises, the axons of the optic nerve are compressed where they leave the eyeball, at the optic disc. This compression is believed to block axonal flow of cytoplasm, resulting in a lack of appropriate nutrition to the fibers, which eventually causes the death of the cells affected (Guyton and Hall 1996). Glaucoma may result in progressive loss of peripheral vision, with the central vision typically being affected only late in the disease. If the condition is not corrected, blindness may result. Age-related macular degeneration is the leading cause of irreversible visual loss in the Western world, in individuals over 60 years of age. The most common form of the disease is characterized by the deposition of abnormal material beneath the retina and degeneration and atrophy of the central retina in the area known as the ‘‘macula lutea’’, which contains the fovea. The less common AMD variant (‘‘wet’’) is characterized by the growth of abnormal blood vessels beneath the central retina, which elevate and distort the retina, and may leak fluid and blood beneath or into the retina. AMD can cause profound loss of central vision, but generally does not affect peripheral vision (Jacko et al. 2002). It should be noted that, since AMD commonly affects elderly individuals, it may be accompanied by other forms of visual limitations (e.g., refractive errors) that are also common in that age group.
4 Approaches to Enhance Human Computer Interaction This section considers an overview of the approaches that have been proposed to facilitate the access of individuals with different levels of visual impairment to graphic interfaces, such as those used in web pages. Clearly, the most desirable solution for an individual experiencing visual impairment would be a clinical intervention capable of restoring, as much as possible, standard visual functionality, which would therefore allow the individual
10
A. Barreto
to access information presented by a computer in an unimpeded fashion. Such clinical interventions are many times available and may range in complexity from the simple use of spectacles or contact lenses to complex surgical procedures, such as the replacement of the eye’s lens with an artificial lens to overcome cataracts. The following paragraphs consider the situation in which full restoration of visual function is not possible and alternative solutions are sought to specifically aid an individual in his or her interaction with computer systems. Approaches suggested to aid individuals who are completely blind are different to those suggested for individuals who have ‘‘low vision’’, i.e., individuals who have significantly diminished visual capabilities (such as visual acuity, contrast sensitivity and field of vision), and therefore encounter difficulty in interacting with computers. Most approaches aimed at facilitating the access of blind users to computers focus on the presentation of information through alternative sensory channels. In particular, significant efforts have been directed to the presentation of output information from the computer through the auditory channel and through tactile devices. A class of alternative computer output systems that use the auditory channel are the ‘‘screen readers’’. One of these systems, which has become very popular, is the Job Access With Speech (JAWS) system. It presents the information that would be displayed to a sighted user as synthesized speech, providing a number of features for speeding and simplifying the search of relevant information in the source being explored. Currently, many blind individuals are able to interact with personal computers using this type of system. However, screen readers provide to their users only the textual contents of the interfaces. This is particularly troublesome for users attempting to interact with web pages, as a screen reader will only be able to substitute the information associated with any picture or graphic element to the extent that the creator of the web page included helpful descriptors (e.g., ‘‘alt’’ tags in the HTML source code for the web page). Furthermore, the (often important) information coded in the layout of web pages is poorly represented by standard screen readers (Donker et al. 2002). Similar limitations apply to refreshable Braille displays, which in addition have a limited character capacity and require the user to be proficient at reading Braille. In contrast, most approaches suggested to facilitate the access of individuals with low vision to graphic interfaces revolve around magnification of the graphical elements. This certainly reduces the functional demand on the user, in terms of visual acuity, as each of the features of the graphical elements will be assigned larger extents in terms of visual angle. On the other hand, for a given finite display surface, magnification of graphical elements must establish a trade-off with the ability to present the complete interface to the user at once. In addition, the limitations in visual field of users with conditions such as glaucoma or AMD may further constrain the usefulness of indiscriminate magnification, forcing the users to inspect different portions of the interface in a sequential manner and adding to the cognitive load involved in interacting with the computer. Jacko et al. (2001) studied the interaction styles of AMD patients and concluded that a solution based on simply enlarging the graphic
Visual Impairments
11
elements of the interface fails to recognize the different ways in which the visual capability of these users is affected. These authors suggested that multiple factors, such as size of graphic elements, background color, number and arrangement of the graphical elements, must be considered comprehensively in proposing interface enhancements for AMD users. Another school of thought, which has been proposed by Peli et al., focuses primarily on the contrast sensitivity losses suffered by many individuals with low vision. As such, these proponents model the visual deficiency as a generic low-pass spatial filter that is implicitly operating in the eye of the low-vision computer user. The associated display enhancement consists of the implementation of a generic high-pass spatial filtering process, termed ‘‘pre-emphasis’’ on the images to be displayed (Peli et al. 1986). It should be noted that the ‘‘accessibility options’’ of contemporary operating systems for personal computers address both of the trends discussed above, by including screen magnification facilities and the ability to select high-contrast representations for the operating system windows and text messages.
5 Emerging Solutions Recently, the availability of new technologies has brought about new approaches that seek to facilitate computer access for individuals with low vision. For example, Alonso et al. (2005, 2006) have proposed a new form of pre-compensation of images to be displayed by the computer, which is customized for each user. In this case, the distortion that the image will experience in the eye of the user is not considered generic. Instead, it is empirically measured using a wavefront analyzer and is characterized mathematically as a wavefront aberration function which is, in turn, used to implement in the computer the opposite transformation on the image, even before it is displayed to the user. The goal sought is that the effect of this software pre-compensation and the intrinsic distortion due to the imperfect refraction in the user’s eye nearly cancel, to yield (ideally) an undistorted retinal image, as in Fig. 2.
Fig. 2 Software pre compensation process proposed by Alonso et al. H 1(x,y) represents the application of a transformation that is opposite to the distortion in the user’s eye
12
A. Barreto
Another innovative approach made possible by technological advances is the use of a virtual retinal display (VRD) as an alternative display device for lowvision computer users. The virtual retinal display system scans modulated, lowpower laser light to form bright, high-contrast and high-resolution images directly onto the retina. The narrow beam employed by the VRD may allow an individual with abnormal or damaged optical media (cornea, lens, etc.) to strategically orient his/her eye in such a way as to minimize the glare from specific scattering points (e.g., corneal scars) to optimize image quality. Likewise, it might be possible for individuals with partial retinal damage to employ a similar alignment strategy to effectively use the remaining functional areas of their retina (Kleweno et al. 2001).
6 Conclusions The brief summary presented here of the basic physiological processes, the expected functional capabilities and the most common afflictions of the human visual system may prove useful in trying to understand how visual impairments introduce critical barriers in the process of human–computer interaction. Similarly, the linkages presented between the physiological processes involved in visual function and the most important visual capabilities required for efficient interaction with computers may provide a guide for future exploration of mechanisms by which new technologies might be able to adapt the visual output of the computer for a better assimilation by individuals whose visual system may not be fully functional. Emerging approaches such as the virtual retinal display and the customized pre-compensation of images to match the visual characteristics of each user certainly raise hopes that the continuous technological improvement might bring these early research results to the realm of widespread use and day-today practice.
References Alonso, M., Barreto, A., Cremades, J. G., Jacko, J., and Adjouadi, M., (2005), ‘‘Image Pre compensation to facilitate computer access for users with refractive errors’’, Behaviour & Information Technology, Vol. 24, No. 3, pp. 161 173, Taylor and Francis. Alonso, M., Barreto, A., Adjouadi, M., and Jacko, J., (2006), ‘‘HOWARD: High Order Wavefront Aberration Regularized Deconvolution for Enhancing Graphic Displays for Visually Impaired Computer Users’’, Lecture Notes in Computer Science, vol. LNCS 4061, pp. 1163 1170, Springer Verlag. Card, S. K., Moran, T. P., and Newell, A., (1983), The Psychology of Human Computer Interaction. Lawrence Erlbaum Associates, Hillsdale, New Jersey. Dix, A., Finlay, J., and Abowd, Beale, R., (1998) Human Computer Interaction, second edition. Prentice Hall Europe.
Visual Impairments
13
Donker, H., Klante, P., and Gorny, P., (2002), ‘‘The design of auditory interfaces for blind users’’, Proceedings of the second Nordic conference on Human computer interaction, pp. 149 156. Ginsburg, A. P., and Hendee, W. R., (1993), ‘‘Quantification of Visual Capability’’, in Hendee, W. R., and Wells, P. N. T. (editors), The perception of Visual Information, Springer Verlag, NY. Guyton A. C., and Hall, J. E., (1996), Textbook of Medical Physiology, 9th Edition, W.B. Saunders Company, Philadelphia, PA. Jacko, J. A., Scott, I. U., Barreto, A. B., Bautsch, H. S., Chu, J. Y. M., and Fain, W. B., (2001), ‘‘Iconic Visual Search Strategies: A Comparison of Computer Users with AMD Versus Computer Users With Normal Vision’’, Proceedings of the 9th International Conference on Human Computer Interaction, HCI International 2001, New Orleans, LA. Jacko, J. A., Vittense, H. S. and Scott, I. U., (2002), ‘‘Perceptual Impairments and Computing Technologies’’, in Jacko, J. A. and Sears, A. (editors), The human computer interaction handbook: fundamentals, evolving technologies and emerging applications, Lawrence Erl baum Associates, Inc., Mahwah, NJ, pp. 504 522. Kleweno, C. P., Seibel, E. J., Viirre, E. S., Kelly, J. P., and Furness III, T. A., (2001), ‘‘The virtual retinal display as a low vision computer interface: a pilot study’’, Journal of Rehabilitation Research and Development, Vol. 38, No. 4, pp. 431 442. Martini, F. H., Ober, W. C., Garrison, C. W., Welch, K., and Hutchings, R. T., (2001), Fundamentals of Anatomy & Physiology, 5th Edition, Prentice Hall, Upper Saddle River, NJ. Peli E., Arend L. E., and Timberlake G. T, (1986), ‘‘Computerized image enhancement for low vision: New technology, new possibilities’’, Journal of Visual Impairment & Blindness, vol. 80, pp. 849 854. Pelli, D. G., Robson, J. G., and Wilkins, A. J., (1998), ‘‘The design of a new letter chart for measuring contrast sensitivity’’. Clinical Vision Science, Vol. 2 No. 3, pp. 187 199.
Cognitive and Learning Impairments Clayton Lewis
Abstract People with cognitive disabilities are gaining in a long struggle for recognition of their right to control their lives. In the information society access to the Web is essential to this control. Cognitive barriers to this access are diverse, reflecting the complexity of human cognitive faculties. These barriers are not well managed in current accessibility practice and policy, in part because cognitive accessibility, like usability, cannot be reduced to a checklist of simple attributes. Advances in representing the meaning as well as the form of information, and in supporting configurable presentation and interaction methods, will yield progress. Increased inclusion of people with cognitive disabilities in the processes of technology development and policy making will also pay off.
1 Introduction People with cognitive disabilities are gaining in a long struggle for recognition of their right to control their lives. Not long ago it was common for people with cognitive disabilities to be institutionalized, and restricted to segregated educational and employment programs (Braddock and Parish 2001). Due in significant part to the rise of self-advocacy organizations (Roth 1983; Dybwad and Bersani 1996), in which people with cognitive disabilities speak out in defense of their right to independence, most people with cognitive disabilities now live outside institutions and attend schools for the general public. Access to employment continues to be an issue, with low levels of employment (in common, unfortunately, with people with disabilities of all kinds: the overall employment rate in the USA is less than 56%, according to the US Office of Disability Employment Policy;1 the figure for the European Union is less than 43%, 1
http://www.dol.gov/odep/faqs/working.htm
C. Lewis Department of Computer Science, University of Colorado, Boulder, CO 80309 0430, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_2, Ó Springer Verlag London Limited 2008
15
16
C. Lewis
according to the European Commission Directorate General for Economic and Social Affairs2). In the information society access to the Web is essential to full, independent participation. Information of all kinds, addressing such vital concerns as health, employment, and civic participation, as well as entertainment and personal enrichment, is now commonly available on the Web. For example, 10 million people in the USA seek health information online on a typical day (Fox 2006). Access to goods and services also increasingly comes via the Web, as well. A participant in a focus group of people with traumatic brain injury, when asked about the importance of the Web for him, said, ‘‘Well, how else would I buy my health insurance?’’
2 Overview Cognitive and learning impairments are extremely diverse, in both origin and impact. This is because the human cognitive apparatus is extremely complex and multifaceted, so that there are many different cognitive functions whose operation can be impaired, and many possible causes of such impairments. Starting on the functional side, Francik (1999) presents a classification of cognitive functions with the following major headings: executive function memory attention visual and spatial perception language emotions In her discussion she adds to this list speed of problem solving, fluid intelligence, and crystallized intelligence, and one could also add specific functions in mathematical reasoning. Each of these broad functional categories has many aspects. For example, ‘‘memory’’ includes encoding of new information, as well as retrieval; delayed retrieval is a different function from immediate retrieval; recognition is different from recall, skill learning is different from learning of facts, and so on. The ‘‘language’’ category includes comprehension as well as production, reading as well speech processing, and issues with vocabulary as well as syntax. The typical operation of any of these functions can be interfered with in many ways. Chromosomal abnormalities, as in Down syndrome, injuries to the brain from external impact or from stroke, effects of aging, diseases like Alzheimer’s or Parkinson’s, or severe mental illness can all cause cognitive 2
impairments. Many people have cognitive impairments for which no cause can be identified. Historically, much emphasis has been given to classifying cognitive disabilities, rather than cognitive functions. For example, in some jurisdictions in the USA, a ‘‘developmental disability’’ is a chronic impairment, originating before age 22, that results in substantial functional limitations in three or more specified areas of major life activity (Interagency Committee on Disability Research 2003); and (at one time) a ‘‘learning disability’’ was defined to be a specific difficulty in processing language or mathematics not associated with a lower-than-average IQ. But research has cast doubt on the meaningfulness of these classifications. For example, it has become clear that particular reading problems can occur regardless of IQ, and so separating conditions by IQ obscures what is going on (Stanovich 2005). In accessibility work these classifications contribute little if anything. It is often not known what classification a particular user belongs to, and even if it were, the variability of function within the classifications is very large. Furthermore, preoccupation with classification can contribute to a tendency to view people with disabilities as if the disabilities, and not the people, are important. What is important is recognizing that people can have difficulty in any of the many functional areas on Francik’s list, and to consider how Web access can be facilitated in the presence of these difficulties. That is, the focus should be on barriers to access, not on impairments (Backenroth 2001; Roth 1983; see also Rapley 2004). Unfortunately, demographic data, not plentiful to begin with, are organized by disability classifications, not around function. There are also methodological problems in the demographics (see, e.g., Hendershot et al. 2005). Nevertheless these data are useful in establishing that large numbers of people encounter cognitive and learning barriers, certainly enough to justify substantial attention to increasing accessibility, though the data are not useful for prioritizing attention to particular functions. People who are unfamiliar with cognitive disabilities sometimes assume that people with these impairments cannot use computers, or could not be represented in the professional, administrative, or managerial workforce. While it is true that there are people who cannot use computers, a survey commissioned by Microsoft (n.d.) produced the estimate that 16% of working-age computer users in the USA have a cognitive or learning impairment. Data on administrative employees of the federal government of the USA in 2004 (United States Equal Employment Opportunity Commission 2005) show more than 1,300 people with developmental disabilities, a subcategory of the larger group with cognitive and learning impairments of all kinds. This figure represents 0.1% of federal employees in this employment category. Another common misconception, one actually associated with earlier research approaches in the field, is that cognitive disabilities can be understood in terms of IQ. IQ is still often used administratively in classifying people or in determining eligibility for support programs (for critical discussion, see ‘‘Assessment and
18
C. Lewis
Identification’’ in President’s Commission, 2002; see also discussion of policies in European countries in European Commission, 2002). But, consistent with the functional view presented above, it is increasingly recognized that IQ measures only some aspects of cognitive function. Thus a person with high IQ can have severe cognitive or learning impairments (Sternberg and Grigorenko 2004). It is also true that a person with low IQ can function very effectively in some areas. Research suggests that variation in IQ accounts for only about 10% of objective success in life, assessed by various criteria, with some scientists arguing that even that figure is an overestimate (Stanovich and West 2000).
3 Discussion The history of attitudes toward and treatment of people with cognitive disabilities is a sad one (Braddock and Parish 2001). As mentioned earlier, assumptions about what they could and could not do led to widespread institutionalization (for UK history see Henley 2001) and restrictions on access to education and other opportunities. Hunt (1967) recounts the struggle of a person with Down syndrome, and his family, for literacy education: in the midtwentieth century the authoritative view in England was that a person with Down syndrome could not be literate. Today, while some people with Down syndrome are not literate, most are or could be with appropriate education (Buckley, n.d.); some have completed secondary school, and some have earned postsecondary degrees, illustrating the range of functional impact of the condition. See Neubert et al. (2001) for a review of postsecondary educational programs in the USA for people with cognitive disabilities. The range of functional impact also means that while many people with cognitive impairments live completely independently, some need help with some aspects of daily life (theArc 1997; Prouty et al. 2006). As mentioned earlier, the employment rate for people with disabilities of all kinds is low in the USA and Europe. The shift in the labor market to jobs requiring higher levels of skill and education is a serious challenge. In meeting this and other challenges, a very positive development is the emergence around the world of the self-advocacy movement. Maintaining the principle, ‘‘nothing about us without us,’’ self-advocates play an active role in policy change and development, with notable success in deinstitutionalization in particular. Self-advocates are continuing to press for reforms in the treatment of benefit payments and access to employment. As mentioned earlier, the Web is a key channel of access for information, services, and participation. But people with cognitive and learning impairments are not well supported by current Web accessibility efforts. This difficulty arises in part from the nature of cognitive accessibility problems. While there have not been many studies of Web use by people with cognitive disabilities (Bohman n.d. i), there are data that indicate that the problems
Cognitive and Learning Impairments
19
encountered by many people with cognitive disabilities are, broadly, the same usability problems that affect all users, but the impact on people with cognitive disabilities is more severe (Small et al. 2005; Harrysson et al. 2004; Freeman et al. 2005). For example, all users have trouble when the back button fails to work, but a user with a cognitive impairment may have more trouble recovering from the problem. All users have trouble processing large amounts of text, but people who cannot read well have more trouble. Pirolli (2005) presents simulation results showing that a small decrease in how well cues (like link labels) are interpreted can lead to an enormous increase in time needed to search a Web site, suggesting that search problems that are bad for good readers may be terrible for poor readers. This relationship between accessibility for people with cognitive disability and usability for a general audience suggests a difference between cognitive accessibility and other aspects of accessibility, at least given current approaches. Current approaches embody the hope that reasonable support for accessibility for people with visual impairment, for example, can be secured by requiring design features that can easily checked, like inclusion of text descriptions for images (W3C 1999; see also Web Accessibility and Guidelines). But it has long been argued (Gould and Lewis 1985) that promoting usability requires user testing, not feature checking. Redish (2000) makes this same argument for comprehensibility, perhaps the key component of cognitive accessibility. This need for user testing makes cognitive accessibility a challenge for regulatory frameworks and guidelines with enforcement concerns, settings in which easy compliance checking is wanted. Leaving aside questions of compliance checking, there are approaches to increasing cognitive accessibility that show promise. Guidelines on presentation and organization of text, navigation, and other matters can be found in Bohman (n.d. i, ii), at the LD Web site,3 and in Hudson et al. (2005). While much of this material concentrates on text, some people with cognitive difficulties do better with non-textual presentation of information or with non-textual supplements (Seeman 2002). As Seeman argues, the different role of text for some people with cognitive disabilities is another source of mismatch between the needs of cognitive accessibility and existing accessibility approaches. The Concept Coding Framework4 and Symbered5 projects address the development of information presentations using pictorial symbols; Webwide6 is a browser and associated service that renders Web content using a symbol vocabulary.
4 Future Directions Self-advocacy has an important role to play in the future development of Web accessibility. While people with visual and auditory impairments have commonly been included in technical advisory bodies on accessibility, people with cognitive impairments have not. Not least among the benefits of increased inclusion will be greater recognition that people with cognitive disabilities have a wide range of capabilities, and that there really is an audience for increased cognitive accessibility. The wide variation in capabilities, and impairments, will lead to greater emphasis on configurability in technology. In this approach, the view a user will have of a Web page will be shaped by a profile that represents his or her information presentation preferences. The Access for All schema (IMS Global Learning Consortium 2004) is an example of a big step in this direction, supporting the interests of people with disabilities of many kinds. The Fluid project7 is developing swappable user interface components to provide tailored user experiences for Web-based higher education applications; this technology will be adaptable to Web applications generally. Rather than requiring the development of different versions of Web content for different audiences, an approach with well-understood drawbacks in unequal access to up-to-date content, the configuration approach separates content from presentation, so that all users receive a view of the same underlying content. Realizing the potential of the configuration approach requires research and development in two directions. First, work is needed on how to capture effectively the many variations in cognitive function that should influence presentation, and hence should be specifiable in user profiles. To take just one example, there do not exist good characterizations of reading ‘‘level’’ that are appropriate to adults, rather than to children in school, or that characterize limitations in vocabulary, or limitations in handling complex syntax (Redish 2000). Second, the technology needed to respond to attributes in profiles is incomplete. To take just one example, again, automatically generating simpler or more complex and complete presentations of a body of content is not a well-understood problem. Progress on representing the meaning of Web content, not just text or images in which content is expressed, will also be an important strategic development for accessibility. The Semantic Web activity8 of the World Wide Consortium may produce representations that can be used for this purpose; UB Access9 is a commercial venture with this goal. Another approach on which work is underway, and will continue, is the development of viewing tools that support people with cognitive and learning 7 8 9
impairments. Rose and Meyer (2002) describe tools that allow readers to hear a spoken presentation of text, or to see a definition of an unfamiliar word. By analogy to screen reader tools used by people with visual impairments, these tools will play an increasing role in allowing people to access Web pages that have not been configured for them.
5 Author’s Opinion of the Field TheArcLink, Incorporated,10 is a nonprofit organization that prepares descriptions of government support services for people with disabilities for presentation on the Web. The descriptions are edited, by a team that includes a self-advocate, to make them easier to understand. While the intended beneficiaries of this editing are people with cognitive impairment, the edited descriptions are much easier for anyone to understand. All too often, we ‘‘communicate’’ using unusual words, jargon, and complex sentence structures and text organization, and everyone suffers. Because of this relationship between problems, design work in support of people with cognitive disabilities will often bring benefits to other users, as well. Thus, a greater focus on cognitive accessibility will help the design community do a better job in its overall mission of creating systems whose operation is clear, and can be mastered without negotiating a painful learning curve. The challenges of this kind of design are different from those associated with other aspects of accessibility. Cognitive barriers can involve a wide variety of cognitive functions, in ways that require changes to the content of presentations, as well as their organization and format, to deal with. The future developments outlined above that will help with these problems will require more focus on cognitive aspects of Web use, including more data gathering on Web use by people with cognitive disabilities, technology development in flexible, customizable user interfaces for presentation and interaction, and advances in representing and transforming meaning rather than form. They will also benefit from increased inclusion of people with cognitive disabilities in the development of technology and of associated policy, including guidelines and regulations, and an approach to policy that recognizes that effective communication, not inclusion of specified features, must be the benchmark for accessibility. Acknowledgments Preparation of this chapter was supported by Rehabilitation Engineering Research Center for Advancing Cognitive Technologies and the Coleman Institute for Cognitive Disabilities.
References Backenroth, G. A. M. (2001) People with disabilities and the changing labor market: Some challenges for counseling practice and research on workplace counseling. International Journal for the Advancement of Counselling, 23, pp. 21 30. Bohman, P. (n.d. i) Cognitive Disabilities Part 1: We still know too little, and we do even less. WebAIM, http://www.webaim.org/articles/cognitive/cognitive_too_little Bohman, P. (n.d. ii) Cognitive Disabilities Part 2: Conceptualizing design considerations. WebAIM, http://www.webaim.org/articles/cognitive/conceptualize Braddock, D. and Parish, S. (2001) An institutional history of disability. In G.L Albrecht, K. D. Seelman, and M. Bury (Eds.) Handbook of Disability Studies. Thousand Oaks, CA: Sage, pp. 11 68. Buckley, S. J. (n.d.) Reading and writing for individuals with Down syndrome An over view. Down Syndrome Information Network, http://information.downsed.org/library/ dsii/07/01/ Dybwad, G., and Bersani, H., Eds. (1996) New Voices: Self advocacy by People with Dis abilities. Cambridge, MA: Brookline Books. European Commission Directorate General for Employment and Social Affairs (2002) Defi nitions of Disability in Europe: A comparative Analysis, ec.europa.eu/employment_ social/index/complete_report_en.pdf Fox, S. (2006) Online health search 2006. Pew Internet & American Life Project, http://www. pewinternet.org/PPF/r/190/report_display.asp Francik, E. (1999) Telecommunications Problems and Design Strategies for People with Cognitive Disabilities. http://www.wid.org/archives/telecom/ Freeman, E., Clare, L., Savitch, N., Royan, L., Litherland, R., and Lindsay, M. (2005) Improving website accessibility for people with early stage dementia: A preliminary investigation. Aging & Mental Health, 9, pp. 442 448. Gould, J. D., and Lewis, C. (1985) Designing for usability: key principles and what designers think. Communication ACM 28, 3, pp. 300 311. Harrysson, B., Svensk, A., and Johansson, G. I. (2004). How People with Developmental Disabilities Navigate the Internet. British Journal of Special Education, 31, pp. 138 142. Hendershot, G. E., Larson, S. A., Lakin, K. C., and Doljanac, R. (2005) Problems in Defining Mental Retardation and Developmental Disability: Using the National Health Interview Survey. DD Data Brief, Volume 7, http://rtc.umn.edu/misc/pubcount.asp? publicationid=131 Henley, C. A. (2001) Good Intentions Unpredictable Consequences. Disability & Society, 16, pp. 933 947. Hudson, R., Weakley, R., and Firminger, P. (2005) An Accessibility Frontier: Cognitive disabilities and learning difficulties. Originally presented at OZeWAI 2004 Conference, La Trobe University, Melbourne, Australia 2 December 2004. http://www.usability.com. au/resources/cognitive.cfm Hunt, N. (1967) The World of Nigel Hunt. New York: Garrett Publications. IMS Global Learning Consortium (2004) IMS AccessForAll Meta data Overview http:// www.imsglobal.org/accessibility/accmdv1p0/imsaccmd_oviewv1p0.html Interagency Committee on Disability Research (2003) Federal Statutory Definitions of Dis ability, http://www.icdr.us/documents/definitions.htm Microsoft (n.d.) The Market for Accessible Technology The Wide Range of Abilities and Its Impact on Computer Use. Online at http://www.microsoft.com/enable/research/phase1.aspx Neubert, D. A., Moon, M. S., Grigal, M., and Redd, V. (2001) Post secondary educational practices for individuals with mental retardation and other significant disabilities: A review of the literature. Journal of Vocational Rehabilitation, 16, pp. 155 169. Pirolli, P. (2005 ) Rational Analyses of Information Foraging on the Web. Cognitive Science, 29, pp. 343 373.
Cognitive and Learning Impairments
23
President’s Commission on Excellence in Special Education (2002) A New Era: Revitalizing Special Education for Children and Their Families. http://www.ed.gov/inits/ commissionsboards/whspecialeducation/reports/index.html Prouty, R.W., Smith, G., and Lakin, K. C. (2006) Residential Services for Persons with Developmental Disabilities: Status and Trends Through 2005. Research and Training Center on Community Living, http://rtc.umn.edu/risp05 Rapley, M. (2004) The Social Construction of Intellectual Disability. Cambridge: Cambridge University Press. Redish, J. (2000) Readability formulas have even more limitations than Klare discusses. ACM Journal of Computer Documentation, 24, pp. 132 137. Rose, D. H. and Meyer, A. (2002) Teaching Every Student in the Digital Age: Universal Design for Learning. Alexandria, VA: ASCD. Roth, William. ‘‘Handicap as a Social Construct.’’ Society 20, no. 3 (March/April 1983): 56 61. Seeman, L. (2002) Inclusion Of Cognitive Disabilities in the Web Accessibility Movement. WWW2002, Eleventh International World Wide Web Conference (Honolulu, Hawaii, USA, 7 11 May 2002) . http://www2002.org/CDROM/alternate/689/ Small, J., Schallau, P., Brown, K., and Appleyard, R. (2005). Web accessibility for people with cognitive disabilities. In CHI ’05 Extended Abstracts on Human Factors in Computing Systems (Portland, OR, USA, April 02 07, 2005). CHI ’05. ACM Press, New York, NY, pp. 1793 1796. Stanovich, K. E. (2005). The future of a mistake: Will discrepancy measurement continue to make the learning disabilities field a pseudoscience? Learning Disability Quarterly, 28, pp. 103 106. Stanovich, K. E., and West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences, 23, pp. 645 665. Sternberg, R. J., and Grigorenko, E. L. (2004). Learning disabilities, giftedness, and gifted/ LD. In T. M. Newman and R. J. Sternberg (Eds.), Students with Both Gifts and Learning Disabilities. New York: Kluwer pp. 17 31 theARC (1997) Community living. http://www. thearc.org/faqs/comliv.html United States Equal Employment Opportunity Commission (2005) Annual Report on the Federal Work Force: Fiscal Year 2004. Table 6. http://www.eeoc.gov/federal/fsp2004/ aed/table6.html W3C (1999) Web Content Accessibility Guidelines 1.0, http://www.w3.org/TR/WAI WEBCONTENT/
Hearing Impairments Anna Cavender and Richard E. Ladner
Abstract For many people with hearing impairments, the degree of hearing loss is only a small aspect of their disability and does not necessarily determine the types of accessibility solutions or accommodations that may be required. For some people, the ability to adjust the audio volume may be sufficient. For others, translation to a signed language may be more appropriate. For still others, access to text alternatives may be the best solution. Because of these differences, it is important for researchers in Web accessibility to understand that people with hearing impairments may have very different cultural-linguistic traditions and personal backgrounds. Keywords Hearing impaired Deaf Hard of hearing Web accessibility American Sign Language Sign language recognition Sign language avatars Captioning
1 Introduction People with hearing impairments form a disability group very different than other disability groups. Because a large segment uses signed languages which are distinct from the spoken language used around them, the accessibility needs of this group involve language translation. In this section, we provide basic information about those who are hearing impaired and the distinct subgroups of deaf and hard of hearing.
1.1 Models of Hearing Impairment The hearing impaired disability group is very diverse and not necessarily of one mind. It helps to think of the three models of disability: (i) medical model, (ii) rehabilitation model, and (iii) social model (Oliver 1990). A. Cavender Computer Science and Engineering, University of Washington, Seattle, WA, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_3, Ó Springer Verlag London Limited 2008
25
26
A. Cavender, R.E. Ladner
In the medical model a person who is hearing impaired is thought to be broken, in need of repair to restore hearing. The ideal is a complete cure, but any step toward better hearing is an achievement. Hearing aids and cochlear implants may partially restore hearing, but are not considered to be cures for deafness. In the rehabilitation model a person who is hearing impaired is viewed to be in need of assistance to carry on a more independent life. In this model, sign language interpreters or real-time captioning are provided. Closed captioned television has become a standard in many parts of the world. There is a focus on lip reading and speech training to help the person interact with hearing people without assistance. In the social model a hearing impaired person is viewed to be part of a community or culture. The group of hearing impaired people who share a common language, such as American Sign Language (ASL), Japanese Sign Language (JSL), or British Sign Language (BSL), appear to be a distinct subculture with their own language and customs. In the United States, members of this group call themselves ‘‘Deaf ’’ with a capital ‘‘D’’ and a certain degree of pride. Indeed, this group is uncomfortable with the term ‘‘hearing impaired’’ as it appears to accentuate something that is lacking and is not a term that they chose for themselves. In a similar way, another group in the United States prefers to call themselves ‘‘Black’’, and rejects terms that were chosen by others. In the United States, among those who are hearing impaired there is a large group who prefer to be called ‘‘hard of hearing’’ rather than deaf or hearing impaired. Again, the term hearing impaired is rejected because it was chosen by others. They recognize that the term ‘‘deaf ’’ does not fit them well because they primarily rely on their residual hearing and speech, rather than on sign language. An individual hearing impaired person may choose at different times to be viewed within any of these models. Those of us who are working in the accessibility field must recognize that we are essentially viewing hearing impaired people in the rehabilitation model. Nonetheless, it is important to recognize that there are other views that must be respected. An elderly person who has lost her hearing is likely not to know sign language and does not identify with Deaf Culture. A hearing impaired young man may have been brought up with hearing parents who tried everything to make him as ‘‘hearing’’ as possible, giving him a cochlear implant and extensive lip reading and speech training at an early age. Later in life, the young man may disable his implant, refuse to speak, and instead choose to be Deaf with a capital ‘‘D’’. He rejects the medical model and the part of the rehabilitation model that tries to define him as hearing. He never felt fully included in the hearing world. However, it is likely that he accepts the part of the rehabilitation model that supports sign language interpreting and captioning, as neither of these has a focus on correcting his hearing. From a strictly audiological point of view there are several ways to quantify hearing loss. The most common metric is the degree of loss in decibels (dB) from
Hearing Impairments
27
mild loss (25–40 dB) to profound loss (90 dB or greater). There is also a distinction between pre- and post-lingual deafness, meaning the deafness occurred before spoken language acquisition or after, respectively. With postlingual deafness, speech training is much easier and often successful while with pre-lingual deafness, speech training is much more difficult and often unsuccessful. In either case, excellence at lip reading is not common. Interestingly, a person’s identification as either deaf or hard of hearing is not a function of the degree and onset of hearing loss, rather, it is a personal choice of what the person feels comfortable calling him- or herself and with which group the person most identifies.
1.2 Demographics on Hearing Impairments The World Heath Organization estimates that in 2005 the number of people in the world with hearing impairments is 278 million, or about 4.3% of the world’s population (WHO 2005). According to the National Center for Health Statistics, in 1994, there were 20,295,000 (8.6%) hearing impaired people in the United States and about 0.5 million of these cannot hear or understand speech (Holt et al. 1994). There appears to be no accurate statistics on the number of people in the United States who are fluent in ASL. It would appear that a large majority of those considered hearing impaired are not part of the Deaf Community and do not know sign language. A significant segment of this group are elderly people who have lost some or all of their hearing.
1.3 Legal Perspective In the United States, the Individuals with Disabilities Education Improvement Act of 2004 (IDEA) and the Americans with Disabilities Act of 1990 (ADA) benefit all disabled persons including those who are hearing impaired. There are many regulations and laws specifically related to deaf, hard of hearing, and hearing impaired people. The Television Decoder Circuitry Act of 1990 requires that all televisions, 13 inches or larger, must have built-in closed caption decoders. This law does not apply to computer equipment that is capable of delivering video or television programming. There are federal regulations that require phone companies to provide TTYs to their deaf customers free of charge. More recently those regulations have been expanded to require free access to Video Relay Services. Most states have laws that give a deaf person the right to have a sign language interpreter in certain situations such as legal proceedings. In many countries there are similar laws and regulations to those found in the United States regarding persons with hearing impairments.
28
A. Cavender, R.E. Ladner
2 Deaf People and Sign Language In this section we focus on the subgroup of people with hearing impairments who identify themselves as deaf. This group uses signed languages and have a rich and interesting history and culture (Ladd 2003; Lane 1984; Padden and Humphries 2005).
2.1 Sign Language In the 1960s linguists began the study of signed languages in earnest. They determined that signed languages have essentially all the properties of spoken languages, except that the hands, arms, body, and facial expressions are used instead of speech. Up until that time, it was generally believed that signed languages were just a system of gestures devoid of the linguistic features of spoken languages. Although a sign language may be influenced by the spoken language found in the same region, it is distinct, with its own grammar suitable for the efficient use of the body and eyes, and not the vocal/aural system. Probably the most studied signed language is ASL, with a large body of literature. Individual signs in ASL are composed of hand shapes, location, and motions. In addition, subtle shifts of body positions and facial expressions can also contain information. An important grammatical component of ASL that does not occur in spoken languages is the use of classifiers. A classifier is a specific hand shape that can represent a particular person or object. The classifier is then put into motion in an iconic way to make a sentence or part of a sentence. For example, one can say ‘‘My car raced down the street, careened to the left, then to the right, then turned over’’ by setting up the hand shape classifier for ‘‘vehicle’’ as my car, then putting the classifier in motion, showing spatially what the sentence says. In fact, the ASL version would contain even more information about the timing and severity of the motions. There is a grammatically correct way to use classifiers; the description of a rollover is not simply iconic gesturing. Because of the complexity of the ASL grammar and its essentially infinite way to modulate signs to change their meaning, there is no universally accepted written form of ASL or any other sign language (BakerShenk and Cokely 1991).
2.2 Historical Perspective
Education
Schools for the deaf have played a pivotal role in the history of the deaf (Lane 1984). Perhaps the first significant such school was founded in Paris in the middle of the eighteenth century by Abbe´ Charles Michel de l’Epe´e. Although there was likely a signing community in Paris at the time, the bringing together of many deaf signers and the natural human propensity for language allowed
Hearing Impairments
29
the French Sign Language to flourish at the school. In the early nineteenth century Thomas Hopkins Gallaudet, at the behest of the father of a deaf child, was sent to Europe from the United States to research what was known about educating deaf children. He brought back a French educator of the deaf, Laurent Clerc, who was also deaf. Gallaudet formed what is now called the American School for the Deaf in West Hartford, Connecticut with Clerc as its first teacher. The American School and many others that were founded in the United States accepted sign language both in and out of the classroom. At the time that the American School was founded, schools for the deaf in much of Europe had adopted a very different philosophy, stressing the oral method that promoted lip reading and speech, to the exclusion of sign language. The Congress on the Education of the Deaf held in Milan in 1880 passed a resolution that essentially stated that the oral method was superior to any other method and that sign language should be banned from education of the deaf. Only two countries voted against the resolution, the United States and Great Britain. As a result of the oral movement, many schools for the deaf, even in the United States, were converted to oral schools. In spite of this effort to banish sign language, it did not die. Students would sign with each other in bathrooms and at night when they could not be observed. When they left school as adults they would congregate together at deaf clubs and at social events. Naturally, there were some students who were ‘‘oral successes’’ who never joined the Deaf Community after leaving school. Others who mastered lip reading and speech to some degree would join the Deaf Community for recreation and emotional support, but would join the hearing world for work and other needs. In the United States, and in most parts of the world, there is now a recognition that oralism alone is not a satisfactory solution for deaf education. Educational philosophies such as ‘‘Total Communication’’ encourage the development of oral skills at the same time allowing students and teachers to use sign language in the educational process. The ‘‘Simultaneous Method’’ encourages teachers to both sign and speak at the same time. Some schools promote teaching in ‘‘Signed English’’ where students are taught to speak with their hands in English language structure, borrowing signs from ASL. Some schools for the deaf offer ‘‘Bilingual Bicultural’’ education where ASL is promoted as a first language and English is taught as a second language. In spite of the oral movement and numerous educational approaches, deaf people tend to eventually learn their indigenous sign language and choose to socialize with each other. When outside the Deaf Community, deaf people may choose to use the oral skills they have gained in school or abandon those skills altogether, instead relying on writing and sign language interpreters. Regardless of what educational philosophy a deaf person is exposed to, it is not uncommon for an individual deaf person to struggle with the indigenous spoken language in both spoken and written forms. For example, many deaf people educated in the United States have difficulties with English. Hearing people learn their native spoken language without actually being taught. Language comes naturally. They may have difficulty with the written form of the
30
A. Cavender, R.E. Ladner
language, but not in speaking it. Those deaf people who have deaf parents (which is less than 10%) also learn their signed languages naturally. However,over 90% of deaf children have hearing parents who do not know any signed language. Many of these children are not exposed to any language in a natural way during those early critical years of language acquisition. Oral training is not really a substitute for almost effortless natural language acquisition. This lack of early exposure to any language may be the reason so many deaf people have difficulty with written language.
2.3 Historical Perspective
Technology
There is another driving force in deaf history: technology. From primitive hearing aids in the shape of horns to modern high-tech hearing aids to cochlear implants, there has been a desire to improve hearing. Modern cochlear implants can improve hearing considerably and can be beneficial for many, but they should not be considered a cure. The cochlear implant industry is growing at a rapid pace, especially since 2000 when the Federal Drug Administration (FDA) lowered the age to 12 months for implantation. There is concern within the Deaf Community that a new oral movement will start again, where implanted children are not allowed an opportunity to learn sign language because of the fear by some that oral skills will suffer from such exposure. While medical advances have affected the Deaf Community, so too have advances in entertainment and communication technology. The Deaf Community was sad to see the demise of the silent movies in the late 1920s, but closed captioned television and subtitled movies introduced in the 1970s have opened television and reopened movies to deaf audiences. Until the 1960s the telephone was inaccessible. The invention of the acoustic modem in 1964 by a deaf physicist, Robert Weitbrecht, allowed surplused Western Union Teletypewriters (TTY) to communicate with each other over phone lines. Modern portable TTYs became very popular in the 1980s. TTY relay services flourished in the 1990s allowing deaf people to communicate with hearing people through intermediaries who voice what is typed and type what is spoken. As the popularity of e-mail, instant messaging, and text messaging have grown in the general population, the Deaf Community has rapidly adopted this ubiquitous technology, making TTYs essentially obsolete. The fastest growing new technology is the Internet-based video phone. Video phones allow deaf people to use sign language instead of text to communicate. This allows for a more natural conversation than can be achieved through a text approach. Video relay services, similar to TTY relay services, are also growing in popularity. The use of vlogs (video web logs) is a relatively recent phenomenon within the Deaf Community that allows for blogging in sign language. Vlogs are so popular that even Robert Davila, the president of Gallaudet,
Hearing Impairments
31
maintains a weekly vlog (Davila 2007). In Europe, Japan, and other countries with 3G networks, video phone calls can be made from cell phones. The growth and popularity of the Web has further enabled deaf and hard of hearing people to participate mostly on an equal basis as hearing people. Although there is a growing amount of multimedia on the Web, most information on the Web is still visually oriented. Whether or not we are in the ‘‘silent movie era’’ of the Web is yet to be seen. If so and audio and multimedia become dominant on the Web, then there may be trouble ahead.
3 Current Web Access Technology Current and future research on Web accessibility for deaf people and people with hearing impairments has and will likely continue to focus on providing alternative or augmented visual information for inaccessible auditory information. This visual information can take the form of captions, transcripts, or sign language synthesis.
3.1 Embedded Video and Captioning Captions provide accessible text versions of video and audio in real time. While this access is essential for people with hearing impairments, it also benefits people who do not have speakers, people in noisy places, and people in noise-minimizing environments such as libraries and cubicle offices. In the case of vlogs discussed in Section 2.3, captions or equivalent text transcripts can ensure accessibility for people who do not know the sign language. Captioning provides an alternative channel of information that may make content more understandable for people with learning disabilities and people learning a new language. Also, adding text to video and audio content makes it more searchable and indexable, which allows more people to discover and access those materials. Common Web accessibility guidelines recommend that captions be both equivalent to the audio and synchronized with the audio. At a minimum, an equivalent transcript of the audio should be provided, even if it cannot be synchronized. Transcripts can also be useful to people who do not have the required video or audio player or who do not want to download the entire video or audio file. Either way, the captioning should be readily accessible through an easy to find link and/or instructions on how to enable the captioning. There are two different types of captions: closed and open. Closed captions give the user the option to display or hide the captions and require a compatible decoder to process and display the caption data. The decoder determines the way in which closed captions are displayed; typically they appear as white text on black background toward the bottom of the screen. Open captions are
32
A. Cavender, R.E. Ladner
incorporated into the video itself and cannot be hidden. But, because they were designed with the video, they can be placed in visually convenient locations on the screen and with appropriate colors and backgrounds. Designing opencaptioned video often requires expensive and time-consuming video editing tools. The more common approach is to utilize closed captioning functionality within multimedia players such as Microsoft’s Windows Media Player, Apple’s Quicktime, RealNetwork’s RealPlayer, and Macromedia’s Flash. Each of these media players handles captions differently. Detailed technical instructions for including captions in Web videos can be found on the WebAIM website (WebAIM 2007). Websites that allow users to upload and share personal videos are becoming more and more popular. YouTube and Google Video (both owned by Google) are two examples. Google Video supports closed captioning by allowing users to upload a file containing timestamped text to be played back with the video (Google Video 2007). Several video editing software packages contain features for adding captions to videos. MAGPie, an authoring tool for assisting Web designers in creating captions and audio transcriptions, was developed by the National Center for Access Media (NCAM) at WGBH (MAGPie 2007).
3.2 Captioning Services Several closed captioning and real-time transcription services such as Automatic Sync, Viable Technologies, and the Media Access Group at WGBH have been established to provide Web video accessibility service. Automatic Sync offers an automated Web-based service for captioning that parses text from voice into appropriate captions, synchronizes them with the audio, and formats the output for Webcasts, DVDs, and/or videotapes (Automatic Sync Technologies 2007). For on-line classrooms, Viable Technologies offers a captioning service using remote voice operators (Viable Technology 2007). The Media Access Group at WGBH can supply closed captions for media players when provided a television video with existing closed captions. They can also provide real-time captions for live Web events and Web conferencing (Media Access Group at WGBH 2007). These are just a few examples of services that can help Web designers to more easily ensure accessibility of both static and streaming video.
3.3 Access Using Sign Language As computer vision and computer graphics techniques have improved in recent years, progress has been made both in sign language recognition and sign language synthesis with the pursuit of automatically translating between written or spoken languages (such as English) and signed languages (such as ASL).
Hearing Impairments
33
Sign language recognition uses computer vision techniques to convert sign language videos into written or spoken language (Ong and Ranganath 2005). Beyond video, sensors may be placed on the arms, hand, and/or face of the signer or data gloves may be used to assist in tracking the movements. Even the best recognition systems still use very limited word sets (under 100 words) to increase the probably that the movements detected are correctly matched with a word or sentence. Sign language synthesis, or signing avatars, are systems that use complex translation systems to convert written or spoken languages to video sign language using human-like graphics. Some projects focus on translation challenges by attempting to formalize a grammar for sign languages (Zhao et al. 2000). Other projects (Toro et al. 2001; Vcom3D 2007) focus on graphics challenges by allowing the user to explicitly select hand shapes, hand positions, and whole words. The TESSA system (Cox et al. 2002) avoids some of the grammatical challenges by constraining the language to a specific domain (common phrases used in the post office) to aid in the communication between a deaf person and a clerk at a post office. Here, voice recognition software matches sentences spoken by the clerk to a limited set of pre-defined phrases that the graphics avatar is then capable of signing. Translation is a difficult task overall for three reasons. First, computer vision techniques continue to struggle with real-world situations such as natural lighting and human skin. Second, synthesized human graphics that produce realistic sign language is still an open problem. And third, automatic translation between languages is already difficult in general, perhaps more so for sign language due to its lack of a written form. For example, computationally expressing the modulation techniques and classifiers discussed in Section 2 is problematic. While sign language recognition may one day contribute to better communication between hearing and hearing impaired people, avatars may be more applicable to Web accessibility. Avatars could help create more accessible Web pages for people who consider ASL their primary language (see Section 2.2).
4 Future Research Directions With extensive training on a single speaker, voice recognition can be very effective, and such systems are used for real-time, automatic generation of captions. For example, many real-time television broadcasts use trained operators who repeat words voiced by actors and news and sports reporters into a voice recognition system. Voice recognition is not perfect. For example, it lacks punctuation, has poor accuracy, and is less reliable for multiple speakers. Highly accurate, speaker-independent voice recognition is still an open problem. Thus, increasing the accuracy and feasibility of voice recognition technology for many different situations is an important area for future research.
34
A. Cavender, R.E. Ladner
Given the imperfections of voice recognition, better interfaces for allowing voice captioners to quickly modify and correct the output is a high-need research area. Similarly, better interfaces for designers to choose good placement and timing for both real-time and non-real-time captions would also be interesting and useful future research. Translation between written/spoken language and signed language continues to be a hot topic in research as working models still need considerable improvement. Current signing avatars are improving, but are a long way from being satisfactory. A different type of translation problem, yet equally challenging, would be converting linguistically complex language into a more universally accessible form. Such a system would benefit anyone whose primary language is not English (or the language being translated). For example, translation of legal documents to a widely understandable form would be a boon for everyone. As the area of human computer interaction (HCI) incorporates the concept of universal design, there is a growing need to include persons with disabilities on research teams, not just as test subjects. This is complicated somewhat with the deaf and hard of hearing group because of the potential language barrier. Nonetheless, with sign language interpreters and real-time captioning such participation is possible and will enhance the research.
5 Authors’ Opinion of the Field Communication technology developments in the last 40 years have opened up the world to the once sheltered Deaf Community. The technology has enabled deaf people to communicate remotely with each other, thereby keeping the community alive and vibrant. At the same time, it has enabled individual deaf people to communicate with more ease with hearing individuals. Medical advances are pushing the greater society to believe that deafness has been or soon will be cured. Hence, there is a constant or even increasing tension between those who love and cherish the Deaf Community and its unique language and those who believe that it is an anachronism that will soon disappear because of medical advances in technology. Unfortunately, most of those in the latter camp have never bothered to learn sign language and get to know deaf people on their terms. Those persons with hearing impairments, especially those in the aging population, have benefited from advances in technology that improve hearing and support texting of any kind. Automated captioning is improving, while language translation between spoken and signed languages is far from ideal. Including captions and language translation for Web accessibility is still basically up to the Web designers.The amount of multimedia on the Web is growing rapidly. Currently, information on the Web is still visually oriented, but we may be in the ‘‘silent movie era’’ of the
Hearing Impairments
35
Web. Should multimedia become dominant on the Web, automated ways to achieve accessibility for people with hearing impairments becomes an imperative. Acknowledgments Thanks to Rob Roth for reading a preliminary version of this chapter and providing valuable suggestions.
References Automatic Sync Technologies. Captionsync. In Automatic Sync Technologies, 2007. http:// www.automaticsync.com/ Baker Shenk, C. and Cokely, D., American Sign Language: A Teacher’s Resource Text on Grammar and Culture. Gallaudet University Press, 1991. Cox, S., Lincoln, M., Tryggvason, J., Nakisa, M., Wells, M., Tutt, M., and Abbott, S., Tessa, a system to aid communication with deaf people. In ACM SIGACCESS Accessibility and Computing, pages 205 212, 2002. Davila, R. R., Bob’s vlogs. In Gallaudet University Web Site, 2007. http://www.gallaudet.edu/ x3603.xml Google Video. Google video help center for captions. In Google, 2007. http://video.google. com/support/bin/answer.py?answer=26577 Holt, J., Hotto, S., and Cole, K., Demographic aspects of hearing impairment: Question and answers, third edition. In Center for Assessment and Demographic Studies, Gallaudet University, 1994. http://gri.gallaudet.edu/Demographics/factsheet.html Ladd, P., Understanding Deaf Culture: In Search of Deafhood. Multilingual Matters, Clevedon, 2003. Lane, H., When the Mind Hears: A History of the Deaf. Random House, 1984. MAGPie. Media access generator. In National Center for Access Media (NCAM) at WGBH, 2007. http://ncam.wgbh.org/webaccess/magpie/ Media Access Group at WGBH. Media access group. In WGBH, 2007. http://main.wgbh. org/wgbh/access/access.html Oliver, M., The Politics of Disablement. Palgrave Macmillan, 1990. Ong, S. C., and Ranganath, S., Automatic sign language analysis: A survey and the future beyond lexical meaning. In IEEE Transactions on Pattern Analysis and Machine Intelli gence, pages 873 891, 2005. Padden, C. and Humphries, T., Inside Deaf Culture. Harvard University Press, 2005. Toro, J., Furst, J., Alkoby, K., Carter, R., Christopher, J., Craft ,B., Davidson, M. J., Hinkle, D., Lancaster, G., Morris, A., McDonald, J., Sedgwick, E., and Wolfe, R., An improved graphical environment for transcription and display of American Sign Language. In Information 4, pages 533 539, 2001. Vcom3D. Vcommunicator signing avatar. In Vcom3D, 2007. http://www.vcom3d.com/ Viable Technology. Remote realtime transcription. In Viable Technologies, Inc., 2007. http:// www.viabletechnologies.com/ WebAIM. Web accessibility in mind. In Center for Persons with Disabilities, Utah State University, 2007. http://www.webaim.org/techniques/captions/ WHO. Deafness and hearing impairment. In World Health Organization, 2005. http://www. who.int/mediacentre/factsheets/fs300/en/index.html Zhao, L., Kipper, K., Schuler, W., Vogler, C., Badler, N., and Palmer, M., A machine translation system from English to American Sign Language. In Proceedings of the Association for Machine Translation in the Americas, pages 54 67, 2000.
Physical Impairment Shari Trewin
Abstract Many health conditions can lead to physical impairments that impact computer and Web access. Musculoskeletal conditions such as arthritis and cumulative trauma disorders can make movement stiff and painful. Movement disorders such as tremor, Parkinsonism and dystonia affect the ability to control movement, or to prevent unwanted movements. Often, the same underlying health condition also has sensory or cognitive effects. People with dexterity impairments may use a standard keyboard and mouse, or any of a wide range of alternative input mechanisms. Examples are given of the diverse ways that specific dexterity impairments and input mechanisms affect the fundamental actions of Web browsing. As the Web becomes increasingly sophisticated, and physically demanding, new access features at the Web browser and page level will be necessary.
1 Introduction People with physical impairments comprise the second largest accessibility group, after those with cognitive impairments. Many rely heavily on the Web to provide access to services and opportunities they would otherwise be unable to use independently. Although they form a very diverse group, many of their Web access requirements are shared. This chapter describes how physical impairment affects movement. It provides an overview of some of the more prevalent physical impairments, with a focus on dexterity impairments, and emphasizes that physical impairment can often occur alongside sensory or cognitive impairments, depending on the underlying health condition. This has implications for the access solutions that individuals choose to adopt. Sometimes, physical ease of use must be sacrificed for cognitive or visual simplicity. S. Trewin IBM T.J. Watson Research Center, Hawthorne, NY, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_4, Ó Springer Verlag London Limited 2008
37
38
S. Trewin
Pointing, clicking and typing are fundamental actions that are essential to Web access, and they can be challenging for people with physical impairments. Some people cannot point and click at all, and use alternative access mechanisms such as keyboard shortcuts. Others point and click, but in a very different way, for example by using eye gaze. Still others use regular devices in very individual ways. This chapter aims to provide a deeper understanding of Web access by people with physical impairments by describing how specific computer input actions are affected by common dexterity impairments. ‘‘Web Accessibility and Guidelines’’ describes the best practice design guidelines that support physical accessibility on the Web, while Part III describes in more detail the assistive technologies and other tools that may be used for Web access by this population. First, consider how people move. Control of movement starts in the brain, most notably the motor cortex, which is responsible for planning and execution of voluntary movement; the cerebellum, which integrates sensory perception and motor coordination; and the basal ganglia, which controls the speed of movement and prevents unwanted movements. From the brain, messages travel through the spinal cord to the nerves, which are attached to muscles. The messages from the nerves tell the muscles when, and how strongly, to contract and relax. Muscles are attached to bones via tendons, and operate in opposing pairs. When one contracts, the other stretches, and the bones move around a joint. The term impairment is used here as defined by the World Health Organization’s International Classification of Functioning, Disability and Health (World Health Organization, 2001) as a loss or abnormality of body structure or of a physiological or psychological function. Impairment is the result of a health condition, such as diabetes or spinal cord injury. This chapter is concerned with impairments of a person’s ability to move – physical impairments. These are often not static – some are associated with progressive diseases and worsen over time. Others are temporary or curable. Still others may follow an unpredictable pattern of progression and remission. The severity of impairment also varies from mild disruption that affects only the most delicate tasks, to severe impairment in which movement is lost entirely. Some people’s position on this spectrum varies on a daily or hourly basis. Dexterity impairments are those that specifically affect the use of the hands and arms. These have the greatest impact on technology and Web access, since the vast majority of computer input mechanisms are manual. The following sections present the more prevalent musculoskeletal and movement disorders that impair dexterity.
1.1 Musculoskeletal Disorders Musculoskeletal impairments are those arising in the muscle or skeletal system, or specifically the interaction between those two systems. They can be caused by deformity, injury or disease.
Physical Impairment
39
Impairments that limit an individual’s range of movement may make it difficult to reach a key or grasp a mouse. If both hands are lost or affected, then it may be necessary to control technology with some other body part, such as a foot or elbow, or by typing with a stick held in the mouth. Stiff swollen joints are a primary symptom of arthritis, and can affect dexterity even in early stages of the disease. Hands are particularly prone to development of both osteoarthritis and rheumatoid arthritis, which are the most common forms. Movement can also be restricted by health conditions such as carpal tunnel syndrome or cumulative trauma disorder, in which repetitive motions cause injury to the muscles, nerves and tendons. Further repetition of the overused movements can aggravate the injury, and may be extremely painful, or simply impossible.
1.2 Movement Disorders Damage to the nervous system or neuromuscular system leads to movement disorders, a very different class of impairments. Among the most common movement disorders are
Ataxia, which is a loss of gross coordination of muscle movements leading to
unsteady and clumsy motion. This is often due to atrophy of cells in the central nervous system. It can be caused by many health conditions including stroke, multiple sclerosis, cerebral palsy and tumors. Chorea is characterized by brief, irregular contractions that occur without conscious effort, caused by over-activity of the neurotransmitter dopamine. It is seen in Huntington’s disease, and as a side effect of certain drugs (e.g., for Parkinson’s treatment), or a complication of rheumatic fever. Dystonia produces involuntary sustained muscle contractions, due to damage in the basal ganglia. The muscle contractions cause repetitive movements or abnormal postures, and can be very painful. Myoclonus is involuntary twitching of a muscle or group of muscles, due to lesions of the brain or spinal cord. It is found in multiple sclerosis, Parkinson’s disease and Alzheimer’s disease, Partial (paresis) or complete (paralysis) loss of muscle function for one or more muscle groups is most often caused by damage to the brain or spinal cord. It is not strictly a disorder of movement, but a loss. Parkinsonism is a combined term covering the Parkinson’s disease symptoms of tremor, rigidity (increase in muscle tone resulting in resistance to movement), bradykinesia (slow execution of movements) and akinesia (inability to initiate movements). Parkinsonism is produced by Parkinson’s disease, Huntington’s disease and other disorders of the basal ganglia. Spasm is a sudden involuntary contraction of a muscle, due to imbalance of signals between the nervous system and the muscles. Common causes of spasticity are cerebral palsy, brain or spinal cord injury and stroke. Spasticity
40
S. Trewin
may range from slight muscle stiffness to contracture – permanent shortening of the muscle that causes the joint to become misshapen. Tremors are unintentional, somewhat rhythmic muscle movements involving oscillations, stemming from problems in the brain. Essential tremor is by far the most common form, and the hands are involved in 90% of cases (Lou and Jankovic 1991). Tremor also occurs in multiple sclerosis, Parkinson’s disease, traumatic brain injury and stroke, and can also be triggered by some medications. Stress or fatigue can exacerbate tremor. There are many different types of tremor. Essential tremor generally occurs when trying to maintain a fixed posture or make a movement. Parkinsonian tremor occurs when the muscles are at rest.
1.3 Prevalence of Dexterity Impairments Significant numbers of individuals have health conditions that can lead to dexterity impairment. In the United States, for example, it is estimated that 4% of people (5–10 million) are affected by essential tremor, 2.6% of the non-institutionalized adult population have had a stroke (Lethbridge-C¸ejku, Rose and Vickerie 2006) and 27% of adults report chronic joint pain or stiffness (Lethbridge-C¸ejku et al. 2006); 2–2.5 in every 1000 children born are affected by cerebral palsy (Odding, Roebroeck and Stam 2006), with a total of approximately 764,000 people affected in the United States (Krigger 2006). A further 1–1.5 million people in the United States are affected by Parkinson’s disease, 400,000 by multiple sclerosis and 253,000 by spinal cord injury (National Spinal Cord Injury Statistical Center, 2006). Approximately 52% of those with spinal cord injuries have partial or complete paralysis of the arms. Every year in the United States, 80,000–90,000 people are left with permanent disability after traumatic brain injury (Thurman, Alverson, Dunn, Guerrero and Sniezek 1999). Worldwide, studies of musculoskeletal disorders, including carpal tunnel syndrome and cumulative trauma disorder, have reported prevalence rates based on medical examination ranging from 9.3 to 26.9% (Huisstede, BiermaZeinstra, Koes and Verhaar 2006). Self-reported prevalence rates are much higher, at 30–53%, with the highest rates being reported by studies of textile workers and students in the United States (Huisstede et al. 2006). One large British survey (Grundy, Ahlburg, Ali, Breeze and Sloggett 1999) estimated that 14.3% of the adult population of the United Kingdom has some (self-reported) mobility impairment, where mobility includes locomotion (the ability to walk), reaching and stretching and dexterity.
1.4 Co-occurrence of Dexterity and Other Impairments The British survey described in the previous section (Grundy et al. 1999) estimated that while 6.2% of the adult population had mobility as their only
Physical Impairment
41
loss of function, 3.9% had both mobility and sensory loss, 1.7% had mobility and cognitive loss and 2.5% had mobility, sensory and cognitive loss. In other words, most adults with a mobility impairment also had a cognitive or sensory impairment. Physical impairments associated with ageing can co-occur with age-related changes in vision and cognition. Perhaps more significantly, many health conditions that cause physical impairment can also cause sensory or cognitive impairment. For example, the damage caused to the brain by stroke or traumatic brain injury can have a wide range of effects, depending on the area affected. A large proportion of individuals with cerebral palsy also have some cognitive impairment with estimates ranging from 25% to two-thirds (Krigger 2006; Odding et al. 2006), and 25–39% of adults with cerebral palsy also have some visual impairment (Krigger 2006). Multiple sclerosis can affect vision, attention and concentration. People with multiple sclerosis may find that controlling their movements is very fatiguing, and fatigue tends to magnify any visual and cognitive difficulties. This co-occurrence is an important observation, because many strategies designed to compensate for physical impairment do so by placing additional cognitive and/or visual load on the user. For example, keyboard shortcuts must be memorized. Word prediction systems require additional reading and decision making by the user. There will always be a need for enormous variety in the assistive solutions available to individuals with physical impairments, to allow them to choose an approach that fits their strengths.
2 Discussion This section discusses the significance of the Web for people with physical impairments, the specialized input mechanisms that some individuals use with their technology for Web access and the effects of dexterity impairments on technology and Web access in general.
2.1 Social and Economic Significance Physical disabilities, both mobility and dexterity, can profoundly limit an individual’s independence and ability to get out of their home and into public places. Through the Internet, an individual can manage their own finances, run a business, do their own shopping, access education and socialize on an equal basis. Technology is also a vital tool for employment. In one study of employment among individuals with cerebral palsy, half of those who were competitively employed relied completely or to a large extent on the use of computers to perform their work (Murphy, Molnar and Lankasky 2000).
42
S. Trewin
2.2 Alternative Input Mechanisms People with dexterity impairments use a variety of creative solutions for controlling technology. These options are discussed in more detail in Part III and include alternative keyboards and pointing devices, speech input, keyboardbased pointing methods and pointing-based typing methods (e.g., use of eye gaze tracking with an on-screen keyboard and selection by dwelling on an item). Some users operate keyboards and pointing devices with their feet, elbows or head. Users with very limited motion can use one or two binary signals generated by a physical switch, tongue switch, EMG sensor or other device, and can control a computer by scanning through the available options (manually or automatically) and selecting the items of interest. Using the right input devices can have an enormous impact on speed and accuracy of computer input. But there are also many individuals with dexterity impairments who are using standard keyboards and mouses. There are a number of reasons for this. Many people are unaware of the alternatives. They may share a computer with others or have become used to a keyboard and mouse before acquiring a disability. They may find the alternatives too expensive, or too cumbersome, or simply find that the mouse is easier to understand than other devices.
2.3 Effect of Dexterity Impairments on Technology and Web Access Sears and Young 2003 provide an overview of physical impairments and their effect on access to computing, and review research in this field. Studies have suggested that advanced age and disabilities make keyboard and mouse use and movement slower and less accurate (Riviere and Thakor 1996; Trewin and Pain 1999). Alternative input mechanisms such as speech, eye gaze pointing or EMG can also be inherently error prone or difficult to control. Whatever input mechanisms are used, the fundamental actions of Web browsing can be significantly affected by dexterity impairments. Pointing to a target is perhaps the most fundamental Web action. Targets on the Web vary enormously in size, with scroll bars, check boxes and radio buttons being among the smallest. The reasons for difficulty in pointing to these small targets will depend on the specific impairment. Arthritis may affect an individual’s ability to make the necessary movements. Ataxia or tremor may make it difficult to move a physical input device along the desired path to the target. Akinesia and bradykinesia, in contrast, make it hard to start and stop the movement accurately. Quite different issues are caused by myoclonus, chorea, tremor and spasms, where unwanted diversions or mouse clicks can occur during pointing movements. This may leave the cursor far from the intended target or activate unwanted functions. Spasticity or dystonia can force an individual into an extreme body position that makes it
Physical Impairment
43
difficult for them to see the screen and move the cursor at the same time. In compensating for such difficulties, older adults, and people with physical impairments, sometimes use very different movement strategies to those without impairments, with multiple small submovements, especially when close to the target (Keates and Trewin 2005). Clicking on a target is another fundamental Web operation. A click will fail if the cursor moves off the target during the click. On some pages, this may even cause an item to be dragged instead of selected. Tremor can make it very difficult for an individual to keep the mouse on target while clicking. With Parkinsonism, a user also clicks more slowly, giving even greater opportunity for slippage. Numbness in the fingers (for example, in multiple sclerosis) makes it difficult for a user to tell if their finger is on top of a mouse button. If their finger slips, they may press multiple buttons or the wrong button. Again, users may adopt a ‘‘move then check before clicking’’ strategy. Dystonia and contractures can also make it necessary to use a device in an unusual way, for example to move a mouse by pushing it with the wrist of one hand, then click with a knuckle of the other hand. Again, this makes it difficult to keep the device stable while clicking. Some users, who cannot click a button, use software that generates a click when they dwell on a target for a period of time. Users of dwellbased selection have similar challenges in maintaining the cursor over the target until the click is generated. In addition to pointing and clicking, accurate text entry is also a fundamental requirement. Many forms of physical impairment affect typing accuracy for users of standard keyboards. Again, the effects of specific impairments can be very different. For example, ataxia may make it difficult to release a key, causing long key presses and requiring adjustment to the key repeat delay on the keyboard. Tremor may cause a key to be pressed multiple times. In general, movement disorders may cause users to press unwanted keys or miss the keys they intended to press (Trewin and Pain 1999). Typing passwords accurately can be a particular challenge, since there is no feedback indicating what has been typed. Web sites that lock users out of their accounts after several incorrect password attempts can be very frustrating to use. Where an on-screen keyboard is used, target selection issues can cause typing errors. For users of speech input, different forms of error arise due to recognition errors. Indeed, the majority of a speech user’s text entry time is spent correcting such errors (Koester 2004). Other physical demands are made of users. Sustaining an action over a period of time can be difficult, not only for people with movement disorders, but also for those with muscle weakness or paresis. In addition, the longer the action must be sustained, the more opportunity for it to be disrupted by an involuntary movement. Dragging (e.g., on a scroll bar) is a common example, where the mouse button must be held down as the mouse is moved. Another sustained action is following a path through a cascading menu, where any deviation from the correct path will cause the user to lose the menu. Some Web pages, particularly those involving forms and financial transactions, will time out if the user does not complete and submit the form within a
44
S. Trewin
set time period. For switch users, typing rates may be just a few words per minute. Bradykinesia will cause typing rates on a physical keyboard to be greatly reduced. It may be impossible to complete the form in time without assistance. Finally, some users do not have a pointing device, and control the computer through keystrokes alone, either from a physical keyboard or generated by software such as a scanning system. It is essential that these users be able to access all the functions that a mouse user could access. The standard way to accommodate diverse input mechanisms is to ensure that browsers and targets on Web pages can be accessed and controlled via keystrokes. However, even when this is done, significant usability issues often remain (Mankoff, Dey, Batra and Moore 2002; Schrepp 2006). All of the above issues can be exacerbated by fatigue and pain, which is commonly associated with many health conditions, including arthritis, cumulative trauma disorders and cerebral palsy. Furthermore, those with more severe impairments may have had limited access to basic education, and as a result may have low literacy, making technology and Web access even more of a challenge. One major study of Web accessibility found that the two main usability problems reported by people with physical impairments were lack of clarity in site navigation mechanisms and confusing page layout (Disability Rights Commission 2004). This finding probably reflects the physical effort involved in navigating the Web – the cost of taking a wrong path is greater for this group, since every link selection may take considerable time. The third most commonly reported problem was small text and graphics, which would be difficult to select (or may reflect visual impairments co-occurring with the physical impairment).
3 Future Directions The demographics of disability are changing, as new medical advances and treatments become available. For example, some Parkinsonian and essential tremors can now be controlled by deep brain stimulation, a procedure in which electrodes are implanted into the brain and deliver a small electric current that blocks the tremors. However, medical advances do not necessarily mean that physical impairment or disability is becoming less prevalent. For example, the prevalence of cerebral palsy is rising. This trend is attributed to improvements in antenatal care that have enabled more low birthweight babies to survive (Odding et al. 2006; Krigger, 2006). More people are surviving stroke or traumatic brain injury, but they may be left with severe physical impairments. Technology is available to extend the lives of individuals with advanced neuromuscular diseases who cannot breathe for themselves. The changing demographic of society in general, with more older citizens, will also lead to a likely increase in the number of people with impairments.
Physical Impairment
45
New assistive technologies, such as brain–computer interfaces, hold great promise in enabling people with very severe impairments to access and control technology. People with disabilities are often pioneers of novel user interface technologies like these at a time when they are not robust or reliable enough to become generally popular. On the Web, the landscape is also changing. Web 2.0 technologies are making Web interaction increasingly sophisticated. Complex widgets such as tree views, cascading menus and draggable objects on Web pages are becoming more common. Researchers are actively working to build the necessary accessibility features into these technologies, including keyboard navigation (see Web 2.0 chapter). Also relevant is the movement to improve Web access from mobile devices such as cellphones (see Mobile Web and Accessibility). Many of the mobile Web design guidelines are also beneficial for physical usability of pages (Trewin 2006).
4 Author’s Opinion of the Field People with physical impairments comprise the second largest accessibility group, after those with cognitive impairments. Yet many of the access problems outlined above remain unsolved. One reason for this is the sheer diversity of the group, of the access strategies they use and the specific access barriers encountered. Physical Web access requires solutions at many levels: the input devices used, the configuration of those devices and operating system parameters, the configuration of the browser and the user’s ability to apply their preferences to a specific Web page. Basic research is needed to find interaction techniques that support physical Web access without imposing excessive cognitive or visual burden on users. A mechanism for adjusting the dexterity level demanded by a Web page is one obvious unmet need. Web-based activities are demanding increasingly high levels of dexterity, as the Web becomes more sophisticated. Many physical usability problems could be solved if users were able to specify a minimum target size at the browser level, in a similar way to choosing a preferred font size. This would provide a solution for small target elements such as check boxes that cannot be made larger in today’s browsers, and would help a wide range of individuals. Information technology, and especially the Internet, is transforming the lives of many people with physical impairments. Users of both standard and specialized input devices can find that dexterity impairments adversely impact both speed and accuracy of Web navigation and text entry. As a result, physical accessibility remains an important area in the field of Web accessibility as a whole.
References Disability Rights Commission. (2004) The Web: Access and Inclusion for Disabled People. Disability Rights Commission, UK. ISBN 0 11 703287 5.
46
S. Trewin
Grundy, E., Ahlburg, D., Ali, M., Breeze, E. and Sloggett, A. (1999) Disability in Great Britain: Results from the 1996/97 Disability Follow up to the Family Resources Survey. Charlesworth Group, Huddersfield, UK. Huisstede, B., Bierma Zeinstra, S., Koes, B. and Verhaar, J. (2006). Incidence and prevalence of upper extremity musculoskeletal disorders. A systematic appraisal of the literature. BMC Musculoskeletal Disorders 7(7). doi: 10.1186/1471 2474 7 7. Keates, S., and Trewin, S. (2005). Effect of age and Parkinson’s disease on cursor positioning using a mouse. Proceedings of ASSETS 2005: 7th International ACM SIGACCESS Con ference on Computers and Accessibility, Baltimore, MD, USA, October 2005, pp. 68 75. ACM Press. Koester, H.H. (2004). Usage, performance and satisfaction outcomes for experienced users of automatic speech recognition. Journal of Rehabilitation Research and Development 41(5), 739 754. Krigger, K. (2006). Cerebral Palsy: An overview. American Family Physician 73(1), pp. 91 100. American Academy of Family Physicians. Available online at www.aafp. org/afp. Lethbridge C¸ejku, M., Rose, D. and Vickerie, J. (2006) Summary health statistics for U.S. adults: National health interview survey, 2004. National Center for Health Statistics. Vital Health Stat 10(228). Lou, J. and Jankovic, J. (1991) Essential tremor: Clinical correlates in 350 patients. Neurology 41, 234 238. Mankoff, J., Dey, A., Batra, B. and Moore, M. (2002) Web accessibility for low bandwidth input, Proceedings of the fifth international ACM conference on Assistive technologies, July 08 10, 2002, Edinburgh, Scotland. ACM Press. Murphy, K., Molnar, G. and Lankasky, K. (2000). Employment and social issues in adults with cerebral palsy. Archives of Phys. Med. Rehabil 81, June 2000, pp. 807 811. National Spinal Cord Injury Statistical Center (2006) Spinal cord injury facts and figures at a glance. Information sheet. http://www.spinalcord.uab.edu/show.asp?durki=21446 Odding, E., Roebroeck, M. and Stam, H. (2006) The epidemiology of cerebral palsy: Inci dence, impairments and risk factors. Disability and Rehabilitation 28(4), pp. 183 191. Riviere, C. and Thakor, N. (1996) Effects of age and disability on tracking tasks with a computer mouse: Accuracy and linearity. Journal of Rehabilitation Research and Develop ment, 33, 6 15. Schrepp, M. (2006) On the efficiency of keyboard navigation in Web sites. Universal Access in the Information Society 5, 180 188. Sears, A., & Young, M. (2003). Physical disabilities and computing technology: An analysis of impairments. In: J. Jacko and A. Sears (Eds.), The Human Computer Interaction Hand book: Fundamentals, Evolving Technologies and Emerging Applications. Lawrence Erlbaum, New Jersey, USA, pp. 482 503. Thurman, D., Alverson, C., Dunn, K, Guerrero, J. and Sniezek, J. (1999). Traumatic brain injury in the United States: A Public health perspective. Journal of Head Trauma Rehabi litation, 14(6), 602 615. Trewin, S. (2006). Physical Usability and the Mobile Web Proceedings of the WWW 2006 international cross disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility? Edinburgh, Scotland, May 2006, pp. 109 112, ACM Press. Trewin, S. and Pain, H. (1999) Keyboard and mouse errors due to motor disabilities, Inter national Journal of Human Computer Studies 50(2), 109 144. World Health Organization (2001) International Classification of Functioning, Disability and Health (ICF). WHO. ISBN: 9789241545426
Ageing Sri H. Kurniawan
Abstract By 2020, the number of the world’s older population is expected to exceed one billion, and maintaining a high quality of life for these people has become an important issue throughout the world. The Web has been shown to have a positive experience on the quality of life and well-being of older persons, by assisting them to maintain an independent living. However, many older people seem to shy away from the Web due to various problems they experience when interacting with the Web. To understand the nature of these problems, this chapter presents the functional impairments and the attitudes that might contribute to older persons’ hesitation of utilising the Web. This chapter discusses the changes that happen with age, their effects on Web interaction and how they can be mediated through the accessible Web.
1 Introduction As we progress through natural ageing process, we experience some degenerative effects of ageing, which can include diminished vision, varying degrees of hearing loss, psychomotor impairments, as well as reduced attention, memory and learning abilities. This can heavily affect the accessibility of the Web, which has become an increasingly vital tool in our information-rich society. Before discussing on the effect of ageing on Web interaction, there is a need to define what ‘‘older persons’’ mean. The term ‘‘older’’ has been defined in numerous ways. Bailey cited a variety of research in which the ‘‘old age’’ categories vary broadly, including studies in which ‘‘older users’’ were defined as ‘‘over 40’’ (Study 2),‘‘over 50’’ (Study 3) and ‘‘over 58’’ (Study 1) (Bailey 2002). Ageing research shows that sensory changes that are typically associated with old age are really the result of a gradual sensory decline that typically begins between the ages of 40–55 years old – earlier than the age most people consider themselves ‘‘old’’ (Straub and Weinschenk 2003). One thing that is apparent, however, is that the S.H. Kurniawan School of Informatics, The University of Manchester, Manchester, UK e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_5, Ó Springer Verlag London Limited 2008
47
48
S.H. Kurniawan
individual variability of sensory, physical and cognitive functioning increases with age (Myatt et al. 2000) and this functioning declines at largely varying rates in older adults (Gregor et al. 2002).
2 Physical Changes 2.1 Vision Vision is the most common physiological change associated with ageing (AgeLight 2001), and the one that affects Web interaction the most. As Jakob Nielsen stated, ‘‘The most serious accessibility problems given the current state of the Web probably relate to blind users and users with other visual disabilities since most Web pages are highly visual’’ (Nielsen 1996). One of the most common changes in vision is caused by the yellowing of the lens due to discolouration of the eye’s fluid. This gives the impression of looking through a yellow filter (Sekuler et al. 1982). Along with this, any colour blindness in the eye caused by glaucoma or general genetic colour blindness normally worsens with age due to decreased blood supply to the retina (AgeLight 2001). These make it difficult for older people to tell the difference between colours of a similar hue and low contrast. It is therefore advisable to use highly contrasting colours to improve legibility and to present users with their own colour option for fonts and backgrounds to allow them to customise the site to their own needs. Where colours are specified, they should be highly saturated. Primary colours are believed to be the best for older persons (AgeLight 2001). Maximising differences between hue, lightness and saturation and using high-contrasting colours also helps to provide maximum legibility. The pupil of the eye shrinks with age. The lens becomes thicker and flatter and the pupil is less able to change diameter, therefore letting in less light. The retina of an average 60-year old receives just 33% of the light of the retina of the average 20-year old (Armstrong et al. 1991). Ageing eyes are also more sensitive to glare, a condition known as ‘‘night blindness,’’ caused by reduced transparency in the lens. To aid this, it is best to use light-coloured text on a dark background and to try to avoid using fluorescent colours or pure white that can appear very bright. Ageing eyes are also very susceptible to fatigue and tend to be dry due to a decrease in the amount of blinking. Some design choices can provide respite to tired eyes. Using Sans Serif fonts such as Arial of at least 12–14 pt are suggested, as the fonts do not have decorative edges (Ellis and Kurniawan 2000). Allowing bigger gaps between lines and using white space can also produce less eye strain as can minimising the use of long strings of capital letters, e.g., it is much better to put ‘‘This Is A Title’’ instead of ‘‘THIS IS A TITLE,’’ as in the later, there is little differentiation between capital letters leading to eye strain.
Ageing
49
A third of people aged over 65 have a disease affecting their vision (Stuart-Hamilton 1999). Some of the most common ones are discussed below.
2.1.1 Age-Related Macular Degeneration (AMD) AMD, sometimes known as ‘‘senile maculopathy’’, is a genetic disease and the most common cause of severe visual impairment amongst older people (Ford 1993). Macular disease refers to the breakdown or thinning of the most sensitive cells of the eye clustered in small area in the centre of the retina known as the macula (Fine et al. 1999). Macular disease affects central vision only; sufferers still can see adequately at the peripherals of their vision, a term commonly described as ‘‘polo mint vision’’ due to the hole in the centre of their vision (Ford 1993). While never resulting in total blindness, AMD is often severe enough for the sufferer to be classed as partially sighted or blind. Symptoms of macular disease usually start around the early mid-50s, typically starting in just one eye. In early stages of macular degeneration, it is difficult to read small or faint print, but as the disease worsens and spreads to both eyes, it becomes difficult even to read large print or to determine any specific details such as pictures. Due to these symptoms, any Web pages should be designed with large fonts (minimum of size 12–14 pt) or the options to increase font size. Any other page elements such as buttons, links and images should be reasonably large. The site should not use bright colours that can cause glare and should avoid using colours in the short-wave spectrum. However, the background of the site should not be too dark as the text will become unreadable due to AMD sufferer’s diminished contrast sensitivity. Web pages should also not link from a bright page to a dark page or vice versa.
2.1.2 Cataracts Cataract refers to the loss of transparency or clouding of the lens of the eye, and it is predominantly an age-related disease (Sekuler et al. 1982). The lens is responsible for focusing light coming into the eye onto the retina to produce clear, sharp images. However, when the lens of the eye becomes clouded, the eye is no longer able to adequately process light coming into the eye. Cataracts are the most common cause of vision loss in people aged over 55 (St. Lukes 2005). Cataracts are caused by an accumulation of dead cells within the lens. As the lens is within a sealed capsule within the eye, dead cells have no way to get out and therefore accumulate over time causing a gradual clouding of the lens. The clouding of the lens means that less violet light enters and reaches the retina making it harder to see colours like blue, green and violet than reds, oranges and yellows (AgeLight 2001).
50
S.H. Kurniawan
Due to this, Web pages should be designed to use colours within the red/ orange/yellow spectrum and avoid using colours in the blue/green/violet spectrum. Using colours of similar hues should also be avoided, as it is harder for people with cataracts to determine the difference. Fonts should be a minimum of size 12 pt to allow for the lack of detail in the sufferers’ vision. Blinking or flashing icons or animations should be avoided, as they are difficult to see with the user’s diminished peripheral vision. The use of advertisements or ‘‘page cluttering’’ icons or images such as page counters should be omitted from the site as these tend to draw the users attention away from the text, making it harder for them to find the content they are looking for. It has been found that large areas of white space with a small block of text in the middle is easier for the user to read as they can tell where the text is even with diminished ability to see detail due to vision clouding, therefore maximising white space around the text is a good way to improve readability for users with cataracts (AgeLight 2001).
2.1.3 Presbyopia Presbyopia is an age-related disorder where the eyes lose the ability to focus on objects or detail at close distances. The onset of presbyopia normally starts in the 40s but is a disorder that happens to all people at some time in their life (Lee and Bailey 2005). Despite its symptoms, presbyopia is not related to nearsightedness, which is due to an abnormality in the shape of the eye. Instead, it is caused by the gradual lack of flexibility in the crystalline lens of the eye due to the natural ageing process (St. Lukes 2005). It is not a disease and cannot be avoided; however, it can easily be treated with lenses or eye surgery. People with presbyopia usually have a diminished visual field and tend to compensate for this by moving their head from side to side when reading, instead of sweeping their eyes from left to right.
2.1.4 Glaucoma Glaucoma is a group of diseases that can damage the optic nerve and cause blindness. While not a direct age-related disorder, it most commonly affects people over 60 or African Americans over 40 years of age. Symptoms include loss of peripheral vision starting with detail and increasing until the sufferer has a form of tunnel vision where the sufferers gradually lose their peripheral vision. If left untreated, this tunnel vision will continue to move inwards until no vision remains. While there are various causes of glaucoma, the most common is an openangle glaucoma where fluid builds up in the anterior chamber of the eye causing pressure that damages the optic nerve (National Eye Institute 2004). As with presbyopia, the sufferer has a decreased angle of vision, and so must turn their head to view what a normal person could view in their peripheral vision.
Ageing
51
2.2 Hearing Twenty percent of people between 45 and 54 years have some degree of hearing impairment. The figure rises to 75% for people between 75 and 79 years of age (Kline and Scialfa 1996). Older people have reduced ability to detect highpitched sounds (Scheiber 1992). Interfaces that use sound to get attention will need to use lower frequency sounds for older users. It is found that a beep that sweeps across 0.5–1.0 KHz is reasonably effective (Zhao 2001). Recorded voice should also use speakers with low-pitched voices. Older people have more problems localizing sound than younger persons, which is more apparent in persons with presbycusis (Kline and Scialfa 1996). They have a reduced ability to follow fast speech (more than 140–180 words per minute) and conversation in noisy surrounding (Hawthorn 2000). Providing audio captions for online news, especially when the journalists reported the news over noisy backgrounds (e.g., an onsite natural disaster report), will help older users. Even though one might argue that hearing loss does not severely affect Web interaction, as the Web adopts a visual paradigm, unfortunately hearing loss was reported to be significantly correlated with the severity of cognitive dysfunction in older persons, and therefore carries problems associated with cognitive impairment (Uhlmann et al. 1989).
2.3 Psychomotor In older adults, response times increase significantly with more complex motor tasks (Spiriduso 1995) or in tasks with a larger number of choices (Hawthorn 2000). Older adults perform poorly when tracking a target using a mouse (Jagacinski et al. 1995), make more sub-movements when using a mouse (Walker et al. 1997) and experience an increase in cursor positioning problems if the target size is small such as the size of letters or spaces in text (Charness and Bosman 1990). Siedler and Stelmach (1996) have also reported that older adults have ‘‘less ability to control and modulate the forces they apply’’. Finally, older adults are more cautious in their movement strategies because the likelihood of errors for fast-moving targets increases with age (Hawthorn 2000). Some older people suffer from age-related diseases that affect their psychomotor abilities, such as multiple sclerosis, arthritis, osteoporosis, stroke and Parkinson’s disease. Multiple sclerosis (MS) is a disorder of the central nervous system marked by weakness, numbness, a loss of muscle coordination and problems with vision, speech and bladder control. Arthritis is inflammation of joints causing pain, swelling and stiffness. Osteoporosis is loss of normal bone density, mass and strength, leading to increased porousness and vulnerability to fracture. Stroke refers to damage to the brain caused by interruption to its blood supply or leakage of blood outside of vessel walls. Depending upon
52
S.H. Kurniawan
where the brain is affected and the extent of the decreased blood supply to the brain, paralysis, weakness, a speech defect, aphasia or death may occur. Finally, Parkinson’s disease is a progressive disorder of the nervous system marked by muscle tremors, muscle rigidity, decreased mobility, stooped posture, slow voluntary movements and a mask-like facial expression. As the above symptoms indicate, any of these diseases can severely affect older person’s psychomotor abilities. Older people also tend to have reduced grip strength and flexibility, and thus a more limited range to move the mouse. Declines in motor control may result in the inability to hold the mouse still and rapidly push the button at the same time, a movement often required when interacting with GUI. A 1997 study revealed that the most common problem faced by older participants was using the mouse, both for pointing and clicking (21%) and for scrolling (24%). It was noted that because of arthritis or tremors, some older persons were incapable of the fine movements required to manoeuvre a mouse. They had difficulty placing the cursor within a search engine box, placing the mouse in the arrowed boxes, scrolling and coordinating the movement of the mouse and clicking it (IFLANET 1997).
3 Cognitive Changes 3.1 Attention Attention is the ability to focus and remember items in the face of distracting stimuli being presented, which may have to be processed simultaneously mentally (Stuart-Hamilton 2000). Older persons experience more difficulties in trying to focus and maintain attention on activities over long periods of time, or require quick and continuous scanning, which is particularly fatiguing (Vercruyssen 1996), and activities that require concentration on a specific task in light of distracting information (Kotary and Hoyer 1995). Selective attention (a type of attention that involves focusing on a specific aspect of an experience while ignoring other aspects) therefore becomes more difficult for older persons. Highlighting important information and using perceptual organization such as grouping would help older adults focus on the necessary information more effectively (Czaja 1997). Divided attention is the ability to attend simultaneously to and process more than one task at the same time (Hawthorn 2000). The ability to sustain divided attention in the performance of tasks declines with age, particularly in complex tasks (Hartley 1992). The ability to form new automated responses, which is the ability to respond to stimuli automatically without conscious effort or control, particularly in visual searches becomes more difficult (Hawthorn 2000), and while older adults are able to learn new responses, they continue to remain attention demanding and hence contribute to
Ageing
53
cognitive load (Rogers et al. 1994). Where automated responses have been learnt in older adults, these can become disruptive when learning new tasks because it is difficult to unlearn responses where the person is unconscious of the response (Rogers et al. 1994). Visual information processing also slows down with ageing (Cerella et al. 1982).
3.2 Memory There is general agreement in the literature on cognitive ageing that memory performance declines with age and that such age-related decrements in performance are much greater in relation to some tasks than in others (Grady and Craik 2000). Memory is a key performance factor in all cognitive tasks, which includes learning, planning, perception, decision making, prioritizing and creativity (Hoisko 2003). Declines occur in intellectual performance (Zajicek 2001) and the ability to process items from long-term memory into short-term memory, which is distinct from simply being able to recall items (Salthouse 1994) and which explains older adults’ problems with text comprehension (Light 1990). With long-term memory, studies have found there is a decline in episodic memory (memory for specific events) and procedural memory (memory for how we carry out tasks) (Hawthorn 2000). Memory is particularly relevant to learning, in that in order to learn, one must acquire the information and retain it in memory. Research shows that older adults retain skill levels in areas of expertise they have learnt, although, it becomes more difficult to learn a new motor skill (Cunningham and Brookbank 1988) and more demanding to learn new complex tasks, particularly where the tasks are not meaningful to the user (Stokes 1992). Older adults also experience a significant decline in capability on performance of memory tasks that require recall of content, however there is little decline on memory tasks involving recognition (Rybash et al. 1995). Research also suggests that older adults tend not to adopt organizing material strategies, unless informed to do so (Ratner et al. 1987), which could also suggest why older adults have poorer learning than younger adults do. Because of the decline in cognitive ability, older users face many difficulties with using Web pages. As people age, there is a general overall slowing of brain processing speed (Czaja and Sharit 1998). The largest impact seems to be with tasks that require the most cognitive processing, such as with working memory, overall attentional capacity and visual search performance. Age effects are smallest for tasks where knowledge is an important aspect of the task and largest for tasks where successful performance is primarily dependent on speed (Czaja and Sharit 1998). Various design suggestions to mediate cognitive decline in older persons have been proposed. For example, the use of a certain style in text writing, i.e., the information must be presented in a clear way using simple language and active
54
S.H. Kurniawan
voice, was suggested (National Institute of Aging and the National Library of Medicine 2002). Older adults may have problems recalling things such as a specific WWW page location (i.e., Uniform Resource Locator, or URL), previously followed links or the current location in a particular WWW site (Mead et al. 1997). Recall takes more cognitive effort than recognition does; therefore, well-designed visual cues such as text links, buttons and icons could significantly support older users. Graphical cues are useful in providing users with a sense of current location, and therefore reducing the demand on working memory to remember where they had been and where they are within the Web structure (Ellis and Kurniawan 2000).
4 Behavioural Changes There are some notable behavioural changes associated with advanced aging. The first notable change is increased cautiousness (hesitancy about making responses that may be incorrect) (Salthouse 1991). One most commonly cited explanation for this change is the decline in speed across a variety of situations (Birren 1970). An older person has longer reaction times, and it has been suggested that this is caused by inefficient central nervous system (CNS) functioning. Indeed, the CNS is also at the root of sensory and perceptual changes that occur with age. To cope with these changes, older adults modify their behaviour and attempt to compensate, resulting in, among others, increased cautiousness and a lack of confidence. Providing assurance to an older user that the user is in the right (or wrong) path to their information target can alleviate the lack of confidence in older persons. Older persons have less confidence about their ability to use computer technology, including the Web, which causes computer phobia, anxiety, resistance and negative attitude towards computers (Christopher 1999). This is partly due to the fact that some older people have never used or been shown how to use computer technology and have never had the opportunity to learn. The same research pointed out that older people are more receptive to using computers when they perceived the technology as being useful and the tasks that they were able to perform with the technology as being valuable and beneficial. One study found that introducing the technology in a highly interactive and understandable manner was one factor that was likely to influence the receptivity of older adults toward computers and the Web (Edwards and Englehardt 1989). One piece of good news is that the influence of learned habits on behaviour is unchanged with age (Grady and Craik 2000). This might mean that whilst problems associated with physical and cognitive changes that come with ageing will still affect how older persons use the Web, the next cohort of older persons might not have problems associated with lack of exposure with computers and the Web.
Ageing
55
5 Author’s Opinion of the Field This chapter has discussed the changes that occur with ageing and how these changes might affect older persons’ interaction with the Web. Although it is apparent that most functional abilities decline with ageing, not all are doom and gloom. Some abilities (e.g., those related to semantic memory) do not decline until very late in life. In addition, various studies pointed out that older persons are able to learn new skills as well as their younger counterparts and are able to perform some tasks equally well as younger persons do. Older persons are arguably the fastest growing segment of potential customers of the Web, and as such it would be economically wise for Web designers to consider the impairment that comes with ageing and how to facilitate effective interaction given this limitation. Many issues can actually be addressed through good Web design and guidelines and proper documentation and training. As many ageing studies pointed out, the biggest barrier of technology use by older persons is not ageing-related functional impairment, but rather hesitation of exploration due to fear of the unknown and the consequence of incorrect actions. Nevertheless, one size does not fit all and whilst good practice in Web design can assist older users, some extra consideration and possibly assistive technology might be required in more severe cases of age-related impairments. Another possible solution is through personalising Web interface to reflect older person’s developing needs. However, as noted earlier, older users may be less confident when it comes to using the Web, even when they arrive from a generation that has grown up with computers, and they are likely to be more nervous about personalisation if that involves making changes themselves. An easy way out for this is to ensure that configuration is made simple and applied in such a way that users can see the effect of personalisation immediately. The list of changes that come with ageing and suggestions to accommodate older persons when designing for the Web presented in this chapter is not an exhaustive list. There is always a need to involve the older population when designing for them, and this includes designing for the accessible Web, as only by involving prospective users can we capture their requirements and needs.
6 Future Directions When we discuss older persons, there are two future directions that we can foresee. In the future, we are talking about a different cohort of ‘‘older persons’’, the cohort who grows up with the Internet. Although undoubtedly this cohort would still experience functional ability declines that come with ageing, we can expect a different set of behaviours in regards to acceptance of Web technology and the requirement to learn about the Web when they are older.
56
S.H. Kurniawan
In terms of technology, the sort of evaluations, methodologies and applications that we can expect in the future are covered extensively from Part II onwards. Some of these are relevant to people with disabilities in general, and some are particularly useful for older Web users. A very good example is Voice XML. This application will definitely benefit older Web users due to its potential to supplement visually oriented Web with sounds, which will help older persons with reduced vision (when voice is used as output) or motor ability (when voice is used as input).
References AgeLight LLC. (2001) Technology & Generational Marketing Strategies: Interface Design Guidelines for users of all ages. http://www.agelight.com/Webdocs/designguide.pdf. Armstrong, D., Marmor, M.F. and Ordy, J.M. (1991) The effects of aging and environment on vision. Plenum Press, New York. Bailey, B. (2002) Age Classifications. UI Design Update Newsletter, July http://www. humanfactors.com/ downloads/jul02.asp. Birren (1970) Cited in Eisdorfer, C. and Lawton, P. (1973) The Psychology of Adult Development and Aging. American Psychological Association, Washington, D.C. Cerella, J., Poon, L.W. and Fozard, J.L. (1982) Age and iconic read out. Journal of Gerontology 37, 197 202. Charness, N. and Bosman, E. (1990) Human Factors in Design. In: J.E. Birren, K.W. Schaie, (Eds.) Handbook of Psychology of Aging, Third Ed., Academic press, San Diego, pp. 446 463. Christopher, P., 1999. Older Adults Special considerations for special people. http://www. gsu.edu/mstswh/courses/it7000/papers/newpage31.htm. Cunningham, W.R. and Brookbank, J.W. (1988) Gerontology: The Psychology, Biology and Sociology of Ageing. Harper and Row, New York. Czaja, S. (1997) Using Technologies to Aid the Performance of Home Tasks. In: Handbook of Human Factors and the Older Adult, Chapter 13, pp. 311 334. Czaja, S. J., and Sharit, J. (1998) Age differences in attitudes toward computers. Journal of Gerontology 53B(5), 329 340. Edwards, R. and Englehardt, K.G. (1989) Microprocessor based innovations and older individuals: AARP survey results and their implications for service robotics. International Journal of Technology and Aging 2, 56 76. Ellis, R.D. and Kurniawan, S.H. (2000) Increasing the usability of online information for older users: A case study in participatory design. International Journal of Human Computer Interaction 2(12), 263 276. Fine, S.L., Berger, J.W. and Maguire, M.G. (1999) Age Related Macular Degeneration. Mosby, Inc., Missouri. Ford, M. (1993) Coping Again. Broadcasting Support Services, pp. 6 28. Grady, C.L. and Craik, F.I.M. (2000) Changes in memory processing with age. Current Opinion in Neurobiology 10, 224 231. Gregor, P., Newell, A.F. and Zajicek, M. (2002) Designing for Dynamic Diversity interfaces for older people. In: Proceedings of the fifth international ACM conference on Assistive technologies. ACM Press, New York, pp. 151 156. Hartley, A.A. (1992) Attention. In: F.I.M. Craik, T.A. and Salthouse (Eds.), The Handbook of Aging and Cognition. Erlbaum, Hillsdale, NJ. Hawthorn, D. (2000) Possible implications of aging for interface designers. Interacting with Computers 12, 507 528.
Ageing
57
Hoisko, J. (2003) Early Experiences of Visual Memory Prosthesis for Supporting Episodic Memory. International Journal of Human Computer Interaction, 15(2), 209 230. IFLANET (1997) Older People and the Internet. http://www.ifla.org/IV/ifla63/44pt2.htm. Jagacinski, R.J., Liao, M.J. and Fayyad, E.A. (1995) Generalised slowing in sinusoidal tracking in older adults. Psychology of Aging 9, 103 112. Kline, D.W. and Scialfa, C.T. (1996) Sensory and perceptual functioning: basic research and human factors implications, In: A.D. Fisk, W.A. Rogers (Eds.), Handbook of Human Factors and the Older Adult. Academic Press, San Diego. Kotary, L. and Hoyer, W.J. (1995) Age and the ability to inhibit distractor information in visual selective attention, Experimental Aging Research, 21(2), 159 171. Lee, J, and Baily, G. (2005) Presbyopia: All about sight. http://www.allaboutvision.com/ conditions/presbyopia.html. Light, L.L. (1990) Memory and language in old age. In: J.E., Birren, K.W. Schaie (Eds.), Handbook of the Psychology of Aging, Third Ed., Academic Press, San Diego, pp. 275 290 Mead, S. E., Spaulding, V. A., Sit, R. A., Meyer, B. and Walker, N. (1997) Effects of age and training on world wide Web navigation strategies. In: Proceedings of the Human Factors and Ergonomics Society 41st annual meeting Human Factors and Ergonomics Society, Santa Monica, pp. 152 156. Myatt, E.D., Essa, I. and Rogers, W. (2000) Increasing the opportunities for ageing in place. In: Proceedings of the ACM Conference on Universal Usability, ACM Press, New York/ Washington, DC, pp. 39 44. National Eye Institute (2004) What is Glaucoma? http://www.nei.nih.gov/health/glaucoma/ glaucoma_facts.asp#1. National Institute on Aging and the National Library of Medicine (2002) Making your Website More Senior Friendly: A Checklist. http://www.usability.gov/checklist.pdf Nielsen, J., 1996. Accessible Design for Users with Disabilities. http://www.useit.com/alert box/9610.html Ratner, H.H., Schell, D.A., Crimmins, A., Mittleman, D. and Baldinelli, L. (1987) Changes in adults prose recall: aging or cognitive demands. Developmental Psychology 23, 521 525. Rogers, W.A., Fisk, A.D. and Hertzog, C. (1994) Do ability related performance relationships differentiate age and practice effects in visual search? Journal of Experimental Psychology, Learning, Memory and Cognition 20, 710 738 Rybash, J.M., Roodin, P.A. and Hoyer, W.J. (1995) Adult Development and Aging. Brown and Benchmark, Chicago. Salthouse, T.A. (1991) Theoretical perspectives on cognitive aging. Lawrence Erlbaum Associates, Hillsdale. Salthouse, T.A. (1994) The aging of working memory. Neuropsychology 8, 535 543 Scheiber, F. (1992) Aging and the senses. In: J.E. Birren, R.B. Sloane, G.D. Cohen (Eds.), Handbook of Mental Health and Aging, Second Ed., Academic Press, San Diego. Sekuler, R., Kline, D. and Dismukes, K. (1982) Aging and Human Visual Functions. Alan R. Liss, Inc., New York, pp. 27 43. Siedler, R. and Stelmach, G. (1996) Motor Control. In: J.E. Birren, (Ed.) Encyclopedia of Gerontology. Academic Press, San Diego, pp. 177 185. Spiriduso, W.W. (1995) Aging and motor control. In: D.R. Lamb, C.V. Gisolfi, E. Nadel (Eds.), Perspectives in Exercise Science and Sports Medicine: Exercise in Older Adults. Cooper, Carmel, pp. 53 114. St. Lukes Eye (2005) Cataracts. http://www.stlukeseye.com/Conditions/Cataracts.asp. Stokes, G. (1992) On Being Old. The Psychology of Later Life. The Falmer Press, London. Straub, K. and Weinschenk, S. (2003) The Gradual Graying of the Internet. UI Design Update Newsletter, June. http://www.humanfactors.com/downloads/jun03.asp Stuart Hamilton, I. (1999) Intellectual changes in late life. In: Woods, E.R.T. (Ed.) Psychological Problems of Ageing. Wiley, New York.
58
S.H. Kurniawan
Uhlmann, R.F., Larson, E.B., Rees, T.S., Koepsell, T.D. and Duckert, L.G. (1989) Relationship of hearing impairment to dementia and cognitive dysfunction in older adults. The Journal of American Medical Association 261(13), 1916 1919. Vercruyssen, M. (1996) Movement control and the speed of behaviour. In: A.D. Fisk, W.A. Rogers (Eds.), Handbook of Human Factors and the Older Adult, Academic Press, San Diego, CA. Walker, N., Philbin, D.A. and Fisk, A.D. (1997) Age related differences in movement control: adjusting sub movement structure to optimize performance. Journal of Gerontology: Psychological Sciences 52B(1), 40 52. Zajicek, M. (2001) Special interface requirements for older adults. Workshop on Universal Accessibility of Ubiquitous Computing: Providing for the Elderly. http://virtual.inesc.pt/ wuauc01/procs/pdfs/zajicek_final.pdf. Zhao, H. (2001) Universal Usability Web Design Guidelines for the Elderly, Age 65 and Older, April. http://www.otal.umd.edu/uupractice/elderly/.
Part II
Evaluation and Methodologies
As a Web accessibility researcher you will need to understand the purpose and failings of guidelines, best practice, and the techniques to evaluate conformance. Only with this basic understanding can any new research be undertaken in this area. While we consider guidelines to be mainly moving to practice new techniques and observational findings, this still means that alterations are required. Indeed, automated evaluation and validation is still only superficial at the present time, and algorithms to test for deeper conformance, beyond shallow lexical and syntactic analysis, have not yet been created. The design and build of a Web document is the starting point of Web accessibility, and as such much effort and attention is focused on guidelines, techniques, best practice, and measuring their conformance via evaluation and validation in an effort to facilitate greater accessibility. This fundamental effort has resulted in the accelerated development of tools and techniques to support guidelines to design against (Best Practice and Guidelines), along with evaluation rules to validate against (Site Conformance, Evaluation Methodologies and Automated Evaluation). From its first beginnings Web accessibility focused on good design, and this good design split initially into guidelines and techniques with best practice becoming more important as the field moved from research into development. Initially, Web accessibility was conflated with good design and more generalized HCI usability. However, Web access for disabled users began to develop with the formation of the W3C’s Web Accessibility Initiative (WAI) based on work toward Web accessibility being undertaken mainly in university research groups pre-1996. Evaluation and document validation then became necessary to building standardized documents. Once standards had been established there needed to be someway to decide if a document was designed and created so as to meet them. Without document validation and evaluation tools, and the techniques encoded within them, true accessibility can never be reached. Therefore, validation to standards is seen as being a critical factor in Web accessibility. However, automated validation does not currently produce accurate assessments of accessibility, only a clue as to the validity of testable parts of the Web (without much semantics). To address this failing the only true and complete way to test both Web sites and your research scenarios is full-scale
60
Evaluation and Methodologies
experimentation (End User Evaluations). However, authoring tools (Authoring Tools and Document Engineering) and support for the designer have also been targeted by researchers. The rational here is that if we can build in accessibility support directly to page generation mechanisms, our client side tools will be more effective at understanding the content and the interaction required.
Web Accessibility and Guidelines Simon Harper and Yeliz Yesilada
Abstract Access to, and movement around, complex online environments, of which the World Wide Web (Web) is the most popular example, has long been considered an important and major issue in the Web design and usability field. The commonly used slang phrase ‘surfing the Web’ implies rapid and free access, pointing to its importance among designers and users alike. It has also been long established that this potentially complex and difficult access is further complicated, and becomes neither rapid nor free, if the user is disabled. There are millions of people who have disabilities that affect their use of the Web. Web accessibility aims to help these people to perceive, understand, navigate, and interact with, as well as contribute to, the Web, and thereby the society in general. This accessibility is, in part, facilitated by the Web Content Accessibility Guidelines (WCAG) currently moving from version one to two. These guidelines are intended to encourage designers to make sure their sites conform to specifications, and in that conformance enable the assistive technologies of disabled users to better interact with the page content. In this way, it was hoped that accessibility could be supported. While this is in part true, guidelines do not solve all problems and the new WCAG version two guidelines are surrounded by controversy and intrigue. This chapter aims to establish the published literature related to Web accessibility and Web accessibility guidelines, and discuss limitations of the current guidelines and future directions.
1 Introduction Disabled people use assistive technologies, a term used to refer to hardware and software designed to facilitate the use of computers by people with disabilities (DRC 2004), to access the Web (see Assistive Technologies). These technologies work satisfactorily as long as the page is designed well. However, this is not the S. Harper Human Centred Web, School of Computer Science, University of Manchester, Manchester, M13 9PL, UK e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_6, Ó Springer Verlag London Limited 2008
61
62
S. Harper, Y. Yesilada
case for many pages (Takagi et al. 2007). The W3C Web Accessibility Initiative (WAI) recognises this and provides guidelines to promote accessibility on the Web including Web Content Accessibility Guidelines (WCAG 1.0) (Chisholm et al. 1999). While many organisations such as the RNIB (Royal National Institute of Blind People) also highlight some of the important accessibility issues, the W3C accessibility guidelines are more complete and cover the key points of all the others. Besides WCAG 1.0, the W3C also provides guidelines for user agents and authoring tools. However, as recent surveys demonstrate (DRC), few designers follow these guidelines. Unfortunately, all attempts which have focused on guidelines have failed to give unilateral accessibility across the board because they are optional, not enforceable, and not accurately testable. While version two of the WCAG guidelines (not yet ratified by the W3C) promises to be more testable, with suggested routes to validation built in at the start, the guidelines still cover over 200 pages with an additional 200 page ‘how-to’ annex. In this case, it would seem that only the most dedicated designer will know enough to design to accessibility standards and produce pages which validate correctly. Indeed, because Web browsers present content on the screen, the visual rendering is often the only ‘validity’ check a designer or author performs. However, this check is flawed as most browsers attempt to correct badly written, and inaccessible, code as it is displayed. Therefore, although Web guidelines direct designers and authors to best practice, currently, most Web sites have accessibility barriers that make it either difficult or near impossible for many people with disabilities to use these sites. There are also evaluation, validation, and repair tools (see Web Accessibility Evaluation) to check Web pages against best practice and guidelines. In brief, validation and repair tools analyse pages against accessibility guidelines and return a report or a rating (Ivory and Hearst 2001). These tools are important for Web accessibility as they provide a medium for designers or authors to validate their pages against published guidelines without actually reading and manually applying them (Paciello 2000). Although there has been extensive work in the degree and development of these tools, automation is still limited (Yesilada et al. 2004). While it is likely that there are certain accessibility issues that cannot be fully automated (e.g., checking the quality of alternative text provided for images), these tools still provide incomplete automation and complex outputs. Similarly, there are also tools to transform Web pages into a more accessible form for disabled users (see Transcoding). Client-side rendering and transformation tools try to remould Web pages into user-centric presentations. This may either be in the form of a custom browser built to enhance the interaction of people with specific disabilities or as extensions to mainstream browsers or as proxies which modify Web documents as they are delivered to the user. However, most of these tools lack an understanding of disabled users’ interaction with Web pages and their requirements. In order to address such user-requirement issues,
Web Accessibility and Guidelines
63
some effort has been directed towards improving the tool support for designers (see Authoring Tools) building accessibility support in at the source. In this chapter, we will examine the published literature related to Web accessibility and Web accessibility guidelines and then discuss limitations of these guidelines and future directions. We begin by demonstrating why accessing and using Web content is a difficult task for disabled users and what has been accomplished through guidelines to improve this.
2 Overview Web accessibility refers to the practice of making pages on the Web accessible to all users, especially to those with disabilities (Paciello 2000, Thatcher et al. 2002). Although an accessible Web means unprecedented access to information for people with disabilities, recent research suggests that the best practice on accessibility has not yet been achieved. For example, Kelly (2002) found the accessibility of the high street stores, banks, and universities in the UK extremely disappointing. Eva (2002) surveyed 20 ‘Flagship’ governmental Web sites in the UK and concluded that 75% needed immediate attention in one area or another. The Disability Rights Commission (DRC) conducted an extensive user evaluation, whose report (DRC) concludes that most Web sites (81%) fail to satisfy even basic accessibility requirements. The Web plays an increasingly important role in many areas (e.g., education, government), so an accessible Web that allows people with disabilities to actively participate in society is essential for equal opportunities in those areas. Furthermore, Web accessibility is not only a social issue but it is also becoming a legal requirement (Paciello 2000, Thatcher et al. 2002). Nations and continents including the UK, Australia, Canada, and the United States are approving specific legislation to enforce Web accessibility. Web accessibility depends on several different components of Web development and interaction working together, including Web software (tools), Web developers (people) and content (e.g., type, size, complexity, etc.) (Chisholm and Henry 2005). The W3C Web Accessibility Initiative (WAI)1 recognises these difficulties and provides guidelines for each of these interdependent components: (i) Authoring Tool Accessibility Guidelines (ATAG) which address software used to create Web sites (Treviranus et al. 2000); (ii) Web Content Accessibility Guidelines (WCAG) which address the information in a Web site, including text, images, forms, sounds, and so on (Chisholm et al. 1999); (iii) User Agent Accessibility Guidelines (UAAG) which address Web browsers and media players, and relate to assistive technologies (Gunderson and Jacobs 1999). There are also other organisations that provide accessibility guidelines such as RNIB (see Table 1) and also accessibility reports that suggest 1
WAI, http://www.w3.org/WAI/
64
S. Harper, Y. Yesilada
Table 1 Web accessibility guidelines Organisation and guidelines Website WAI Guidelines Section 508 Guidelines RNIB Guidelines AFB Guidelines Dive into Accessibility IBM Guidelines PAS78 Accessible PDF ad Flash
Table 2 Web accessibility evaluation surveys and reports Organisation and guidelines Reference DRC Report Nielsen Norman Group Report Nova Report eAccessibility EU report UK Government Web sites
(DRC) (Coyne and Nielsen 2001) (Craven and Brophy 2003) (rep 2005) (Eva 2002)
guidelines (see Table 2), but the WAI guidelines are more complete and cover the key points of all the others. There is however, no homogeneous set of guidelines that designers can easily follow. Moreover, some guidelines are tailored to address the limitations of existing assistive technologies and devices. For instance, there is a guideline which says that extra white space needs to be added between link menu elements as some screen readers, which are commonly used assistive technologies among visually disabled users to access Web pages in audio, cannot handle link menu items properly. This means that some of these guidelines are not generic and device independent. The Web Content Accessibility Guidelines 1.0 (WCAG 1.0) describe how to make accessible Web content and Web sites (Chisholm et al. 1999). They are presented in two themes: graceful transformation (of content, structure, and presentation) and making content understandable and navigable (see Table 3). The specifications provide 14 guidelines, but unfortunately only three of them are in the second theme; the rest, such as creating tables that transform gracefully, are oriented to support sensory translation of text content to audio (Goble et al. 2000). The Nielson Norman Group has also published guidelines to assist designers to create accessible and usable Web pages (Coyne and Nielsen 2001). Although these guidelines are based on a series of usability tests of several different Web sites, the guidelines themselves are not different from others. Lately, the WAI is working on a new version of WCAG. However, this version has yet to be completed and published2 (see Section 4.1). Furthermore, as the DRC report concludes, although 2
WCAG 2.0, http://www.w3.org/TR/wcag2 req/
Web Accessibility and Guidelines
65
Table 3 Summary of the web content accessibility guidelines (WCAG 1.0) No. Guideline Theme 1: Ensuring graceful transformation 1. Provide equivalent alternatives to auditory and visual content 2. Do not rely on colour alone 3. Use markup and style sheets and do so properly 4. Clarify natural language usage 5. Create tables that transform gracefully 6. Ensure that pages featuring new technologies transform gracefully 7. Ensure user control of time sensitive content changes 8. Ensure direct accessibility of embedded user interfaces 9. Design for device independence 10. Use interim solutions 11. Use W3C technologies and guidelines Theme 2: Making content understandable and navigable 12. Provide context and orientation information 13. Provide clear navigation mechanisms 14. Ensure that documents are clear and simple
the compliance with WCAG 1.0 is necessary, it is not a sufficient condition for ensuring that sites are practically accessible and usable by disabled people (DRC). The DRC report also provides a number of recommendations to improve the navigation and orientation issues addressed in WCAG 1.0. The User Agent Accessibility Guidelines 1.0 (UAAG 1.0) describe how to make browsers and media players accessible (Gunderson and Jacobs 1999). The specification emphasises the importance of accessibility of the user interface, enabling the user to have access to the content and helping the user to orientate (see Desktop Browsers). Similarly, the Authoring Tool Accessibility Guidelines 1.0 (ATAG 1.0) provide key issues to assist in designing authoring tools that produce accessible Web content and assist in creating an accessible authoring interface (Treviranus et al. 2000). Most of the guidelines in this specification focus on the creation of standard and accessible markup, but they pay little attention to how authoring tools can assist Web designers to create understandable and navigable Web pages. Besides these guidelines, there are also other best practice efforts (Yesilada et al. 2007), which mainly include developing tools to ensure accessibility, such as validation, transformation, and repair tools3 (Harper and Bechhofer 2005). Validation and repair tools analyse pages against accessibility guidelines and return a report or a rating (Ivory and Hearst 2001). Various validation tools are available which differ in several ways such as functionalities (e.g., testing, fixing) and method of use (e.g., online service, desktop application integrated in authoring tools). These tools are important for Web accessibility as they provide a medium for designers or authors to validate their pages against 3
published guidelines without actually reading and manually applying them (Paciello 2000). While these tools encourage markup that conforms to the specifications and guidelines, no one except the Web page designer can really enforce it. While the evaluation and repair tools focus on assisting the authors to modify or correct their pages, transformation or transcoding tools focus on assisting Web users by mainly transforming pages into alternative forms to better meet users’ needs (see Transcoding). Although there has been extensive work in the degree and development of these tools, automation is still limited (Yesilada et al. 2004). While it is likely that there are certain accessibility issues that cannot be fully automated (e.g., checking the quality of alternative text provided for images), these tools still provide incomplete automation and complex outputs. There are a number of related fields to Web accessibility and guidelines which serve as a generic expression of the kinds of requirements needed to make the Web open. Device independence encourages this openness by encouraging inclusively diverse devices. By supporting these devices in a generalised context, we provide de facto support for specialist devices, such as the assistive technologies used in Web accessibility. How users interact, and support for that interaction, is also very important in both the general case and in the specialised case of accessibility. Only by understanding both of these areas can we design Web accessibility guidelines and support open standards. Indeed, the more we reveal by our research, the more changes are required to the guidelines, hence the constant move through versions. In the following sections, we discuss a number of areas where we believe they will impact how Web accessibility guidelines will be developed in the future.
2.1 Device Independence Device independence, the goal of running any Web resource on any compliant device, is as yet unachieved. However, the W3C has had a device independence activity4 since these early noughties, which has now been closed and its work items transferred to the Ubiquitous Web Applications Activity (UWA).5 Both activities, however, aim to develop techniques and recommendations to address challenges (Sullivan and Matson 2000) faced by Web users because of device or network limitations, including small screens, restricted keyboards, and lower bandwidth (Lie and Saarela 1999). This activity mainly focuses on methods by which the characteristics of the device are made available, and methods to assist authors in creating sites and applications that can support device independence. The most important outcome of this activity is the Composite Capabilities/ Preferences Profile (CC/PP) which is a framework for describing device capabilities and user preferences. 4 5
Device Independence Activity, http://www.w3.org/2001/di/ Ubiquitous Web Applications Activity, http://www.w3.org/2007/uwa/
Web Accessibility and Guidelines
67
Although the CC/PP framework is based on the Resource Description Framework (RDF),6 which means it provides an extensible vocabulary (descriptions of new devices and different user preferences can easily be represented), there are many limitations. For example, such device descriptions require negotiation between server and client, because the server needs to provide content to meet the needs of the client device. This requires designing pages in such a way that numerous device and profile descriptions can be handled on the server side (Dees 2004). This can be achieved either by applying content selection techniques or content transformation algorithms. However, the method is not as important as the fact that designers need to design pages in such a way that these different requirements can be handled. Furthermore, if designers also want to create accessible pages for disabled users, they need to consider accessibility requirements as well as device independence requirements (Kirda 2001), which means that page design can become extremely complicated. With the transfer of concerns to the new UWA Activity, we may see ‘u¨ber– device independence’ as the working group re-focuses on extending the Web to all kinds of devices including sensors and effectors, with application areas including home monitoring and control, home entertainment, office equipment, mobile and automotive applications.
2.2 Web Interaction Web interaction focuses on improving technologies that provide communication with the Web. This is lead by the W3C’s Interaction Domain, which is responsible for developing technologies that shape and adapt the Web’s user interface (Sullivan and Matson 2000).7 These technologies mainly include (X)HTML, which is the markup language that started the Web, Cascading Style Sheets (CSS), which provides a mechanism for adding presentation style to Web pages, Scalable Vector Graphics (SVG), which can be used to create two-dimensional graphics in XML, etc. (Lie and Saarela 1999). Development in these technologies effect how people browse the Web, and how they author Web content. Therefore in any effort to support Web accessibility, it is crucial that features and limitations of these technologies are clearly stated. As part of the W3C’s Interaction Domain, the Multimodal Interaction Working Group8 seeks to extend the Web to allow users to choose an effective means to interact with Web applications through the modes of interaction best suited to their needs and device (visual, aural, and tactile). This kind of adaptation is key to Web standards, the rationale behind the activities focus on providing use cases and requirements analyses which are important resources for supporting Web accessibility. 6 7 8
RDF, http://www.w3.org/RDF/ Web Interaction Activity, http://www.w3.org/Interaction/ Multimodal Interaction Group, http://www.w3.org/2002/mmi/
68
2.3 Adaptation and Coping Interaction
S. Harper, Y. Yesilada
Key Components of Disabled
Some recent user studies suggest that disabled users develop strategies to cope with complex and inappropriately designed pages (Yesilada et al. 2007, Takagi et al. 2007), but it is not known how these strategies affect disabled users’ interaction with Web pages and how guidelines can be extended to address these strategies. However, before we consider coping strategies as part of the Web page design paradigm or guidelines, we first need to understand the relationship between coping and adaptation processes. Adaptation is described as routine modes of getting along, and coping is related to those instances of adaptation that are particularly problematic, requiring new responses or special efforts (Zeidner and Endler 1996). Coping is further defined as ‘constantly changing cognitive and behavioural efforts to manage specific external and/or internal demands that are appraised as taxing or exceeding the resources of that person’ (Lazarus and Folkman 1984, Lazarus 1993). Research on coping suggests that it typically involves some sort of stress and can be distinguished from other behaviours by occurring in stress situations (Zeidner and Endler 1996). Disabled Web users are in stress situations when they access complex pages with assistive technology (DRC). To overcome such stress situations, they employ coping strategies which refer to the specific efforts, both behavioural and psychological, that they employ to master, tolerate, reduce, or minimise stressful events (Lazarus 1966). Two general coping strategies have been distinguished: problem-solving strategies are efforts to do something active to alleviate stressful circumstances, whereas emotion-focused coping strategies involve efforts to regulate the emotional consequences of stressful or potentially stressful events. In the design world, problem-solving coping strategies are also known as work-arounds or acts to employ unintended, non-obvious elements of a design in an effort to overcome the constraints of a physical and social environment (Norman 1988, 2004). Some studies show that disabled people develop workarounds regarding everyday technology found in their homes, such as wristwatches and cell phones (Shinohara 2006). It seems evident, then, that Web interaction is highly influenced by the abilities of the users and the technology used to facilitate interaction. This technology, its use along with the guidelines that are often consulted in its development, drive Web accessibility especially at the design and build stage of the life cycle.
3 Discussion The major problem we find is that there is a lack of scientific rigour to the current sets of guidelines. These guidelines, designed for the most part by wellmeaning committees, did not have a wealth of scientific evidence when
Web Accessibility and Guidelines
69
designing them. Indeed, most guidelines are still without proper user studies and without scientific ratification. These guidelines were mostly created from anecdotal evidence and hearsay with some minor study results as the driving force. Indeed, guidelines have in some cases been used as a justification for prosecution, but when met, still do not produce an accessible Web site. While some work is being undertaken on this front, in retrospective validation (Watanabe 2007) if you will, there seems very little progress on the whole, with accessibility resources being directed into new areas such as Web 2.0 and the Semantic Web. Web 2.0, is a mesh of enhanced semantics, push application widgets, and embedded scripting languages, and it was developed to pursue the promise of enhanced interactivity. While Web 2.0 will give a more in-depth treatise, we summarise that there is no precise definition of the Web 2.0, and in fact, there is some controversy surrounding definitions (White 2006). Today’s Web is qualitatively different from the Web created a decade ago, and we can say that the term Web 2.0 is used to emphasise this evolution in a software-versioning style. As O’Reilly,9 who coined the term, highlights ‘there is no hard boundary for the definition of this term.’ However, we can discuss Web 2.0 based on the following three aspects (Millard and Ross 2006): content, social (collective intelligence), and technologically. In the Web 2.0, information is broken up into ‘micro-content’ units that can be distributed over different domains. As opposed to static (i.e., single-stream) pages, in Web 2.0 sites, pages aggregate and remix micro-content in different ways (i.e., multi-stream). These pages consume and remix data from multiple sources, a good example of this is Google Portal,10 while providing their own data and services in a form that allows remixing by others (Webster et al. 2006). This creates network effects through an ‘architecture of participation,’ and goes beyond the page metaphor of the Web a decade ago to deliver rich user experiences. Technologies such as Web services and RSS feeds have contributed enormously towards developing such kind of aggregated contents. The Web 2.0 is also seen as a combination of tools and sites that foster collaboration, sharing, and participation (Millen et al. 2005). The idea is that users are treated as co-developers, and the environment allows a harnessing of collective intelligence. Thus network effects from user contributions are key to this idea. Wikipedia, del.icio.us, Flickr, Amazon, Google, and Yahoo are good examples that make use of the collective intelligence to provide a variety of services. The Web 2.0 is also used to refer to a family of technologies used to build dynamic and collaborative features of Web sites. These include technologies such as (i) AJAX that stands for ‘asynchronous JavaScript and XML’ 9
What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software, http://www.oreillynet.com/lpt/a/6228 10 Google Portal, http://www.google.com/ig
70
S. Harper, Y. Yesilada
(Gibson 2007) and incorporates XHML, CSS, DOM, XML, XSLT, and XMLHttpRequest; (ii) tag clouds or folksonomies,11 and (iii) wikis.12 Based on these three aspects, we can say that the Web 2.0 is not about serving and reading static pages, but it is a platform that allows collaboration, sharing, and usage of applications that used to traditionally run on desktops; these include online calendars (e.g., CalendarHub), productivity application suites (e.g., HyperOffice), e-mail and collaboration (e.g., Gmail), project management and personal organisers (e.g., Stikipad), and multimedia social software (e.g., Flickr, YouTube). In this case, how do guidelines and best practice, conceived when the Web had just one operating modality (i.e., static), relate to new dynamic and highly interactive pages? Do we now need a new set of guidelines to address such changes on the Web? These questions are as yet unanswered. While we can predict that the possible benefits of Web 2.0 are great, it seems, however, that without timely and prompt action, disabled users will be barred from these benefits. Indeed, the use of Web 2.0 sites, as described above, will rapidly become ‘off-limits’ to disabled users. Semantic Web technologies (Semantic Web 19) have already shown themselves to be useful in addressing some issues of Web Accessibility. However, this new technology has not yet started to make its way into mainstream applications. Without change, will the benefits of the Semantic Web be lost? Will the promising enhanced interactivity of Web 2.0 technologies become increasingly inaccessible to disabled users? More importantly, how can we incorporate the requirements of these new technologies to the existing Web accessibility guidelines?
4 Future Directions Understanding how a subject area will develop is notoriously risky, however, we can make some fairly sweeping predictions. First, we consider that the Web Content Accessibility Guidelines–Version 2 (WCAG 2.0) will be a major focus of effort in the future, especially with regard to testing algorithms and addition of scientific-based testing criteria. Secondly, we think there will be a strong case for guideline internationalisation. Thirdly, we believe that if unchecked, we will encounter ‘guideline snow’, a proliferation of many guidelines with no real way of checking and validating all. Finally, we believe that in the future, the luxury of expecting creators to know multiple guidelines and best practice will evaporate as the technologies, applications, and user devices they are used to create will be expanding exponentially.
4.1 Web Content Accessibility Guidelines (Version 2) The WCAG 2.0 has taken around 5 years to develop mainly by a committee of specialists with public consultation under the aspics of the W3C. WCAG 2.0 covers a wide range of recommendations for making Web content more accessible. However, as the recommendations themselves state, the guidelines do not include standard usability recommendations except where they have a significantly greater impact on people with disabilities than on other people. These guidelines are mainly concerned with being testable and validateable, a major criticism of WCAG 1.0 guidelines. However, the authors acknowledge that even content that completely conforms to WCAG 2.0 may not be fully accessible to every person with a disability. Indeed, some user groups such as people with cognitive, language, and learning disabilities are not fully addressed by WCAG 2.0, either directly or through assistive technologies; and, there is a need for more research and development in these areas. The guidelines themselves reassert this need for user studies in concert with machine validation (see End User Evaluations): All WCAG 2.0 success criteria are testable. While some can be tested by computer programs, others must be tested by qualified human testers. Sometimes, a combination of computer programs and qualified human testers may be used. When people who understand WCAG 2.0 test the same content using the same success criteria, the same results should be obtained with high inter rater reliability.
The testable nature of the guidelines is a marked difference and departure from WCAG 1.0 and permeates through all guidelines. Under each guideline, there are success criteria that describe specifically what must be achieved in order to conform to this standard. Each success criterion is written as a statement that is either true or false when Web content is tested against it. These success criteria are, like WCAG 1.0 guidelines, divided into three levels of conformance, single–A, double–A, and triple–A. However, the user-testing aspect is also covered as the guidelines require the same results to be obtained when people who understand how disabled users interact with Web content, test the same content. Indeed, WCAG 2.0 rests on four key principles: Anyone who wants to use the Web must have content that is: (1) Perceivable Informa tion and user interface components must be perceivable by users; (2) Operable User interface components must be operable by users; (3) Understandable Information and operation of user interface must be understandable by users; and (4) Robustness Content must be robust enough that it can be interpreted reliably by a wide variety of user agents, including assistive technologies.
While these guidelines (and the fact there are only 12) all look straight forward on initial inspection, they are reasonably complicated and specific ‘under the hood’. For instance, there are four supporting documents to consider when looking at a guideline. First, the quick reference, then the technical guideline document, next the ‘Understanding WCAG 2.0’ document, and finally the ‘Techniques and Failures for WCAG 2.0’ text. These complications have lead
72
S. Harper, Y. Yesilada
to a number of designers and practitioners condemning WCAG 2.0 as impractical, and in some cases suggesting their own. The most vocal of these is Joe Clark: The Web Content Accessibility Guidelines 1.0 were published in 1999 and quickly grew out of date. The proposed new WCAG 2 is the result of five long years work by a Web Accessibility Initiative (WAI) committee that never quite got its act together. In an effort to be all things to all web content, the fundamentals of WCAG 2 are nearly impossible for a working standards compliant developer to understand. WCAG 2 backtracks on basics of responsible web development that are well accepted by stan dardistas. WCAG 2 is not enough of an improvement and was not worth the wait. Joe clark ‘To Hell with WCAG 2’, A List Apart Magazine, May 23, 2006 http://www.alistapart.com/articles/tohellwithwcag2
As research scientists, this controversy seems to be focused on the more practice-related areas of Web accessibility, however with enough support, the W3C Web Accessibility Initiative may be forced to rethink WCAG 2.0.
4.2 Internationalisation and Guideline Snow Guidelines are meant to dovetail into accessible technology, for instance WCAG 1.0 checkpoint 10.5 states ‘10.5 Until user agents (including assistive technologies) render adjacent links dis tinctly, include non link, printable characters (surrounded by spaces) between adjacent links. [Priority 3]’
In this case, the checkpoint is less focused on the content but more on the conformity of the user agent accessibility technology. Secondly, the guidelines have an imbedded cultural bias in that western north American and northern European society drive development of the guideline effort and therefore assumptions are made with regard to status, requirement, and preference. Indeed, when looking at guideline 10.5, the implicit assumption is this guideline will be met when tools developed in the English speaking world can render adjacent links distinctly. However, what about the capabilities of Japanese or Taiwanese screen readers (Watanabe and Umegaki 2006, Chen and Ho 2007)? What about user agents for different devices and what about devices used out of context? The WAI understands that this occurs but their solution is standards harmonisation: ‘Harmonization of Web accessibility standards is key to making an accessible Web, because it creates a unified market for authoring tools that produce conformant content. This unified market in turn drives more rapid development of improved authoring tools. Improved authoring tools make it easier to create accessible Web sites, and to repair previously inaccessible sites; for instance, by prompting for acces sibility information such as alternative text for graphics, captions for audio, or sum maries for data tables. Widespread availability of improved authoring tools can enable accessible design to become the prevailing design mode even for Web developers only
Web Accessibility and Guidelines
73
minimally aware of the rationale for Web accessibility, or disinclined to learn guidelines and techniques for accessibility.’
Which means conforming country-specific guidelines to the W3C guidelines. While this may seem practical, it does not account for cultural differences. While we can see that ‘guideline snow’ (having too many guidelines to make a decision as to site compliance or remain informed about the standards process) should be reduced, we must also build in flexibility for cultural specific guidelines. It is not just country-specific legislatures which contribute to the proliferation of guidelines. Indeed, the W3C also contributes to guideline snow; it seems that with every new technology, a new set of conformance criteria and guidelines are created. Instead of creating specific addendums to a master set of guidelines to keep repetition and overload to a minimum, each W3C domain seems to need to create a new set of guidelines just for their area of concern.
4.3 Other Domains and Generalisation The work undertaken in the Web accessibility field is not only for disabled people (Harper et al. 2004) but for organisations and people without disabilities also.13 For instance, Mobile Web access suffers from interoperability and usability problems that make the Web difficult to use for most users (Harper and Patel 2005). With the move to small screen size, low bandwidth, and different operating modalities, all mobile device users effectively suffer the sensory and cognitive impairments normally only experienced by disabled users. W3Cs ‘Mobile Web Initiative’ (MWI)14 proposes to address these issues through a concerted effort of key players in the Mobile production chain, including authoring tool vendors, content providers, handset manufacturers, browser vendors, and Mobile operators. The current work in MWI focuses in two main areas: (i) developing ‘best practices’, which includes developing a set of technical best practices and associated materials in support of the development of Web sites that can be easily viewed and interacted with on Mobile devices and (ii) identifying device information required for content adaptation which includes the development of services that provide device descriptions in support of Web-enabled applications. Within the focus of the first area, the MWI proposes a new set of guidelines for realising mobile Web called the Mobile Web Best Practices (MWBP). Although these best practices have been partly derived from WCAG 1.0, they are still presented as separate guidelines. Therefore, if designers want to create a page which is accessible for both mobile and disabled users, they have to follow a number of different guidelines and validation tools which means it will be time consuming and costly. 13 14
WAI Education and Outreach, http://www.w3.org/WAI/EO/ Mobile Web Initiative, http://www.w3.org/2005/MWI/
74
S. Harper, Y. Yesilada
We think migrating findings from accessibility research to the mobile Web is timely, and there is an opportunity to transfer lessons learnt and experiences gained to the Mobile Web which should not be missed. Indeed, there is great potential for reciprocity and interoperability between the mobile and accessible Webs, and many of the lessons learnt from the accessible Web will be relevant to the mobile Web. Accessibility practitioners have been researching, innovating, and building device-independent resources, in all but name, for many years; their expertise can, and should, be leveraged in efforts to develop the Mobile Web. Conversely, the Mobile Web is important from an accessibility standpoint, because the Mobile Web addresses many of the same issues, and is likely to be the focus of a significant research effort. We also think mobile Web is just an example domain that can benefit from the accessible Web research, there are many more and with the evolving Web, there will be more.
4.4 Everyone Editing It is clear that the Web is returning to its origins; surfers are not just passive readers but content creators. Wikis allow open editing and access, blogs enable personal expression, Flicker, YouTube, MySpace, and Facebook encourage social networking by enabling designs to be ‘created’ and ‘wrapped’ around content. Indeed, it seems that only the Web infrastructure supporting expression is immutable and invisible to the user. Template-based tools such as iWeb, Google Page Creator, and RapidWeaver enable fast professional looking Web site creation using automated placement, with templates for blogging, picture sharing, and social networking; these tools often require publishing to a systemspecific server, such as ‘.mac’. The ‘everyman’ can now have a say but how do we prevent the appalling inaccessibility on the early Web without stifling creative freedom. Ordinary users do not (mostly) understand Web technologies, guidelines, or the needs of other user groups. How can we help everyone to understand ‘the rules’ (guidelines)? Do we need to create a simple set of guidelines, or indeed do we need guidelines to be supported by the authoring tools instead? Maybe guidelines are not the way forward and a path developing technologies to automatically crawl and fix these pages needs to be followed. In this case, we wonder if the conjugation of authoring tools and user agents represents an opportunity for automatically generated Web accessibility or yet another problem for Web accessibility? Will form-based and highly graphical interfaces exclude disabled users from creation, expression, and social networking? What problems exist, what are the upcoming problems, what solutions are required? What about the accessibility of the content designed and created by surfers? Finally, what effect will this have on the wider Web? We pose the question: What happens when surfers become authors and designers?
Web Accessibility and Guidelines
75
5 Authors Opinion of the Field Although the overall vision of these guidelines is good, the success of these guidelines can be discussed – according to the DRC report (DRC), most of the Web sites in their evaluation (81%) fail to satisfy the most basic WCAG 1.0 categories. Designers usually view these guidelines as irrelevant, too restrictive, or too time-consuming to implement. Moreover, as the DRC report (DRC) highlights designers have an inadequate understanding of the needs of disabled users and how to create accessible websites. For example, the Web has offered visually disabled people an unprecedented opportunity to have the same access to information as their sighted counterparts. However, not many designers or authors know or understand how visually disabled people access the Web and what needs to be done to create an accessible page. Furthermore, some studies show that the application of the guidelines is subject to interpretation; two designers applying the same set of guidelines to same set of pages generate different results (Ivory and Hearst 2001). Disabled people represent around 10%–15% (estimate includes both registered and unregistered) of the European population. More than 82% of all people who are disabled are 50 years of age and older with these figures set to increase as the population ages. This ageing population will find that if the status quo is maintained, a great deal of the quality of their lives will be reduced as technology (such as the Web) becomes inaccessible to them. We see the problem as twofold; as the population ages, the requirement to work longer is increased but the ability to work long, as disability increases, is reduced. Apart from the ability to work, most people will also loose a communication lifeline when they can no longer use the Web; no more books from Amazon,15 no more Webmail from the children, no more searching for family genealogy. We suggest that these issues must be addressed so that the major life activities of disabled people can be as unlimited as possible regardless of disability. We suggest that people are disabled not by their impairment but are handicapped by the technology, infrastructure surrounding them, and the environment in which people are working in – this is also known as situationally induced impairments which typically occur temporarily (Sears and Young 2003). People are also handicapped in their efforts to find employment, to interact more fully with society at large, and to freely use technology without assistance. However, with the growth of the knowledge economy through Europe and other countries, and a move from manual work to more thoughtand communication-based activities, there is the very real possibility of disabled people finding productive, fulfilling, and social empowering employment even later in life if only technology, and specifically the Web, were available to them. Web accessibility not only will benefit those users whose access is currently hampered, but will potentially reduce the associated costs of providing 15
Amazon, http://www.amazon.co.uk/
76
S. Harper, Y. Yesilada
accessible content for information providers. With the additional introduction of legislation, providing supporting infrastructure to aid Web accessibility becomes increasingly important.
6 Conclusion The Web plays an important role in many areas of our lives (e.g., education, employment, government, etc.) and as Thatcher et al. (2002) states ‘an accessible Web that allows people with disabilities to actively participate in society is essential for equal opportunities in many areas’. In this chapter, we have given a broad overview of the Web accessibility field which aims to provide equal opportunities to everyone, described a number of guidelines that are designed to ensure accessibility on the Web and discussed a number of areas that we think will become more important in the future. Given the speed at which the Web evolves, it can be anticipated that many other guidelines will be developed or will need to be developed and made available to the accessibility community. Therefore, it is our hope that this chapter will help future scientists learn from our mistakes. In conclusion, although the guidelines are useful, they are only part of the overall process of supporting Web accessibility (see Specialized Browsers). Our review of the ‘Web accessibility’ concludes that disabled people have difficulties accessing the Web, either because of the inappropriately designed Web pages or because of the insufficiency of currently available technologies. This lack of accessibility leads to poor interaction and lack of understanding of disabled users which forces them to cope with interaction methodologies that are inappropriate.
References Andrew Sears and Mark Young. Physical disabilities and computing technologies: An analysis of impairments. pp. 482 503, 2003. A Report Into Key Government Web Sites. Interactive Bureau, UK, 2002. ISBN 978 0954720445. http://www.iablondon.com/. Bebo White. The implications of web 2.0 on web information system. In WEBIST, Portugal, 2006. Becky Gibson. Enabling an accessible web 2.0. In W4A ’07: Proceedings of the 2007 international cross disciplinary conference on Web accessibility (W4A), pp. 1 6. ACM Press, 2007. doi: http://doi.acm.org/10.1145/1243441.1243442. Brian Kelly. Webwatch: An accessibility analysis of UK university entry points. Technical report, The University of Bath Ariadne Issue 33, 2002. http://www.ariadne.ac.uk/issue33/ web watch/. Carole Goble, Simon Harper, and Robert Stevens. The travails of visually impaired web travellers. In HT? 00, pp. 1 10. ACM Press, 2000. David E. Millard and Martin Ross. Web 2.0: hypertext by any other name? In HYPERTEXT ’06: Proceedings of the seventeenth conference on Hypertext and hypermedia, pp. 27 30. ACM Press, 2006. doi: http://doi.acm.org/10.1145/1149941.1149947.
Web Accessibility and Guidelines
77
David Millen, Jonathan Feinberg, and Bernard Kerr. Social bookmarking in the enterprise. Queue, 3(9):28 35, 2005. ISSN 1542 7730. doi: http://doi.acm.org/10.1145/1105664. 1105676. Disability Rights Commission (DRC). The web: Access and inclusion for disabled people. Technical report, Disability Rights Commission (DRC), UK, 2004. David Webster, Weihong Huang, Darren Mundy, and Paul Warren. Context orientated news riltering for web 2.0 and beyond. In WWW ’06: Proceedings of the 15th international conference on World Wide Web, pp. 1001 1002, New York, NY, USA, 2006. ACM Press. ISBN 1 59593 323 9. doi: http://doi.acm.org/10.1145/1135777.1135985. Donald A. Norman. The Design of Everyday Things. MIT Press, 1988. Donald A. Norman. Emotional Design Why We Love (or Hate) Everyday Things. Basic Books, 2004. eAccessibility of Public Sector Services in the European Union. European union policy survey, UK Government Cabinet Office, November 2005. URL http://www.cabinetof fice.gov.uk/e government/resources/eaccessibility/index.asp. Engin Kirda. Web engineering device independent web services. In ICSE ’01: Proceedings of the 23rd International Conference on Software Engineering, pp. 795 796, Washington, DC, USA, 2001. IEEE Computer Society. Haakon Wium Lie and Janne Saarela. Multipurpose Web publishing using HTML, XML, and CSS. Commun. ACM, 42(10):95 101, 1999. doi: http://doi.acm.org/10.1145/317665. 317681. Hironobu Takagi, Shin Saito, Kentarou Fukuda, and Chieko Asakawa. Analysis of navigability of web applications for improving blind usability. ACM Trans. Comput. Hum. Interact., 14(3):13, 2007. Jenny Craven and Peter Brophy. Non visual access to the digital library: the use of digital library interfaces by blind and visually impaired people, 2003. Library and Information Commission Research Report 145. Jim Thatcher, Cynthia Waddell, Shawn Henry, Sarah Swierenga, Mark Urban, Michael Burks, Bob Regan, and Paul Bohman. Constructing Accessible Web Sites. Glasshaus, 2002. ISBN 1904151000. Jon Gunderson and Ian Jacobs. User agent accessibility guidelines 1.0. W3C, 1999. http:// www.w3.org/TR/WAI USERAGENT/. Jutta Treviranus, Charles McCathieNevile, Ian Jacobs, and Jan Richards. Authoring tool accessibility guidelines 1.0. W3C, 2000. http://www.w3.org/TR/ATAG10/. Kara Pernice Coyne and Jakob Nielsen. Beyond ALT text: Making the web easy to use for users with disabilities. Nielson Norman Group, 2001. Kristen Shinohara. Designing assistive technology for blind users. In ASSETS’06, 2006. Melody Ivory and Marti Hearst. The state of the art in automating usability evaluation of user interfaces. ACM Computer Survey, 33(4):470 516, 2001. Michael Paciello. Web accessibility for people with disabilities. CMP books, CMP media LLC, 2000. ISBN 1 929629 08 7. Moshe Zeidner and Norman S. Endler, editors. Handbook of coping: theory, research, applications. John Wiley and Sons, 1996. Richard S. Lazarus. Psychological Stress and the Coping Process. McGraw Hill, 1966. Richard S. Lazarus. Coping theory and research: Past, present and future. Psychosomatic Medicine, 55:234 247, 1993. Richard S. Lazarus and Susan Folkman. Stress, Appraisal and Coping. Springer Publishing Company, 1984. Simon Harper and Neha Patel. Gist summaries for visually impaired surfers. In Assets ’05: Proceedings of the 7th international ACM SIGACCESS conference on Computers and accessibility, pp. 90 97. ACM Press, 2005. Simon Harper and Sean Bechhofer. Semantic triage for increased accessibility. IBM Systems Journal, 44(3), 2005.
78
S. Harper, Y. Yesilada
Simon Harper, Yeliz Yesilada, Carole Goble, and Robert Stevens. How much is too much in a hypertext link?: Investigating context and preview a formative evaluation. In Proceedings of the fifteenth ACM conference on Hypertext & hypermedia, pp. 116 125, 2004. doi: http:// dx.doi.org/10.1145/1012807.1012843. Takayuki Watanabe. Experimental evaluation of usability and accessibility of heading elements. In W4A ’07, pp. 157 164. ACM Press, 2007. doi: http://doi.acm.org/10.1145/ 1243441.1243473. Takayuki Watanabe and Masahiro Umegaki. Capability survey of japanese user agents and its impact on web accessibility. In W4A: Proceedings of the 2006 international cross disciplinary workshop on Web accessibility (W4A), pp. 38 48. ACM Press, 2006. doi: http://doi.acm.org/10.1145/1133219.1133227. Terry Sullivan and Rebecca Matson. Barriers to use: usability and content accessibility on the web’s most popular sites. In Proceedings of the 2000 conference on Universal Usability, pp. 139 144, 2000. ISBN 1 58113 314 6. Walter Dees. Handling device diversity through multi level stylesheets. In IUI ’04: Proceed ings of the 9th international conference on Intelligent user interfaces, pp. 229 231. ACM Press, 2004. doi: http://doi.acm.org/10.1145/964442.964488. Wendy A. Chisholm and Shawn Lawton Henry. Interdependent components of web accessi bility. In W4A ’05: Proceedings of the 2005 International Cross Disciplinary Workshop on Web Accessibility (W4A), pp. 31 37. ACM Press, 2005. Wendy Chisholm, Gregg Vanderheiden, and Ian Jacobs. Web content accessibility guidelines 1.0.W3C, 1999. http://www.w3.org/TR/WAI WEBCONTENT/. Yeliz Yesilada, Robert Stevens, Simon Harper, and Carole Goble. Evaluating DANTE: Semantic transcoding for visually disabled users. ACM Trans. Comput. Hum. Interact., 14(3):14, 2007. doi: http://doi.acm.org/10.1145/1279700.1279704. Yeliz Yesilada, Simon Harper, Carole Goble, and Robert Stevens. Screen readers cannot see (ontology based semantic annotation for visually impaired web travellers). In Proceedings of the International Conference on Web Engineering (ICWE), pp. 445 458. Springer, 2004. doi: http://dx.doi.org/10.1007/b99180. Yuejiao Zhang. Wiki means more: hyperreading in wikipedia. In HYPERTEXT ’06: Proceed ings of the seventeenth conference on Hypertext and hypermedia, pp. 23 26. ACM Press, 2006. doi: http://doi.acm.org/10.1145/1149941.1149946. Yui Liang Chen and Yung Yu Ho. The status of using ‘big eye’ chinese screen reader on ‘wretch’ blog in taiwan. In W4A ’07: Proceedings of the 2007 international cross disciplinary conference on Web accessibility (W4A), pp. 134 135. ACM Press, 2007. doi: http://doi.acm. org/10.1145/1243441.1243447.
Web Accessibility Evaluation Shadi Abou-Zahra
Abstract Web accessibility evaluation is a broad field that combines different disciplines and skills. It encompasses technical aspects such as the assessment of conformance to standards and guidelines, as well as non-technical aspects such as the involvement of end-users during the evaluation process. Since Web accessibility is a qualitative and experiential measure rather than a quantitative and concrete property, the evaluation approaches need to include different techniques and maintain flexibility and adaptability toward different situations. At the same time, evaluation approaches need to be robust and reliable so that they can be effective. This chapter explores some of the techniques and strategies to evaluate the accessibility of Web content for people with disabilities. It highlights some of the common approaches to carry out and manage evaluation processes rather than list out individual steps for evaluating Web content. This chapter also provides an outlook to some of the future directions in which the field seems to be heading, and outlines some opportunities for research and development.
1 What Is Evaluation? Web accessibility evaluation is an assessment of how well the Web can be used by people with disabilities. While this includes the accessibility of the authoring tools, the user agents (such as browsers and media players), assistive technologies, and the underlying Web technologies, this chapter focuses on evaluating the Web content. Depending on the scope of an evaluation, Web content could mean individual Web pages, collections of Web pages such as Web applications or whole Web sites, or just specific
S. Abou Zahra World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI), France e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_7, Ó Springer Verlag London Limited 2008
79
80
S. Abou Zahra
parts of a Web page such as tables or images. Some common situations for evaluating the accessibility of Web content include the following:
A Web developer wants to ensure that the Web application being developed meets the required standard for accessibility
A Web author wants to ensure that the information published in the Web pages is usable by people with disabilities
A Web designer wants to learn about some of the accessibility issues related to the visual design in order to improve it
A Web project manager wants to explore some of the potential accessibility issues on a Web site to estimate its performance
An organization wants to determine if their Web site meets a standard for accessibility or where it fails to meet it In a more technical sense, Web accessibility evaluations can be described as quality assurance measures that use accessibility provisions as the metric. The accessibility provisions are defined by technical standards, guidelines, best practices, or other sets of requirements, some of which have been introduced in the previous chapter of this book. Depending on the nature of the provisions that set the requirements, different inspection and testing techniques can be combined, for example automated testing, expert judgments, or testing with end-users. Also similar to quality assurance, Web accessibility evaluation encompasses inspections as well as processes and workflows that govern the roles and tasks of the different evaluators involved in the assessment.
2 Principles of Evaluation While Web accessibility evaluation is in many aspects similar to software quality assurance, there are some notable differences. Firstly, Web content tends to change frequently while software is released in discrete versions that do not change much over time. Evaluation is therefore not limited to the production of the Web content but is essential for assessing and monitoring the Web content throughout its lifetime; in many cases this on-going evaluation is more important than the initial evaluation. Web content is also often published by non-technical Web authors, for example by employees who are using content management systems or other authoring tools that cascade the source code. On the other hand, software is developed by technical personnel and in more controlled environments. The quality control of Web content is therefore more challenging with respect to roles, responsibilities, and coordination of the involved parties. For example, on many Web sites any end-user can publish or modify Web content using the wikis, blogs, and other interactive communication channels; quality control in such situations becomes an extremely challenging task.
Web Accessibility Evaluation
81
Finally, Web content tends to focus more on informational substance (usually in the form of text but increasingly also multimedia) and its presentation to the user while software tends to be more oriented to the functional features and programming logic. This difference is reflected in the assessment approaches and requirements. For example, the assessment of the informational substance, such as the formulation of text, or the user interface, such as the layout, bears some degree of subjectivity and requires human judgment. However, the assessment of programming logic usually relies on test cases which can often be executed automatically on the source code. Where Web accessibility evaluation differs from software quality assurance, it often resembles usability assessment which primarily addresses the user interaction rather than the technical implementation. In other words, in between the largely technical domain of software quality assurance and the largely functional domain of Web usability evaluation, the field of Web accessibility evaluation has emerged as an independent discipline in quality assurance. It has developed its own (though related) testing techniques, inspection procedures, and support tools. These basic principles of evaluation will be highlighted and discussed throughout the following sections.
2.1 Evaluation and Development Since Web accessibility evaluation is often carried out with the purpose of improving or maintaining the Web content, it is closely related to the development process. At each stage of the development process, different types of evaluations are carried out to assess different aspects. Figure 1 illustrates a Web development life cycle that consists of a ‘‘Requirements’’, ‘‘Design’’, ‘‘Implementation’’, as well as ‘‘Operation’’ stages, each of which is connected in a closed loop. The terminology is based on the standard ‘‘ISO 12207 – Software Life Cycle Processes’’ (ISO 1995) but was slightly adapted for Web development. The following sub-sections will explain these stages in more detail and highlight their relationship to evaluating accessibility.
2.1.1 Requirements Phase During the requirements phase of Web development, the objectives and requirements are identified. Typically, common requirements analysis techniques such as sketches, storyboards, and personas are used to develop the requirements and ensure that they address the needs of the end-users. The accessibility requirements are ideally also considered during this early stage of development as it will save valuable time and effort in addressing these requirements later. For example, a common accessibility provision is to grant the user sufficient time to complete tasks, such as by alerting the user before a timeout is reached. Such provisions can influence the requirements on the overall behavior and
82
S. Abou Zahra
Fig. 1 Stages of the Web development life cycle
characteristics of the Web content. Also the formulation and presentation of text and multimedia are subject to accessibility requirements. The first step in analyzing the accessibility requirements for specific Web content is to learn about how people with disabilities interact with the Web, and understand some of the common issues that can confront specific users. A usercentered design process can be a very helpful method for including people with disabilities from the start, for example by constructing concrete personas or scenarios that involve people with disabilities (Henry 2007). Also Web accessibility standards and guidelines, such as the W3C Web Content Accessibility Guidelines, provide a comprehensive listing of the requirements for people with disabilities, and are therefore an important first step in setting out the objectives, goals, and functional properties of Web content.
2.1.2 Design Phase After the objectives and requirements are matured, first prototypes and mockups to validate the requirements are created. During this stage of development, formative and functional evaluations should be carried out on the overall design concepts, for example on the document structure, the navigational features, the color schemes, and general presentation. Often the templates that later control the overall presentation and key interaction concepts (the ‘‘look and feel’’) of the Web content are developed during this stage; these are critical assets for evaluation as it is usually much more difficult to change these underlying concepts during later stages of development.
Web Accessibility Evaluation
83
Also during this stage, Web accessibility standards and guidelines provide important guidance on how to design Web content that is usable by people with disabilities, for example by determining the relevant accessibility provisions and assessing how well the Web content conforms to them. Note that accessibility provisions usually tend to also improve the usability for a larger audience (beyond people with disabilities), and are therefore especially worthwhile. However, without understanding some of the basic principles of how people with disabilities interact with the Web, it can often be difficult to examine the accessibility of the Web content. Screening techniques that help explore some of the potential issues can be useful methods for learning about the scope, intention, and implications of the accessibility provisions (Henry 2007). 2.1.3 Implementation Phase Once the overall design has matured, the realization of the actual Web content starts during the implementation phase. This primarily involves developing the markup code that controls the content structure, as well as the server- and client-side scripts that control the functional behavior of the content. However, this development stage also involves the creation of the informational substance of the Web content such as the text, video, or sound resources; and which often tend to be neglected during the evaluation processes, even though they are equally important to the more technical aspects of implementation. It is important to understand that accessibility is not the responsibility of the technical developers alone, but also the responsibility of the content authors who need to help assess how well the information is formulated and presented to the endusers, for example if the text meets the readability provisions. Accessibility evaluations carried out during the implementation phase are usually summative as they are carried out on real content rather than on the design concepts. The scope of the evaluations can be individual parts of the Web content to ensure that they meet the accessibility requirements, or on larger composition of the parts to see how well they work with each other as a whole. For example, the implementation of a Web site should involve evaluating the individual Web pages (or representative samples thereof) as they are being developed, as well as evaluating the accessibility requirements that affect the whole Web site such as the navigation or consistency. 2.1.4 Operation Phase If the Web content has been developed with consideration for accessibility and meets certain standards for accessibility, then the evaluations that are carried out during the operation phase are primarily intended to maintain that level of quality or possibly to identify additional optimizations that can be made to further improve the quality. However, many Web sites continue to be developed with little or no consideration for accessibility so that broader-scoped evaluations are necessary to determine the overall level of accessibility and the
84
S. Abou Zahra
potential issues for people with disabilities. The following are some of the different types of evaluations that are typically carried out during the operation phase; ideally these different approaches are combined:
Conformance audits – address whole Web sites or Web site parts to assess how well these meet certain Web accessibility standards and guidelines
On-going monitoring – can be recurring conformance audits or evaluation of individual Web pages or Web site parts as they are being published
Focused assessments – examine specific aspects in more detail to improve or optimize the accessibility solutions provided by the Web content
Exploratory audits – exploration of some of the pertinent issues to assess the overall performance and gather requirements for future repairs. An important aspect of Web content development is that the majority of the content tends to be published during the operation and maintenance phases, and outside the control of the internal development environment. For example, much of the content tends to be published by employees using content management systems and other types of authoring tools, rather than by the technical developers. Although these publications are usually individual Web pages or even smaller segments, they can constitute large amounts of content and quickly affect the overall accessibility of the Web site. It is therefore necessary to develop and install appropriate mechanisms to ensure the accessibility of the content on an on-going basis, for example by means of peerreview or publication workflow processes. It is particularly effective when the content authors can evaluate and address the relevant accessibility requirements themselves rather than compensating that with a higher quality assurance overhead.
2.2 Testing Techniques After having explored the different Web development stages and the different types of evaluations that typically occur during these stages, this section explores some of the testing techniques that are combined depending on the purpose and context of the evaluation. In general there are three basic types of testing techniques: ‘‘automated testing’’ is carried out by software tools, ‘‘manual testing’’ is carried out by human evaluators who could be experts or novice, and ‘‘user testing’’ which is carried out by end-users in informal or formal settings (Fig. 2). While these techniques overlap in many of the accessibility issues that they can address, each one is more apt for addressing specific types of issues. Note that the accessibility provisions set out by standards and guidelines can realistically only address a subset of all the possible issues, and that testing techniques can test beyond these provisions. Optimal results are achieved by combining different approaches to benefit from each of their specific advantages.
Web Accessibility Evaluation
85
Fig. 2 Relationship between Web accessibility testing techniques Note: The areas in the diagram do not represent a scale or measure.
2.2.1 Automated Testing Automated testing is carried out without the need for human intervention. It is cost effective and can be executed periodically over large amounts of Web pages. At the same time, automated testing only addresses a subset of the accessibility provisions set out by most standards. This is inherent to the nature of the provisions that tend to be qualitative and that address user interface, interaction, as well as natural language aspects. For example, a common Web accessibility provision is to ensure that the document markup reflects the semantic structure that is conveyed through its visual presentation, yet it is difficult to develop algorithms that analyze such semantics. Another difficulty of automated testing is simply computational limitations. For example, a common accessibility provision is to ensure sufficient color contrast between the foreground and the background of Web content. For text in HTML and other document formats, this can be calculated using the red–green–blue (RGB) values. However, for bitmapped content such as images it is generally difficult to differentiate between foreground and background pixels automatically. In general, one could differentiate between the following types of automated testing:
Syntactic checks – analyze the syntactic structure of the Web content such as checking the existence of ALT-attributes in IMG elements or LANGattributes in the root HTML elements, and others. While these types of syntax checks are reliable and quite simple to realize, they only address the minor subset of the provisions that relate to syntax issues Heuristic checks – examine some of the semantics in the Web content such as the layout and markup or the natural language of information. While these types of checks cover a broader range of provisions they are considered less reliable and usually only serve as warnings for human evaluators to further validate and confirm potential accessibility issues Indicative checks – use statistical metrics and profiling techniques to estimate performance of whole Web sites or large collections of Web content. While
86
S. Abou Zahra
these checks are too imprecise for detailed assessment of the Web content, they are useful for large-scale surveys, for example to monitor the overall developments in the public sector of a country 2.2.2 Manual Testing In practice, the majority of the tests need to be carried out by human evaluators, even if they are sometimes guided or supported by software tools. For example, while software tools can quickly determine the existence ALT-attributes in HTML IMG elements, human evaluators need to judge the adequacy of the text in these attributes. In some cases automated heuristic checks can provide additional assistance for the human evaluators, for example by triggering warnings if the ALT-attribute contains typical default texts such as ‘‘image’’ or ‘‘spacer’’, and so on; however, the primary responsibility for making the final decisions is held by the human evaluators. Because manual tests cover a broad range of accessibility provisions and have varying degrees of software tool support, they have varying requirements with regard to the skills and knowledge of the human evaluators. Some tests can be carried out by non-technical evaluators while others may need more technical knowledge; some tests can be carried out by evaluators who only know basic principles of accessibility while others may require significantly more domain knowledge. While the required skills are mainly determined by the nature of the tests to be carried out, the software tool support provided to the evaluator can also be an important factor. In general, one can differentiate between the following types of manual testing:
Non-technical checks – can be carried out by non-technical evaluators such as content authors, for example to determine if the ALT-attributes describes the purpose of the images appropriately or if the captioning (or transcriptions) for the multimedia content is effective and correct Technical checks – are usually carried out by Web developers who have technical skills and basic knowledge about Web accessibility. Such checks typically address markup code and document structure as well as compatibility with assistive technology and other programming aspects Expert checks – are carried out by evaluators who have knowledge of how people with disabilities use the Web and who can identify issues that relate to the user interaction. This is comparable to ‘‘walkthroughs’’ and ‘‘heuristic evaluations’’ in the field of usability engineering as the experts anticipate the issues that end-users may encounter in the content 2.2.3 User Testing User testing is carried out by real end-users rather than by human evaluators or by software tools. It is a broad field of study and will be further discussed in the next chapter of this book, however, in the context of this chapter it is important
Web Accessibility Evaluation
87
to note that it is a testing technique that complements the other ones highlighted so far. It focuses on the end-users and how well the technical solutions match their needs in a specific context. For example, it is generally good practice to provide orientation cues and landmarks to help users to navigate through the Web content. At the same time, too many cues can be irritating or even become a barrier in itself. While it is the goal of accessibility standards to capture such conflicts and define provisions to avoid them, studies show that even experienced usability evaluators only find about 35% of the problems on average (Nielsen 1993); the same applies to accessibility. Involving people with disabilities during an evaluation helps clarify the accessibility issues and implement more effective accessibility solutions (WAI 2005c). Probably the biggest caveat with user testing is the difficulty to filter out personal bias and preferences, and identify the actual issues. For example, if a user was not able to complete a task does not automatically mean that there is a valid accessibility issue with the Web content but it could equally be an issue with the browser, the assistive technology, or even that the user is not able to use these tools, for example, if the user is a novice computer or assistive technology user. While these issues are usually related, it is important to separate them in order to identify the underlying causes and address them accordingly. Although the methods for user testing cover a broad spectrum, in general one can differentiate between two modes (Henry 2007):
Informal checks – are simple and can be carried out by non-experts, for example by asking individual persons like friends or colleagues for their opinions. While these types of quick checks can be effective and useful, they are coarse and thus prone to personal bias and preferences Formal checks – are usually carried out by professionals who follow wellestablished usability procedures. It is important that the evaluators can identify a sufficiently diverse user base and appropriate user tasks, as well as have expertise in how people with disabilities use the Web
2.3 Sampling Strategies For Web accessibility evaluations that have a large scope such as a whole Web site or larger collections of Web pages, it is usually not economically feasible to evaluate all the Web content thoroughly. Instead, sampling strategies are used to assess the overall performance of the Web content. For example, consider an organization that wants to introduce an accessibility policy for its Web site and needs to carry out a detailed analysis of how well the current content meets some accessibility standard. Such types of evaluations are commonly referred to as ‘‘conformance evaluations’’ (WAI 2005b) and can be compared to acceptance tests in traditional software quality assurance (see Section 2.1.4 Operation Phase). In such situations, it is often sufficient to evaluate a diverse selection of Web pages to identify a variety of issues and rerun the evaluation on another
88
S. Abou Zahra
Fig. 3 Effects of sample size and sample diversity
selection after some improvement has been made to the Web site. Depending on the nature of the Web site and the thoroughness of the evaluation, different sample sizes may be necessary to make an informed assessment. Figure 3 illustrates the relationship between the sample size and the identified types of accessibility issues, by means of two curves ‘‘A’’ and ‘‘B’’. The graph shows that as the sample sizes are increased, more types of accessibility issues can be identified; however, the two curves ‘‘A’’ and ‘‘B’’ have different inclination gradients. Assuming that both of the outlined sampling strategies ‘‘A’’ and ‘‘B’’ were carried out on the same Web site, the initial steepness of curve ‘‘A’’ (showing more identified issues for smaller sample sizes than in curve ‘‘B’’) indicates that a more effective sampling strategy was used. It is likely that for ‘‘A’’ more attention was paid to select diverse types of Web pages, while for ‘‘B’’ a more systematic page-by-page approach was used. While both strategies eventually converge for sufficiently large sample sizes, in a worst-case scenario this could mean the entire Web content. Conversely, for smaller sample sizes the benefits of an effective sampling strategy become more apparent. In practice, experienced evaluators might only evaluate a dozen or less Web pages thoroughly, then confirm the results from these evaluations using quick checks, such as the W3C Preliminary Evaluation (WAI 2005a), to reduce the amount of effort. There are several factors that influence the optimal sample size, however, for a given Web site and evaluation target, the precise number of Web pages required to provide acceptable results is primarily determined by the consistency of the Web content:
Homogeneous Web content – template-driven Web sites, such as ones that are developed using content management systems, usually have a limited number of Web page types that typically demonstrate specific types of accessibility issues. Web content can also be quite consistent if it is developed using effective
Web Accessibility Evaluation
89
style guides and other measures to ensure a common level of quality (a common ‘‘look and feel’’) Inhomogeneous Web content – many Web sites have grown over the years and contain content that was developed using different tools, or by different developers, or using different technologies. In extreme cases, the Web pages are ‘‘hand-grown’’ and developed by several authors and without any effective policies for consistency (such as common tools, style guides, etc.). The consistency amongst these types of Web pages tends to be very low so that larger sampling sizes are required. Note: see also Section 5.1 for more discussion on sampling strategies.
2.4 Evaluation Methodologies Effort ¼
Thoroughness Volume Consistency
Equation 1: Estimating the evaluation effort An evaluation methodology describes a specific procedure to evaluate Web content, using well-defined testing criteria and processes. The objective of an evaluation methodology is to provide a reliable measurement of the level of accessibility, while minimizing the required effort to carry out the evaluation. The term reliability in this context refers to the repeatability of the methodology (in other words, evaluations that were carried out separately such as by different evaluators will still demonstrate comparable results) and more importantly to the validity of the methodology (in other words, how well the evaluation identifies valid accessibility barriers). The amount of effort required to carry out an evaluation depends heavily on the purpose and the context of the evaluation, and the optimizations that can be therefore made. Equation (1) provides a rough estimation for the required effort to carry out an evaluation; the calculation is based on the following three primary factors:
Thoroughness – how detailed an evaluation should be, for example the level of accessibility against which Web content should be evaluated
Volume – the amount of Web content to be evaluated, for example if it is a large Web site, or just a single page, or something in between
Consistency – the similarity of different parts of the Web content, for example if it was developed using templates or other style guides According to this calculation, doing a quick and focused check on a small part of a Web page to confirm the validity of an accessibility solution requires the least effort, while a detailed evaluation of a large ‘‘hand-grown’’ Web site requires the most. This seems like an intuitive and logical statement but it also
90
S. Abou Zahra
underlines the implication of the initial state of the Web content on the effort required to carry out an evaluation. In other words, to leverage the available resources and provide optimal results, each evaluation process is ideally customized for the specific content. As a consequence, it is important that methodologies are adaptable to different types of situations. While evaluation methodologies that address Web content during its production are useful for proactive quality assurance, most of the commonly known methodologies tend to address the Web content after it has been developed. Most commonly known evaluation methodologies are conformance evaluations (see Section 2.1.4 Operation Phase), and typically mix different evaluation approaches to optimize the efficiency. For example, the methodology may require that the entire Web site be evaluated automatically first (high volume but low thoroughness); then that some pages are evaluated coarsely to get an impression of the potential types of issues (medium volume and medium thoroughness); and finally that a specifically selected sample of pages is evaluated rigorously (low volume but high thoroughness). W3C describes three evaluation methodologies with varying levels of thoroughness and effort:
Preliminary evaluation (WAI 2005a) – quick and simple check to identify some of the major issues on a Web site. While it is an informal and coarse approach, it can be carried out by non-technical evaluators Conformance evaluation (WAI 2005b) – a comprehensive approach to determine if a Web site conforms to an accessibility standard such as the W3C Web Content Accessibility Guidelines (WCAG) Involving end-users (WAI 2005c) – goes beyond technical evaluation and includes end-users. While this approach is targeted to complement conformance evaluations, it can also be used for other purposes
2.4.1 Audit Versus Certification While the ultimate goal of an evaluation methodology is to identify the potential issues, the educational purpose may vary. In some cases an evaluation is carried out only for the mere purpose of determining whether the Web content meets or does not meet specific accessibility criteria. This is generally known as certification where an organization seeks to achieve some form of label or certification, typically from an accredited or otherwise trusted third-party. In such cases, the evaluation report does not usually include any detail about the cause for theissues or advice on improving them, but only a listing of the criteria that were or were not met by the Web content. In many other situations, an evaluation is carried out with the intent of informing and educating the developers about the issues, and possibly advising them on how to retrofit them. This is generally known as an audit and may be carried out internally, although it is also often carried out with the help of external experts. In fact, many specialized organizations tend to provide a variety of evaluation services, including certification, depending on the needs
Web Accessibility Evaluation
91
of the customer. Usually the overall evaluation methodology is the same for all types of assessments but the thoroughness and the sample size of the evaluation may vary. Most notably, the format of the report will typically contain varying level of details depending on the purpose of the assessment.
2.4.2 Example Methodologies There are numerous evaluation methodologies to address various purposes so that it is not possible to exhaustively list them. There is also no distinct separation between the different types of evaluation methodologies in order to categorize them usefully. Instead, the following three methodologies are exemplary and highlight some of the different usages and contexts for an evaluation methodology. While the commonality between these three methodologies is that they aim to provide reliable and repeatable procedures, they address a different scope or employ different evaluation approaches to achieve this common objective:
Unified Web Evaluation Methodology (UWEM) – is a comprehensive approach being currently developed by the WAB-Cluster and is based on WCAG 1.0. It includes sampling strategies, testing procedures, as well as methods for aggregating and presenting the evaluation results European Internet Accessibility Observatory (EIAO 2007) – is a large-scale and fully automated evaluation system intended to monitor public Web sites in the European Union. It is based on UWEM (uses a subset) and employs statistical metrics and page profiles to improve the results Barrierefreies Internet Eroffnet Neue Einsichten (BIENE 2006) – a Web ¨ award that involves people with disabilities as key part of its evaluation methodology. The criteria are based on the German national standard for Web accessibility called BITV, and which is also based on WCAG 1.0
3 Practical Considerations While the testing techniques and evaluation approaches discussed in the previous section are important assets for carrying out reliable evaluations, they are theoretical concepts rather than practical experiences. However, in practice every evaluation presents a unique situation with unique requirements, especially when the evaluation is carried out with the intent of improving the accessibility and providing optimal solutions for the endusers. It is a question of managing the evaluation optimally, for example by assigning responsibilities, training the employees, installing workflow processes, or by acquiring tools to help the evaluators in carrying out their tasks. Some of these considerations and practical experiences are discussed in this section.
92
S. Abou Zahra
3.1 Distributing Responsibilities In many cases, evaluations are not carried out as solitary processes, such as by a separate quality assurance team or by a third-party organization, but as an integral part of the initial development or on-going maintenance processes. In these situations different developer roles may be involved throughout an evaluation process. Figure 4 illustrates different developer roles, each of which will be outlined below. Note that for smaller organizations an individual developer may have one or more of these roles during an evaluation process, but these roles are nevertheless distinct: Author – is non-technical and typically uses WYSIWYG tools, such as content management systems, to publish and modify content. A subset of the accessibility provisions can be evaluated by authors, in some cases with the help of software tools, for example, evaluating the text description of images can be carried out by authors before publication Designer – is responsible for the visual design and overall ‘‘look and feel’’ (such as the corporate design, etc.). The relevant accessibility issues usually relate to the layout, navigation, and colors used; in some cases they can also relate to technical aspects of the design implementation, for example the CSS presentation and the HTML document structure Developer – is technical and develops much of the underlying markup code and programming logic. Developers often address a majority of the accessibility issues, especially the ones that relate to the templates or the scripts. In many cases developers also compensate for the low support of accessibility in authoring tools, for example to realize tables or forms Webmaster – is also technical but usually monitors and maintains the content rather than actively develops new content. Ideally Webmasters have minor roles in evaluation because the development process ensures the production of accessible content; in reality, however, they have to compensate for much of the inaccessible content published by others
Fig. 4 Roles in Web accessibility evaluation
Web Accessibility Evaluation
93
Manager – is non-technical and oversees the evaluation process. The manager monitors the level of accessibility and identifies strategies to address issues, for example by setting priorities to improve the content management system that generates inaccessible content, or by training different authors, developers, or webmasters to improve their skills Accessibility champion – is a central figure that advises and trains the others, including the manager, involved in the evaluation process. The champion can help design the evaluation methodology, process, and workflow as well as advises on their improvement and optimization
3.2 Training Evaluators According to Jakob Nielsen, highly qualified usability evaluators can reduce the cost and effort considerably. In his studies, Nielsen identified that usability experts are 1.8 times more effective than novice evaluators. Furthermore, usability experts with expertise in the application domain can be 2.7 times more effective than novices (Nielsen 1993). While these studies were carried out on usability evaluations, similar characteristics can be expected for accessibility evaluations. In fact, in more recent studies Chevalier and Ivory observed evaluators while they carried out accessibility evaluations. After giving some of them brief trainings, they identified that even such short and introductory training sessions made the novice evaluators up to 31% more effective than the untrained novice evaluators (Chevalier and Ivory 2003). Studies such as the ones cited above as well as practical experiences demonstrate the importance of training and skills in order to make evaluators more efficient and effective. However, despite training and expertise, individual evaluators can only identify a relatively small portion of the issues that may exist on a Web site. It is therefore recommendable to carry out evaluation using a team of multiple evaluators, in order to achieve broader coverage (Nielsen 1993). Note that not all evaluators in a so-called ‘‘review team’’ need to have the same level of technical expertise or domain knowledge; different skills and expertise of the evaluators can often be combined to constitute effective review teams (WAI 2006c). When evaluations are carried out in the context of initial development or ongoing maintenance of the Web content as highlighted in the previous subsection, different kinds of training may be needed to address the different developer roles. At a basic level, the target of the training is to raise the awareness of the developers to the importance of accessibility requirements and to promote an understanding of how people with disabilities use the Web. At a more advanced level, the training could include principles of human– computer interaction and user-centered design (Henry 2007). Also the level of the technical detail is variable and ranges from functional criteria that can be evaluated by non-technical evaluators, for example switching off the CSS style
94
S. Abou Zahra
sheets to examine the structure of the HTML markup, to more detailed analysis of the best practices and source-code implementation.
3.3 Selecting and Using Evaluation Tools The W3C defines Web accessibility evaluation tools as ‘‘software programs or online services that help determine if a Web site is accessible’’ (WAI 2006a). This is a very broad definition and could also include software that may not have been explicitly designed for the purpose of evaluating Web content. For example, in many cases the standard functions in Web browsers or assistive technologies can be used to evaluate how the Web content renders using different settings. In fact, the W3C preliminary evaluation approach (WAI 2005a) uses such techniques to evaluate the accessibility of Web content. However, in a closer sense the term ‘‘evaluation tools’’ usually refers to tools that provide specific functionality that serve the purpose of evaluating Web accessibility; a list of some of these type of tools is provided by W3C (WAI 2006b). While evaluation tools can reduce the effort required to carry out evaluations, it is important to remember that no single evaluation tool can automatically determine the accessibility of Web content. As described in Section 2.2, only a subset of the accessibility provisions can be tested automatically. Moreover, only the syntactic checks of the automatic tests can be considered to be reliable while the other types of automated tests need to be verified by human evaluators. In the latter case evaluation tools can be compared to dictionaries in word processors as they only help identify potential issues rather than determine real issues (Henry 2002). At the same time, the syntactic checks carried out by automated tools can be very useful, especially for monitoring, as they can be executed periodically over large volumes of Web content. Since the majority of the accessibility provisions (that are not evaluated by users) need to be evaluated manually, an important property of evaluation tools is how well they address these types of checks. Accordingly, there are many different types of evaluation tools that provide different functionality and approaches to address the breadth of the requirements. Some focus on evaluating one or two specific provisions such as color contrast or tables while others are more generic. While the coverage of accessibility provisions can be an important aspect for selecting evaluation tools, it is also import to consider how different tools support provisions. For example, some tools provide more guidance to help the evaluators carry out the evaluation tasks than others. A variety of considerations for selecting tools is outlined in (WAI 2006a). Besides the functionality, the user interface and usage characteristics are important properties of evaluation tools. For example, evaluation tools that execute automated tests tend to generate long reports with the results of these test executions. These reports can sometimes be customized or provided in machine-readable formats (such as XML or EARL) and are useful listings of
Web Accessibility Evaluation
95
defects (‘‘bugs’’) for developers to fix. In some cases, the evaluation tools may provide options for evaluators to supplement the results of the automated tests with manual tests and include these in the reports. For example, the tools may guide the evaluators through specific manual checks using a step-by-step wizard or other dialogs and include the results as part of the full report. In other cases, the evaluation tools may display the results of the report using virtual icons on the actual Web content which can often be a useful approach for novice evaluators, to relate the results to their corresponding location in the Web content. The different user interface and usage characteristics of evaluation tools become especially relevant for manual checks, as the type of assistance requested by specific evaluators is typically a matter of preferences. For example, step-by-step wizards are usually quite simple to use and provide high degree of guidance; however, for more experienced evaluators such dialogs may be verbose or even tedious. Instead, such evaluators could prefer functions that highlight or outline specific parts of the Web content, for example functions that show the heading structure of a document, the reading order of the tables, or the text descriptions of the images. Note that the user interfaces of evaluation tools may themselves demonstrate accessibility issues. In practice, evaluators may indeed select or combine different tools depending on their functionality and on the specific context of the evaluation. For example, some tools may provide more appealing functionality for evaluating multimedia content while others may be more appealing for evaluating PDF or other document formats. This is more common when evaluators do not have access to tools that provide broad ranges of functionality to satisfy the different evaluation purposes and situations. It is not uncommon that evaluators use automated tools to get a coarse idea of some of the potential issues before using more specialized tools to evaluate specific aspects. In general, evaluation tools do not integrate well with each other or with authoring tools. However, some evaluation tools do provide programming interfaces (APIs) through which they can be extended or integrated into specific authoring tools.
4 Future Directions The Web is continuously evolving and with it the authoring tools, the browsers, and the assistive technologies have evolved too. Some of these evolutions are beneficial, for example the generally improved standards compliance of Web browsers and assistive technologies. At the same time other evolutions provide new challenges, for example the emergence of authoring tools disguised as wikis or blogs. Also Web applications with complex user interfaces and interaction modalities, also referred to as Web 2.0, demonstrate new opportunities as well as challenges for people with disabilities. Accordingly, the methods for evaluating the Web content must evolve and address the current requirements for accessibility on the Web today.
96
S. Abou Zahra
As the foundation for evaluation is provided through the accessibility provisions, the development of the W3C Web Content Accessibility Guidelines (WCAG) 2.0 will be an important milestone for evaluating Web accessibility. Since WCAG 2.0 is designed to address different Web technologies and interactive applications, it will facilitate the evaluation of these and shift the current focus of evaluation which is still largely based on static HTML and CSS pages. WCAG 2.0 is also designed to be more testable than its previous version using two key approaches: each provision has been carefully formulated and rigorously reviewed for any remaining ambiguity, and secondly extensive techniques are being developed to provide detailed guidance. For evaluators and for evaluation tool developers the WCAG 2.0 Techniques could become even more important than the Guidelines document itself. The aim of the techniques is to provide a collection of different approaches that satisfy the guidelines. For example, Technique ‘‘#H37: Using alt attributes on img elements’’ is one way of satisfying the requirement set forth by Success Criterion ‘‘1.1.1 Non-text Content’’. Within Technique #H37, evaluators and developers can find guidance that includes a detailed description of the solution, code examples, and test procedures. While Technique #H37 is intended for HTML content, there are others intended for other Web technologies such as CSS. In future, additional techniques for many other technologies such as SVG, SMIL, Flash, or PDF can be added as modules to the WCAG 2.0 framework; this will be a key enabler to evaluating the modern Web.
4.1 Web Applications The term ‘‘Web applications’’ typically referred to programs that were primarily run on a server and that had a Web-based user interface, for example an online shopping facility where the user is presented with different Web pages at different stages of the transaction. Examples of such Web pages could include a view of the available products, the selected products in a virtual shopping basket, or the payment details. As the Web browsers became increasingly sophisticated and powerful, much of the programming logic was shifted onto them thus changing them from ‘‘thin clients’’ to so-called ‘‘rich clients’’. Especially the browser support for the ‘‘XMLHttpRequest’’ API function led to this change as it enabled the instances of the Web pages that were loaded in the browser, the clients, to communicate directly with the server and exchange data without reloading the Web page. AJAX – Asynchronous JavaScript And XML – is a wide-spread programming technique based on this API function. One of the important implications of rich clients is the gradual insignificance of the term Web page, which typically referred to a single resource such as a document. However, rich clients often serve the purpose of multiple Web pages. For example, the online shopping application outlined above can be developed using a single-client application that displays the different dialogs and user
Web Accessibility Evaluation
97
interfaces depending on the current task that the user is trying to accomplish. In the background, the rich client will be exchanging information with the serverside application, for example to fetch data about the selected products or to send the credit card details to the server. What this means is that Web pages that were typically static (even if they were generated dynamically through server-side scripts) now increasingly have a run-time like any normal software application. It is therefore not enough to evaluate individual Web pages without considering their state and the different paths of execution. For example, what was the state and input parameters of the Web content when it was evaluated and how well does this state represent all possible states of the content? In traditional software quality assurance it is common practice to consider the run-time and input parameters while carrying out the tests in order to ensure that the different paths of executions of software were covered. A lot of research and development is needed to make such approaches mainstream in Web accessibility evaluation. The Evaluation And Report Language (EARL) is one potential candidate to support the evaluation of Web applications. EARL is a vocabulary to describe the results of test executions, for example information about who or what carried out the test and how the test was carried out (automatically, manually, semiautomatically, etc.). Since EARL is expressed in the W3C Resource Description Framework (RDF) it can be extended quite easily to better address specific domains such as accessibility (EARL is itself neutral to the type of test results it records). An extension module to EARL is called ‘‘HTTP Vocabulary in RDF’’ and provides a mechanism to record the HTTP messages that were exchanged between the client and the server at the time the test was carried out. While this is a rather verbose approach, it allows evaluation tools as well as human evaluators (with the help of semi-automated tools) to record extensive information about the Web content that was evaluated. This information can be used by developers to reproduce the exact situation for debugging purposes or to analyze the different execution paths that were included (or excluded) from the evaluation. Another promising approach to effectively evaluate Web applications is testing based on the Document Object Model (DOM) rather than on the raw representation of the Web resource. For example, many evaluation tools execute tests on the HTML or CSS as received directly from the server without fully rendering these (especially without executing the scripts). Using this approach evaluation tools can at most only evaluate the initial state of the content but not the actual representation with which the users are confronted. In order to evaluate Web applications, evaluation tools will need to fully render the content and parse the actual DOM which may have been manipulated by scripts. This applies primarily not only to automated evaluation tools but also to semi-automated tools, for example even browser-based toolbar tools sometimes trigger the tests on the raw HTML and CSS rather than on the rendered content that is available through the respective browser APIs. Also automated tools could rely on browser engines to render the content rather than developing this functionality anew; however, this potentially means inheriting the browser bugs or other peculiarities.
98
S. Abou Zahra
4.2 Web Technologies From the early inception of the Web, the predominant technologies were HTML and CSS. While HTML and CSS continue to be the core languages for the Web, today many more Web technologies have become available and need to be addressed too. For example, many Web sites use PDF format to deliver a substantial portion of the information; typically these documents tend to be formal or otherwise significant for the Web site end-user. Also multimedia content is now delivered through a whole spectrum of Web technologies such as SVG, SMIL, Flash, QuickTime, or MPEG. While some of the current Web technologies are open and royalty-free, others are provided under commercial licenses or are proprietary in the first place. This new scenario of multiple Web technologies with varying characteristics sets a number of challenges for evaluating the accessibility of the Web content:
Evaluators need to learn and understand many more Web technologies and the specifics that are relevant for evaluating their accessibility.
Researchers and experts may not have easy access to the internal aspects of these technologies in order to develop evaluation approaches.
Licensing terms of some technologies may make it difficult to facilitate a broad support in evaluation methodologies and evaluation tools. As noted in the introduction to this section, the W3C Web Content Accessibility Guidelines (WCAG) 2.0 is designed to provide a comprehensive set of technology-independent provisions that can be applied to any Web technology. The techniques layer sets a framework with different modules that describe some of the established solutions for satisfying the provision requirements in different technologies. While the focus during the current stage of WCAG 2.0 is on developing modules for HTML, CSS, and scripting, other modules are expected when the guidelines become more stable (these are currently work-in-progress, published as ‘‘Working Draft’’). While WCAG 2.0 promises to provide a major improvement in harmonizing and aligning the evaluation procedures regardless of the underlying technology, it will not be able to completely neutralize the growing complexity and effort required to evaluate the increasing number of Web technologies. More systematic education and training programs for the evaluators will continue to gain importance.
4.3 Web Quality Assurance While Web accessibility is certainly an important indicator for the quality of Web content, there are many other aspects that may have equal importance for a Web site. For example, security and privacy are becoming increasingly
Web Accessibility Evaluation
99
important to end-users as their usage of the Web grows and the number of their purchasing transactions rise. Also as mobile devices and ubiquitous interfaces continue to spread, other guidelines such as the W3C Mobile Web Best Practices, are developed to address these specific requirements. While some of the requirements relate directly to accessibility (such as guidelines for the mobile Web or for the aging community), others such as privacy or security are not as closely related. However, the processes used to evaluate many of these requirements are common among the different domains. In the light of Web applications, the evaluation of many of these requirements is becoming increasingly challenging. For example, even basic validation of the HTML markup is affected by the run-time of any scripts that manipulate the DOM. In other words, a more comprehensive quality assurance framework in which accessibility is one of the parameters may be more beneficial than replicating different instances of the processes and methods to address the broad spectrum of requirements. Note that this does not mean that accessibility evaluation would become less important but that it benefits from a more integrated approach. For example, automated evaluation tools are increasingly employing approaches to separate the test execution engine from the test descriptions so that these can be easily updated or customized. As a consequence some Web accessibility evaluation tool developers have started to implement nearby requirements such as the mobile Web guidelines. Also Web accessibility evaluation methodologies seem to be undergoing similar repurposing efforts for other domains. Besides working together to develop improved quality assurance methodologies, the development of (domain-neutral) test description languages may be a key enabler to the convergence of the evaluation approaches on a technical level. For example, the WebAIM 2006 Logical Rapid Accessibility Evaluation (LRAE) is an XML vocabulary that complements EARL by describing the actual test procedures rather than the test results. These two vocabularies can, however, be promoted in other domains which could potentially increase their visibility and accelerate their development.
5 Research and Development Web accessibility evaluation is a broad domain and offers several opportunities for research and development. Particularly, there is a need to continue the development and refinement of publicly available, detailed, and reliably repeatable evaluation methodologies. This is not a simple undertaking as it encompasses analyzing many different situations and scenarios that sometimes pose competing requirements. It also comprises many related sub-topics such as sampling strategies, aggregation methods, and adequate tool support. Besides these rather technical approaches there is a whole variety of research topics in the field of user-based accessibility evaluation and how these approaches can be
100
S. Abou Zahra
better integrated into evaluation methodologies. This section will, however, focus on the technical aspects, and leave the usability aspects to be discussed more comprehensively in the next chapter of this book.
5.1 Sampling Strategies As discussed in Section 2.3, effective sampling strategies can significantly improve the performance and efficiency of evaluations that involve larger amounts of Web content such as whole Web sites. However, the big question is how many Web pages should be selected, or more importantly, which pages should be selected? There are generally four main factors that directly affect the optimal sample selection:
Size of the Web site – While the type of accessibility issues on a Web site are often repeated more or less uniformly throughout the entire site, one can generally assume that the larger the Web site is the more issues will potentially exist. The sample size may therefore need to be adapted to the size of the base population determined by the evaluation scope. Diversity of the Web pages – The types of accessibility issues found on a Web site correlates more directly to the diversity in the content than to the number of Web pages. This includes the structural features such as the type of markup elements (forms, tables, scripts, etc.), the page layout features, as well as the technologies used to realize the Web pages (PDF, SVG, MathML, etc.); more diversity requires a higher sample size. Precision of the evaluation – The most significant factor that impacts the sample population is the intended precision of the evaluation. This includes the thoroughness of the evaluation (how many issues should be identified) as well as the repeatability (how well should the assessments carried out by different evaluators correlate). Given those parameters, some evaluation methodologies may only evaluate as little as 3 or 4 pages, while in others may evaluate as much as 30 or more pages. Note that in some cases more than one sample can be selected for evaluation using different combinations of testing techniques, such as the ones described in Section 2.2. A common approach in practice is to evaluate the entire Web site automatically, then carry out quick checks on a set of Web pages, and finally evaluate yet another set of Web pages more comprehensively. In other words, a series of evaluations with varying precisions and varying scopes are combined in a cost–effort optimization. In practice, expert evaluators will have a good sense of which pages to select and how to evaluate them. It is an expertise they acquire with time and there is therefore generally a high fluctuation among the results that different evaluators provide. Part of this fluctuation is due to the different interpretations of the underlying provisions, yet much of it is also due to the differences in the
Web Accessibility Evaluation
101
sampling coverage, especially for smaller sample sizes. Developing more systematic guidance and approaches to select representative Web pages would contribute to the reliability of the assessments and potentially also optimize the effort required to execute them (Brajnik et al. 2007). A promising research direction seems to be page profiling – based on Web page metrics that are easy to identify, for example the type and the density of the elements used (such as links, images, text, etc.), different profiles can be deduced. From each cluster of profiles, as little as one or two instances of Web pages may be sufficient to provide acceptable results. As there are many metrics that can be taken from Web pages, there is a lot of statistical fine tuning that needs to be made in order to provide reliable clusters. The EAIO project (introduced in Section 2.4.2) carried out extensive research about such profiling and clustering approaches. However, their project focus is on automated evaluation and will need to be further adapted in order to serve other types of evaluations such as manual- or user testing approaches.
5.2 Measuring Accessibility Accessibility is a qualitative property and cannot be easily measured or quantified. Even if technical standards are taken as benchmarks, the actual impact of the barriers on the end-users is not always quantifiable. For example, one missing ALT-attribute is usually not a show-stopper but how many missing ALT-attributes eventually make the Web content unusable? Intuitively, this threshold does not depend on the number of missing attributes alone but also on the type and importance of the images, as well as on the patience (or desperation) of the end-users. Some people argue against such approaches that implicitly introduce some form of tolerance in the conformance to the technical standards, as they fear it may dilute their effectiveness. In other words a missing ALT-attribute is a failure in the sense of conformance regardless of the other factors. The measurement of accessibility is therefore the degree to which the Web content conforms to the technical standard or the more granular provisions therein. One use case for developing quantitative metrics for Web accessibility in largescale evaluations that monitor the level of accessibility. While such evaluations are not precise assessments, they can provide useful indicators for Web site owners. In the past, national or international surveys using such approaches have generated a lot of attention and may therefore be useful methods for accessibility advocates and policy makers to monitor the level of accessibility. A recent study on quantitative metrics concludes that while they provide an acceptable level of reliability, further improvements need to be made (Vigo et al. 2007). While the approach followed by Vigo et al. is based on basic automatic tests, a different approach tries to quantify the impact of accessibility tests on different user profiles. For example, the impact of a missing ALT-attribute is
102
S. Abou Zahra
different for a blind user than for a user with low vision. This approach (that is commonly referred to as barrier-walkthrough) can be carried out using automated and large-scale evaluations (Bu¨hler et al. 2006) or using expert evaluations (Brajnik 2006). It is important to note that while these approaches for approximating the level of accessibility are potentially interesting for ranking Web sites, the validity of their results (how well these rankings really work with regard to actual users with disabilities, especially multiple disabilities, etc.) needs to be further researched.
5.3 Tool Integration Web accessibility evaluation tools are essential support for executing and managing evaluation processes effectively and efficiently. However, tools can only assist their users in carrying out specific tasks; they do not perform these tasks on their own. Thus tools have specific usage and need to be used with skill and knowledge in order to be useful and effective. Conversely, tools need to fit into the tasks and processes of the users in order to provide optimal support or to be useful in the first place. With regard to the skills and knowledge of the users, appropriate training is one approach to improve the situation. At the same time, additional research and development is necessary to optimize the tool support during Web accessibility evaluation. In an empirical study using automated evaluation tools, Melody Ivory found that while the tools identified considerably more problems than Web designers did, the throughput of the designers was not significantly improved by the use of such tools. In fact, some of the designers were confused or overwhelmed by the output that was generated by the tools. In some cases using automated tools led to reduction in the performance of the designers rather than an enhancement. While Ivory suggests that the primary cause for this mismatch between the expected and the actual designer throughput could be due to the underlying guidelines that the tools evaluate for, other possible factors are also suggested in this study. For example, the state of the Web site, the expertise of the Web designers, and the time needed to learn to use the tools are potential parameters that could impact the effectiveness of the tools (Ivory 2003). Similarly, studies carried out by Giorgio Brajnik to investigate the optimal use of automated Web accessibility tools suggest that the context in which the tools are used has crucial impact on the effectiveness of the tools. For example Brajnik says that ‘‘unless automatic Web testing tools are deployed in the normal processes of design, development and maintenance of Web sites, the quality of Web sites is unlikely to improve’’ (Brajnik 2004a) or that ‘‘processes for monitoring, assessing and ensuring appropriate levels of accessibility and usability have to be adopted by Web development and maintenance teams [. . .] automatic tools for accessibility and usability are a necessary component of these processes’’ (Brajnik 2004b). In other words, the tools and the processes need to work in concert to be useful and effective.
Web Accessibility Evaluation
103
In line with Brajnik’s findings, Paul Englefield and colleagues conclude that the evaluation processes and the tools need to integrate and work in synergy. In fact, Englefield and colleagues outline some of the specific issues they experience while using automated evaluation tools. For example, the inconsistent user experiences amongst different tools, the diverse reporting formats, or lack of integration with other development tools are described as severe issues that ‘‘mar the effectiveness of existing tools’’ (Englefield et al. 2005). Englefield and colleagues continue and describe an abstract model for an architecture that integrates different types of automated evaluation tools into a single framework. The proposed framework is designed to be extensible with regards to functionality, flexible with regards to customizations, and effective with regards to the integration into existing development environments. However, it is important to note that this architecture focuses on automated evaluation tools and does not adequately address other types of tools, for example semiautomated or manual Web accessibility evaluation tools. Based on several studies and observations such as the ones referenced above, one can conclude that generally there is a severe disconnect between Web accessibility evaluation tools and the Web development processes. It seems that Web developers are occasionally forced to go out of their way, to learn how to use several tools, or to accommodate the peculiarities of evaluation tools rather than be blessed with tools that integrate more smoothly into their specific development processes. At the same time, different evaluation tools do not generally integrate well amongst each other or with Web authoring tools thus the collaboration between the different developer roles is not adequately supported by tools (including evaluation tools). Promising techniques to enhance the integration of different types of evaluation and development tools are the W3C Evaluation And Report Language (EARL) and the WebAIM Logical Rapid Accessibility Evaluation (LRAE) as introduced in Sections 4.1 and 4.3, respectively. The benefit of these techniques is that they are designed to be generic quality assurance vocabularies; thus they have a better chance of adoption in a broader community. For example, it is possible to use these vocabularies as a format for test harness tools. There is a great potential for evaluation tools, especially the semi-automated tools, that has not yet been fully leveraged; further research and development in this field will be an important improvement to Web accessibility.
6 Summary and Conclusions Accessibility is an experiential measure of quality; it is less a property of the Web content but rather a result of the interplay between the Web content, the browser, and potentially the assistive technology that some people with disabilities may be using to access the content (Slatin and Rush 2003). Also authoring tools, evaluation tools, and, more importantly, the quality assurance
104
S. Abou Zahra
processes that are employed during the production and maintenance of Web content are important factors that significantly impact the level of accessibility. While some of these parameters are not controlled by the Web content providers, for example the accessibility support and standards compliance of the browsers, the media players, or assistive technologies, many other parameters need to be adjusted by the Web content providers to ensure an adequate level of accessibility, for example the Web technologies used and the best practices that are employed to implement and realize accessible Web content. Web accessibility standards and guidelines, for example the W3C Web Content Accessibility Guidelines (WCAG), are crucial indicators for the level of accessibility of the Web content. However, the accessibility provisions in such standards can only possibly address some of the common requirements for the interaction of people with disabilities on the Web; they cannot exhaustively address all of the situations and combinations of users, user agents, and content implementation. In other words these standards and guidelines should be used as essential guidance to better understand and avoid accessibility barriers, and to construct robust quality assurance measures that address the production and maintenance of accessible Web content. Evaluating Web content for accessibility is therefore not the final stage in the Web development process but ideally an on-going and integral measure from the requirements analysis phase, all the way until the operation and maintenance phases of Web content. There are different testing techniques and evaluation approaches that complement each other in a comprehensive accessibility assessment. For example sampling Web pages and combining automated, expert, and user testing approaches to evaluate the samples can be an effective method to manage the accessibility of large Web sites. It is important to understand the benefits and limitations of each of the approaches in order to leverage their effectiveness and maximize the overall evaluation result. In practice, the responsibility for evaluating Web content is ideally distributed in a review team. This is especially useful when the evaluation is carried out by the content authors and developers that have complementary roles and skills. However, also separate quality assurance teams, for example the quality assurance department in an enterprise or the webmaster in smaller organizations, can be more effective when the responsibility is distributed among the entire development team. Training the developers to raise their awareness on accessibility issues, or the evaluators to improve their skills and expertise, greatly improves their effectiveness in identifying issues; in some cases this gain can be as large as 2.7 fold (Nielsen 1993). More importantly, training developers is a useful approach to avoid the creation of barriers in the first place, and is therefore an important asset for any organization. Finally, evaluation tools are also a key aspect of Web accessibility evaluation as they can, when they are used appropriately, provide essential support for evaluators and developers, and thus significantly reduce the time and effort required to carry out an evaluation. At the same time, studies show that there is a great potential to further improve and enhance Web accessibility evaluation
Web Accessibility Evaluation
105
tools, especially to help integrate them into the development process and environment of the developers. While there are some potential solutions to facilitate this integration, for example the Evaluation And Report Language (EARL) or the Logical Rapid Accessibility Evaluation (LRAE), additional research and development work is necessary to refine these. Further research and development is also necessary to help optimize the existing evaluation approaches and methodologies to today’s Web of rich Internet clients and dynamic Web applications. There is an on-going shift from the traditional document-centric paradigm of the Web to more application-centric concepts that need to be reflected in the evaluation approaches and strategies, for example to address the run-time and execution paths of the applications. Such considerations are common in the field of traditional software quality assurance, however, it is still largely unclear how these approaches can be adopted and adapted to meet the requirements of the Web; it is an exciting opportunity for research and development, and a growing market for a new generation of products and services.
References BIENE (2006) Barrierefreies Internet Eroffnet Neue Einsichten. German Web award ¨ organized by Aktion Mensch and Stiftung Digital Chancen, May 2006. Brajnik, G. (2004a) Comparing Accessibility Evaluation Tools. Springer Verlag, Berlin. Brajnik, G. (2004b) Using Automatic Tools in Accessibility and Usability Assurance Processes. Springer Verlag, Berlin. Brajnik, G.(2006) Web Accessibility Testing: When the Method is the Culprit. In proceedings of 10th International Conference on Computers Helping People with Special Needs (ICCHP), Linz. Brajnik, G., Mulas, A., and Pitton, C. (2007) Effects of sampling methods on web accessibility evaluations. In proceedings of ASSETS, Tempe. Bu¨hler, C., Heck, H., Perlick, O., Nietzio, A., and Ulltveit Moe, N., (2006) Interpreting Results from Large Scale Automatic Evaluation of Web Accessibility. In proceedings of 10th International Conference on Computers Helping People with Special Needs (ICCHP), Linz. Chevalier, A. and Ivory, M. Y. (2003) Web site designs: Influences of designer’s experience and design constraints. International Journal of Human Computer Studies, 58(1):57 87. EIAO (2007) European Internet Accessibility Observatory. http://www.eiao.net. Englefield, P., Paddison, C., Tibbits, M., and Damani, I. (2005) A proposed architecture for integrating accessibility test tools. IBM Systems Journal, 44(3), August 2005. ISO (1995) Singh, R. (Ed.) Software Life Cycle Processes. ISO 12207, Washington. Ivory, M. Y. (2003) Automated Web Site Evaluation. Kluwer, Dordrecht. Henry, S. L. (2002) Web Accessibility Evaluation Tools Need People. Online resource by UI Access, located at http://www.uiaccess.com/evaltools.html, August 2002. Henry, S. L. (2007) Just Ask: Integrating Accessibility Throughout Design. Lulu.com. Nielsen, J. (1993) Usability Engineering. Academic Press, Cambridge MA. Slatin, J. M. and Rush, S. (2003) Maximum Accessibility. Addison Wesley, Boston. Vigo, M., Arrue, M., Brajnik, G., Lomuscio, R., and Abascal, J. (2007) Quantitative metrics for measuring web accessibility. In proceedings of The 2007 International Cross Disciplinary Workshop on Web accessibility (W4A), Banff.
106
S. Abou Zahra
WAI (2005a) Abou Zahra, S. (Ed.) Preliminary Review of Web Sites for Accessibility. Online resource by W3C Education and Outreach Working Group, located at http://www.w3. org/WAI/eval/preliminary.html, October 2005. WAI (2005b) Abou Zahra, S. (Ed.) Conformance Evaluation of Web Sites for Accessibility. Online resource by W3C Education and Outreach Working Group, located at http://www. w3.org/WAI/eval/conformance.html, October 2005. WAI (2005c) Henry, S. L. (Ed.) Involving Users in Web Accessibility Evaluation. Online resource by W3C Education and Outreach Working Group, located at http://www.w3. org/WAI/eval/users.html, November 2005. WAI (2006a) Abou Zahra, S. (Ed.) Selecting Web Accessibility Evaluation Tools. Online resource by W3C Education and Outreach Working Group, located at http://www.w3. org/WAI/eval/selectingtools.html, March 2006. WAI (2006b) Abou Zahra, S. (Ed.) Web Accessibility Evaluation Tools. Online resource by W3C Education and Outreach Working Group, located at http://www.w3.org/WAI/ER/ tools/, March 2006. WAI (2006c) Brewer, J. (Ed.) Using Combined Expertise to Evaluate Web Accessibility. Online resource by W3C Education and Outreach Working Group, located at http:// www.w3.org/WAI/eval/reviewteams.html, March 2006. WebAIM (2006) WebAIM Accessibility Evaluation Framework. Online resource located at http://eval.webaim.org/, November 2006
End User Evaluations Caroline Jay, Darren Lunn, and Eleni Michailidou
Abstract As new technologies emerge, and Web sites become increasingly sophisticated, ensuring they remain accessible to disabled and small-screen users is a major challenge. While guidelines and automated evaluation tools are useful for informing some aspects of Web site design, numerous studies have demonstrated that they provide no guarantee that the site is genuinely accessible. The only reliable way to evaluate the accessibility of a site is to study the intended users interacting with it. This chapter outlines the processes that can be used throughout the design life cycle to ensure Web accessibility, describing their strengths and weaknesses, and discussing the practical and ethical considerations that they entail. The chapter also considers an important emerging trend in user evaluations: combining data from studies of ‘‘standard’’ Web use with data describing existing accessibility issues, to drive accessibility solutions forward.
1 Introduction Web accessibility guidelines provide a valuable starting point for designers trying to ensure that visually disabled users are able to use their Web sites. Unfortunately, conforming to the guidelines in no way guarantees that a Web site is genuinely accessible (Petrie and Kheir 2007). Even when pages are built according to guidelines that are meant to increase accessibility, there still seem to be ‘‘disabilities’’, which result from an overreliance on the syntactic checking of Web pages (Takagi 2004). The problem is summed up by Hanson (2004), who says, ‘‘specifications for accessibility of Web pages do not necessarily guarantee a usable or satisfying Web experience for persons with disabilities. It is not uncommon to have pages that meet standards but are still difficult to use by persons who have difficulties’’ (p. 1). C. Jay Department of Computer Science, University of Manchester, Manchester M13 9PL, UK e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_8, Ó Springer Verlag London Limited 2008
107
108
C. Jay et al.
In order to ensure a Web site is as ‘‘user friendly’’ and accessible as possible, it should be evaluated by its end users at key points in the development process. In the initial stages, end user studies can be used to capture requirements and document existing accessibility problems; throughout development, user studies provide a means of iteratively evaluating particular site components, feeding useful information back into the design process; at the final stages of development, an end user evaluation can be used to gauge how well the site achieves its intended goal. Table 1 shows common user evaluation paradigms, alongside the ways in which they can be used to improve Web accessibility. While it may be reasonable in some circumstances to run just one type of evaluation, which may be purely quantitative or qualitative, it is usually far better to run a study that is able to combine various types of measures (e.g. task completion time, questionnaire results, and observational analysis) to provide a broad picture of the user’s performance, experience, and preferences. Constructing a study that achieves this is not a trivial task – every aspect of the process, from the experimental design to the data analysis, is subject to potential problems. Difficulties can arise as a result of both methodological issues (which tasks should users complete? how many conditions should there be?) and practical limitations (how can I obtain a representative sample of users? what is the best way to use limited time resources?). The next section provides an overview of standard user evaluation paradigms, and the technical, practical, and ethical considerations that affect experimental design. A discussion of how the various factors interplay to determine the type of evaluation that is appropriate or feasible is provided in Section 3. Section 4 looks beyond the current usability evaluation paradigms, to new approaches that combine modelling of ‘‘standard’’ Web experience with knowledge gained from disabled or small-screen user studies, to inform techniques that aim to maximize Web accessibility. Section 5 outlines the authors’ opinion of the field, and Section 6 concludes by summarizing the most important reasons for conducting end user evaluations, and the best ways in which to use them.
Paradigm
Table 1 Common user evaluation paradigms Data Common Uses
Performance measures Logging user actions Questionnaires
Quantitative
Acquiring numerical values to judge Web site success
Quantitative
Retrieving low level user interaction
Quantitative
Understanding user perception of a Web and qualitative site Obtaining real time interaction information Collecting users’ implicit knowledge of a Web site Gaining thoughts, feelings, and opinions of a user’s interaction experience
Observations Interviews Think aloud
Qualitative Qualitative Qualitative
End User Evaluations
109
2 Overview All user studies require careful planning as it is often difficult to find users willing to participate in an evaluation. When users are recruited, using their time effectively is essential. In this section, we will provide overviews of evaluation paradigms that can be used to acquire information about a Web site. The paradigm used will be dependent on what aspect of the Web site is being evaluated. Having established an appropriate evaluation paradigm, a wider strategy including analysis of the data, ethical treatment of participants, and sampling of participants also needs to be developed. Devising such a strategy will be the topic of Section 2.2.
2.1 Commonly Used Evaluation Paradigms Evaluation methodologies fall into two broad camps: those that provide quantitative data and those that provide qualitative data. Quantitative evaluations involve the acquisition and analysis of measurable data that are obtained from the user as they interact with a Web site. As the results of quantitative evaluations are numeric, the results can generally be analysed using statistical techniques (Dix et al. 1993), which allow complex usability issues to be expressed as a number value that is easy to compare and discuss (Nielsen 2004a). Qualitative evaluations produce non-numeric results that are descriptive and often subjective. Evaluations that use qualitative techniques try to elicit users’ implicit knowledge about, and perception of, the evaluated Web site. When visiting a Web site, each user has a specific goal based on their needs and experience that influences the problems that they have while interacting with a Web page. Qualitative techniques help the evaluator to understand the individual problems, reactions, and expectations of the user. The remainder of this section provides an overview of some commonly used quantitative and qualitative evaluation paradigms, summarized in Table 1. 2.1.1 Performance Measures Measuring the performance of the user in a collection of tasks forms the basis of traditional research on human factors (Nielsen 2004b). Such experiments are often performed in the initial stages of a project, and throughout iterative development, to guide the design process. Examples of performance measures that can be obtained during user evaluations on Web sites include the following:
Time required by the user to complete a task Time spent navigating the Web site menu The number of incorrect link choices The number of pages viewed that are incorrect The number of observations of user frustration The frequency of links that are never traversed
110
C. Jay et al.
This list is not exhaustive – any technique that can objectively capture numerical values can be employed. It is interesting to note, however, that although the above list provides quantitative measures, some of those measures are applied to qualitative information. For example, counting the number of occurrences when the user is frustrated provides a quantitative value, yet determining whether a user is frustrated relies upon the qualitative judgement of the observer. In such situations it is recommended that the same observer perform the evaluation on all participants to avoid differences in judgement and to ensure a level of consistency to all quantitative data that are obtained (Dumas and Redish 1999). When selecting performance measures, it is important for the evaluator to choose a measure that is related to the accessibility issue they are trying to understand. For example, if a Web site evaluator is concerned that cognitively impaired users may have difficulty selecting the correct links on a page, then an appropriate measure would be to determine the proportion of correct link choices compared to the erroneous link choices. Choosing the wrong measure will lead to misleading results when the data are analysed and the final results presented. In this case, the number of incorrect links clicked may not represent accuracy: in one condition a user may inaccurately click three links, but may not click any links accurately; in another a user may inaccurately click five links, but may click four further links accurately. 2.1.2 Logging User Actions Logging user actions allows the evaluator to capture a continuous stream of quantitative data as tasks are performed. Capturing data logs is particularly useful as it allows for the automatic acquisition of large amounts of data, from multiple users, over a long period of time (Nielsen 2004b). Logging can occur through the user’s own equipment without the need for observers to be present, allowing the capture of realistic use patterns. With many other evaluation strategies, users are required to perform tasks in a laboratory, or have evaluators observe and query their actions. In contrast, the logging process is a background task and is less invasive than other evaluation techniques. While users are initially aware that they are being observed, over time the process becomes invisible and users often forget that logging is taking place. They therefore revert to accomplishing tasks that they would typically perform in their day-today activities (Faulkner 2000). However, while users can often behave as though logging is not occurring, the evaluator should always inform users of what actions will be captured and why. Failure to do so raises serious ethical issues, and in some countries covertly capturing user data is illegal. A major advantage of using logging for Web site evaluations is that Web servers typically have logging facilities built into the server software. This can produce records of server activity that can be analysed to describe user behaviour within the Web site. A typical application of Web server logging is to enhance navigation for the user by establishing common paths through a Web
End User Evaluations
111
site. An analysis of the Web logs can reveal how users navigate to certain pages within the site and the length of time users spend viewing the content before moving to other pages. By using such an analysis, links can be arranged to allow users to travel to pages of interest directly (Doerr et al. 2007) . This can enhance user mobility, which is key to the accessibility, design, and usability of Web sites (Harper et al. 2001). The logging discussed so far has been at a level of abstraction whereby only the links a user follows and the pages they visit can be captured, although it is possible to install logging tools that capture finer grained activity, such as keystrokes and mouse movements (Hammontree et al. 1992). Regardless of the granularity, all log data analysis faces the same problem of mapping lowlevel system usage to high-level user tasks (Ivory and Hearst 2001). Evaluators can understand what users did when interacting with a Web site, but cannot necessarily understand why they did it, and if they achieved their goal in a satisfactory manner (Nielsen 2004b). 2.1.3 Questionnaires Questionnaires are used to determine user perception of a system or an interface. They allow evaluators to obtain the users’ opinion of a Web site, in addition to why they formed such an opinion (Dumas and Redish 1999). For example, a survey could discover if a user finds a Web site inaccessible, and why the user had difficulties accessing the content. There are a wide variety of questions that can be asked during a survey; however, questions typically fall into one of two categories – closed questions and open-ended questions. Closed questions are questions that are asked of the user but have a predetermined set of responses that are deemed appropriate by the evaluator (Sudman and Bradburn 1992). Response types typically consist of categories from which the user answers a question with a numerical rating, or ranks a series of statements in order. Regardless of the question type used, the user is limited in the responses that they can provide. For example, during an evaluation of the clarity of Web site navigation, a closed question could ask, ‘‘How easy was it to navigate between pages? (1) very easy, (2) somewhat easy (3) somewhat difficult (4) very difficult’’. By limiting the responses, answers can be easily compared between users, and therefore analysed using statistical methods (see Section 2.2). However, as with all questions, care should be taken that the question is unambiguous, otherwise such comparisons will prove to be misrepresentative (Sudman and Bradburn 1992). Open-ended questions allow respondents to answer in a free narrative form. This type of questionnaire is valuable for its ability to elicit responses that the investigator may not have considered. Participants can give precise and subjective answers, specifying problems at a much greater level of detail than they could in a closed questionnaire. Nevertheless, problems can arise during data analysis. Even if they result in considerable descriptive data, open-ended questions may not provide adequate information on context. A participant may write an incomplete comment, fail to answer the question, or the researcher may
112
C. Jay et al.
misinterpret the user’s comments (Leedy and Ormrod 2005, Nielsen 2004b). Analysing data from open-ended questionnaires is time consuming, so appropriate procedures must be followed during the design stages of the survey. Evaluators should also be wary of questions that appear open-ended but still restrict the user’s responses. Consider the previous example of asking users about the ease of navigation within a Web site. By removing the fixed responses from the question ‘‘How easy was it to navigate between pages?’’ it appears that an open-ended question has been created. However, the wording of the question narrows the scope for users to provide a wide range of answers and so the responses will be similar to those of the question where the answers were explicitly provided. Instead, adapting the question to ‘‘How do you feel about the navigation between pages within the Web site?’’ invites users to be more expressive with the answers that they provide (Patton 2002). Before deploying questionnaires to users, a prototype should be drafted beforehand, reviewed among colleagues who may not be directly involved with the work and tested on a small sample of users (Gillham 2000). This ensures that the questionnaires are clear and that the user’s understanding of the question matches that intended by the evaluator. It is also important to ensure that the questionnaires are distributed appropriately to the target user group. For general usability questions, an online survey can be a cost-effective option. The effort of printing, distributing, and collecting questionnaires is reduced and there is potential for a large audience to complete the survey (Shneiderman 1998). On other occasions, a more directed paper-based survey is appropriate. For example, evaluators may wish to survey users who have just performed tasks on an experimental design. In this case, the audience is limited to those users who have used the prototype and who are present in the lab. Under such circumstances, a paper-based questionnaire is more appropriate than a wide-ranging online survey. For further details on questionnaire design, see Patton (2002) and Leedy and Ormrod (2005). 2.1.4 Observation Observing users while working on an interface is an important usability method that provides real-time interaction information. By directly observing user behaviour, the evaluator can understand how the design of an interface helps or impedes interaction and reveal problems that cannot be identified through a specific task completion. The user may also interact with the interface in an unexpected way which can unmask unpredictable problems (Nielsen 2004b). Observation usually occurs during both pre-design and post-design stages of a project. During pre-design, observations are useful for identifying problems and establishing user requirements. In the post-design phase, observations allow the interface to be evaluated in order to establish whether the new design meets user expectations and whether information within the Web page is more usable and accessible. Observational evaluations in qualitative studies are unstructured, free flowing, and occur in a flexible setting where the observer can take advantage of any
End User Evaluations
113
unpredictable problems that surface (Leedy and Ormrod 2005). However, an inexperienced observer might influence and affect user reaction and behaviour. A video and audio recording of each session can help identify any problems and also keep track of every reaction and comment. On the other hand, recording the session may make participants uncomfortable and affect their behaviour. Even if observations are flexible, such evaluations need to be designed appropriately, and the observer must be trained. The observer needs to be constantly alert for any events, confusions, and interactions, be able to describe and evaluate the quality of the interaction, and be able to take thorough notes without being judgemental. For more information on observation experimental design and some of the issues that need to be considered, see Leedy and Ormrod (2005). 2.1.5 Interviews Interviews form another method of collecting users’ implicit knowledge and opinions of an interface or Web site. They can be structured, with a preselected set of questions, or in-depth, where the interviewer does not follow a specific form. Interviews frequently occur as a follow-up of an observational or task performance evaluation, but are also used in an exploratory capacity at the preliminary stages of an investigation (Nielsen 2004b). Here, the interviewer can adjust the interview to target the area in which they wish to gain more information. In some cases, a researcher may interview several users at the same time in a focus group. This technique uses group interaction to generate ideas and observe the users’ various opinions. Focus groups usually occur in the design stages, but they can also replace individual interviews when time and cost are restricting issues. With all types of interviews, the researcher can gather a set of descriptive data that can reveal users’ emotions, thoughts, experiences, expectations and perceptions (Patton 2002). Questions asked during an interview must be open-ended and phrased in a way that encourages users to elaborate on their opinions in depth. As with observational analysis, the interviewer must stay as neutral as possible to avoid influencing the user, or interpreting the user’s answers in a biased way. In addition, the interviewer must be careful not to ask leading questions, agree or disagree with the user’s statements, or try to explain to the user why the system behaved in a certain way (Nielsen 2004a). Interview evaluations have the disadvantage of being time consuming and producing data that are difficult to analyse. However an experienced interviewer can encourage the user to provide in-depth discussion and clarification which yield rich data that, when correctly analysed, reveal sound conclusions. 2.1.6 Think Aloud During a think-aloud evaluation, users are observed using an interface while verbalizing their thoughts, feelings and, opinions about their interaction experience. Participants are asked to explain what they are doing, and what motivated
114
C. Jay et al.
a particular action. This technique is commonly used for Web page usability evaluations because it gives an insight into what users are thinking and how they feel when facing certain problems. It can reveal unpredicted problems, along with suggestions for addressing them. Thinking aloud provides strong qualitative data from a small number of users; however, it is important to consider the user’s comments in context. In some instances, a user may criticize an interface and place too much emphasis on their own theory of how the interface should be designed. In such circumstances, the observer should be aware of such comments as how users believe they should interact with an interface and how users actually interact with an interface can sometimes be disconnected (Nielsen 2004a). Thinking aloud seems unnatural to most people and can make participants uncomfortable, affecting their performance and comments. In addition, verbalizing every thought can slow users down making performance measurements less representative (Nielsen 2004a). Observers need to be trained for this situation, as well as avoiding the more general types of experimenter bias, such as asking questions that influence user reactions.
2.2 Designing an Effective Study Regardless of the paradigm, robust experimental design is crucial to obtaining reliable results. The main principles for any experimental design are universality, replication, control, and measurement (Leedy and Ormrod 2005). An experiment is universal when it could be carried out by any individual and replicable when one is able to repeat the experiment and achieve comparable results. It is important that the researcher control any confounding factors and collect and analyse both quantitative and qualitative measurements. Consideration should also be given to whether different groups of users should evaluate each variable (between-subject design) or whether the same group of users should evaluate all varying factors (within-subject design). Furthermore, care should be taken to ensure that data are obtained from a suitable sample of users, that the experiment has both internal and external validity, and that the data are analysed appropriately. In addition, practical and ethical issues must be recognized and taken into account. An overview of these concepts is provided below, but for more in-depth discussion on effective experimental design, see Graziano and Raulin (1997).
2.2.1 Sampling Sampling is a concern for any researcher who wants to draw inferences about a population. Generally speaking, the larger the sample, the stronger the generalizations that can be made. Statisticians have developed formulas for determining the desired sample size for a given population and the effect size that the
End User Evaluations
115
study will produce. For further information on choosing an appropriate sample size, see Cohen (1998). The most vital issue, though, is to know how well a particular sample represents the population or target group and how it reflects the characteristics of the population from which it is drawn. Obtaining a representative user sample, particularly in accessibility research, can be very difficult. To some extent, the size of an adequate sample depends on how homogeneous or heterogeneous the population is – how alike or different its members are with respect to the characteristics of the research interest (Leedy and Ormrod 2005). For example, when evaluating the accessibility of a Web site by visually impaired users, it can be difficult to find a large number of participants. However, a small sample number can be used to represent the entire population as visually impaired users have common strategies, expectations, and needs while interacting with the Web. If the researcher can design an experiment so that confounding variables are controlled, and if the problems can be identified with only a small number of users, the results should be generalizable to at least some subgroup in the population. 2.2.2 Ensuring Validity Maintaining validity ensures that the research truly measures what it is intended to measure in a reliable and generalizable way. The degree to which confounding variables within the study are eliminated is referred to as internal validity, and the generalizability of the findings is called external validity. The key to having high internal validity is to use a good experimental design; external validity is trickier to ensure, and requires the researcher to use intuition and judgement to some extent. There is an interaction between the two types of validity, with high internal validity often bought at the expense of external validity. However robust an experimental design, internal validity is very difficult to guarantee. In a tightly controlled experiment, with a fully replicated, withinsubjects, randomized design, it is reasonable to assume that variations in the recorded measure (dependent variable) are due to variations in the level of whatever is being tested (independent variable). In a real situation, though, there are a lot of extraneous factors that cannot be controlled and which can lead to behaviour changes or changes that can be confused with the effects of our intended manipulations. In a longitudinal study, threats to internal validity include unexpected events that affect the dependent variable, and participant maturation. In a betweensubjects study, unrepresentative sampling of participants in the experimental and control groups can affect results (Field and Hole 2003). In every experiment, participant motivation and experience will vary, and are difficult to reliably control. As such, random assignment of participants to different conditions or condition orders is vital. Designing a study to observe visually disabled users accessing Web sites with screen readers demonstrates the difficulties of ensuring internal validity. There
116
C. Jay et al.
are a number of screen readers available, such as Jaws,1 Hal,2 and WindowEyes,3 and each has its own functionality and user customization settings. When inviting users into the laboratory, one must be wary that users may be facing difficulties not because of the task they must perform, but because they are unfamiliar with the equipment. Allowing participants to use their own screen readers and settings avoids this issue, but adds a further variable to the experiment, which, owing to the idiosyncratic nature of an individual’s set-up, is impossible to control. As high internal validity is maintained by holding extraneous variables constant, it risks sacrificing external validity. An internally valid experiment shows what happens when a variable is manipulated in isolation, but how representative is this situation of the real world? Many of the factors necessary to maintain high internal validity, such as time constraints, the laboratory setting, and the type of task, actually threaten external validity (Field and Hole 2003). Sampling and concerns about the failure to randomly select a sample from a population also impact on the external validity of the experiment.
2.2.3 Data Analysis Effective data analysis is key to accurately interpreting the results of any user evaluation and should be built into the experimental design early on. The appropriate form of analysis depends on factors such as the number of participants, the type of data collected, and the distribution of the population. When the data are quantitative, evaluators can use parametric or nonparametric analysis techniques. Parametric tests are appropriate when the data are continuous and have a Gaussian distribution. Such tests compare observed data to theoretical distribution models (Yu 2003). The evaluator performs an experiment and then judges the outcome to a known distribution of results to establish whether they are significantly different from the observed results or not. Examples of parametric tests that can achieve this include t-tests, analyses of variance (ANOVA), and analysis of covariance (ANCOVA). These parametric tests generally produce valid analyses of data, provided some technical assumptions about the data set are met (Todman and Dugard 2001). These assumptions include a suitable large number of participants, a Gaussian distribution, and a random sample of participants that represent the entire population that is of interest to the evaluator. Non-parametric analysis is used when the data obtained are not Gaussian and there are no assumptions about the underlying model against which to compare the observed results. Such statistical analysis is useful when the data obtained during the evaluation are comparative, such as a ranking of elements 1 2 3
(Leedy and ormrod 2005). For example, giving a user a series of interface designs and asking them to rank the interface they were most comfortable interacting with will lead to non-parametric data analysis. Examples of nonparametric tests that can achieve this include Chi-square tests, Fisher’s exact test, and Spearman’s rank correlation coefficient. Randomization tests are useful when the data obtained during an evaluation are continuous, but it is hard to establish whether the distribution is Gaussian. This can occur when only a small number of users participated in the evaluation and data are limited. As with non-parametric analysis, randomization techniques use no assumptions about the data distribution. However, given that the evaluation involved modifying a single variable and measuring its effect, randomization tests can provide statistical results that are on par with parametric results used for large data sets (Todmad and Dugard 2001). In qualitative research, data analysis and interpretation act as one. Observations, interviews, and think-aloud evaluations provide a large body of information. The important thing to consider while analysing qualitative data is to look for patterns and relationships and make general discoveries about the researched phenomena (Leedy and ormrod 2005). This is usually achieved through iterations of information sorting and categorization. When analysing qualitative data, the researcher should first organize results by breaking large units into smaller ones to get an overall idea of the data. Then, an appropriate classification of the data into categories or themes helps find meaning in the data, which reveals possible conclusions or propositions.
2.2.4 Ethical Treatment of Participants Whenever users are involved in scientific experimentation, care should be taken to safeguard their well-being. User evaluations in Web accessibility, unlike medical trials, are not likely to threaten physical well-being, but there are still concerns that must be addressed before any user evaluation begins. The most important first step in ensuring the ethical treatment of user evaluation participants is informing them about the nature of the evaluation and gaining the consent to proceed. No evaluation should be conducted covertly. Capturing user data without consent is unethical and, in some countries, illegal. As part of the consent process, users should be informed of what they will be doing in the experiment, and how long it is likely to take. Many users in Web evaluations are not taking part because they need to, but because they have an interest in the subject, because they want to ‘‘help out’’ or, in some case, because there is a token award such as a small stipend or additional course credits. It is therefore unwise to place too much of a burden on the users in terms of the difficulty of the task, or to keep them in the lab for an unreasonable amount of time. Doing so will affect the results of the evaluation: users will become tired, stressed and irritated, and, in the worst case, unwilling to participate.
118
C. Jay et al.
Care should also be taken to capture only data that are necessary for the evaluation. If you wish to observe Web browsing behaviour on a selection of sites, then select only those sites that do not provide sensitive information. Asking a user to browse a selection of news Web sites is reasonable, but recording information from a user interacting with a personal Web-based email account is not, as these contain private and potentially sensitive data. Where sensitive information is required, such as age, sex, or computer skills level, this data should be kept secure and not passed on to any third party. Participant data should be stored under a code, to ensure that anonymity is maintained. The only time a person’s name should be recorded is when gaining consent. General well-being of the participants should also be considered, especially when the users are invited into the lab. For visually impaired uses, the lab should be clutter free and have a clear access route. There should be wheel-chair access for those users who require it. If the study involves children, care should be taken to minimize access to electrical or other dangerous equipment. Every study has different requirements, and some of these will necessarily place greater demands on participants than others. In most cases, common sense and due care will ensure that users feel comfortable and happy to be taking part. Most universities and research organizations have an ethical approval committee that examines user studies on a case-by-case basis, to ensure that ethical concerns are considered alongside an individual study’s requirements.
3 Discussion Research design consists of planning the overall structure of the procedure, including how the data will be collected, what they will represent, and how they will be analysed. Questions such as ‘‘what should I study? how should I study it? what am I expecting to find? can my experimental design be meaningfully analysed?’’ should all be answered during the planning process. The previous section described the standard evaluation paradigms, and the practical and ethical issues that influence the outcome of a study. To ensure that an evaluation achieves its goals, these components should not be considered in isolation: the number and type of users, time, and equipment available all influence the type of evaluation that is appropriate. A number of questions prompt any study, and the first step is to refine these questions to make the goals realistic and achievable. Defining the research goals helps to formulate the experimental hypothesis, the statement that the study will test, which is essential for any kind of quantitative data analysis. The hypothesis should have an independent variable, which can have a number of levels, manipulated in a controlled manner, and a dependent variable, which will measure the effect. A purely exploratory study may not have a definitive hypothesis as such, but should still have a clearly stated goal, which can serve as a basis for analysis. A survey of existing studies may be appropriate not only to avoid repeating previous research but also to guide the experimental methodology. For
End User Evaluations
119
example, when planning an evaluation that addresses the accessibility problems of visually impaired users, existing literature can provide significant information on the previous work in the area, and the standard procedures for running such studies. When deciding on the methodology, it is important to consider the desired outcome in conjunction with any practical limitations. In an experiment with quantitative measures, a relatively large sample is desirable to ensure that statistical analysis is sufficiently powerful. In practical terms, a large sample of disabled users can be difficult to obtain, and of that population, participants may have differing levels of disability. In this case, quantitative data, such as task completion time, can still provide a useful indication of the extent of accessibility problems, but using additional qualitative procedures, such as observational analysis, will provide a much stronger and more comprehensive result. As qualitative research can be very time consuming, it may not be appropriate to conduct this type of study with a large sample. If sufficient time and expertise cannot be allocated to data collection and analysis, it may be better to use the whole sample only for quantitative data collection, and a reduced sample for more in-depth, qualitative investigation. It is important to ensure that the stimuli for the evaluation are chosen, stored, and presented carefully. A live Web site may provide the most ‘‘realistic’’ task environment, but content updates introduce further, potentially uncontrollable variables to the study. Saving Web pages locally avoids this issue, and ensures faster, more consistent access to the site. A key part of the design process is planning the data analysis. If the data analysis is not thought through at this early stage, a lot of time and effort can be spent collecting data only to discover that it cannot be analysed in a meaningful way. A pilot study is a good way of determining the feasibility of the evaluation, and trying out any procedures or analysis methods. If participants within the target user group are hard to find, performing a pilot study with a more general group of users can still help in identifying any possible design or data analysis problems. To design a successful experiment, it is important to adhere to the principles of universality, replication, control, and measurement (Leedy and Ormrod 2005). It is important that the researcher control any confounding factors, and that accurate measurements be taken and analysed. Using these principles to guide study design helps to ensure that a user evaluation is robust and reliable and provides informative data.
4 Future Directions The research approach outlined so far has focused on employing user evaluation studies to test the accessibility of Web sites, to either determine existing problems or evaluate a potential solution. Such research is vital towards
120
C. Jay et al.
improving accessibility, but, in isolation, it cannot provide a complete picture of accessibility problems, or how to solve them. An emerging trend in accessibility research considers not only the end users – those with access difficulties – but also the traditional users – those for whom the Web site was originally designed. Generally speaking, traditional users are able to view the Web site on a full size display and interact with it using a keyboard and mouse. By studying traditional users, and understanding how and why they retrieve information from a site, we are much better placed to make Web sites truly accessible. Without this information, accessibility work is guided to a certain extent by assumptions. For example, an image is assumed to be of little use to users browsing a site with a screen reader. Providing alternative text to describe the image is considered sufficient. However, such an approach fails to take account of how sighted users are really responding to images. Eye tracking research, discussed further below, has shown that sighted users rarely view images at length, but if an image is next to text, they will prioritize reading that content over un-illustrated sections of text (Jay et al. 2007). Previous research addressing traditional Web interaction has generally been to improve site design for those same, standard users. Studies have focused on specific usability goals and have been relatively narrow in their scope, producing data that are difficult to apply directly to accessibility research. More recently, studies have become more ambitious, attempting to model interaction with Web sites (Granka et al. 2004, Jay et al. 2007, McCarthy et al. 2003, outing and Ruel 2006). Such models and profiles of standard user interaction with Web sites potentiallyprovide valuable information about what is actually missing when access is limited (rather than what is assumed to be missing). A tool that has enabled a significant proportion of this research is the eye tracker. Tracking eye movements is an extremely valuable way of understanding how people orientate their visual attention. Eye movement data have been used to provide insight into moment-to-moment processing activities such as music reading, typing, visual search, scene perception, mathematics, and, recently, Web browsing. The pattern of fixations on a Web page shows how much attention is allocated to different parts of the page. The duration of fixations indicates the extent of cognitive processing the stimulus requires: fixations are longer during visual search, for example, than reading (Rayner 1998). The gaze path – the order in which different elements of the page are visited – shows the manner in which people navigate around the page. Figure 1(a) shows a ‘‘heat map’’ of gaze data on a Web page. The warmer the colour of a section of the page, the greater the number of fixations it received. Gaze plots (see Fig. 1(b)) display a static view of the fixations of a single participant. Each spot represents a single fixation, which varies in size according to its duration. The lines between spots represent saccades (eye movements), and each spot is marked with a number to show its order in the gaze path. Eye tracking has been used to improve the standard design and layout of Web pages, and evaluate their usability (Russell 2005). Studies have also
End User Evaluations
121
Fig. 1 Example of eye tracking data taken from http://www.bbc.co.uk/ on 13 February 2006: (a) heat map using colours to represent the number of fixations on any given area and (b) gaze plot of a single user showing the order of fixations and amount of attention
examined the saliency of items on a page under varying conditions (Joachims et al. 2004, Pan et al. 2004) and how eye movements vary according to information scent (Pirolli et al. 2001). Although these studies (and many others) could potentially provide useful input into accessibility solutions, they were not explicitly designed to do so. Recent research has started to address this. An eye tracking study looking at visual search on Web sites found significant differences between browsing behaviour on pages designed with a standard, graphical layout and those with the same content displayed in a text-only format (Jay et al. 2007). When viewing the standard page, participants directed their gaze around the page according to its formatting, making it easier to find the link. On the text-only page, participants simply read through the content until they found the link; a situation analogous to using a screen reader. The qualitative differences in viewing behaviour show the type of implicit cues provided by presentation and layout. To maximize accessibility, such cues should ideally be translated to situations where they are not already available. Using eye tracking as a means of providing input to Web accessibility research is a new technique. Its ability though to document users’ often unconscious visual attention processes provides very useful information about how people really view Web sites, and thus what may be missing from current
122
C. Jay et al.
attempts to make sites more accessible. Eye tracking is not the only tool that is useful for acquiring this type of information: every user evaluation that investigates traditional Web interaction could indicate what aspects of the standard visual presentation people use to browse the site, and hence what should be replaced to make it more accessible in non-standard presentations. The next section discusses what the authors consider to be the ultimate goal for Web accessibility: combining this type of data with data from other accessibility user studies, in order to develop truly accessible Web sites.
5 Authors’ Opinion of the Field Maximizing access to Web pages requires not only knowledge of the current accessibility problems but also a model of the desired solution, and a means of mapping between the two. The desired solution, at first glance, seems as simple as, ‘‘an accessible site’’, but in reality it is far more complex than this. If a site has a very clear purpose, such as displaying timetable information, then the desired solution could be to ensure the timetable information is accessible via a screen reader, or a mobile device. Further consideration of this simple example quickly reveals many issues, such as determining the level of detail required for the query, and the form of the results, but the overall goal remains clear. In the case of a ‘‘Web 2.0’’ site with many functions on a single page, the desired solution is less apparent. How should one prioritize access to different parts of the page? Are some functions more important than others? What effect does the layout and presentation of the page have on its functionality? To provide a comprehensive accessibility solution, user evaluations are thus required in the following areas:
Identifying existing accessibility problems Defining the desired solution Designing an accessible interface Mapping the desired solution to the interface
An experiment should always be carefully designed to answer a specific question or set of questions, and as such, most studies will address only a small part of one of the areas listed above. The nature of experimental research means that this will always be the case: carefully controlling variables is necessary to ensure valid results that are representative of at least a subgroup of the target population. Although the goal of an individual study may be narrow, this does not mean that the application of its results needs to be. In theory, each of the areas listed above can be considered in isolation. In practice, it is far better to consider them as overlapping and feeding into one another. For example, it is far easier to define the desired solution (decide what constitutes an accessible Web site) with knowledge of the available interface (the means of accessing the Web site) and the existing accessibility difficulties that exist. Similarly, interface design will be
End User Evaluations
123
more successful if it considers what it should ultimately achieve (the desired solution), as well as the access problems that exist with the current interface. The research conducted within the Human Centred Web Lab (HCW)4 is based on the ethos of interdependent elements of Web accessibility. The lab is currently working on a series of projects that have a well-defined research focus, yet the outcomes of the projects dovetail into each other to support and strengthen efforts into creating a comprehensive Web accessibility solution. SADIe5 is a project that investigates accurate and scalable transcoding techniques. Transcoding is a way of transforming Web content so that it can be accessed on a diverse range of devices (Ihde et al. 2001), with SADIe’s primary focus being refactoring Web pages so that that are more suited to screen readers used by visually impaired users. To achieve this, SADIe makes use of Semantic Web technologies (see Semantic Web) to expose implicit information to the user. For example, sighted users can identify that a list of links is a menu due to the way it is rendered on screen, but this rendering is lost to visually impaired users and hence navigation can be hindered. By explicitly providing access to the menu, users can navigate the Web page more easily. For further discussion of the SADIe method and architecture, the reader is directed to Bechhofer (2006). While current transcoding solutions such as SADIe can improve access to Web content, the functionality that is used to adapt the pages is devised from a top-down approach, whereby developers use common sense and their own intuition as a basis for the transcoding algorithms. Users are typically only involved within the design process during the final evaluation stage. Successful evaluations justify the algorithm design, which we believe limits the ability of tools to improve accessibility. We assert that future research into Web accessibility will also require bottom-up approaches that use theoretical understanding of how users perceive and interact with Web pages as the basis for accessibility solutions. With this in mind, we have pioneered investigations into understanding user coping strategies. Coping is defined within psychology as a constantly changing cognitive and behaviour effort to manage specific external and/or internal demands that are appraised as taxing or exceeding the resources of the person (Lazarus and Folkman 1984). From the perspective of Web accessibility, this manifests itself as routines that users rely upon to cope with inaccessible pages. For example, a post on the BCAB6 mailing list describes how a visually impaired user searched for the word ‘‘cached’’ on Google to reach search results. This was due to the frustration caused by listening to the advertisements and other elements that surrounded the results every time a search was performed. Navigating to ‘‘cached’’ allowed advertisements to be avoided as it was situated close to the first result. By understanding what users need to do to deal with the
4 5 6
http://hcw.cs.manchester.ac.uk/ http://hcw.cs.manchester.ac.uk/research/sadie/ British Computer Association of the Blind
124
C. Jay et al.
inaccessible Web, we can create tools such as SADIe that are more targeted towards the users, needs and therefore further improve access to Web content. In addition to research into how visually impaired users cope with the Web, ViCRAM7 is a project that investigates how sighted users behave with respect to Web page design, structure, and perception using eye movement tracking and knowledge elicitation techniques. Conclusions will be derived about user interaction paradigms and browsing behaviour in correlation with the visual presentation and structure of a page. We assert that this kind of information can be used for giving feedback to users and designers on the visual clutter and common interaction patterns as well as a guide for transcoding tools, such as SADIe. ViCRAM will produce a model of how sighted users, and therefore the intended audience, perceive Web pages. SADIe will produce a model of the strategies visually impaired users employ to cope with inaccessible pages. By combining the two theoretical models, plus our ability to adapt and transform Web content, it will be possible to create Web pages that contain the key information of the Web page but in a format that is readily available to visually impaired user using screen readers to browse the Web. Designers will also be able to use this information to understand how their targeted audience wants and actually interacts with the Web, to produce more usable and hence accessible sites. End user evaluations are becoming central to the field of Web accessibility. As the research within our own lab has shown, accessibility solutions can no longer be designer centric and tested on users at the end of the development life cycle. Understanding and modelling users’ perceptions, requirements, and interactions are integral to understanding the issues of accessibility and the solutions that need to be derived to create a Web that is accessible to all. Furthermore, lessons learnt from one user group are now being adapted to suit different user groups. RIAM8 investigates ways in which traditional Web accessibility research focused on users with disabilities can be integrated into solutions designed to make the mobile Web more accessible. This cross-pollination and overlapping of theoretical cognitive model, in conjunction with user studies at all stages of the deign process, will become central to future evaluations and Web accessibility research as a whole.
6 Conclusions User evaluations are crucial to ensuring Web accessibility. They provide valuable information at every stage of Web site design, from understanding existing access problems to evaluating solutions. To get accurate information from an evaluation, it must be properly designed and executed. Ideally, obtaining both quantitative and qualitative data provides a comprehensive picture of the issue under investigation. 7 8
Accessibility user studies are now moving beyond documenting access difficulties, to studying ‘‘traditional’’ Web use with techniques such as eye tracking and knowledge elicitation, as a means of defining the ultimate goals for accessibility solutions. We believe combining the resulting models of traditional Web use with detailed information about existing access problems, and user preferences, will eventually lead to the development of Web sites genuinely accessible to all.
References Alan Dix, Janet Finlay, Gregory Abowd, and Russel Beale. Human Computer Interaction. Prentice Hall, Campus 400 Maryland Avenue, Hemel Hempstead, Hertfordshire, HP2 7EZ, 1st ed., 1993. ISBN: 0134582667. Andy Field and Graham Hole. How to Design and Report Experiments. SAGE Publications Ltd, London, UK, 2003. ISBN: 0761973834. Anthony M. Graziano and Michael L. Raulin. Research Methods: A Process of Enquiry. Addison Wesley Longman, 3rd ed., 1997. ISBN: 0673980413. Ben Shneiderman. Designing The User Interface: Strategies for Effective Human Computer Interaction. Addison Wesley, 3rd ed., 1998. ISBN: 0201694972. Bing Pan, Helene A. Hembrooke, Geri K. Gay, Laura A. Granka, Matthew K. Feusner, and Jill K. Newman. The determinants of web page viewing behavior: an eye tracking study. In ETRA’2004: Proceedings of the Eye tracking research & applications symposium on Eye tracking research & applications, pp. 147 154, New York, NY, USA, 2004. ACM Press. Caroline Jay, Robert Stevens, Mashhuda Glencross, Alan Chalmers, and Cathy Yang. How people use presentation to search for a link: Expanding the understanding of accessibility on the web. Universal Access in the Information Society, 2007. Chong Ho Yu. Resampling methods: Concepts, applications, and justification. Practical Assessment, Research & Evaluation, 8(19):1 23, September 2003. ISSN: 1531 7714. Christian Doerr, Daniel von Dincklage, and Amer Diwan. Simplifying web traversals by recognizing behavior patterns. In HT ’07: Proceedings of the 18th conference on Hypertext and hypermedia, pp. 105 114, New York, NY, USA, 2007. ACM Press. Helen Petrie and Omar Kheir. The relationship between accessibility and usability of websites. In CHI ’07: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 397 406, 2007. Hironobu Takagi, Chieko Asakawa, Kentarou Fukuda, and Junji Maeda. Accessibility designer visualizing usability for the blind. In Assets ’04: Proceedings of the 6th international ACM SIGACCESS conference on Computers and accessibility, p. 177 184, New York, NY, USA, 2004. ACM Press. Jacob Cohen. Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates, 365 Broadway, Hillsdale, New Jersey, 07642, USA, 2nd ed., 1988. ISBN: 0805802835. Jakob Nielsen. Risks of quantitative studies. Alertbox: Current Issues in Web Usability, March 2004a. http://www.useit.com/alertbox/20040301.html. Jakob Nielsen. Usability Engineering. The Morgan Kaufmann Series in Interactive Tech nologies. Academic Press Inc, 525 B Street, Suite 1900, San Diego, CA, 92101 4495, USA, 1st ed., November 2004b. ISBN: 0125184069. John B Todman and Pat Dugard. Single case and Small n Experimental Designs: A Practical Guide To Randomization Tests. Laurence Erlbaum Associates, Mahwah, New Jersey, 07430, USA, 1st ed., March 2001. ISBN: 0805835547. John D. McCarthy, Angela M. Sasse, and Jens Riegelsberger. Could i have the menu please? an eye tracking study of design conventions. In Proceedings of HCI2003, pp. 8 12, 2003.
126
C. Jay et al.
Joseph S. Dumas and Janice C. Redish. A Practical Guide To Usability Testing. Intellect, 5804? N.E. Hassalo Street, Potland, Oregon, 97213 3644, USA, 1999. ISBN: 1 84150 020 8. Keith Rayner. Eye movements in reading and information processing: 20 years of research, volume 124. 1998. Laura A. Granka, Thorsten Joachims, and Geri Gay. Eye tracking analysis of user behavior in www search. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 478 479, 2004. Mark Russell. Using eye tracking data to understand first impressions of a website. Usability News, 7.2, 2005. Melody Y. Ivory and Marti A Hearst. The state of the art in automating usability evaluation of user interfaces. ACM Computer Surv., 33(4):470 516, 2001. Michael Quinn Patton. Qualitative Research & Evaluation Methods. Sage Publishing, 6 Bonhill Street, London, EC2A 4PU, UK, 3rd ed., 2002. ISBN: 0761919716. Monty L. Hammontree, Jeffrey J. Hendrickson, and Billy W. Hensley. Integrated data capture and analysis tools for research and testing on graphical user interfaces. In CHI ’92: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 431 432, New York, NY, USA, 1992. ACM Press. Paul D. Leedy and Jeanne Ellis Ormrod. Practical Research: Planning and Design. Prentice Hall, 8th ed., 2005. ISBN: 0131108956. Peter Pirolli, Stuart K. Card, and Mija M. Van Der Wege. Visual information foraging in a focus + context visualization. In CHI ’01: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 506 513, New York, NY, USA, 2001. ACM Press. Richard S. Lazarus and Susan Folkman. Stress, Appraisal, and Coping. Springer New York, Springer Publishing Company, 356 Broadway, New York, New York 10012, 1984. ISBN: 0 8261 4190 0. Sean Bechhofer, Simon Harper, and Darren Lunn. Sadie: Semantic annotation for accessi bility. In ISWC ’06: Proceedings of The 5th International Semantic Web Conference, pp. 101 115, 2006. Seymour Sudman and Norman M. Bradburn. Asking Questions: A practical Guide to Ques tionnaire Design. Jossey Bass Series in Social and Behavioural Sciences. Jossey Bass, 433 California Street, San Francisco, California, 94104, USA, 1st ed., 1992. ISBN: 0875895468. Simon Harper, Robert Stevens, and Carole Goble. Web mobility guidelines for visually impaired surfers. Journal of Research and Practice in Information Technology Special Issue on HCI (Australian Computer Society), 33(1):30 41, 2001. Steven C. Ihde, Paul Maglio, Jorg ¨ Meyer, and Rob Barrett. Intermediary Based Transcoding Framework. IBM Systems Journal, 40(1):179 192, 2001. Steve Outing and Laura Ruel. Eyetrack III: What We Saw Through Their Eyes, 2006. A Project of the Poynter Institute, http://www.poynterextra.org/eyetrack2004/index.htm Last Accessed: 20th January 2006. Thorsten Joachims, Laura A. Granka, and Geri Gay. Eye tracking analysis of user behaviour in www search. In SIGR, pp. 478 479. ACM Press, 2004. Vicki L. Hanson. The user experience designs and adaptations. In Simon Harper, Yeliz Yesilada, and Carole Goble, editors, Proceedings of the international cross disciplinary workshop on Web accessibility, pp. 1 11, 2004. William Gillham. Developing a Questionnaire. Real World Research. London : Continium, The Tower Building, 11 York Road, London, SE1 7NX, UK, 2000. ISBN: 0826447953. Xristine Faulkner. Usability Engineering. Grassroots Series. Macmillan Press, Houndmills, Basingstoke, Hampshire, RG21 6XS, UK, 2000. ISBN: 0333773217.
Authoring Tools Jutta Treviranus
Abstract Authoring tools that are accessible and that enable authors to produce accessible Web content play a critical role in web accessibility. Widespread use of authoring tools that comply to the W3C Authoring Tool Accessibility Guidelines (ATAG) would ensure that even authors who are neither knowledgeable about nor particularly motivated to produce accessible content do so by default. The principles and techniques of ATAG are discussed. Some examples of accessible authoring tools are described including authoring tool content management components such as TinyMCE. Considerations for creating an accessible collaborative environment are also covered. As part of providing accessible content, the debate between system-based personal optimization and one universally accessible site configuration is presented. The issues and potential solutions to address the accessibility crisis presented by the advent of rich internet applications are outlined. This challenge must be met to ensure that a large segment of the population is able to participate in the move toward the web as a two-way communication mechanism.
1 Introduction Authoring tools play two very critical roles in Web accessibility: they offer a powerful mechanism for promoting the creation of accessible Web content and they are the key to equal participation in communication via the Web. This chapter will discuss the role authoring tools can play in promoting broader compliance to Web accessibility guidelines and the importance of authoring tools in the equal participation of people with disabilities in the phenomenon that is the Web. Most Web content is authored using an authoring tool, there are very few authors left who code Web pages using raw HTML. These authoring tools greatly influence the Web content created. Some markup is automatically J. Treviranus Adaptive Technology Resource Centre, Faculty of Information Studies, University of Toronto, Toronto, Ontario, Canada e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_9, Ó Springer Verlag London Limited 2008
127
128
J. Treviranus
generated for the author by the tool, authors are presented choices and advice, authors are offered pre-authored content and templates and authors are assisted in checking and revising their content. Each of these functions presents an opportunity to promote the creation of accessible Web content. The Web is presently one of the primary loci for communication, information sharing and community building. It has become far more than a supplementary information source. To date the focus of Web accessibility discourse has been on access to information or on people with disabilities as consumers of information. It is just as critical that people with disabilities be producers of information and participants in the global and local conversations occurring on the Web. This is not possible without accessible authoring tools.
2 Overview and History of the Field A cursory review of publishing and discourse on the topic of Web accessibility shows a preponderance of information, legislation and discussion regarding Web content accessibility and Web content accessibility guidelines with very little focus on authoring by people with disabilities or the use of authoring tools to promote accessibility. The Web Accessibility Initiative of the World Wide Web Consortium was established in April 1997 with three major guideline initiatives: Web content, authoring tools and user agents. Since then 26 jurisdictions around the world have adopted legislation regarding Web content accessibility, the majority based upon the W3C Web Content Accessibility Guidelines 1.0 (WCAG). The Authoring Tools Accessibility Guidelines 1.0 (ATAG) became a W3C recommendation in February 2000. These guidelines describe how to create a Web authoring tool that helps authors to create accessible Web content (that conforms to WCAG) and how to create an authoring tool that can be used by people with disabilities. The guidelines are primarily intended for developers of authoring tools. Authoring tools are very broadly defined to encompass any software application, tool, script or wizard that produces Web content. This includes the following:
‘‘Editing tools specifically designed to produce Web content, for example, what-you-see-is-what-you-get (WYSIWYG) HTML and XML editors
Tools that offer the option of saving content in a Web format, for example, word processors or desktop publishing packages
Tools that transform documents into Web formats, for example, filters to transform desktop publishing formats to HTML
Tools that produce multimedia, especially where it is intended for use on the Web, for example, video production and editing suites, SMIL authoring packages Tools for site management or site publication, including content managements systems (CMS), tools that automatically generate Web sites dynamically from a database, on-the-fly conversion tools, and Web site publishing tools
Authoring Tools
129
Tools for management of layout, for example, CSS formatting tools Web sites that let users add content, such as blogs, wikis, photo sharing sites, and social networking sites’’ (Treviranus, McCathieNevile, Jacobs, & Richards 2000) These can be software applications or tools used on the web such as wikis, chat systems or blogs. In the more than 7 years since ATAG 1.0 became a recommendation there have been very few if any applications that have complied with all the priority 1 and priority 2 checkpoints within the guidelines. This is partly due to the volatile authoring tool market. Several applications were almost fully compliant when company mergers caused them to be abandoned or absorbed (e.g., HomeSite, HotDog). There have been some notable research initiatives to develop authoring tools that encourage accessible Web content and are accessible. The first of these was a project initiated in 1996, led by a Canadian company, SoftQuad Inc., in collaboration with the Adaptive Technology Resource Centre of the University of Toronto. SoftQuad were the developers of the first HTML editor HoTMetaL. HoTMetaL was redesigned to be accessible to users with disabilities and incorporated a number of the principles that were later included in ATAG. For example, HoTMetaL prompted the author for alt-text when an image was inserted, encouraged the use of CSS for styling and steered the author toward the use of appropriate structural markup. HoTMetaL was abandoned as a product when SoftQuad was purchased by a larger company. Another project led by the ATRC in 2005 addressed the challenge by modifying an open source HTML editing component incorporated in many content management systems. TinyMCE was modified to comply with priority 1 and priority 2 checkpoints of the ATAG 2.0 draft. The goal was to encourage wide proliferation of the accessible authoring supports. ATRC worked with Moxiecode Systems to retrofit their open source JavaScript-based ‘‘what you see is what you get’’ (WYSIWYG) HTML editor to make it ATAG 2.0 conformant. This open source HTML editor was chosen because TinyMCE is used in many popular Content Management Systems (CMS), used to build Web sites, blogs, wikis and discussion forums, etc.; therefore, by focusing efforts on this one particular editor, it would be possible to quickly propagate accessible authoring practices to a number of other tools. In the past year, since the accessible version of the editor was released, there have been more than 500,000 downloads of the TinyMCE editor. Among others, TinyMCE is currently being used in the following applications: ATutor, Mambo, Joomla, Drupal, Plone, Xaraya, XOOPS, Typo3, b2evolution, QuickelSoft CMS, WordPress, Community Server and Zope (http://culturall.atrc. utoronto.ca/index.php?option=com_content&task=category§ionid=12& id=15&Itemid=35). Other chapters in this book cover the fundamental importance of Web accessibility not only to the lives of people with disability but also to society as a whole. The economic, educational and social impact of lack of equal access to the Web is grave and far reaching. Policy, advocacy and legislation encouraging Web accessibility have focused on the Web Content Accessibility
130
J. Treviranus
Guidelines. Despite legislation in many jurisdictions (some with very serious consequences associated with non-compliance) a recent United Nations study shows that the majority of Web sites, including government Web sites, are still inaccessible (Nomensa 2006). It would appear that current strategies to encourage Web accessibility have not been as successful as hoped and efforts should be focused on new or additional strategies. Web accessibility advocacy based solely on WCAG requires knowledge and understanding of the guidelines by all Web authors. Web authors include a large part of the population and a wide cross-section of the population. This cross-section includes such diverse authors as professional Web editors, employees whose occasional task is to author Web content, grandparents, young children and hobbyists. To fully understand and adhere to the accessibility guidelines requires strong motivation and commitment on the part of authors. Authors must also constantly update their knowledge as technologies change. Another support for accessible Web content is the use of checking or evaluation tools (Abou-Zahra 2007). These tools process a Web page or site to detect and report any accessibility issues. The checking tools detect as many problems as possible automatically but leave a number of issues to human judgment. Some tools guide the author through a series of questions to determine whether the content is accessible. The difficulty with this approach is that the checking occurs once the site has already been created. Addressing Web accessibility problems at this stage requires retrofitting existing content and occasionally completely recreating a site. Many authors rely solely on the automatic checking component ignoring the additional manual checking that must occur. Authoring tools that are compliant to ATAG may address these barriers to creating accessible content. Theoretically, using an ATAG-compliant authoring tool to produce accessible content does not require knowledge of the WCAG guidelines or even motivation or commitment to create accessible content on the part of the content author. An authoring tool can encourage accessible practices and accessible authoring choices from the very beginning, thereby precluding costly and onerous retrofitting or reworking of sites. However, before this strategy can be effective, ATAG-compliant authoring tools must be developed and broadly deployed. The advocacy effort to achieve this should not be as difficult as achieving WCAG compliance as the number of developers of authoring tools is far smaller than the number of authors of Web content. What is needed is a concerted effort by policy makers, advocates and companies developing authoring tools.
3 Discussion 3.1 Encouraging the Creation of Accessible Content Authoring tools influence the design of the Web content created not only in a large number of explicit but also in subtle and even hidden ways. The styles of
Authoring Tools
131
influence differ according to the type of tool used whether it is a WYSIWYG tool, a tool that supports direct manipulation of the mark-up or a tool that automatically converts content to HTML (or DHTML). Web accessibility is largely based upon the choice of formats or technologies used (e.g., W3C open standards), the appropriate choice and use of markup (e.g., use of headers rather than fixed text styling), the creation of equivalent content in accessible formats (e.g., alt-text, captions, descriptions), appropriate structuring and description of content (e.g., for forms, tables, document structure) and avoidance of certain content or authoring techniques (e.g., blinking, color coding). Authoring tools can generate accessible content, influence the choices made, guide and support good authoring practices, educate in explicit or subtle ways and encourage the adoption of accessible authoring habits and conventions. Little research has been conducted to determine the most effective means of encouraging accessible authoring practices. General user interface design research can be applied, but even here much of the research is anecdotal. Determining the criteria for successful support of accessible authoring within an authoring tool is a rich and worthwhile research agenda that can be informed by user interface design research and research into change management and learning. This section outlines some of the techniques gleaned from informal heuristic evaluations, anecdotal observations and experiences contributed by tool designers in developing the ATAG (Treviranus et al. 2000). Many authoring tools or authoring tool functions make choices for authors by automatically generating markup, structure or file formats. This includes the choice of markup in WYSIWYG tools and conversion-to-HTML functions in Word Processors. These automatic processes can deploy accessible technologies or markup by default. This is a highly reliable and predictable method of creating accessible content. When the author has a choice, given that there are accessible choices and inaccessible choices (or more and less accessible choices), there are many strategies that can be employed to ensure or encourage an accessible choice. These choices may be presented in menus, toolbars, dialog boxes, palettes or other user interface mechanisms. At the most basic level the choices available should include accessible choices. This is not always the case. For novice or less experienced authors the order of choices influences the choice made, the first choices are the most likely to be selected. The prominence of the choice may also influence the decision. For example if the accessible alternative is nested within several layers of menus it is less likely to be chosen than if it is at the top level and obviously displayed. However, for most authors, it is important that the accessible choice not be seen as an add-on or non-integrated alternative. Some accessible practices require more than a set of accessible choices and cannot be performed automatically. This includes the creation of alt-text or other equivalent content such as captions for audio content, labels for form or table elements and other authoring practices. In these cases authoring tools can use various mechanisms to guide and support authors such as dialog boxes,
132
J. Treviranus
wizards or intelligent agents. Authoring tools can also provide supportive tools such as alt-text libraries to make the task easier. Wizards, assistants or intelligent agents have had a mixed reception in user interface design. Wizards are more likely to be received positively when the user wishes to accomplish a goal that has several steps, when the steps need to be completed in a set sequence or when users lack necessary domain knowledge. Wizards that attempt to anticipate a user’s choice or intention are frequently dismissed as are wizards that are inflexible or wizards that accomplish tasks that can be accomplished by other means. While the goal is to encourage accessible authoring, an authoring tool cannot be dictatorial or inflexible, authors will usually respond by making perfunctory steps to comply, finding work-arounds that are less than satisfactory from an accessibility perspective, or rejecting the tool. An example of this might be a dialog box that will not let the author proceed unless alt-text is filled into a text field when an image is inserted. The author who wishes to insert the images in a batch will likely fill in any text to proceed rather than taking the time to create an appropriate label. The author should be given sufficient flexibility and leeway regarding the timing, order of steps and choice of authoring options to avoid feeling constrained and at odds with the authoring tool. Similarly, intrusive prompts, pop-up windows or warnings, although they are powerful mechanisms to address accessibility issues, interrupt the workflow and can be seen as annoying by the author. These are more likely to be well received if the author has chosen to activate them and can turn them off. An assistive function that has become expected and has gained user trust is the spell checking function. In standard spell checkers, errors are highlighted in an unobtrusive manner and can be dealt with immediately or in a batch. Similarly Web authors have come to trust and implement HTML or XML checking tools. Accessibility checking and repair functions integrated into an authoring tool can mimic these more familiar tools to encourage greater acceptance. Checking and repair integrated into an authoring tool has the advantage of enabling checking and repair at time of authoring when the cost of revision is minor rather than after the fact when a number of dependent steps may need to be reversed to address accessibility problems. Most authors leave preference settings in the default or ‘‘factory preset’’ state, unless prompted to create a preference profile upon setup. To support the goal of accessible authoring, most accessibility supports, such as accessibility checking and repair, should therefore be ‘‘on’’ by default. Many authors implement templates, style sheets and pre-authored content such as clip art or scripts and applets. This has become even more prevalent with the increased use of dynamic, database-driven Web sites delivered through content management systems. If these templates and pre-authored content elements are WCAG compliant there is a high likelihood that the sites they form the basis of will also be WCAG compliant. However, there are instances when the author should be encouraged to modify the content, for example, when images are to be repurposed, stock alt-text may no longer be appropriate
Authoring Tools
133
for the new purpose and authors should be instructed to modify the alt-text in line with the new meaning to be communicated by the image. Pre-authored applets, scripts or online user interface elements that are part of many content management systems including learning management systems should be accessible. With the prevalence of open source content management projects exemplary accessible components can be shared and freely adapted across systems making it easier to include accessible versions of functionality. Ideally, accessible authoring should become a natural, integrated part of the authoring workflow. Accessible authoring practices should become habitual and assumed. Standard conventions for existing content types and for emerging content types should include accessible practices. An authoring tool can encourage this by integrating accessibility features and accessible authoring steps into any multi-step process, as well as including accessible examples in any examples given in help, tutorials, documentation or intelligent assistants. All tutorials, help, documentation or intelligent assistants should integrate accessible authoring practices in the standard authoring practices demonstrated or described.
3.2 Authoring Tools That Are Accessible to People with Disabilities It is just as important that people with disabilities be able to use authoring tools to produce Web content as it is that content be accessible to people with disabilities. This requires that the authoring tool user interface follow standard user interface accessibility guidelines. Standard accessible user interface techniques are ably covered in a number of resources and will not be addressed in this chapter (Treviranus et al. 2000). In addition to standard accessible user interface principles there are a number of unique accessibility challenges that are presented by the task of authoring that should also be addressed. One unique accessibility challenge associated with authoring is that the default presentation or rendering of the content that the author is creating may not be accessible to the author. The author should therefore be able to configure the tool interface and rendering of the content independent of the final default rendering, while authoring. For example an author with low vision may require text to be presented in a 46 point size with a dark background and light foreground text. This may not be the desired rendering of the Web site the author is creating. This can be addressed by allowing the author to adjust the presentation of the user interface and content without affecting the styling of the authored content. Standard authoring tasks include cutting, copying, moving and pasting content. Typically this involves mouse-based, visually dependent highlighting, dragging and dropping. When the application is designed accessibly this can be achieved using keyboard equivalents, however, moving to and selecting the desired chunk of content can be a considerable challenge when relying on the keyboard. Enabling navigation using the structure (e.g., from one H1 to the next, through all H2 s nested within an H1 and then to the first paragraph)
134
J. Treviranus
and selection of structural chunks (e.g., header, body, paragraph, etc.) makes this important task much more efficient and accessible. Authoring is frequently a collaborative task. When authoring a largely textbased document, change tracking commonly relies on color-coding and other purely visual cues. Modally independent alternatives must be developed for these cues (e.g., text-based alternatives or markup that can be interpreted as a change in voice if read by a screen reader). When the collaborative environment or application is used to create or to communicate through graphic information, such as a white board application, more creative solutions are needed to make the information and the collaboration accessible. One approach is a white board that offers a palette of vector graphic shapes in place of free-hand drawing. These shapes can be combined or grouped and new combinations can be added to the palette. For example a triangle on top of a rectangle with smaller rectangles can be combined to be a rudimentary house that can then be added to the palette. If each of these shapes and grouping of shapes has an associated text label, an individual who is blind can decipher the visual model being collaboratively constructed. The facility for a collaborative peer to also add a text description of the graphically presented information will add to the accessibility of the collaboration. The most challenging online authoring environments from an accessibility perspective are communication environments in which the information is created synchronously or in real time and must be responded to in real time. These include text chats, voice over IP and video over IP. These present a particularly difficult challenge because there is little opportunity to create equivalent content for audio or visual information. Surprisingly even text-chat environments continue to present barriers even though the communication medium is text. The primary accessibility barrier in text-chat environments is that screen readers and refreshable Braille displays are unable to logically handle focus. Thus a screen reader will intersperse speaking a message being constructed by the screen reader user with messages coming in from other participants. This has been addressed in applications such as A-Chat (http://achat.atrc.utoronto.ca/). When real-time communication occurs using speech or video, providing equivalent content such as captions or descriptions is much more challenging. Two options include relying on ad hoc peer captioning or description or using a video or audio relay service (i.e., access to professional transcribers or sign interpreters through a remote link). The communication environment or application should provide input supports to enable this peer or relay translation.
4 Future Directions 4.1 Rich Internet Applications An as yet unmet accessibility challenge that has threatened to derail Web accessibility progress and affects authoring tools that are implemented as Web applications is the accessibility of rich Internet Applications, Web 2.0 or technologies such
Authoring Tools
135
as Ajax. Current techniques and strategies to make the Web or desktop applications accessible to assistive technologies such as screen readers are confounded by rich Internet or Web 2.0 technologies. Rich Internet Applications have user interfaces that are more varied and more responsive to user actions than traditional, page-based, Web sites. Rich Internet Application interfaces have controls and user interactions built from available HTML elements, styled with CSS and given behavior or animation through JavaScript to perform functions not possible with traditional HTML. To provide meaningful access to a person with a disability, not only must an assistive technology communicate or describe a static page, but must describe a variety of interactions and a constantly changing page or display. Similarly alternative control devices such as on-screen keyboards must find actionable items or controls on a constantly changing display. This is not possible given the present functioning of assistive technologies and the unpredictable and non-explicit nature of rich Internet interface components. The recent Target lawsuit and other lawsuits of this kind in the United States have attracted a great deal of attention to this issue (as an illustration type ‘‘Target lawsuit accessibility’’ into your Google search engine). Unfortunately the public view of the issue has become polarized such that Ajax, the most popular Web 2.0 tool, has been framed as anti-accessibility, and disability advocates are working to ban or prevent its use. As the popularity of Ajax and related technologies increases this is bound to fail and further characterizes accessibility as anti-progress, anti-innovation and constraining technical creativity. Several initiatives of significance have emerged to address this unfortunate dilemma. The Accessible Rich Internet Applications project (ARIA) is organized through the World Wide Consortium with support from IBM, Sun and Mozilla. This working group has been mandated to coordinate efforts to ‘‘fix the accessibility of Rich Internet Applications’’ (http://www.w3.org/WAI/PF/). The primary tasks have been to create a roadmap for Accessible Rich Internet Applications (see http://www.w3.org/TR/aria-roadmap/), and to create the semantic markup needed to adequately describe the roles, states and properties of widget, applets and UI components so that alternative access systems can adequately process these elements. To create accessible Ajax or rich Internet controls and widgets (which have functionality outside the capabilities of HTML) there must be a mechanism to communicate the role of a control and the state that it has in the Web browser in a way that is comprehensible to the user who is relying on audio information or Braille. For example, this mechanism might indicate that a part of the page is a progress bar and that it is at 60%. This work is not complete as new controls are developed almost daily and the common vocabulary has not been implemented in Web 2.0 application toolkits.
4.2 Individual Optimization or ‘‘One Size Fits One’’ Many Web sites currently offer the opportunity to log in and create a personalized profile that persists on the site or to express personal preferences regarding
136
J. Treviranus
the interface or content on the site. This provides the opportunity to optimize the site for each individual user. This can be an effective mechanism for delivering individually optimized accessibility as well. Standards and specifications have been created that provide a common language for expressing accessibility preferences and needs, in very functional terms, that apply to users with and without disabilities (Jackl, Treviranus & Roberts 2004; Norton & Treviranus 2003; http://jtc1sc36.org/). If these are commonly implemented a user with a disability or any user can have a portable preference profile that they can take from application to application. These profiles can also be context specific to accommodate varying needs caused by the device used, the environment or other circumstances that may cause a shift in needs or preferences. For the site author this means that all accessibility guidelines do not need to be addressed in a single instance of the site, and the content and interface can be dynamically transformed or replaced depending on the user.
5 State of the Field Despite extensive accessibility advocacy, policy and legislation, very few Web authors are aware of accessibility guidelines or knowledgeable in accessible authoring practices. Very few authors see accessibility as a priority when creating Web sites. Most authors of Web content, however, use some form of authoring tool to create Web pages. Education, advocacy or compliance evaluation programs will not effectively address the prevalence of inaccessible Web sites. These approaches demand skills and conventions that do not match the reality of Web authoring, not all authors need to know the technical minutiae of accessible authoring practices, as all authors do not need to know about HTML to author Web content. Evaluation and repair programs or conformance testing occurs after a Web site is created (often after it is publicly available) causing the author or evaluator to retrofit or rewrite content. The best and most efficient strategy for ensuring that content is accessible is to broadly implement the use of authoring tools that create accessible content. This strategy would ensure that even authors who are not knowledgeable about or motivated to create accessible content do so, almost unconsciously. In this way, accessible authoring would also be an integrated part of the process rather than an afterthought, reducing the time required to repair accessibility problems. This approach can be accomplished by mandating or promoting – through legislative or policy mechanisms – the use of authoring tools that are compliant with ATAG. The Web Accessibility Initiative was founded in an era when there was a clear distinction between content, authoring tools and browsers or user agents. Today these distinctions are blurring. Many Web environments have become collaborative authoring environments where the distinction between content, authoring and viewing becomes an academic rather than practical distinction. Forums, blogs, wikis, sites, such as Flickr, YouTube, Myspace and Facebook, can be seen as content, authoring and special purpose user agents. It may be time to
Authoring Tools
137
create a new conception of accessibility guidelines to address this convergence. This new conception could be based on more practical classifications of functionality such as professional and amateur authoring, dynamically generated and manually authored content, software development kits and component libraries. This new conception could also take into account accessibility through personal optimization rather than through a single universally accessible resource. One of the key challenges facing the accessibility field at the moment is the reputation of accessibility among Web developers. Accessibility has been characterized as anti-innovation, anti-creativity. Developers are cautioned or prevented from using new technology due to accessibility concerns. Accessibility evaluation is frequently seen as a policing, or punitive function. The sad irony is that the accessibility challenge is more in need of innovation and creativity than many other areas. Fortunately it can be shown that inclusive design spurs creativity and innovation and benefits everyone. To achieve an inclusive Web, accessibility advocates must work to ally accessibility with innovation and creativity. This can be achieved in large part by focusing on integrated accessible authoring rather than compliance testing and by the promotion of more flexible accessibility strategies such as personal optimization which support the use of a variety of strategies and allows experimentation with new technologies that are not necessarily accessibility vetted. With the emergence of the participatory Web (Kelly 2005) it has become even more critical that people with disabilities have equal access to communication over the Web – to both receiving and expressing information. This is true from the perspective of the individual and the community. New technology-enabled social practices such as tag clouds and social bookmarking intensify the effect of nonparticipation. All things popular and current rise to the top and gain additional significance. Taking the example of tag clouds the most popular topics increase in size, while the less popular shrink and eventually disappear. Thus the values of popularity and newness gain prominence. This reinforces the popular view and any perspective in the minority will never win the popularity contest. Perspectives that cannot participate are rendered invisible. If people with disabilities do not have accessible means of contributing, their perspective and needs will disappear. Equal participation may also bring about the promotion of more inclusive alternatives to popularity as influential values in these online communities.
6 Conclusions Authoring tools are a critical piece of the Web accessibility puzzle, they offer a powerful and effective mechanism for supporting the creation of accessible Web content and they are the key to equal participation on the Web. As more and more important daily functions occur on the Web and as the Web becomes our source for socialization and community this equal participation becomes even more critical. A principle that has been underemphasized in Web accessibility efforts is that people with disabilities must be producers as well as consumers of
138
J. Treviranus
information on the Web. This has become even more important with the emergence of the participatory Web. If this participatory Web is inaccessible to people with disabilities, the contributions, creativity, as well as needs of a large segment of society will become invisible. The research agenda to address accessible authoring is not only of great magnitude but also of great significance.
References Abou Zahra, S. (March 2007). Evaluation and Report Language (EARL) 1.0 Schema W3C Working Draft 23 March 2007, Retrieved August 1, 2007 from http://www.w3.org/TR/ EARL10 Schema/ Nomensa (2006). United Nations Global Audit of Web Accessibility. Available from Nomensa at http://www.nomensa.com/resources/research/united nations global audit of accessibility. html Jackl, A., Treviranus, J., & Roberts, A. (2004). IMS AccessForAll Meta data overview. Retrieved May 1, 2007, from http://www.imsglobal.org/accessibility/accmdv1p0/ imsaccmd_oviewv1p0.html Kelly, K. (2005). We are the web. Wired, 13, August. Retrieved June 5, 2007 from http://www. wired.com/wired/archive/13.08/tech_pr.html Norton, M. & Treviranus, J. (2003).IMS learner information package accessibility for LIP best practice and implementation guide. Retrieved March 1, 2007, from http://www.imsglobal. org/accessibility/acclipv1p0/imsacclip_infov1p0.html Treviranus, J., McCathieNevile, C., Jacobs, I., & Richards, J. (2000). Authoring tool accessi bility guidelines 1.0. Retrieved May 1, 2007, from http://www.w3.org/TR/2000/REC ATAG10 20000203
Part III
Applications
Users’ experience depends on both how a Web page is designed and the technologies used to access that page. The previous part discusses the former, how a Web page can be designed, created, and evaluated to make it accessible, and this part discusses the latter, how new technologies can be developed to ease the access to Web pages. By supporting the user with modifications and strategies derived from experimental in vitro research, it is hoped that in vivo interactions are enhanced. What research exists in this domain? How do we afford access to the Web as a whole? How do we support bespoke interaction mechanisms and reform invalid or difficult-to-use Web pages? Current research is focused around solving these problems and understanding the new issues that will effect the client-side engineering. Currently, the main way to render content on the client is to use a standard Web Browser (Assistive Technologies and Desktop Browsers) and join the visual rendering specified in the CSS with the DOM specified in the XHTML document. However, conventional renderings are often not appropriate for disabled users and so other browsers such as IBM’s Home page Reader and Oxford Brookes University’s BrookesTalk (Specialised Browsers and Browser Augmentation) were created to get around these problems. While, these have met with some success, and indeed work still continues on browsers such as HearSay and Emacspeaks, bespoke browsing solutions have dropped away and work is now refocusing to use bespoke browsers along with document annotation and transformation (Transcoding) technologies.
Assistive Technologies Alistair D.N. Edwards
Abstract To be excluded from access to the Web is becoming and increasingly severe disadvantage. One potential cause of exclusion is to have a visual impairment. Technologies exist which can make the Web more accessible to such people, but those technologies depend to a large extent on the way the Web content is designed. This chapter gives an introduction to the underlying Web technology and then to the assistive technologies which have been developed to facilitate access, notably specialized non-visual browsers and screen reader/ visual browser combinations. Speculation about the future suggests that technologically the picture is quite positive as access and mainstream technologies merge together. The technologies are not the whole answer though; accessible content will only be generated if and when website owners pay greater attention to the need for it.
1 Introduction To many people, the Web is now their principal source of information. This is a trend which will only continue with time. It is not confined only to those countries with highly developed technology either. The low costs and the range of wireless technologies mean that the same is true even in less developed countries. All this means that increasingly to not have access to the Web for any reason is becoming a serious disadvantage. There are many causes of exclusion. Economics is still going to be the major one, but also there are questions of physical access through the computer. By the time the Web was established, the graphical user interface (GUI) was standard. Thus, the normal means of access was through a keyboard, mouse and a screen. Indeed, the hypermedia paradigm underlying the Web virtually mandated the use of these input and output devices. There were exceptions,
A.D.N. Edwards Department of computer science, University of York, Heslington, York, UK e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_10, Ó Springer Verlag London Limited 2008
141
142
A.D.N. Edwards
notably Lynx,1 which was a text-based browsing program, but for the most part, to use the Web requires that the user can make selections with a pointing device (usually a mouse), enter text (normally via a keyboard) and perceive the visual output on a screen. Given the ubiquity of the GUI, a variety of assistive technologies have been developed to give access to people who cannot use the conventional technologies for one reason or another. For the most part, technologies which work for GUIs in general will work to allow access to Web browsers. Some people have difficulties using conventional computer interfaces – typing on keyboards or pointing with a mouse – due to physical impairments. However, aids and adaptations have been developed which overcome these problems for many users who have physical impairments. So, if a physically disabled person can use a computer, then to all intents and purposes they can use the Web. The real problem of accessibility specifically to the Web exists for people with visual disabilities. This is because the Web has developed into a largely visual medium. In this chapter, we will therefore concentrate on overcoming the problems of access for those who have difficulty accessing visual material.
Fig. 1 The different components of accessibility. Tools play an important part along both limbs of the path between Authors and Users. (Figure based on http://www.w3.org/WAI/ intro/relate.png) 1
http://lynx.isc.org/
Assistive Technologies
143
There are two approaches to achieving accessibility. The first is to use a specialized adapted browser and the second is to use a standard browser along with additional access technology, particularly the screen reader. Both approaches have their advantages and disadvantages which will be discussed in this chapter, but before we can discuss them in detail, it is necessary to set out the problems that they attempt to address. The requirements to achieve accessibility are summed up in Fig. 1 and discussed in Chisholm and Henry (2005). Notice that content is central, but that achieving accessibility depends on the Web page authors – who can be assisted by tools – and on the technologies which are available to the users. Both of these sets of technologies are discussed in this chapter. In order to understand the technology and how it works, it is necessary that you have a basic understanding of Web technology, and particularly the ‘language’ of web pages, HTML. The next section presents a brief introduction to HTML, but if you are already familiar with this you may skip to the next section.
2 A Brief Introduction to Web Technology: HTML Most Web pages consist mainly of text, which is marked up using the HTML (hypertext markup language) notation. That is to say that the text to be displayed is marked with tags which describe its structure and sometimes its intended appearance. The details of HTML markup are well beyond the scope of this chapter and this book. Anyone who wants more details might consult a reference such as Musciano and Kennedy (2000). However, it will be helpful to explain some of the basics. The example in Fig. 2 should be sufficient to explain the main points of this chapter.
Document title Top-level heading Sub-heading
This is a paragraph of text. This is important so it is emphasized. This is important too, so the author has specified it should be in italics. Here is a link to another page.
Fig. 2 A sample Web page which is marked up with HTML tags. How this page would appear when viewed in a browser is shown in Fig. 3
144
A.D.N. Edwards
Notice that the structure of the page is apparent from the markup used. The markup consists of tags which delimit and structure the text and other material to be displayed. Tags are texts contained in diamond brackets. They usually occur in opening and closing pairs, distinguished by the slash character, ‘/’. The overall structure is marked by the html tags opening with and closed by . These enclose the whole page, marking it as an HTML page. (While most Web pages are in HTML format, other formats are allowed.) The page is then divided into a head part and a body. Within the head there has to be a title. This tells the reader something about the page – and usually is shown as the window title in the browser (See Fig. 3). The text to be displayed on the page is contained in its body. In the example you can see the different levels of heading. Top-level (level 1) headings are marked with h1 tags, level 2 is h2 and so on. Then there is a paragraph (delimited by
and
). There is some plain text and then a sentence which is to be emphasized, as signalled by the tags. The next sentence is to be rendered in an italic typeface, marked by . The power of the Web comes from the ability to link between pages, that is to be able to click on a piece of text (or a picture) and for a corresponding new page to then be displayed in the browser. Links are referred to in HTML as ‘anchors’. The word link in Fig. 3 is a link to another page, in the file otherpage.html, as signified by the tag with the attribute, href= ‘‘otherpage.html’’. Next there is a picture, or ‘image’, specified with the tag. The picture is a graphic contained in another file and the name of the file is specified by the
Fig. 3 How the HTML page in Fig. 2 might be rendered by a browser
Assistive Technologies
145
src (‘source’) attribute of the tag. Another attribute of that tag is the alt text, which is an important accessibility feature. There are a number of points illustrated by this example. Firstly note that the HTML markup specifies the structure of the page. In addition to the compulsory structural tags described above (, , ), the different levels of heading are useful landmarks. For instance, ideally every page should have precisely one level-1 () heading. This should thus be a clear indication of the page’s purpose. All level-2 headings are subsidiary to that and indicate the subsections of the page. Within a level-2 subsection there may be third-level () headings and so on. The difference between the and tags is important. The tag is an indication from the author of the page that the enclosed text is important and worthy of emphasis. In using this tag the author is not saying anything about how that emphasis should be expressed. In practice, most browsers use the convention of displaying the text in italics (see Fig. 2 and Fig. 3). This is quite appropriate as it is consistent with typesetting conventions whereby italics signify emphasis. The tag is different, though. By using the tag, the page author is specifically saying that the text must be rendered in an italic typeface. This might be because the typeface is being used to signal something other than emphasis. For instance, italics are often used to signal the use of a foreign language phrase within a document, etc. In that case only italics will do and the author should use the tag, not . It is up to browser manufacturers as to how they will render all of the tags (visually, in the case of a conventional browser). For example, an heading should be given greater prominence than lower-level headings. One browser might do this by using a large bold font, left-justified (as in Fig. 3), but another might make it large, underlined and centred. On the one hand, this means that these features should be used as a means of structuring the document (which aids accessibility), but on the other hand it means that the page designer forfeits a lot of control over the visual appearance of their page. That can lead to designers not using the features, creating beautiful-looking pages – but ones which are inaccessible. The alt attribute (often referred to as the ‘alt text’) provides an alternative non-visual representation of an image. In the example above, there is an alt text (‘A picture of a rabbit’) in Fig. 2. This text does not appear in the visual rendering in Fig. 3 – but a non-visual rendering would not show the picture, but rather display (in speech or text) the alt text. Once again, the degree of accessibility achieved depends very much on the page author. The first question is whether the author takes the trouble to include the alt text at all. If they do (as they should if they wish their HTML to be standard-conforming), then the question is what to put in it. The very richness of visual images is again the fundamental problem. What a picture contains and how it is used depend very much on the viewer. Some images are used purely for decoration while others may contain vast amounts of information. The page author must decide what is appropriate to put into the alt text.
146
A.D.N. Edwards
There are guidelines on the use of alt texts,2 but even these cannot be definitive. There are also differences of opinion as to the appropriate use of alt texts. For instance, one school of thought suggests that a purely decorative image is irrelevant to non-visual access and requires no distracting alt text (i.e. the text will be specified as alt=‘‘’’), while another attitude is that the blind reader deserves the same information as the sighted one so that all alt texts should be meaningful. It should be apparent that the accessibility of a page can be heavily influenced by the way that it is composed by the page author. It was suggested above that ‘ideally’ every page should have precisely one level-1 heading. In other words, this is not always the case; not every Web author sticks to this convention. Indeed, they might not use heading tags at all. For instance, if they want control over exactly how their top-level headings will appear on all browsers, they might explicitly mark the heading’s format, as in Fig. 4. This gives the page author full control over the appearance of the text, but it tells the browser nothing about the structural role of the text. Similarly, most Web page authors do not distinguish between the use of italics for emphasis and in other roles. In other words, they rarely use the tags, but instead use whenever they deem emphasis to be appropriate. Any text can be used as a link to other pages, but images can also be used as links. In the simplest form the whole image is a link, in which case its alt text should include an indication of the link. Another form of image, though, is the image map. This is an image that the user can click on, but different parts of the picture are linked to different destinations. Commonly, maps are displayed in this way, so that clicking on the south east of England will take the viewer to information about London and surrounding counties, while a click on the north west yields information on that area. There are two ways to implement such image maps: on the client-side or the server-side. Client-side image maps are much more accessible, so again it is up to the page author to decide whether to implement their page in an accessible form or not. A client-side image map has all the information as to which area of the image links to which destination contained within the page. Thus, the browser (and screen reader) can access that information and render it in a non-visual form. In a server-side image map, that mapping information is held on the server from which the page has been fetched. This is inaccessible to the browser and screen reader.
Top-level heading
Fig. 4 Markup specifying the precise appearance of a piece of text, which is intended as a top level heading. In this case the text will be aligned to the left, very large (7 is the largest font) and bold
2
http://www.w3.org/TR/WCAG10 HTML TECHS/
Assistive Technologies
147
Before moving on, though, it should be pointed out that HTML is no longer the only mechanism for specifying the format of Web pages. Cascading Style Sheets (CSS) are a means of specifying format in a more flexible manner. If we take the earlier example of a top-level heading, the page’s author could control the appearance of the heading by specifying it in a style sheet. That is to say that the author can specify the format of all headings (left-justified, large and bold), and then wherever an heading appears in that page (or on the whole site) the same rendering will be used. In other words, the author can override whatever rendering decisions the browser manufacturer has made. Furthermore, the viewer of a Web page can use their own style sheets. This can be a big aid to accessibility. For instance, a user might prefer that all of the text on a page be displayed in white text on a black background. Users can define their own style sheet, containing a new default style for the tag which specifies white-on-black. This style can then be applied to every page the user views. The use of CSS thus can make it easier to achieve accessibility – as well as giving page authors good control over the appearance of their pages.3 HTML in a more pure form can be used to specify the structure of a page and CSS can separately define the appearance. Once again, though, it is up to the Web page designer to make appropriate use of the facility and not all of them do this. Page layout is important, so text in one position may have different significance to that in a different position. Furthermore, the layout on a Web page may imply a structure which is not apparent when the page is rendered in a nonvisual form. A simple example is the use of multiple columns. Text laid out in two columns with a gap between them will be read a column at a time by a sighted reader, but a screen reader is likely to read across the columns. For an example, see Fig. 5. Again, the Web page author has some control over how accessible their layout can be. Layout effects such as multiple columns are often implemented by using the table facility in HTML, but these ‘tables’ are more difficult to access when used in this way. Again it is the case that CSS can be used to obtain the same results in a much more accessible way. As explained above, the table tag was intended as a way of presenting tabular information, in rows and columns. It has been subverted to being used to specify layout and this should be avoided, but of course there is still a use for ‘true’ tables. This use should not be precluded for accessibility reasons, but there are more and less accessible ways of designing data tables. For instance, a cell in a table is identified with the
tag. There is a separate tag,
, intended to identify a table row or column header. Many authors do not use the
tag, though. Instead they use a
tag, possibly varying its appearance by making the text bold, for instance. However, if the
tag is used then the browser and/or screen reader can recognize the structure of the table and present it appropriately. 3
Any reader who is interested in the design potential of CSS should consult Clark (2002) and the CSS Zen Garden, http://www.csszengarden.com/
148
A.D.N. Edwards
Fig. 5 Two column text. The sighted reader would start reading ‘This book is intended to be read mainly by people in secondary schools. . .’, but a screen reader would read right across, horizontally. Thus it would read out ‘This book is intended this will give you to be read mainly by some idea what the. . .’. Clearly this would be confusing to the listener
Another table feature which is little used by most page designers, but which can greatly improve accessibility, is the summary attribute of the table tag. As its name implies, the summary attribute is used to provide a brief textual summary of the table’s contents. Although the majority of Web pages consist of HTML, there are other formats that are also available. Interactive animations can be created using Adobe Flash.4 This requires additional player software for the browser, but this is freely downloadable. As with HTML it is rather easier to make inaccessible Flash than an accessible version. Unfortunately, it is furthermore very difficult to make Flash very accessible. It is beyond the scope of this chapter to go into the details of writing accessible Flash, and the interested reader should consult the WebAIM website.5 JavaScript is the most common form of client-side processing. That is to say that an HTML page can contain program code which can be executed as the page is viewed. This requires the browser to have a built-in interpreter, but these are standard on most modern browsers. Once again, writing JavaScript which is accessible is difficult. Sometimes it is easier to ensure that there is an alternative to the JavaScript presentation. WebAIM has guidelines on making JavaScript more accessible.6 4 5 6
It should be evident that there are ways whereby a Web page author can make their pages more or less accessible. Good practice has been captured in the Web Accessibility Initiative’s Web Content Accessibility Guidelines (WCAG).7 These cover the kinds of features listed above (use of alt texts, client-side image maps, avoidance of use of tables for formatting, etc.) – and many more, which should be implemented in order to make pages accessible. Guidelines are rated as to their level of priority8: Priority 1 A Web content developer must satisfy this checkpoint. Otherwise, one or more groups will find it impossible to access information in the document. Satisfying this checkpoint is a basic requirement for some groups to be able to use Web documents. Priority 2 A Web content developer should satisfy this checkpoint. Otherwise, one or more groups will find it difficult to access information in the document. Satisfying this checkpoint will remove significant barriers to accessing Web documents. Priority 3 A Web content developer may address this checkpoint. Otherwise, one or more groups will find it somewhat difficult to access information in the document. Satisfying this checkpoint will improve access to Web documents. There are three corresponding levels of conformance to the guidelines, A, AA and AAA, whereby a page which satisfies at least all the Priority 1 guidelines qualifies for Level A conformance, Priority 2 corresponding to Level AA and a page which also implements all Priority 3 guidelines qualifies for Level AAA. It should be borne in mind, though, that simple conformance to guidelines does not necessarily ensure true accessibility. A report for the Disability Rights Commission (DRC, 2004) shows that there is no substitute for testing with real people. Clark (2002) is a very useful guide to building accessible websites. This section has given a brief introduction to the main underlying technology of the Web with some indications of potential accessibility problems. Now we can extend this into a more detailed treatment of how blind and partially sighted people can achieve access.
3 Tools to Support Accessible Design The topic of this chapter is Access Technology that really refers to the technology on the right-hand side of Fig. 1. It is beyond our scope to go into the technology on the left-hand side, that to support the Web author, in detail, but it will nevertheless be mentioned briefly. 7 8
As explained in Section 2, the choices that a Web page author makes can have a great influence on the accessibility of their pages when viewed using access technology. Authoring tools should assist the author in achieving this. The first level is the definition of Web standards. For instance, the alt text is a feature which makes images much more accessible, and in recent versions of the HTML standard, the alt text has been made a compulsory attribute. Thus, any author who wishes to make pages which are standard compliant will have to insert alt texts. A number of tools exist which will check pages for standard conformance, including the Web Design Group HTML Validator.9 As discussed in Section 2, most accessibility requirements cannot be captured in HTML rules but are instead embodied by the WCAG guidelines. There are a number of software tools which can help check for conformance.10 One problem with the guidelines is that they cannot all be checked mechanically; they require human intervention and judgement. For instance, a program can easily check whether an image tag has an alt attribute – but it cannot check whether its text is appropriate for the image. Indeed, an author might insert the attribute alt=‘‘’’, is this just an attempt to circumvent validation checks or a positive choice to use the convention that decorative images should not be described? Another example is that a tool can detect that there has been a change in the colour of a piece of text, but it takes a human viewer to judge whether the colour conveys information. If it does, then that information should then also be signalled in a non-visual form. Validation tools work in different ways. Some will simply point out errors – including potential errors which require manual checking. Others will also offer the option of fixing errors, automatically or with some input from the user. Authoring and validation tools can thus be more or less integrated. It is probably the ideal that they should be so that authors generate accessible code immediately as they create a page, rather than this being a post hoc revision. The WAI maintains Authoring Tool Accessibility Guidelines.11 As discussed more fully in Section 8, full accessibility will not be achieved until website developers build their sites to be accessible. Authoring tools should assist in this process and ideally should make it as easy – or easier – to build an accessible page as an inaccessible one.
4 Visual Screen Access It is important to make a distinction between the access problems of people who have impaired vision (partially sighted) and those who cannot see at all. The needs of the two groups are very different. Those who have some sight generally 9
http://www.htmlhelp.com/tools/validator/ A list of such tools can be found at http://www.w3.org/WAI/ER/tools/ 11 http://www.w3.org/WAI/intro/atag.php 10
Assistive Technologies
151
prefer to make as much use of their vision as possible. They will therefore not use the non-visual alternatives that are described below. Rather they will need ways of enhancing the visual output from the Web to accommodate their level of vision. There are many different causes and forms of visual impairment and hence a corresponding variety of accommodations that can be provided. There can certainly be no ‘one-size-fits-all’ approach; indeed a screen enhancement which supports users with one form of impairment may in practice make access worse for others. The simplest example is enlargement. Enlarging the elements on a screen can be essential for some users, but for anyone with tunnel vision (as might be caused by glaucoma12), enlargement only makes matters worse. Tunnel vision means that the person can only see a limited scope at any time. It might be, for instance, when reading text, that the person can only see one word at a time. This makes reading much slower than for someone who can look ahead several words at a time. If enlargement is now applied, then it might be that only two or three letters can be seen at any time. This then makes reading very much slower, because the reader cannot see the shape of the words and has to essentially read them letter by letter. Having clarified this, it has to be said that some users do benefit from screen enlargement. Commonly, such enlargement is achieved through software. This may be using facilities built in to the operating system, or through additional software.13 These programs generally work by enlarging all or part of the screen. A problem is that the user can lose contextual information because the physical screen can only display a portion of the virtual screen, as shown in Fig. 6. For instance, in the unenlarged screen in Fig 6(a) all of the text is visible, but when enlarged (Fig. 6(b)) only parts of the text can be read at a time. A more extreme example is that an event may occur on the portion of the virtual screen which is outside the physical screen. For instance, an alert may pop up which freezes the browser, but because the user cannot see (and dismiss) it, they do not understand why the browser has suddenly become unresponsive. Many screen enlargers have features which try to compensate for this, to allow enlargement but at the same time to maintain an overview. For instance, there may be two separate windows, one displaying an enlargement of a portion of the screen and another showing that portion within the context of the whole screen. As mentioned earlier, the cursor is a vital component of the graphical user interface. A user who cannot locate it is at a disadvantage, and this can be hard for a partially sighted user. Screen enlargers and systems often have facilities to help. An obvious remedy is to enlarge the standard cursor. An alternative is to change its form altogether. One useful approach is to make the cursor into a set 12 13
http://www.tiresias.org/controls/visual_impairments.html See http://www.tiresias.org/equipment/eb19.htm for a current list of such software.
152
A.D.N. Edwards
Fig. 6 The left hand picture (a) shows how a whole Web page would be displayed on the screen. If that page is enlarged, though, as in (b) on the right, then only a small portion of the whole page is visible. The blurred area of the virtual screen outside the physical screen is invisible
of crosshairs, the full width and height of the screen. Then, if lost, the user can visually scan the screen horizontally and vertically to locate the hairlines and hence where they cross. Colour and colour combinations can make a big difference to usability for people with visual impairments. The choices of colours can be very individual and idiosyncratic, but the common choice is generally combinations with high contrast. Again, the operating system or screen enlarger usually allows for some choice in colour schemes, but these are limited with regard to Web access because the colour schemes of individual Web pages are dictated to a large extent by the page’s author. There is some scope for overriding the colour scheme, but the extent to which this is possible depends on the way that the page author has specified the colours within the page. As discussed in Web Accessibility and Guidelines and Authoring Tools the appropriate use of Cascading Style Sheets (CSS) can make accessibility easier to achieve – in general and particularly with reference to the choice of colours. Ideally, the viewer of a Web page can specify their own style sheets with whatever colour schemes they find most readable. Other visual enhancements can be quite simple and external to the computer. For instance, a good-quality monitor with a high-resolution, bright screen may make all the difference in terms of visibility for an individual. Some display technologies will be more appropriate for some users. For example, a plasma or CRT display may be preferable to an LCD screen which may be less bright and can only be viewed at certain angles. Thus, visually impaired people, who have some vision, tend to use standard software as much as possible. They will use standard browsers, but making use
Assistive Technologies
153
of whatever visual enhancements are available. If these are not sufficient, then they may use additional screen enhancement software.14
5 Accessible Browsing As explained in Section 2, a Web page contains not only the text which is displayed by the browser but HTML markup describing the structure of the page. A visual browser reads in the marked-up text and renders it to the user. There are thus effectively two points at which the page can be captured and rendered in a non-visual form for a blind user: either as raw HTML (Fig. 2) or as formatted text (Fig. 3). Specialized browsers do the former, whereas screen readers (in conjunction with standard browsers) work in the latter way.
5.1 Specialized Browsers A specialized browser is one which is designed to render Web pages in a nonvisual form. Specifically there are a number of talking browsers. These work like any other browser in that they accept HTML pages from the Web and render them to the user. Having access to the HTML means that the browser has maximum control over its rendering. For instance, knowing that a portion of text is to be emphasized (, see Section 2) a speech browser might switch to a different kind of voice (e.g. louder and/or lower pitched). This is one reason why it is better to use for emphasis. If the browser only knows that the text is in italics (), should it be rendered as emphasized or in some other way? More importantly, with access to the structural elements such as headings, the specialized browser can assist the user to navigate around the pages in a nonvisual manner. Studies have shown (Morkes and Nielsen, 1997) that (sighted) people reading Web pages ‘users do not read on the Web; instead they scan the pages, trying to pick out a few sentences or even parts of sentences to get the information they want’ (op. cit.). Blind users need the same facility, that is to say a means of quickly scanning a page to decide whether it is of interest. Then they can decide whether to read the page in detail or to go on somewhere else. (Surely the whole reason that the activity of using the Web has become known as ‘surfing’ is a reflection of this kind of behaviour? The user floats on the surface of a page, hanging on as long as possible to the good ones – but skipping off the uninteresting ones in search of something better.) If the page’s structure is clearly signposted through the use of heading tags, the browser can use this to transform the method of access. In a simple form, the 14
Currently, available aids are listed at http://www.rnib.org.uk/xpedio/groups/public/ documents/publicwebsite/public_lowvisioninfosheet.hcsp
154
A.D.N. Edwards
browser could effectively construct a table of contents at the head of the page, designed in such a way to be easily browsed in speech. Links are another important component of any Web page. Again, thinking of the surfing analogy, a user will often use a page merely as a stepping stone to the page that is of real interest.15 It may be, therefore, that the links are the only – or at least the most important – components on the page. Again, a specialized browser can pick these out and present them quickly to the user. There are a number of specialized speech-based browsers which have been developed, including16 PWWebspeak This was significant as the first commercial speech-based Windows browser available. It demonstrated the feasibility of access. Development and support ceased some time ago, but copies of it are still available for download.17 Of course, the fact that it has not been developed means that it is not necessarily compatible with the latest Web developments. BrookesTalk BrookesTalk was a research vehicle into means of making the Web more accessible. Many of its developments influenced subsequent designs of screen reader facilities (Zajicek, Powell et al., 1999).18 Home Page Reader IBM’s Home Page Reader was a low-cost talking Web browser. It too is no longer available. It will be quite apparent that what all of the above programs have in common is that they are no longer being developed and supported. The significance is that as screen reader technology has developed it has become possible to build nearly all of the features of the specialized browsers into screen reader/ browser combinations. This has the advantage of flexibility since a screen reader makes a whole range of software accessible – not only a browser. Indeed, for the user of a specialized speech browser, there was always the problem of how they could launch the browser, if the rest of the system was inaccessible! HearSay19 is a current project to build a non-visual browser (Ramakrishnan, Stent et al., 2004). It transforms the HTML page into a more accessible non-visual format by performing a structural and semantic analysis on it and transforming its document object model (DOM, Bates, 2002). The semantic analysis is based on heuristics and an ontology. At the time of writing, the 15
The page returned by a search engine such as Google is an extreme example. A more complete list of such browsers can be found at http://www.w3.org/WAI/ References/Browsing 17 http://www.soundlinks.com/pwgen.htm 18 Further papers on the research underpinning BrookesTalk can be found at http:// www.brookes.ac.uk/speech 19 http://www.cs.sunysb.edu/hearsay/ 16
Assistive Technologies
155
software is available in an alpha release. It is self-voicing and cannot be used in parallel with a screen reader. A beta version is planned for release in December 2007.
5.2 Standard Browsers Plus Access Technology The alternative to using a specialized browser is to use a standard browser in conjunction with access technology, in particular a screen reader. A screen reader is a piece of software which interrogates the contents of the screen on a computer and turns the contents into a non-visual form. This can be presented in synthetic speech or braille – or both. One advantage of this approach is that the blind or partially sighted person uses the same browser as most of his or her colleagues and friends. This is important in terms of self-esteem, not labelling the user as different. Also it is of practical value since all of the users can refer to the same interface. A potential disadvantage is that the browser has been designed for optimal visual presentation. It may not therefore present the information in a manner appropriate for non-visual access. For instance, the best way to write text to be read visually is different from how speech should be composed (Pitt and Edwards, 2002). This is in contrast to the specialized browser, which has access to the raw HTML and which can re-present in an appropriate form, as discussed above. However, it has to be said that with advances in access technology and its interface to applications, it is now possible to provide almost as good access through this route. Popular screen readers include Jaws, Window-Eyes and Hal for Windows.20 The complexity of any computer screen, including Web pages, means that the task of the screen reader is quite difficult; much of the screen contents cannot be rendered as text (speech or braille) in any straightforward way. Clearly, text on a screen can be rendered as speech or braille, but even that may not be simple. Visual design relies on the fact that a sighted reader can choose which portion of the text to (literally) focus on at any time. Sight is very powerful, enabling the reader to concentrate only on the portion of text of interest at any time and to ignore the rest. Ideally a screen reader should give a similar level of control, but naturally it is not going to be as flexible as vision. A sighted reader moves their focus without even thinking about it while the screen reader user must input commands to a computer. To a sighted user, the visual layout of the page provides a structure to their exploration. Visual formatting and conventions guide the users to the information they require (at least on a well-designed page). The blind user does not have access tothese cues directly. Instead the screen reader has to build an off-screen model, a data structure which it maintains and which should support the user’s 20
Jaws: http://www.freedomscientific.com; Window Eyes: http://www.gwmicro.com; Hal for Windows: http://www.dolphinuk.co.uk/.
156
A.D.N. Edwards
access to the information. In particular, the screen reader will thus embody the concept of the current point of interest on the screen and will have commands to control the rendering of text around that point. For instance, the screen reader might read out the current line and a simple command from the user will move the focus to the next line. Screen readers can have a very rich set of such commands. The trade-off is always that the more commands there are, the greater the power the user has but the harder the screen reader can be to use. Not the least, there will be more commands to remember. At the same time as maintaining the focus of attention, the screen reader must also track and control the screen cursor. Using a mouse or other pointing device is generally impractical for a blind person and therefore the screen reader also takes on this role. That is to say that the screen reader (in collaboration with the operating system) must also have a set of commands to perform the functions of the mouse. These commands are all entered via the keyboard – and must all be remembered by the user (possibly with assistance). Not all text on a Web page is the same. For instance, there are headings. If the page author has used HTML correctly, the headers can be picked out by the screen reader, so that the structure of the page will be rendered to the user. Links are also important. A screen reader can recognize textual links and highlight them in some way to the user. For instance, this may be in speech (‘link’) or as a non-speech sound accompanying the reading of the link text. Screen readers can thus cope with the re-presentation of text in speech or braille – although even that is not as straightforward as it might be. Clearly, though, graphical items are much more difficult to render non-visually. Most graphics on Web pages take the form of graphics files ( tags, described in Section 2). The main facility in HTML designed to make these more accessible is the alt text. The screen reader can thus effectively ignore the graphic and read out the alt text. Modern screen readers have largely overcome many of the access problems discussed earlier, in Section 2. Multiple-column layout, as in Fig 5, may cause problems for some screen readers, but a well-designed one should be able to recognize the formatting and render it appropriately. More complex layouts, such as tables and frames, should also be transformed into more simple (essentially one dimensional) formats. By far the most popular Web browser is Microsoft’s Internet Explorer (IE) – and this is also true of blind users. It is generally found that IE works well with screen readers, largely because it uses Microsoft’s Active Accessibility features.21 However, it is not the only such browser. Its main rival, Mozilla’s Firefox, now also uses Active Accessibility and hence has a high level of accessibility.22 In the early days of the Web there was a text-only browser called 21 Note, though, that at the time of writing the latest released version of Internet Explorer is version 7 and this is not compatible with existing screen readers. This is a serious and severely retrograde development from Microsoft. 22 http://www.mozilla.com/en US/firefox/features.html
Assistive Technologies
157
Lynx. It soon became outmoded for all but blind and partially sighted users who continued to use it long after sighted users had moved on to graphical browsers. It seems that its use has largely died out even among that population, though. Microsoft Windows is not the only operating system, of course. Linux has increasing popularity on PC platforms. The Gnopernicus Project is currently under way, specifically with the objective of producing an accessible operating system, including the SRCore screen reader.23 Apple has also responded to the need for accessibility. The latest versions of the Mac OS X operating system has the VoiceOver screen reader built in as standard. Opinions suggest that this is not as fully featured as other screen readers, but it has the enormous advantage that it is an integrated, standard component of the operating system. This represents a major shift. Apple was the first manufacturer to introduce the graphical user interface, with the release of the Macintosh in 1984 (Edwards, 1995). At that time (long before the advent of the Web) it was seen as a severe threat to the accessibility of computers for blind users. Hence for a long time – and during the rise of the Web – Windows, Internet Explorer and compatible screen readers became the technology of choice. It will be interesting to see whether this situation becomes inverted, that the Mac becomes the system of choice for blind users.
5.3 Extending the Technology Specialized talking browsers are powerful but never really caught on, probably due to the fact that people tend to prefer to use standard software, as similar as possible to that which their friends and colleagues use. Screen readers facilitate the use of standard browsers, but the combination still gives sub-optimal access. There are thus developments in progress to attempt to achieve the best of both worlds. These are based on standard browsers – but ones which are configured in a particular way. Fire Vox24 is a screen reader that is designed especially for Mozilla’s Firefox browser. It turns Firefox into a self-voicing browser, but it also can be used with other screen readers. Firefox is available on different platforms and Fire Vox has also been written to be compatible with Microsoft Windows, Mac OS and Linux. It is intended to be used whenever a speech browser may be required, and this includes use by blind people so that all control can be entered via the keyboard. Fire Vox incorporates the basic features that are expected of screen readers, such as navigational assistance and being able to identify headings,
links and images and the like, but in addition it also provides support for MathML and CSS speech module properties. It also succeeds in making Ajax technology accessible, something that other screen readers cannot do (Thiessen and Chen, 2007). Fire Vox is free and open-source and is implemented as an extension to Firefox. As open-source software it is being constantly developed and improved and seems likely to become a significant asset. While Fire Vox relies on an extension to the browser, it still is essentially a screen reader. A more complex access problem that a screen reader cannot tackle is the fact that different forms of information are displayed on Web pages. There will be navigation, advertising, decoration and content. The role of any piece of information can usually be deduced by the sighted user because certain visual conventions are used in laying out pages. For instance, there is often a column of navigation information (links) down the left-hand side of the page. Deducing that this is navigation and not (say) content is beyond the capabilities of the screen reader. Other software is needed which uses knowledge about screen layout. Two approaches have been tried. Both of them involve ‘transcoding’ pages into annotated forms which are more suitable for speech access. The system developed by Asakawa and Takagi (2000) works in two modes. Automatic transcoding, as its name implies, requires no manual intervention. However, it can only perform a structural analysis of the Web page. Annotationbased transcoding is more powerful and generates highly accessible pages – but it requires manual annotation – preferably by the page author. The transcoding of the page is carried out by a proxy Web server (Tanenbaum, 2003). This means that the user uses a standard browser, but that browser has to be configured to send its Web page requests via the proxy. Dante (Yesilada, Harper et al., 2004) is described as ‘semi-automatic’. It is based on an extension of the concept of Web navigation, referred to as travel (Goble, Harper et al., 2000). This is used to build a travel ontology which is used to analsze and hence annotate the Web pages. In their transcoded form it is possible to get an overview of the page (essentially a table of contents), to navigate easily within the page (‘skip links’) and to navigate between pages (with a structured list of links). Dante is implemented via a browser plug-in (Tanenbaum, 2003). In other words, it too uses a standard browser – but one which is configured in a particular way (plug-in installed) for the blind user. Both of these systems are experimental; they are not yet the basis of available products. However, it seems likely that they will represent the way the technology will move. Ideally it should be no more difficult for the Web page designer to produce a page which is accessible than one which is not. Enhancing browsers through existing extension mechanisms can shift some of the responsibility from the page designer while letting the blind user use standard technology.
Assistive Technologies
159
6 Discussion History often seems to proceed in cycles. Within the current context, if we go back to the advent of the PC, it presented its information on a screen. This was not accessible to blind people. However, it was realized that most of the information presented was textual – and text could be transformed into non-visual forms – particularly synthetic speech. So there was a period when developers started building specialized talking programs: talking word processors, databases, spreadsheets and so on. This was a boon to many blind people, but it was less than ideal that they should have to use different applications from their sighted colleagues. So arose the concept of the screen reader – assistive technology which could make a whole range of software accessible. The access was perhaps not as simple as the dedicated talking version, but the level of equality was great. Then came the graphical user interface (GUI). Suddenly all the information presented was not simple text and the screen reader no longer worked. This was a time when some blind people feared they had completely lost the level of equality that they had achieved and that they would be excluded from future access to computers. That turned out not to be the case, though, partly because certain technologists and researchers were not going to give up that easily while at the same time increasing awareness of the civil rights of people with disabilities meant that it was not allowed to simply give up. So new technologies were developed; new types of screen readers began to work. Over time they improved and the platforms on which they ran were adapted to make them easier to interface to. In parallel with the rise of the GUI, the Web was invented. It assumed the GUI paradigm but it also began to operate in a very much visually dominated fashion. So again there were the gloom-mongers who thought that here was an insurmountable technological barrier – but again they were wrong. At first there were ad hoc technologies – screen readers working with standard and text-only browsers. Then there was the development of specialized non-visual, speech and/or braille browsers. Then the assistive technologies advanced as well as the accompanying Web technologies, such that a high level of access has now been achieved. The level of access is still not good enough – there is still scope for improvement.
7 Future Directions There is no one answer to the problem of Web accessibility. Most of the components are summarized in Fig. 1. Content is central. Content can be designed in such a way as to be more – or less – accessible. It is evident that the average Web page author will not create accessible pages unless this can be achieved without the expenditure of minimal additional effort. There is thus a need for authoring tools which will achieve this. For instance, Bigham, Kaminsky et al. (2006) have
160
A.D.N. Edwards
investigated the possibility of automatically generating alt texts, using information available on the Web. The second component on the left arm of Fig. 1 is evaluation tools. Again there is scope for further development. Current tools are only semi-automatic, they cannot make judgements, for instance regarding the use of colours on a page. Furthermore, it is apparent that current guidelines only embody a superficial level of accessibility. With a better understanding of accessibility, more intelligent tools may be developed that will further automate the evaluation process. Of course, one stage further is to integrate authoring and evaluation tools. An obvious example would be for a tool to detect the omission of an alt text and for another component to generate and insert one. This kind of integration is also going to happen on the right-hand arm of the figure. Browsers and assistive technologies are already much more closely integrated than they were in the early days of the Web. As a result many of the ‘classic’ accessibility problems are largely solved. Now that these easier problems have been solved, we can tackle the more challenging ones and approaches such as that adopted in Dante, rely on close integration of browser and assistive technologies – as well as a better understanding of what users need. Web accessibility is a moving target, though. As yesterday’s accessibility problems are solved, today’s websites are using new technologies. There will always be the possibility of improving access for blind and partially sighted Web surfers. As a new technology has been developed it has been released to the mainstream market and then some time – weeks, months or even years – later, the corresponding access technology has caught up. It is severely disappointing to observe the current situation regarding Internet Explorer. Even in these times of apparent high awareness of accessibility needs within Microsoft, they have still seen fit to release their newest version of IE (Version 7) despite the fact that it is not compatible with current access technology. This should not be the case in the future (it should not be the case now!) as accessibility should be built in as standard. This depends on an awareness of the need for both access and techniques by which it can be achieved. ‘Multimedia’ has often been a frightening word in terms of accessibility, but in practice it should be the opposite. How do people with sensory impairments cope in the real world? Mainly they can use the redundancy in real-world objects, which can be seen, heard, felt, smelled and soon. It is only a matter of time before such rich, redundant representations can be synthesized – to the benefit of all users, including those with disabilities.
8 Technology Is Not the Whole Answer The subject of this chapter is Assistive Technologies, so it is natural that there has been a technological bias to it. It is worth looking at the relationship between technology and the wider world.
Assistive Technologies
161
It should be apparent that advances in accessibility, and the technology of accessibility, have only been achieved within a social context. This is often the case and it has to be acknowledged that often technological change is easier to achieve than social change. Over the lifetime of the Web, there has been an increase in the awareness of the need for accessibility and that has been demonstrated by the enactment of legislation. However, it cannot be said that the awareness has been matched by true improvements in accessibility. A most telling test is the report The Web: Access and Inclusion for Disabled People, from the Disability Rights Commission (DRC, 2004). In a survey carried out for that report, 68% of website owners from large organizations claimed to take accessibility into account. Yet the vast majority of websites surveyed, 81%, ‘failed to satisfy even the most basic Web Accessibility Initiative category’ (p. 37). In other words awareness and good intentions are not being translated into effective actions. All the technology in the world will not make websites accessible unless people will use them. As suggested above, though, technology is not all the answer. The website owners and maintainers who are unaware of or uncaring about the requirements of accessibility are only likely to change their attitude by becoming more aware – and that is only likely to happen through legislation and its enforcement. Accessibility laws already exist, but it is evident that website owners are unaware of them. There have been some important test cases, notably that brought against the Sydney Organising Committee for the Olympic Games (SOCOG).25 The implication of the ruling is that at least Level A WCAG compliance is the minimum to be expected on a public website and the SOCOG had to pay $20,000 (Australian) in damages. If more penalties like that were extracted and more publicly perhaps website owners would pay more attention.
9 Conclusions Most mainstream technologies are designed to be accessible and useable by the majority of the population. It is a common experience that the need for access for minorities – including those with disabilities – comes as an afterthought. New technologies then have to be developed to adapt the existing technology to the particular needs of different users. This has been the experience with the Web. An important factor is that the Web has become so important (and continues to rise in importance). This chapter has been largely about the technologies which have been developed to make the Web more accessible. The encouraging trend is in the merging of the technologies. Authoring tools are becoming more integrated with accessibility checkers so that creating accessible pages should become a matter of course. Similarly, specialist browser technologies have given way to effectively integrated browsers and screen 25
http://www.tomw.net.au/2001/bat2001f.html
162
A.D.N. Edwards
readers and there are moves towards complete integration, mainstream browsers which happen to be accessible. The technology trends are thus encouraging, but it must be remembered that technology alone is not sufficient; it will only be with a broader awareness of the need for access that people will make full use of the technology.
References Asakawa, C. and Takagi, H. (2000). Annotation based transcoding for nonvisual web access. Proceedings of the fourth international ACM conference on Assistive technologies, pp. 172 179, Arlington, Virginia. Bates, C. (2002). Web Programming: Building Internet Applications. Hoboken, New Jersey, John Wiley. Bigham, J. P., Kaminsky, R. S., Ladner, R. E., Danielsson, O. M. and Hempton, G. L. (2006). WebInSight: making web images accessible. Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility, pp. 181 188, Portland, Oregon, USA, ACM (http://doi.acm.org/10.1145/1168987.1169018). Chisholm, W. A. and Henry, S. L. (2005). Interdependent components of web accessibility. Proceedings of the 2005 International Cross Disciplinary Workshop on Web Accessibility (W4A), pp. 31 37, Chiba, Japan, ACM Press (http://doi.acm.org/10.1145/1061811.1061818). Clark, J. (2002). Building Accessible Websites. Indianapolis, New Riders. Clarke, A. (2007). Transcending CSS: The fine art of web design. Berkeley, California, New Riders. DRC (2004). The Web: Access and Inclusion for Disabled People. Disability Rights Commis sion, 56 Edwards, A. D. N. (1995). The rise of the graphical user interface. Information Technology and Disabilities 2(4) (http://www.rit.edu/easi/itd/itdv02n4/article3.htm). Goble, C., Harper, S. and Stevens, R. (2000). The travails of visually impaired web travellers. Proceedings of the eleventh ACM on Hypertext and hypermedia, pp. 1 10, San Antonio, Texas, United States (http://doi.acm.org/10.1145/336296.336304). Morkes, J. and Nielsen, J. (1997). Concise, SCANNABLE, and Objective: How to Write for the Web. Retrieved 20 April, 2007, from http://www.useit.com/papers/webwriting/writing. html Musciano, C. and Kennedy, B. (2000). HTML and XHTML: The Definitive Guide. Beijing, O’Reilly. Pitt, I. and Edwards, A. (2002). Design of Speech based Devices: A Practical Guide. London, Springer. Ramakrishnan, I. V., Stent, A. and Yang, G. (2004). Hearsay: enabling audio browsing on hypertext content. Proceedings of the 13th international conference on World Wide Web pp. 80 89, New York, NY, USA (http://doi.acm.org/10.1145/988672.988684). Tanenbaum, A. (2003). Computer Networks. Upper Saddle River, New Jersey, Prentice Hall. Thiessen, P. and Chen, C. (2007). Ajax live regions: ReefChat using the Fire Vox screen reader as a case example. Proceedings of the 2007 international cross disciplinary conference on Web accessibility (W4A), pp. 136 137, Banff, Canada (http://doi.acm.org/10.1145/ 1243441.1243448). Yesilada, Y., Harper, S., Goble, C. and Stevens, R. (2004). Screen readers cannot see (ontology based semantic annotation for visually impaired web travellers). Web Engineering 4th International Conference, ICWE 2004 Proceedings (LNCS 3140), pp. 445 458 (http:// www.people.man.ac.uk/zzalszsh/research/papers/shicwe04.pdf). Zajicek, M., Powell, C. and Reeves, C. (1999). Web search and orientation with BrookesTalk. Proceedings of CSUN ’99, Technology and Persons with Disabilities, Los Angeles.
Desktop Browsers Jon Gunderson
Abstract Web browsers play an important role in the accessibility of the Web by people with disabilities. The features that Web browsers provide to control the rendering of content, navigate and orient to document structure, and control the automatic behaviors will determine the types of accessibility techniques Web authors can use to make their resources more accessible and the level of usability that will be available to people with disabilities in accessing Web resources. Web browsers play a critical role in accessibility as Web 2.0 widgets created out of HTML, CSS, and JavaScripting by supporting new W3C technologies to make Web applications more accessible.
1 Introduction Browsers are the window to the world of Web resources, and the accessibility features of browsers have a big impact on how people with disabilities can view and access Web content, and how Web developers design Web resources to be accessible. For example, consider the U.S. Federal Government Section 508 requirement 1194.22 (o) ‘‘A method shall be provided that permits users to skip repetitive navigation links.’’ The purpose of this requirement is to allow screen reader users to easily move the reading cursor to the main content of a Web resource without having to listen to navigation links at the beginning of many Web resources. Most navigation bars are at the beginning of the document forcing speech users and other keyboard-only users to tab through navigation, advertising, and secondary side bar links as they search for the main content of a Web page. The Section 508 1194.22 (o) requires authors to provide a means for users to skip directly to the main content, although there is no specific technique required. Most Web
J. Gunderson University of Illinois at Urbana/Champaign, Disability Resources and Educational Resources, Champaign, IL, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_11, Ó Springer Verlag London Limited 2008
163
164
J. Gunderson
developers implement this feature by using an internal ‘‘skip navigation’’ link at the beginning of the page, since popular Web browsers like Internet Explorer and Firefox do not have one simple feature that would have made this approach unthinkable, would increase the support for Web standards and lead to even higher levels of accessibility for people with disabilities. If Web browsers implement a keyboard shortcut to navigate headers (‘‘h1–h6’’), there would have been a much better Section 508 requirement of ‘‘use headers to indicate document structure and the location of navigation bars.’’ Everybody including developers and people with disabilities would benefit from this requirement. Users with disabilities benefit since they can easily skip over navigation bars to get to main topics, but they can also easily find all the main, sub topics, and navigation bars on the page. Web developers benefit since it encourages them to use a Web standards approach to Web design which reduces the resources needed to create and maintain Web resources. All users benefit as Web developers use Web standards since this naturally leads to Web resources with consistent graphical renderings making it easier for users to find and locate information as they browse the Website. The features of Web browsers to support Web standards like structured navigation and user style sheets play an important role in determining how people with disabilities can access the Web and the techniques available to Web developers to implement accessibility standards.
1.1 Testing for Functional Web Accessibility Authors have a responsibility to create accessible content, but it is not very well understood the role browsers have in providing the options to Web developers to make content more accessible. The W3C User Agent Accessibility Guidelines (Jacobs, Gunderson and Hansen 2002) provide a detailed list of requirements, including header navigation, for browsers to support accessibility by people disabilities. The U.S. Federal Section 508 Software Accessibility requirements apply to Web browsers, but they do not have the specific requirements of the W3C User Agent Accessibility Guidelines for software designed to render resources from the Web. The introductory example in this chapter illustrates the impact browsers have on how people with disabilities can access the Web and how they support Web developers in creating accessible content through the use of Web standards. The following sections of this chapter will explore the ways browsers support accessibility and the implications of these features for people with disabilities, people without disabilities, and Web developers. One of the major roles browsers can play in the development of accessible Web resources is to provide a way for Web developers to functionally test Web resources for accessibility. The following test procedures are based on the iCITA Web Accessibility Best Practices (Gunderson 2007) to implement
Desktop Browsers
165
Section 508 and W3C Web Content Accessibility Guidelines (Vanderheiden, Jacobs and Chisholm 1999). Numerous authors have published books on accessible Web design, based on the limitations that current browser technology place on Web accessibility techniques; these books include Web Accessibility for People with Disabilities (Pacillo 2000), Accessibility for Everybody (Mueller 2003), and Web Accessibility: Web Standards and Regulatory Compliance (Thatcher et al. 2006). Ideally, if browsers had more built-in support for Web accessibility, there would really be no difference between Web standards based design and accessible design, which would make Web accessibility principles much more understandable and reasonable to most Web developers. But the lack of built-in accessibility features in popular browsers forces Web developers to use special markup techniques that are unique to disability access and are usually unfamiliar to most Web developers, which leads to accessibility features that are incomplete and ineffective. Special accessibility techniques sustain the myths that accessibility features only benefit people with disabilities and the idea that ‘‘text-only’’ Web pages are the preferred means of people with disabilities accessing Web resources. I hope that this chapter will help dispel some of these myths and how browser extensions and assistive technologies are helping to transform accessibility techniques to the ideal of the use of Web standards.
1.2 Keyboard Testing Many people with disabilities can only use the keyboard to navigate links, form controls and other elements that respond to user interaction on a Web resource. The ability to use the interface with only the keyboard is becoming even more important as dynamic user interface controls are introduced in Web 2.0 Web applications (van der Vlist et al. 2006). Links:
Headings: Event handlers:
Check to make sure users can navigate to all links using only keyboard commands. Check to make sure link text is unique and that the link text indicates the target of the link. Check to make sure that headers are used to indicate all major and minor topics using only keyboard commands. Make sure all interactive features implemented through scripting can be achieved using the keyboard alone.
1.3 Styling Interoperability is at the heart of Web standards (Berners-Lee et al. 1994) and is a major reason why Web standard based designs are more accessible to people with disabilities. Web standards (Zeldman 2003) based designs provide all users with the capabilities to restyle Web content to meet their own perceptual needs
166
J. Gunderson
or supports the rendering capabilities of a wide range of range of Web browsing technology, including desktop graphical browsers, speech browsers, refreshable Braille renderings, personal digital assistants (PDAs), and cell phone technologies. People with disabilities use a wider range of Web browsing technologies than the able-bodied population, and they have much more specific rendering requirements. For example, a person with a visual impairment may want a high contrast rendering of white characters on a black background. This can be accomplished through assistive technologies through screen magnifiers like Zoom Text from AISquared or Magic from Freedom Scientific, but it can also be done with Web standards when authors use CSS to style their Web resources and browsers support overriding author styles, allowing users to supply their own style sheet. The original premise of the Web was interoperability, providing a means for authors to write a Web resource once and have a wide variety of people and technologies to be able to access and use the resource. Using Cascading Style Sheet (CSS) technology (Lie and Bos 2005) to style content is an important part of supporting interoperability. People with disabilities often need to change font sizes, font families, and/or the foreground and background colors used in rendering text to make information more perceivable to them. In the same way, the resolutions of graphic displays are becoming more diverse and less predictable to Web developers. Web developers commonly design Web pages to particular resolutions (i.e., 800600, 1024800, 12801024, etc.) so resources will ‘‘look the same’’ to all users. But as resolutions of monitors change, this design technique results in renderings that are too small on some monitors and too large on other monitors. Using CSS relative size values allows Web resources to automatically adjust to the resolution of a monitor, maximizing the use of the available graphical real estate from 180px wide PDAs to 4000px high definition monitors. The techniques that allow Web resources to automatically adapt to varying resolutions of graphical monitors are the same techniques that make Web pages easier for people with disabilities to restyle content to meet their own perceptual needs. People with visual acuity problems like farsightedness may need to increase the font size of a rendered resource to be able to read the text on the screen. Using relative CSS units allows Web pages to reflow when people use browser text zoom features to increase the default font size. Browser features play an important role in allowing developers to test their Web resources to adapt to the styling needs of users and supporting different resolution monitors. If Web resources support interoperability, they should reflow to fill the resolution of the graphical window and not require the user to use the horizontal scroll bar to view content or have wide areas of empty white space. Many browsers do not make it easy for authors to make these adjustments or do not support the adjustments at all. When popular browsers do not make it easy for users and developers to adjust the rendering of Web content, it reinforces developer’s design model of using fixed width pixel based designs
Desktop Browsers
167
which are less usable to people with disabilities. The following are browser features which support testing for functional accessibility: Font size:
Window width:
Author styling:
User styling:
Increasing and decreasing the font size should result in content reflowing to fit the current window width without the user having to use horizontal scrolling to view content. As the width of the graphical window is increased and decreased, content should reflow to fit the current window width without the user having to use horizontal scrolling to view content. When author defined style sheets and in line styling are disabled, content should still be usable. This is the ultimate test for interoperability with speech and Braille technologies since all author supplied graphical formatting is removed. If the author was using any graphical formatting or spatial layout to indicate relationships, this would become evident when author style information is removed. A little known part of W3C CSS specifications (Bos et al. 2007) is the concept of user style sheets, which provides a means for users to apply their own styling preferences to Web resources. Even when browsers support user style sheets, they usually hide this feature from users by making it difficult for them to define and associate the user style sheet with the rendering of the content. Browsers like Opera have a number of built in author style sheets for high contrast styling and styling renderings to emulate text browsers. The built in user style sheets in Opera make it easy for the developer to test the interoperability of their design with a wide variety of device rendering capabilities and user preferences, including the preferences of people with disabilities.
1.4 Text Descriptions and Conditional Content Images and other embedded objects need text equivalents to allow people with disabilities to access the content of the image or at least to understand the purpose of the image in the Web resource. Text equivalents are just one type of content that is referred to in the W3C User Agent Accessibility Guidelines (Jacobs et al. 2002) as conditional content, since browsers may render this content given a certain set of conditions including user action and settings (i.e., hover mouse over element with a TITLE attribute to create a tooltip) or configuration of the browser (i.e. substitute ALT text rendering instead of image rendering). The W3C HTML 4.01 specification conditional content includes the TITLE attribute which can be used to provide information on the purpose of frames or additional information about a link or form control. The following are examples of conditional content in HTML 4.01. ALT attribute:
Configure the browser to turn off images and enable the rendering of ALT attribute content in place of the image and check to make sure the alt content represents the content of the image.
168 OBJECT element:
TITLE attribute:
LABEL element:
LANG attribute:
J. Gunderson Configure browser to turn off embedded objects and enable the rendering of text content of the OBJECT and check to make sure the equivalent accurately indicates the purpose of the object. Check the content of the title attributes to see if they provide useful supplemental information about the link, form control, or frame element. Labels for form controls are important for speech renderings to be able to prompt users when a form control gets keyboard focus. Proper labeling of form controls are difficult for developers to verify since the use of labels in graphical browsers results in no change in rendering and only subtle changes in behavior for checkbox and radio button form controls. HTML 4.01 allows authors to define the default language and changes in language of a Web resource. Using this markup in graphical browsers does little to change the rendering of content since the language is represented by a character and character set information. But like labels, this information is critical for speech renderings since speech needs this information to change the language spoken by the speech synthesizer.
1.5 Browser Impact on Accessibility Browsers have a tremendous impact on the techniques used by Web developers to create Web resources and the expectations of users on how content will be provided to them. Developers will not use features if browsers don’t support the markup by providing access to additional content, styling, or navigation features. A classic example is the LONGDESC attribute in HTML 4.01. Access to the LONGDESC content was not implemented for many years by many browsers and is still not available in major browsers like Microsoft Internet Explorer through the graphical user interface. The lack of support-reduced authors resulted in both a lower awareness of the potential capabilities of the LONGDESC attribute and a lack of interest in including content that cannot be accessed by popular browsers. Another example is font scaling or zooming, developers won’t support and users will not explore the possibilities of liquid Web design if browsers don’t support scaling text content. The view of most Web developers is that current popular accessibility techniques only benefit users with disabilities (i.e., skip navigation link, alt text for images, and labels for form controls). This is for two primary reasons, the first is the limitation in browser features to support Web standards forcing many Web developers to work around and away from a Web standards based approach to Web design. If accessibility techniques seem to only benefit users with disabilities, this will reinforce the negative stereotypes of accessibility techniques being obscure, a burden to implement and not a benefit to all users. The second is developers often cannot test an accessibility technique like using headers (h1–h6) to indicate major and minor sections within a Web resource with the built-in features of browser, so they often don’t use them or apply them incorrectly. Many developers use headings for styling purposes (big or small font) rather than
Desktop Browsers
169
for indicating the structure of a Web resource. The next section of this chapter will look at the implementation of browser features that support accessibility and how these features can enhance the browsing experience of all users.
2 Built-In Features for Accessibility The built-in features for accessibility are important to promote accessible design techniques based on Web standards. Web developers, able-bodied users, and users with disabilities need to benefit from using Web standards based approaches to accessible Web design, and in large part, this will only be possible through the functionality provided by the Web browser.
2.1 Keyboard Overview Keyboard support is the most basic accessibility feature of any software application and is the first checkpoint in the W3C User Agent Accessibility Guidelines 1.0 and part of the Section 508 Software requirements. Keyboard support is needed by people with disabilities who cannot use pointing devices like the mouse, and other people with disabilities who use alternative input devices (Anson 1997) need keyboard support to map there alternative input device to activate the functions of an application. For example, someone using voice recognition can say ‘‘Next Link’’ and the voice recognition program can emulate the TAB key press to move to the next link in a Web resource.
2.2 Keyboard Shortcuts Table 1 shows keyboard support for selected browser functions that allow users to move between Web pages and support for the common inter-page navigation in graphical desktop browsers. One of the main features is to support the builtin keyboard accessibility features of the operating system, all the major operating system have features to support keyboard access. Table 2 shows keyboard support for navigation of content within a Web page, and there are significant differences between desktop browsers. The Opera browser clearly has many more built-in keyboard functions to navigate page content, including header navigation, directional link navigation. Header navigation provides an important feature to navigate the structural content for keyboard only users and makes it easier for people with visual impairments to find the main content of a Web resource. Header navigation can significantly reduce the number of keystrokes for a user to select a link or form control. Consider Web resources with a large number of links, the basic sequential link navigation most browsers support using the TAB key is a tedious and time-consuming process. But if
170
Open location History back History forward History list Bookmarks/ favorites Add bookmark/ favorite
Windows IE 7.0
Table 1 Selected built in keyboard shortcuts for browser page navigation Macintosh Firefox 2.0 Opera 9.1 Safari 2.0 Firefox 2.0 Opera 9.1
Next link Previous link Link type ahead List of links Link up Link down Link left Link right Next form Previous form Next heading Previous heading Find Find next Toggle images User style sheets Zoom in Zoom out
Tab Shift+Tab
Tab Shift+Tab Yes
Tab Shift+Tab
Tab Shift+Tab
Ctrl+F
Ctrl+F Ctrl+G
Ctrl+Plus Ctrl+Minus
Ctrl+Plus Ctrl+Minus
A Q Ctrl+J Shift+ " Shift+# Shift+ Shift+! Tab Shift+Tab W S Ctrl+F Ctrl+G Shift+I Shift+G Plus Minus
Tab Shift+Tab
Tab Shift+Tab Yes
Tab Shift+Tab
Tab Shift+Tab
Cmd+F Cmd+G
Ctrl+F Ctrl+G
Cmd+Plus Cmd+Minus
Ctrl+Plus Ctrl+Minus
A Q Cmd+J Shift+ " Shift+# Shift+ Shift+! Tab Shift+Tab W S Ctrl+F Ctrl+G Shift+I Shift+G Plus Minus
Unix Firefox 2.0 Tab Shift+Tab Yes
Tab Shift+Tab
Ctrl+F Ctrl+G
Ctrl+Plus Ctrl+Minus
Opera 9.0
Desktop Browsers
Windows IE 7.0
Table 2 Keyboard shortcuts for web content navigation and stylin2.3 accesskeys Macintosh Firefox 2.0 Opera 9.0 Safari 2.0 Firefox 2.0 Opera 9.0
A Q Ctrl+J Shift+ " Shift+# Shift+ Shift+! Tab Shift+Tab W S Ctrl+F Ctrl+G Shift+I Shift+G Plus Minus
171
172
J. Gunderson
headers are used and the browser supports header navigation, the user can turn a 40–50 key sequence into 4–5 keys. Directional navigation supported by Opera (Shiftþarrow keys) provides another means to use the keyboard to move more directly to a link of interest if you can see the spatial relationships between links. Firefox has a type ahead feature to make it easier for keyboard users to navigate to links by moving keyboard focus to the links that start with the letters the user is typing. This is very useful for a page the user is familiar with, since the user may know the text of the link they want to select. The discussion of browser keyboard accessibility would not be complete without a discussion of accesskeys. The HTML 4.01 specification defines an accesskey attribute for links and form controls using a character to identify the shortcut key. The purpose is to allow authors to create their own keyboard shortcuts for users to activate links or move focus to a form control. One of the goals of accesskeys is to help people with disabilities by providing them with additional keyboard shortcuts to frequently used links and form controls on a Web resource. For example, ACCESSKEY=‘‘S’’ could be used to move keyboard focus to the text search box within a Website; ACCESSKEY=‘‘M’’ could move focus to the main content. There a number of issues that impeded the widespread use of accesskeys. Internet Explorer 4.0 was the first browser to implement the accesskey feature; it implemented moving focus to links and form controls using the ALT modifier key þ character. Unfortunately, the ALT modifier is also used for shortcuts within the windows operating system, so this leds to conflicts between Web page accesskeys and operating system shortcut keys. For example, ALTþF opens the file menu in most windows applications, so if a author used accesskey=‘‘F’’ in a Web resource, the accesskey would either be ignored or override the operating system shortcut. The ALT key is also used by many assistive technologies for keyboard shortcuts, including screen readers, compounding the confusion even more. Other issues include notifying the user which access keys are defined in a Web resource. There is little guidance in the HTML 4.01 specification about how users should learn about the availability of accesskeys or the keyboard combinations to implement the feature. Therefore, it is up to the author to provide the user with information about the available accesskeys. But even providing this information is problematic since accesskeys are not implemented the same on all major browsers (see Table 3). Firefox implements accesskeys but will follow a link rather than just moving focus to the link like Internet Explorer 4.0þ does. Opera requires a 2-key sequence to activate accesskeys: first press key combination of controlþESCape and then the accesskey character. The Opera technique eliminated the keyboard conflicts of using the ALT key, but again it is different than Internet Explorer and Firefox, confusing both developers and users on implementing accesskeys. Opera was late to implement accesskeys on the grounds that it did not internationalize very well, since the author could define a character that is not on a user’s keyboard. So there are many issues with including accesskeys on Web resources to improve accessibility, and with the current eclectic implementation, most Web developers avoid their use. There
Desktop Browsers
Internet Explorer 4.0+ Mozilla/ Firefox 1.0+ Opera 7.0+
173
Table 3 Accesskey implementation on popular browsers Form control Activation key Link behavior behavior ALT+Character Move keyboard focus to link, Move keyboard focus no activation of link to form control ALT+Character
Activate link associated with accesskey
Move keyboard focus to form control
Shift+ESCape, Character
Activate link associated with accesskey
Move keyboard focus to form control
Safari 1.0
are some situations where accesskeys can be useful, for example, if someone needed to fill out or modify a Web form on a regular basis, accesskeys on frequently used or strategically placed form fields can be very useful to all users and people with disabilities.
2.3 Styling Content People with visual impairments and some types of learning disabilities often need to restyle content to meet their visual perception and processing needs. This includes changing font size, font family and colors used to render text, and the ability to linearize content (remove table markup) to make it easier for people with visual impairments and visual processing learning disabilities to read the content of Web resources. For users to easily restyle content requires authors to use Web standards based relative CSS values for styling content and browser to support user style sheets for users to modify the rendering to meet their own perceptual needs. Users need to be able to configure browsers to ignore author supplied styling information and apply their own style sheet or direct the browser to use operating system settings for text color and font characteristics. Figure 1 shows configuring Internet Explorer to ignore some author supplied styling. Internet Explorer requires configuration of several dialog boxes for the user to set their own style preferences and then going through the same set of dialog boxes to restore authoring styling. Opera on the other hand makes it very simple for user to switch between author styling and user styling preferences. Menu options and keyboard shortcuts (Fig. 2) make it easy for user to switch between author and user styling modes. The Opera approach is very useful to Web developers since they can easily check how their Web resources will render with when the authors styling is removed and the styling of the user or device is applied (i.e., PDA, cell phone, and text browser). The same techniques to make content adaptable to the needs of people with disabilities are the same techniques needed to make content adaptable to a wide variety of portable and desktop technologies. With the release of
174
J. Gunderson
Fig. 1 Internet Explorer configured for high contrast through accessibility dialog box settings
Fig. 2 Opera browser in high contrast mode through menu options
Desktop Browsers
175
Fig. 3 Internet Explorer 7.0 turning off display of images requires going through a series of dialog boxes
Internet Explorer 7.0, all of the popular desktop graphical browsers support scaling of text (see Table 2; Figs. 3 and 4). Yet even here there are differences. Opera and Firefox zoom features support content reflow to fit the window width, and Internet Explorer zoom feature maintains the layout forcing horizontal scrolling when the layout exceeds the width of the window. Horizontal scrolling to read content is an annoyance to most users and can make reading Web resources by people with disabilities a very difficult process.
2.4 Images The IMG and AREA elements are the primary techniques to include images in Web resources. Both elements include the ALT attribute, and the IMG element includes an additional LONGDESC attribute. There are many issues with browsers providing access to the text equivalents for images and Table 4 provides a summary of the text description rendering capabilities of major browsers. The typical way browsers provide access to text equivalents for IMG elements is to replace the rendered image with the ALT text content. One of the issues with the rendering of ALT text is the ability of the user to access all of the ALT attribute content and the ability to style the text rendering
176
J. Gunderson
Fig. 4 Opera Browser toggling between images on and off through a menu option
of the ALT content. Many browsers clip the text to the size of the original image which results in most of the ALT content being cut off from rendering. The other issue is styling of the ALT attribute content when it is rendered. Most browsers offer only limited styling capability so even if the user is able to view all of the ALT attribute content they may not be able to make the text large enough to read. The only major browser to support access to the LONGDESC attribute URL for the IMG element is Mozilla/Firefox browser. Users must move the
Table 4 Summary of text description rendering for major browsers ALT attribute for LONGDESC ALT attribute Style ALT text IMG element attribute for AREA rendering Render in place of Limited ability Internet image to scale size Explorer 7.0 Firefox 2.0 Render in place of Context menu Full image Opera 9.1 Render in place of Full image Safari Render in place of image
Desktop Browsers
177
pointing device to the image and open the context-sensitive menu using the right-click feature of the mouse in Microsoft Windows. None of the major browsers provide access to the ALT attribute for content for AREA element, essentially requiring the author to provide redundant text links for any links associated with server or client-side image maps.
2.5 Conditional Content There are a number of other types of conditional content that are useful for accessibility like the TITLE attribute, which can be used on almost all HTML elements. The TITLE attribute is used as a ‘‘tool tip’’ in Internet Explorer and Firefox for elements. A tool tip displays the TITLE attribute content in little window near the mouse pointer when the user hovers the mouse pointer over the element with a TITLE attribute defined. Other types of conditional content are not rendered by graphical browsers at all. For example, the TITLE element can be used to indicate the purpose of a FRAME within a FRAMESET, but graphical browsers currently do not provide a mechanism to render the TITLE attribute content of the FRAME. Without this feature Web, authors cannot easily check to see if their FRAME titles are meaningful.
2.6 Scripting In the early years, Web scripting was difficult for Web developers to integrate into Web resources due to the inconsistencies and changes in the scripting languages and document object models used by Netscape Navigator and Internet Explorer. Scripting was typically used for decorative purposes like image roll over effects for links that are styled using images. The W3C Web Content Accessibility Guidelines 1.0 reflected the relative unimportance of scripting at the time the recommendations were published in 1999 by requiring Web pages to be functional when scripting is disabled or the browser does not support scripting. The Web is much different today; the standardization of JavaScript, the widespread implementation of the W3C Document Object Model (Champion et al. 2000), and XMLHttpRequest have created an environment where developers are creating Web widgets with the same capabilities as those found in Graphical User Interfaces (GUIs) on desktop operating systems. Scripting accessibility is no longer an option but a necessity, and browsers have a significant role in making dynamic content accessible. The following sections deal with simpler scripting accessibility issues found on static Web resources, and the accessibility issues of Web applications will be discussed in a later section.
178
J. Gunderson
2.7 Device Independence and Event Handlers One of the most important features of script-based interaction within Web pages is the ability to support the keyboard. HTML specifications have sepa3rate event handlers for mouse and keyboard events. Event handlers triggered by mouse events:
For developers to support both the keyboard and the mouse, they must use multiple event handlers. Some event handlers like the onClick event already support device independence by responding to both mouse pointer events (left click) and keyboard events (Enter key press when element has focus). Other types of user event handling require pairing of mouse and keyboard event handlers, for example, onMouseOver and onMouseOut event handlers must also have corresponding onFocus and onBlur event handlers. Many developers do not understand accessibility requirements of supporting the keyboard and therefore do not include the onFocus and onBlur event handlers. Other types of event handlers cause keyboard accessibility problems just by using them, notably the use of the onChange event handler with the SELECT form input element to move to a new Web page identified by the options in the select box. The onChange event is triggered when the user moves keyboard focus to a select element using the TAB key and then tries to use the Up and Down arrow keys to view the options in the select box. The use of the arrow keys triggers the onChange event handler, and the user is then moved to a new Web page. This abrupt change in Web page is very disorienting to visually impaired users who thought they were just going to explore the options associated with the SELECT control and physically impaired users who cannot actually get to the other options in the select control. Some browsers have fixed this problem by adding additional keyboard commands like ControlþUp and Controlþ Down arrow to allow keyboard users to view the options without triggering the onChange eventscripts, but most users do not know this obscure command.
Desktop Browsers
179
The more users need to know these arcane keyboard commands the less usable the browser is for users with disabilities. A change is needed in the next generation of event handlers to be more device independent, both to improve accessibility to people with disabilities and also to make it easier for developers to improve the interoperability of the Web resources they create. Generic event handlers would allow browsers to map keyboard, mouse, or other input devices to the generic event handlers reducing the burden on Web developers to define event handler lists for a wide range of technologies and every conceivable device that might be used to interact with Web content. Consider an example of a Web author wanting to provide a context sensitive help system for a set form controls on a Web resource they are developing. The author wants to provide a system that allows users to get additional information on each form control through a ‘‘balloon’’ help system. The information in the balloon changes as the user either moves keyboard focus to each control or hovers the mouse over a control. This currently requires two separate sets of event handlers: onMouseOver/onMouseOut and onFocus/ onBlur. If the onFocus and onBlur events were also triggered by mouse hover events, then the developer could just use the onFocus and onBlur events to implement the balloon help system.
2.8 Pop-Up Windows Pop-up windows have become an annoyance for all users, and recent browsers have added features to block new windows automatically open through JavaScripts. The accessibility issues of new windows though are more severe than simple annoyance for people with disabilities, especially for speech users. Before pop-up blocking was available, users would select a link and expect to go to the page described by the link and end up with focus on an advertising pop-up window. This is very disorienting to the speech user since the content is unexpected and the user needs to explore the page to determine that it is not the target link they were expecting and then they must navigate through the other opened windows of the browser to find the window with the primary content, a time-consuming and tedious process. Another issue with pop-up windows is that authors usually remove the menus and tool bars from the window and many people with disabilities rely on features in the menu and tool bars to help them navigate and access content. Users need the ability to override author preferences to exclude menu bars, just like with author styling information, and allow users to have menu and tool bars available to them if they want them. Currently no browser supports this type of user configuration. Another aspect of pop-up windows is the loss of browsing history. When the user ends up in a pop-up window and become disoriented, a natural response is to use the back page feature to go to the previous page. But since the popup is a new window, it has no browsing history, the user is stuck on the popup resource. The user
180
J. Gunderson
should have the ability to include browser history in any new open window overriding author preferences, so if they get disoriented they can at least go back to a page they are familiar with. Currently no browser provides this configuration option to users.
2.9 Embedded Objects The main problem with any embedded media players, Java applets or other objects with a HTML user interface is the support for keyboard focus. Figure 5 shows an image of an embedded media player, and currently there is no way to move keyboard focus into and out of embedded player using the browser keyboard commands. The lack of keyboard support for embedded objects breaks a fundamental accessibility requirement of supporting the keyboard for all operations. There are some techniques that allow keyboard focus to be moved from the HTML content to the embedded object, but once keyboard focus is in the embedded object there is no way for the object to give the focus back to the HTML content. This places the burden on the Web developer to find ways to work around this problem. For media players, one technique is to
Fig. 5 Embedded multmedia player in a popup Web page
Desktop Browsers
181
place a link on the page to open the media file in an external media player which gives the user full access to the media player’s accessibility features and standard menu controls. The user can move between the media player and the browser using the operating system’s keyboard commands to move between applications, for example, ALT–TAB in Microsoft Windows moves users between programs running. The other aspect of embedded applications is their lack of accessibility features. Many do not support keyboard operation even if keyboard focus is give to them, and users cannot restyle content like they can with the HTML part of the page.
2.10 Configuration Configuration by the user of browser rendering and automation options is important for people with disabilities to configure the browser to meet their interaction preferences. Figure 6 shows the Presentation Mode options in the Opera browser to allow the user to choose which styling will be used in author-and
Fig. 6 Opera user and author styling preferences dialog box
182
J. Gunderson
user -rendering modes. One of the biggest issues in user preferences is portability of user browser configuration options, especially to public access computer systems in schools, libraries, and government offices. The W3C User Agent Accessibility Guidelines has a requirement for portable user profiles to make it easy for users to apply their preferences to browsers they have not used before. Currently no browser has fully implemented the profiles feature.
3 Extending Browser Functions There are many features important for accessibility defined in the W3C User Agent Accessibility Guidelines that are not implemented on popular desktop browsers. Some desktop browsers can be extended to include additional accessibility features. Two of these extensions include the Firefox Accessibility Extension from the University of Illinois (Fig. 7), AIS Web Accessibility Toolbar for Internet Explorer from Vision Australia (Fig. 8), and Firevox developed by HCarles Chen at the University of Texas in Austin. These tool
Fig. 7 Firefox Accessibility Extension
Desktop Browsers
183
Fig. 8 Web Accessibility Toolbar for Internet Explorer
bars provide browser enhancements for testing Web sites for accessibility and providing additional accessibility features for people with disabilities to access Web content.
3.1 Making Conditional Content Visible One of the most important features of extensions is to make conditional content and structural markup that is invisible in the default browser visible to users with disabilities and developers who want to check the functional accessibility features of their Web resources. Conditional content is information that describes relationships between elements or provides additional information about an element. A common example of conditional content is the TITLE attribute. Internet Explorer displays TITLE content as a ‘‘tooltip,’’ but many other browsers simply do not provide access to the TITLE attribute content at all. There are many sources of conditional content that are important for accessibility, including navigational elements like headings (h1–h6), LABELs for form controls, text equivalents for images, embedded objects, frames, and scripting information. Extensions can help developers do functional testing for
184
J. Gunderson
accessibility much more efficient, and users can access content that otherwise might be hidden from them to more easily view and navigate Web resources.
3.2 Enhancing Keyboard Support Browser extensions can provide keyboard shortcut enhancements. For example, heading navigation (h1–h6) is implemented in Opera browser as a built-in keyboard shortcut, but Internet Explorer, Firefox, and Safari do not implement a built-in keyboard shortcut for header navigation. The Firefox Accessibility Extension (Gunderson and Schwerdtfeger 2006) and Firevox provide keyboard enhancements and add a keyboard shortcut for header navigation to the Firefox browser. Other keyboard enhancements include shortcuts to toggle between user and author styling preferences or allow users to access and navigate a list of links similar to the keyboard features built in the Opera browser.
3.3 Styling The Opera browser provides a number of built-in user style sheets and makes it very easy to switch between user and author styling preferences. This is very important for people with disabilities to be able to use color and font settings that are most useful to them and for Web developers to functionally test the ability of their Web resources to be usable without the authors style sheets enabled and with common device and user style sheets applied to the content. Desktop browsers like Internet Explorer, Firefox, and Safari have user restyling features, but they are incomplete and difficult for users and developers to find and use. Extensions can simplify this process by making to user style sheets and disabling author styling information easier for users and developers to apply and restore.
3.4 Other Features Browser extensions can provide many other features to help Web developers and people with disabilities to access Web content. 1. 2. 3. 4. 5.
Visual impairment simulation Show language markup Show structural content Color contrast tests Speech renderings (Firevox).
Desktop Browsers
185
4 Compatibility with Assistive Technologies One of the most important features of desktop graphical browsers is their ability to communicate Web content information and user inputs to assistive technologies. There are many types of assistive technologies that provide screen enhancements (larger text and graphics and/or color transformations) for people with low vision or learning disabilities that affect visual processing, alternative inputs for people with physical impairments, supplemental speech for people with learning disabilities, and screen readers used by people who are blind. Screen readers provide auditory or refreshable Braille alternatives to the graphical rendering desktop applications for people who are blind and cannot see the display. Web browsers are just one of the many applications used by screen reader users, although browser access is becoming more important as Web applications displace their desktop counterparts. Microsoft Windows is the dominate operating system used in western countries and starting with the release of Windows 1995, windows included a technology called Microsoft Active Accessibility (MSAA), which was designed to provide a means for desktop applications and operating system to communicate with assistive technologies. Until the release of MSAA, developers had to reverse engineer graphical operating system data structures, to figure out what was being drawn on the screen. While MSAA provides information about what is being drawn on the screen, it does not directly provide information about the HTML content that generated the graphical rendering. Information about the Web content is available from the Document Object Model (DOM) of the browser. The W3C User Agent Accessibility Guidelines requires that browsers support accessibility APIs and access to the DOM to gather information about the content rendered for use by alternative speech and refreshable Braille renderings.
4.1 Document Object Model Document Object Model (DOM) is designed for scripting languages like JavaScript. JavaScript is used to manipulate the content of a Web resource based on user actions and other types of events. The DOM has important information that can be used by assistive technologies to improve navigation, orientation, and usability of the alternative user interface by people with disabilities. The DOM provides a standard way for browsers to improve interoperability and provide a way for assistive technologies to access the content of Web resources. The main problem with assistive technologies relying solely on the DOM for information is that there are often differences between the content in the DOM and the content rendered graphically on the screen. The difference is due to the ability of graphical browsers to repair invalid HTML to provide some type of graphical rendering. Invalid HTML may not have a clean representation in the DOM, making it difficult for assistive technologies to provide a consistent alternative rendering with the graphical display. For example, some
186
J. Gunderson
information may be displayed graphically, but not read by a screen reader. The parsing within the browser to generate the graphical rendering and for creating the DOM is usually handled separately. This leads to potential differences between what is rendered and what is in the DOM. Sometimes content rendered is not represented in the DOM, and other times, information in the DOM is not rendered. Therefore, assistive technologies need to look at both the DOM and the MSAA information on what is rendered on the screen, further complicating the job of assistive technologies.
4.2 Accessibility APIs Microsoft Active Accessibility (MSAA) is the accessibility API for Microsoft windows and is supported by Internet Explorer and Firefox browsers. MSAA provides the means for browsers (and other windows applications) to communicate information on the rendering of content, user interface controls, and user events (mouse and keyboard) with assistive technologies like screen readers and magnifiers. MSAA provides information on what is actually rendered by a browser and in combination with information from the DOM assistive technologies can provide a more robust and usable alternative user interface to the browser for people using assistive technologies. There are limitations in the current implementation of Microsoft Active Accessibility to represent widgets that commonly appear in Web applications. IAccesible2 (The Linux Foundation) extends the current MSAA implementation provided by Microsoft to include support for Web widgets. Microsoft has developed a new accessibility API to represent a much richer and extendable accessibility API as part of their .NET technology called UI Automation to represent widgets, although at the time of this writing of this chapter neither assistive technology vendors or windows application developers have embraced the use of UI Automation to make applications more accessible. Other operating systems and programming environments have their own accessibility APIs including Sun Java, Apple Macintosh, and Unix GTK/GNOME; see additional resources for more information on these accessibility APIs.
5 Accessible Rich Internet Applications (ARIA) Browser support for AJAX technologies (JavaScripting, Document Object Model (DOM), and XMLHttpRequest for asynchronous communication) to create desktop widgets and to asynchronously update the widget on desktop graphical browsers has raised a new set of accessibility issues for people with disabilities. Web application developers are using these capabilities to create user interface experiences that mimic Graphical User Interfaces (GUI) found in desktop operating systems like Microsoft Windows, Apple Macintosh, Sun Java JRE, and
Desktop Browsers
187
Unix GNOME/TDK user interfaces. The accessibility issues raised by Web applications include keyboard support and the ability to provide assistive technologies information on the function and relationships of Web widgets created out of HTML markup, JavaScript, and CSS. The W3C Web Accessibility Initiative is developing a set of specifications that allow authors to provide keyboard support, identify the types of widgets, and the states and properties of widgets.
5.1 TABINDEX and Keyboard Focus One of the fundamental features of making Web 2.0 widgets accessible is keyboard support, and browsers play the key role in supporting keyboard access to Web 2.0 applications. Traditionally, Web browsers have only supported the concept of keyboard focus for anchors and form control elements, as defined in the W3C HTML 4.01 Specification. The TAB key (or other keystrokes in the case of the Opera Browser) is used to move keyboard focus sequentially between anchors and form control elements, and this behavior is implemented in browsers like Opera, Internet Explorer, Mozilla Firefox, and Apple Safari. Web widgets are typically built from DIV and SPAN elements with JavaScript event handlers to support user interaction with the widget features. The DIV and SPAN elements with event handlers are not included in the default tabbing order like HTML form controls and anchors. When browsers support the ARIA recommendations, Web 2.0 widgets receive keyboard focus when the author defines a TABINDEX attribute for the element. Setting the TABINDEX=‘‘0’’ will allow an element to become part of the default tabbing order of the resource, and setting TABINDEX=‘‘–1’’ allows an element to receive keyboard focus but the element is not included in the tabbing order of the Web resource. For elements with TABINDEX=‘‘–1’’, receive focus through authordefined keyboard event handlers implemented as a part of the widget scripting, and the author can give these elements focus using the DOM focus property.
5.2 Roles, Properties, and States The ROLE attribute is an important part of the xhtml 2.0 specification and provides a means to describe the types of Web 2.0 widget that are part of a Web application. The author can communicate information about widget relationships, properties, and the status by programmatically setting attributes on the HTML elements. Changes in the values of the attributes trigger accessibility API events which communicate information about the widget to assistive technologies like screen readers and magnifiers. The following example shows the xhtml 2.0 markup for a simple checkbox control made out of a DIV element and JavaScript event handlers. The example uses the ‘‘checked’’ attribute to
188
J. Gunderson
change the CSS rendering and provide information to assistive technologies on the current state of the checkbox. xhtml markup for checkbox example: My Checkbox Label JavaScipt for checkbox example: /** * toggleCheckbox is called by event handlers to * toggle the state of the checkbox * * @param ( Checkbox object ) checkbox Checkbox to * toggle state * @return nothing */ toggleCheckbox = function( checkbox ) { if (checkbox.node.getAttributeNS(NS_STATE, "checked") == "true") { // If the checkbox is currently checked set the state to unchecked checkbox.node.setAttributeNS(NS_STATE,"checked", "false"); } else { // If the checkbox is currently unchecked set the state to checked checkbox.node.setAttributeNS(NS_STATE,"checked", "true"); } // endif } /** * handleCheckboxGroupKeyDownEvent processes keys asso ciated with * a radio button group * @param ( event ) event is the event handler for the event * @param ( Checkbox object ) checkbox is the Checkbox object that is the target of the keyboard event
Desktop Browsers
189
* @return false if keyboard event was used by radio group, else true */ handleCheckboxKeyDownEvent = function(event, checkbox) { switch( event.keyCode ) { case KEY_SPACE: toggleCheckbox( checkbox ); event.stopPropagation(); event.preventDefault(); return false; break; } // end switch return true; } /** * handleCheckboxClickEvent processes pointer click events with in * the radio group * @param ( event ) event is the event handler for the event * @param ( Checkbox object ) checkbox is the Checkbox object that is the target of the pointer event * @return false if poiner event was used by radio group, else true */ function handleCheckboxClickEvent( event, checkbox ) { if( checkbox.node == event.target ) { toggleCheckbox( checkbox ); } // endif } CSS for checkbox example: div.checkbox[*|checked="true"] { background repeat: no repeat; background position: left center; background image: url(’checked.gif’); } div.checkbox, div.checkbox[*|checked="false"] { background repeat: no repeat; background position: left center; background image: url(’unchecked.gif’); }
190
Table 5 Mapping ARIA properties to accessibility APIs from roadmap for Accessible Rich Internet Applications States and properties module Disabled Checked Expanded
Haspopup
Multiselectable Pressed Readonly Required
MSAA: There is no mapping. User agent must make available through the [DOM] or a specialized API. Note: While optional could be combined with required, this is kept to be consistent with CSS3 pseudoclasses and [XForms]. MSAA:STATE_SYSTEM_SELECTED MSAA:mixed MSAA: should return the value for getValue()
User agent mapping via ATK ATK:AT_STATE_DISABLED ATK: ATK_STATE_CHECKED If the hidden property is set to true : ATK: ATK_STATE_EXPANDABLE If the hidden property is set to false: ATK:ATK_STATE_EXPANDED ATK: not necessary in ATK because it has multiple actions with description ATK:ATK_STATE_MULTISELECTABLE ATK: ATK_STATE_PRESSED is true when checked ATK:ATK_STATE_READONLY=inverse of readonly ATK: There is no mapping.
ATK:ATK_STATE_SELECTED ATK:indeterminate ATK: should return this as part of the AccessibleValue structure
J. Gunderson
Selected Unknown Value
User agent mapping via MSAA MSAA:STATE_SYSTEM_UNAVAILABLE MSAA: STATE_SYSTEM_CHECKED If the hidden property is set to true : MSAA:STATE_SYSTEM_COLLAPSED If the hidden property is set to false: MSAA:STATE_SYSTEM_EXPANDED This state should be mapped to true on Windows systems when an event handler has a role of pop up menu. MSAA: haspopup MSAA:STATE_SYSTEM_EXTSELECTABLE MSAA: STATE_SYSTEM_PRESSED is true when checked. MSAA:STATE_SYSTEM_READONLY
Desktop Browsers
191
5.3 Accessibility API Support One of the limitations placed upon the roles and states is the mapping between them and accessibility APIs. Table 5 is taken from the States and Properties Module for Accessible Rich Internet Applications and shows the relationship between ARIA properties and MSAA/ATK accessibility API event messages. Most of the ARIA properties have corresponding MSAA or ATK mapping, but some do not have direct mapping like ‘‘required’’ properties. To provide mappings for these properties and roles, the Free Standards Group (FSG) has developed extension to MSAA called Iaccessible2. IAccessible2 extends the features of MSAA without assistive technology and browser developers needing to implement an entirely new accessibility API like UI Automation to provide additional information on widgets.
6 Conclusions There is no best or perfect browser that meets the accessibility needs of all people with disabilities. Each Web browser has it own built-in accessibility features and compatibility with assistive technology. The more Web browsers that become available, the more choices all users will have in finding a browser that will meet their accessibility needs. I hope that this chapter has helped you learn more about the importance of browser features in making Web resources more accessible and how browser features impact the way authors create accessible content. One of the most important aspects of making the Web more accessible to people with disabilities is the support of W3C recommendations. The heart of W3C recommendations is the support for interoperability, and interoperability makes it easier for Web developers to create resources that can be used on a wide range of technologies, including the technologies used by people with disabilities. Since people with disabilities use a wider range of Web browsing technologies than the general population, to access the Web resources, the support of Web standards gives them more options and makes it easier for them to use browser features to adapt Web resources to meet their needs. It is important for all of us to ask developers of browser and multimedia players to support Web standards and implement features that benefit people with disabilities. Through our combined voice, companies building Web browsing technologies will need to pay more attention to improving accessibility features.
7 Additional Resources The following section provides links to additional resources to further explore the technologies and features needed to make desktop browsers more accessible to people with disabilities.
192
J. Gunderson
7.1 W3C Recommendations and Other Standards Section 508 Information Technology Accessibility Standards http://www.access-board.gov/508.htm W3C User Agent Accessibility Guidelines 1.0 http://www.w3.org/TR/UAAG iCITA HTML Accessibility Best Practices http://html.cita.uiuc.edu W3C Web Content Accessibility Guidelines 1.0 http://www.w3.org/TR/WCAG Cascading Style Sheets, level 2 revision 1 http://www.w3.org/TR/CSS21/ HTML 4.01 Specification http://www.w3.org/TR/HTML4/ W3C Document Object Model http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/ Roles for Accessible Rich Internet Applications (ARIA Roles) http://www.w3.org/TR/aria-role/ States and Properties Module for Accessible Rich Internet Applications (ARIA States and Properties) http://www.w3.org/TR/aria-state/
7.3 Accessibility APIs Microsoft Active Accessibility for Microsoft Windows http://www.microsoft.com/enable IAccessible2 Accessibility API http://accessibility.freestandards.org/a11yspecs/ia2/docs/html/ UI Automation for Microsoft Windows http://www.microsoft.com/enable http://msdn2.microsoft.com/en-us/library/aa286482.aspx Java Accessibility API and Resources http://java.sun.com/products/jfc/accessibility/index.jsp
Desktop Browsers
193
Apple Accessibility API and Resources http://developer.apple.com/referencelibrary/GettingStarted/ GS_Accessibility/ GNOME Accessibility API http://developer.gnome.org/projects/gap/
References Anson, D. (1997) Alternative Computer Access: A Guide to Selection, F. A. Davis Company, Philadelphia, PA. Berners Lee, T., Cailliau, R., Luotonen, A., Nielsen, H. F., Secret, A. (1994) ‘‘The World Wide Web’’, Communications of the ACM, August, pp76 82. Bos, B., C¸elik, T., Hickson, I., Lie, H. W. (2007) W3C Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification, http://www.w3.org/TR/CSS21. Champion, M., Nicol, G., Byrne, S., Le Hors, A., Le He´garet, P., Robie, J., Wood, L. (2000) Document Object Model (DOM) Level 2 Core Specification, . http://www.w3.org/TR/ 2000/REC DOM Level 2 Core 20001113 Gunderson, J. (Ed) (2007) iCITA Web Accessibility Best Practices, http://html.cita.uiuc.edu. Gunderson, J., Schwerdtfeger, R. (2006) Mozilla/Firefox Accessibility Extension, Proceedings of the 2006 International Technology and Persons with Disabilities Conference. Jacobs, I., Gunderson, J., Hansen, E. (Eds.) (2002) W3C User Agent Accessibility Guidelines, http://www.w3.org/TR/UAAG. Lie, H. W., Bos, B. (2005) Cascading Style Sheets: Designing for the Web (3rd Edition), Addison Wesley Professional, Indianapolis, IN. Mueller, John P. (2003) Accessibility for Everybody: Understanding the Section 508 Accessibility Requirements, Springer Verlag, New York. Pacillo, M. (2000) Web Accessibility for People with Disabilities. CMP Books R&D Developer Series, Lawrence, KS. Thatcher, J., Burks, M., Heilmann, C., Hnery, S., Kirkpatrick, A., Lauke, P., Lawson, B., Regan, B., Rutter, R., Urban, M., Waddell, C. (2006) Web Accessibility: Web Standards and Regulatory Compliance, Apress, Berkley, CA. van der Vlist, E., Ayers, D., Bruchez, E., Fawcett, J. (2006) Professional Web 2.0 Programming, Wrox Professional Guides, Wiley, Indianapolis, IN. Vanderheiden, G., Jacobs, I., Chisholm, W. (Eds.) (1999) W3C Web Content Accessibility Guidelines, http://www.w3.org/TR/WCAG/. Zeldman, J. (2003) Designing with Web Standards, New Riders, Berkley, CA.
Specialized Browsers T.V. Raman
Abstract To most users of the Internet, the Web is epitomized by the Web browser, the program on their machines that they use to ‘‘logon to the Web’’. However, in its essence, the Web is both a lot more than— and a lot less than— the Web browser. The Web is built on: URLs. A universal means for identifying and addressing content HTTP. A simple protocol for client/server communication HTML. A simple markup language for communicating hypertext content This decentralized architecture was designed from the outset to create an environment where content producers and consumers could come together without the need for everyone to use the same server and client. To participate in the Web revolution, one only needed to subscribe to the basic architecture of a Web of content delivered via HTTP and addressable via URLs. Given this architecture, specialized browsers have always existed to a greater or lesser degree alongside mainstream Web browsers. These range from simple scripts for performing oft-repeated tasks, e.g., looking up the weather forecast for a given location, to specialized Web user-agents that focus on providing an alternative view of Web content. This chapter traces the history of such specialized Web clients and outlines various implementation techniques that have been used over the years. It highlights specialized browsers in the context of accessibility, especially for use by persons with special needs. However, notice that specialized browsers are not necessarily restricted to niche user communities—said differently, all of us have special needs at one time or another. As we evolve from the purely presentational Web to a more data-oriented Web, such specialized tools become center-stage with respect to providing optimal information access to the end-user. The chapter concludes with a
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_12, Ó Springer Verlag London Limited 2008
195
196
T.V. Raman
brief overview of where such Web technologies are headed and what this means to the future of making Web content accessible to all users.
1 Introduction To most users of the Internet, the Web is epitomized by the Web browser, the program on their machines that they use to ‘‘logon to the Web’’. However, in its essence, the Web is both a lot more than—and a lot less than—the Web browser. The Web is built on: URLs. A universal means for identifying and addressing content HTTP. A simple protocol for client/server communication HTML. A simple markup language for communicating hypertext content This decentralized architecture was designed from the outset to create an environment where content producers and consumers could come together without the need for everyone to use the same server and client. To participate in the Web revolution, one only needed to subscribe to the basic architecture of a Web of content delivered via HTTP and addressable via URLs. Given this architecture, specialized browsers have always existed to a greater or lesser degree alongside mainstream Web browsers. These range from simple scripts for performing oft-repeated tasks, e.g., looking up the weather forecast for a given location, to specialized Web user-agents that focus on providing an alternative view of Web content. This chapter traces the history of such specialized Web clients and outlines various implementation techniques that have been used over the years. It highlights specialized browsers in the context of accessibility, especially for use by persons with special needs. However, notice that specialized browsers are not necessarily restricted to niche user communities—said differently, all of us have special needs at one time or another. As we evolve from the purely presentational Web to a more data-oriented Web, such specialized tools become center-stage with respect to providing optimal information access to the end-user. The chapter concludes with a brief overview of where such Web technologies are headed and what this means to the future of making Web content accessible to all users.
2 Overview We start this section with a brief overview of the history of Browsers (2007)— commonly referred to as self-voicing browsers. The goal is not to cover every specialized browser that was ever written; rather, this section attempts to broadly classify various solutions that have been built since 1994 as a function of the end-user experience they delivered.
Specialized Browsers
2.1 Talking Browsers
197
1994 1998
The mainstreaming of the Web in 1994 coincided with the coming of age of GUI screenreaders. This meant that Web access issues for visually impaired users became intricately tangled up with the broader issue of providing good nonvisual access to the GUI. At the time, generic platform-level accessibility APIs were non-existent, and screenreaders relied on constructing an off-screen model by watching low-level graphics calls. Thus, Web access presented an additional challenge to adaptive technologies of the time. Specialized Web browsers that spoke Web content first emerged in early 1995. These were implemented as browser add-ons that relied on a visual browser to retrieve and display the content; the browser add-on accessed the retrieved HTML to produce spoken content. Notice that this was before the advent of standardized APIs such as the HTML Document Object Model (DOM). Despite this lack of standardized APIs, talking browsers of the time still had an advantage over available screenreader technologies; this was because specialized browsers were able to augment the user interface with additional commands that enabled the user to efficiently navigate the contents of a page. Contrast this with the screenreaders of the time that had to rely on the final visual presentation of the page—users of specialized talking browsers could navigate by paragraphs and sections, whereas screenreader users of the time were limited to line-oriented navigation. Examples of such specialized Web browsers from 1995 include the following: PWWebSpeak. This was implemented as an add-on to the Netscape browser in 1995, and the tool survived until the late 1990 s—see (Results, 2007). The browser was revolutionary for its time in terms of providing direct spoken access to the Web document, rather than forcing the speech user to deal with a purely visual presentation. Home Page Reader. IBM Home Page Reader was released a few months later. Built originally as a Netscape extension, it later evolved to become a plugin to Internet Explorer. Like PWWebSpeak before it, it relied on the mainstream Web browser (Netscape and later IE) to do the bulk of the work with respect to retrieving and displaying content. Home Page Reader hosted the Web browser—contrast this with PWWebSpeak which was hosted inside the browser. This reversal of roles enabled IBM Home Page Reader to provide a better end-user experience over time, since the program had greater flexibility with respect to adding or subtracting user interface elements from the browser’s chrome. Emacs W3. This was one of the early Web browsers that saw active development between 1993 and 1998. In conjunction with Emacspeak (2007), this tool introduced many innovations including:
Support for Aural CSS. Aural style sheets based on Cascading Style Sheets (CSS) could specify the aural properties to be used when speaking Web content.
198
T.V. Raman
Structured navigation. Users could quickly skim through an HTML document based on its underlying document structure.
HTML form enhancements including support for element
label and fieldset. This enabled Emacspeak to provide contextually meaningful
prompts of the form Press this to change ‘‘Do you accept’’ from yes to no.
Table navigation with the ability to speak a cell with its row or column header, as well as the ability to focus in on the contents of a given table cell.
2.2 Spoken Web Access
1998 2003
By the late 1990s, Windows screenreaders like Jaws For Windows (JFW) and Window Eyes started looking at the HTML content in addition to using the visual presentation provided by the browser. Around the same time, platformlevel accessibility APIs like Microsoft Active Accessibility (MSAA) enabled screenreaders to produce more reliable spoken output. Consequently, the combination of a screenreader and mainstream browser began to provide the same level of end-user functionality that was seen earlier with specialized browsers like PWWebSpeak and IBM Home Page Reader. As an example, popular Windows screenreaders today implement browser support by placing Web pages in a virtual buffer that the user navigates using specialized commands to listen to the contents. Tools like IBM Home Page Reader therefore evolved into tools for checking the usability of Web sites with spoken output for use by content developers.
2.3 Spoken Web Access
2003 Present
Content feeds encoded as Really Simple Syndication (RSS) went mainstream in 2003. RSS feeds were the underpinnings of the blogging revolution. As a sideeffect, content-rich Web sites started providing data feeds that could be viewed inside specialized tools such as feed aggregators. This marked the coming of the data-oriented Web, where content is designed to be more than just viewed in a Web browser. The coming of such data-oriented access mechanisms has had a significant impact on the role and effectiveness of specialized browsing tools. In early 2000, Emacspeak (2007) acquired the ability to apply content transformations to Web pages before presenting them to the user. This meant that the content of a Web page could be rearranged and filtered to provide an optimal eyes-free experience. In combination with the availability of content feeds, this enabled the creation of a large number of task-oriented tools. All of these tools leveraged the basic HTML+CSS rendering capabilities of Emacs/ W3. Each of these specialized tools exposed a task-oriented interface that prompted the user for relevant input, retrieved and transformed the Web content relevant to the user and finally produced a speech-friendly presentation of the results.
Specialized Browsers
199
The key difference with such task-oriented tools is that the user does not first launch a Web browser; for the most part, the user does not even think of the output from these tools as Web pages. The framework hosting these tools—Emacspeak—implemented the building blocks of basic Web Architecture, and the result was a set of mini-applications that the user could call up with a few keystrokes. Examples of such task-driven tools include the following: Search. Prompt for a search query and speak the results Map directions. Prompt for a start and end location and speak the directions Weather. Prompt for a location and speak the weather forecast Notice that the resulting collection of speech-enabled tools can each be thought of as an extremely specialized browsing tool; the framework for hosting such tools then becomes an alternative container on par with the traditional Web browser. We will return to the topic of specialized containers that host Web components in Section 4.2 where we cover the rapidly evolving space of Web gadgets.
2.4 Voice Browsers
2000 2007
In 1999, the W3C launched the Voice Browser activity which led to the publication of VoiceXML 2.0—an XML-based language for authoring dialog interaction. VoiceXML is designed for authoring interactive applications that use speech as the primary interaction modality; typical implementations consist of a specialized container that processes VoiceXML documents to carry out a spoken dialog with the user. Covering the design and use of VoiceXML is beyond the scope of this chapter—for details, see VoiceXML (2007). Later, XHTML (2007) enabled the integration of interactive spoken dialogs in Web pages. In this design, VoiceXML was used to author dialogs that were then attached as event handlers to visual Web controls. When hosted within a browser implementing DOM2 Events, this had the effect of turning visual user interface controls into multimodal dialogs. When a visual user interface control received focus, the VoiceXML dialog attached to that control produced appropriate spoken prompts, activated the specified speech recognition grammar and returned the recognized result to the user interface control. This meant that users could fill-in forms either via the keyboard or via spoken input—this technique was implemented in browsers like Opera 9. The design of VoiceXML applications is significant from the perspective of specialized browsers and Web architecture. VoiceXML applications use URLs as the addressing mechanism for locating and retrieving resources via HTTP. Here, resources include both application data, e.g., a train timetable, as well as application resources needed to carry out an effective spoken dialog with the user, e.g., spoken prompts and speech grammars. Thus, a VoiceXML application consists of the following:
200
T.V. Raman
Prompts. Spoken prompts as either
Authored as declarative markup using Speech Synthesis Markup Language (SSML) for further processing by a Text To Speech (TTS) engine or
Created as pre-recorded audio files for playback to the user Grammars. Speech Recognition Grammar Specification (SRGS) grammars for constraining the recognizer to the set of appropriate utterances. VoiceXML. A sequence of VoiceXML dialogs consisting of form and field elements. A VoiceXML document can be viewed as a sequence of dialog elements that act as event handlers. Each VoiceXML dialog prompts the user, collects one or more values from the user and defines the appropriate event handler to fire based on the result of the recognition task. These event handlers are themselves other VoiceXML dialogs, thereby enabling VoiceXML to define a finite state machine that encapsulates dialog flow. VoiceXML applications can be viewed as attaching a purely spoken user interface to data available on the emerging service-oriented Web. Notice that the above design pattern of attaching spoken interaction to data on the serviceoriented Web is still an on-going evolutionary process. VoiceXML applications authored for today’s Web often end up needing to create a specialized back-end application from scratch—as opposed to merely attaching a spoken dialog interface to an existing data-oriented application. But this is a reflection of the fact that until now most applications have been authored for use via visual interaction. As we move toward an increasingly diversified Web characterized by users who demand ubiquitous access from a variety of access devices ranging from desktop PCs to mobile devices, the Web is seeing a corresponding refactoring of the programming technologies used to author, deploy and deliver end user interaction. Over time, such refactoring is beginning to lead to a dataoriented Web where open Web APIs based on URLs and standardized feed structures based on ATOM (2007) and RSS (2007) increasingly enable programmatic access to useful services. As such access increases, specialized browsers that provide alternative access can increasingly focus on the details of user interaction in a given modality, without having to repeatedly program the modality-independent aspects of an application. Content creation guidelines and standards play an increasingly important role in this process of refactoring as will be seen in the next section.
3 Access Guidelines The extent to which Web content can be made perceivable to the widest possible audience is a function of the following: Content. The nature of the content, and the extent to which the encoding of that content permits graceful degradation. As an example, a purely visual image
Specialized Browsers
201
is of little use to someone who cannot see (given the state of today’s automatic image recognition technologies). Notice that graceful degradation of content requires redundancy in the content encoding, and that such redundancy is an essential prerequisite when repurposing content to different modalities via specialized browsers. User Agent. The software used to access content is primarily responsible for the quality of the user experience. Adaptive Technology. Users’ needs and abilities vary, and where available user agents do not include the necessary augmentations needed by a specific group of users, this ability gap can often be bridged by using add-on adaptive technologies. As can be seen from the above, the overall user experience—especially when considering users with special needs—is a function of the triple (C, UA, AT). Other chapters in this book focus on access guidelines and adaptive technologies in far greater detail; this section focuses on the relevance of accessibility guidelines as viewed from the goal of designing specialized browsing applications.
3.1 Content Is King The Web as we know it would not exist without content. For the Web to remain true to its original vision of a Web of content where producers and consumers come together without explicit dependencies on a given set of technologies, content needs to be able to degrade gracefully, e.g., a Web site that has been created assuming color displays needs to be usable when viewed using a monochrome display. Returning to the topic of spoken access and specialized browsers, there is a deep relationship between access guidelines created to further the needs of graceful degradation and creating content that lends itself to being delivered via alternative modalities such as spoken output.
3.2 Separation of Content from Style Separating content from style on Web pages by using CSS is an example of good content practice that benefits accessibility in the broader sense:
Empowers users to pick a presentation scheme that is best suited to the user’s needs and abilities
User-specific styles are cascaded with author-provided styles to achieve the final effect
Presentation is no longer hard-wired into the content, making the content amenable for a multiplicity of presentations
202
T.V. Raman
More specifically, separation of style from content makes the resulting HTML better suited for delivery via alternative modalities such as spoken output. Work on CSS1 started in 1995—CSS1 became a W3C Recommendation in early 1997. Aural CSS was created as an appendix to CSS1 in February 1996; later, it was converted into a CSS module for CSS 2.1. Note that the next version of CSS, CSS 3.0, is being created as a collection of modules—with one module focused on auditory output. Aural CSS is a good example of talking browsers leveraging an underlying design principle—separation of content from presentation—and applying the benefits of such separation to an entirely different output modality. Aural CSS was first implemented in Emacs/W3 in 1996; later, Opera implemented a subset of Aural CSS in Opera 9 in the context of speech-enabling the Opera browser using XHTML+Voice (X+V). Aural CSS specifies a set of additional voice properties that can be used to annotate Web content. As with visual CSS properties, aural properties can originate from a number of sources:
Style sheets provided by the content author Style sheets provided by the user Style mappings provided by the browser, e.g., a talking browser can choose to map a given visual style to a corresponding aural style Most uses of Aural CSS fall into the final category above, i.e., specialized browsers use Aural CSS as a rule-based means of mapping visual style rules to appropriately designed aural styles.
3.3 Separation of Content and Interaction Today’s Web pages are no longer pure content—they come close to realizing the 30-year-old maxim ‘‘the document is the interface!’’. User interfaces—and interactive documents as found on today’s Web—consist of content, style and interaction. Thus, today’s HTML pages consist of the following layers: Content. Declarative HTML markup that represents document content Style. CSS style rules that are bound to the HTML via appropriate class attributes placed on the content Scripts. Event handlers implemented in the form of JavaScript functions that are invoked in response to user events Notice that as we add in the next layer of complexity to Web documents, there is significant value in keeping the interaction layer well separated from the content and style layers. Such separation is important for the broader needs of accessibility to the widest possible audience; but it is crucial with respect to creating Web applications that lend themselves for easy deployment to different enduser interaction scenarios.
Specialized Browsers
203
In 1999, the W3C’s Forms WG set out to define the next generation of HTML Forms, but in this process quickly discovered that form elements in HTML were not just about fill-out forms. Form elements collect user input and are in fact the basic building blocks for creating user interaction within Web pages. With this realization, XForms evolved into a light-weight Web application framework with a well-defined Model View Controller (MVC) design. Thus, XForms consists of the following: Model. An XMl data model for encapsulating user input, along with validity and dependency constraints UI. A set of abstract user interface controls that capture the intent—rather than the presentation— underlying the user interface Binding. A generic binding mechanism for connecting the user interface layer to the underlying data model The above separation between content, presentation and interaction was introduced to ensure that Web applications created via XForms could be delivered to a multiplicity of end-user interaction contexts. As an example, a given XForms application can be hosted inside a Web page to provide visual interaction; the same XForms application can be processed by a different container to deliver a chat-like interface, where the user is progressively prompted for the requisite information using an instant messaging client. As a case in point, see FormsPlayer (2007) which describes how the various items of abstract metadata encapsulated by an XForms application, e.g., help and hint, can be leveraged to deliver a multimodal experience where the relevant tips are spoken to the user.
4 Future Directions Based on the trends seen so far, this section sketches future directions for lightweight Web applications and their impact on the area of specialized browsers and accessibility. Notice that most if not all evolution on the Web is incremental; this means that many of the solutions that will become common-place in the future will typically trace their past to early prototypes of today. With this in view, this section sketches some future directions based on prototypes that have been built during the last few years. This is not to say that there will be no revolutionary changes; however, incremental improvements are far easier to predict—and in their aggregate often prove to be revolutionary in their impact.
4.1 Web Wizards and URL Templates As described in Section 1, URLs play a central role in the architecture of the Web. As the Web evolved to include dynamic server-generated content, URLs
204
T.V. Raman
became more than locators of static content—URLs came to include URL parameters that were processed on the server to generate customized content. The formalizing of the Common Gateway Interface (CGI) in 1994 and the advent of HTML forms for collecting user input together led to the idea of RESTful URLs —see Representational State Transfer (REST, 2007). Such RESTful URIs naturally evolved into the fore-runner of light-weight Web APIs; in fact these still form the underpinnings of many of the data-oriented APIs deployed on the Web in 2007. RESTful URIs led to the notion of url templates and Websearch wizards in Emacspeak around 1999. At the time, mainstream Web sites had become visually busy. As a result, useful services such as getting map directions were difficult to use in an eyes-free environment where one needed to listen to the entire Web page. As an example, in 1998, one could get driving directions from Yahoo Maps for anywhere in the United States —a major step forward for the time since before then, one needed to use specialized mapping/atlas programs to obtain such information. The only drawback was that the input controls for providing start and end locations were buried deeply inside a visually busy page. Worse, once one had located the input fields and filled in the requisite information, one suffered the obligatory World Wide Wait before receiving a heavy-weight HTML page with the directions swamped by a mass of additional content. Fortunately, the underlying Web architecture based on RESTful URLs made building a specialized tool for this task relatively easy. The tool in question was implemented to: Prompt. Collect the start and end location from the user Retrieve. Retrieve the content at the URL constructed by filling in the appropriate URL params Filter. Filter the resulting content to locate and speak the driving directions Eight years later and counting, the Emacspeak tool for accessing driving directions from Yahoo Maps still works. The only piece of this tool that has changed over the intervening period has been the filter step which needs to keep pace with changes to the layout of the HTML page containing the directions. The next step in this evolution was to convert the one-off tool above into a mini-application that was hosted in a framework. Notice that there is nothing very specific to map directions about the (prompt, retrieve, filter) sequence outlined above. Thus, within a few weeks of implementing the specialized talking map directions tool, Emacspeak had evolved to contain a framework that allowed easy authoring of talking Web tools. All of these tools have the following in common: Interaction. A common interaction model that consists of spoken prompts with auto-completion and automatic speaking of the relevant results Style. Aural CSS is used to consistently style all spoken output with changes in voice characteristic highlighting key portions of the result being spoken Code Isolation. Each specialized tool in the framework is specific to a given Web site’s idiosyncrasies. This means that at any given time, at least some of
Specialized Browsers
205
the available tools might be broken and need updating; however, such breakages are isolated to that particular tool Incremental Evolution. Tools can be added, removed or modified without affecting other tools
4.2 Portals and Web Gadgets In the late 1990 s the Web browser evolved into a universal client—thanks to the availability of a number of useful services on the Web. This movement started on corporate Intranets where Web technologies proved far more cost-effective than traditional multi-tier client/server solutions. The trend extended itself to the global Internet as electronic commerce became prevalent on the Web. Thus, the Web browser became the user’s porthole onto the world of electronic information.
4.2.1 Portlets and Portal Servers This evolution led naturally to the advent of Web portals —and consequently to the creation of portal servers. Web sites like Yahoo (2007a) aggregated a number of useful services onto a single large Web page to provide a single point of access; on corporate Intranets, such portal sites were powered by portal servers that enabled the Web administrator to easily deploy new applications on the site and have users configure their user experience by determining what they saw on their customized page. The above process gave birth to a new form of specialized browser —the Web portlet. A portlet was a small Web application that was deployed on the server to carry out the following steps: Back-end. Communicate to the back-end application—typically via HTTP—to retrieve, filter and format the requisite information Front-end. Render the formatted information as HTML for embedding within a larger Web page Configuration. Provide the user interface affordances to allow users customize the final experience by configuring the look and feel of the portlet. Such configuration included adding, removing, expanding or collapsing the portlet Preferences. Manage user preferences across portlets hosted on a page Single sign-on. Delegate common tasks such as authentication to the portal container, so that users do not need to login to each portlet application Portlets as described above can be viewed as specialized browsers optimized for a given task, e.g., working with an employee’s financial records on a corporate Intranet. Though hosted within a specialized application container on the server, such portlets are in fact no different than the specialized talking Web tools described in the previous section; in the case of looking up an employee’s financial records, the tasks that the user would typically need to perform
206
T.V. Raman
Browse. Point the Web browser at the site for managing financial records Sign In. Sign into the site with the appropriate credentials Query. Request the relevant information are performed on behalf of the user by the portlet. Thus, the portal server becomes an application container that provides a framework for portlet authors to create task-specific Web applications. The framework manages details such as single sign-on and creating a uniform look and feel with respect to customization. The resulting portal site provides a single point of entry for the user and obviates many repetitive tasks: Single Sign-on. Users sign in once to access a number of related applications. Defaults. Each application can be configured with a useful set of defaults for the current user. Preferences. Users can manage their personal preferences with respect to look and feel across a set of applications. In their heyday, portlets were not limited to desktop browsers with a visual interface. Using the underlying Web APIs, portlets were also created for deployment to mobile devices. Finally, a small number of portlets were created for hosting within a voice portal; such voice portlets emitted VoiceXML for aggregation into a larger VoiceXML application. Compared to their visual analog, VoiceXML portlets have not been very successful—primarily because integrating multiple spoken dialog applications into a coherent whole still remains an unsolved research problem.
4.2.2 Web Gadgets Portal servers and portlets became the rage in 2002. Mapping this concept onto the client led to Web gadgets—task-specific Web applications hosted within the browser. As with the task-specific Web application technologies described so far, gadgets also relied on the underlying Web architecture of URLs and HTTP to bring relevant data closer to the user. As a client-side technology, gadgets naturally chose HTML and JavaScript as the implementation language; early prototype examples include Opera (2007) for the Opera browser among others. Thus, client-side gadgets consisting of HTML, CSS and JavaScript were initially designed for placing within a Web page in a manner analogous to what was seen earlier with portlets. The next step in this evolution came with the realization that forcing the end-user to launch a Web browser for every task was not always convenient—there are certain types of information, e.g., the current weather, that are better suited to being available on the user’s desktop. This led to Apple’s MacDashboard (2007). Thus, the task-specific Web applications created thus far for aggregation into a Web page for viewing within a browser were finally freed from the shackles of having to live inside the browser— Web widgets could now materialize on the desktop.
Specialized Browsers
207
Web gadgets are still evolving as this chapter is being written. In late 2005, Google introduced IGoogle Modules for adding custom content to a user’s Personalized Google page. Conceptually, these are similar to portlets, except that IGoogle Modules can be authored by anyone on the Web and published to a directory of modules that helps users discover and add published modules to their personalized IGoogle page. In an interesting parallel to clientside Web widgets escaping the shackles of the browser to live on the desktop, IGoogle modules can now be hosted within Web pages outside Google; they can also be viewed as Google Gadgets and materialize on the Google Desktop. Notice that we have now come full circle, with task-specific browsing technologies that started as a niche application becoming a mainstream feature. Notice further that though such widgets inhabit the user’s desktop outside the Web browser, they are well-integrated with respect to Web architecture and use all of the Web’s basic building blocks to achieve their end. The impact on talking browsers of this progression from specialized Web applications to task-specific Web gadgets for the mainstream is profound, since the very features needed by spoken Web access:
A clean separation of content from presentation and interaction A data-oriented Web Light-weight Web APIs are all prerequisites to building a healthy environment for Web gadgets.
4.3 Web APIs and Mashups RESTful Web APIs became common by late 2004. The simplicity afforded by parametrized URLs and the bottom-up nature of development of RESTful Web APIs helped them overtake the much vaunted Web Services1. As a consequence, the number of useful Web services available via light-weight Web APIs reached critical mass by late 2004, and Web 2.0 became a viable platform for building useful solutions. Google Maps launched in early 2005, provided the final link in the chain that led to Web mashups—light-weight Web applications that bring together data from different sources on the Web. Maps provide an ideal spatial canvas for visualizing information available on the Web. The availability of locationbased information, e.g., available rentals in a given city, data about crime rates in different neighborhoods, etc., when combined with Google Maps enabled the creation of map mashups that allowed one to place locationoriented data on a map. 1
The capitalization of Web Services here is intentional and refers to the large number of complex WS* specifications that make up the Web Services stack.
208
T.V. Raman
Notice that Web mashups represent a very special kind of task-oriented browsing; earlier the user looking for apartments to rent would have had to perform the following discrete tasks: Find. Browse to the relevant Web site to query for available apartments Locate. For each available apartment, enter its address into the map to locate it The Web mashup plays the role of a specialized browser that performs these tasks on the user’s behalf to create the final result set. Web mashups like the one described here leverage the underlying Web architecture of URL-addressable data that is retrievable via HTTP. The last 18 months have seen an explosion of useful Web mashups. Mashups have moved from being Web applications that brought together data from different sites to providing alternative views of available data. As an example, the Google Calendar API enables Web sites to embed a user’s Google Calendar within a Web page. In doing so, such mashups can customize the look and feel of the calendar; this leads naturally to mashups that provide alternative views of the Google Calendar. The ability to provide alternative views of the same data source is a key consequence of the separation of data from any given view, and was earlier identified as a key requirement for adaptive Web access. With Web APIs and mashups liberating Web developers and users from a one size fits all Web, mashups are evolving to be a flexible platform that:
Provide the ability to build highly optimized custom views for cases where the ‘‘one size fits all’’ solution does not work
Discover innovative access solutions via experimentation for inclusion into the mainstream
4.4 Putting It Together
Ubiquitous Access
The evolution of specialized Web tools, light-weight Web APIs and Web mashups have together led to the emergence of a component framework for the Web. This framework is characterized by: Data Model. An emerging data model for representing, manipulating and communicating presentation-independent structured data. These manifest themselves in one of the following forms:
ATOM feeds used in the context of Atom Publishing Protocol (APP, 2007).
XML instances backed by appropriate XML Schema type definitions for application-specific data. These are most commonly encountered in the context of XForms. JavaScript Object Notation (JSON) serializations of structured data records. JSON uses JavaScript serialization to represent structured data as an alternative to XML.
Specialized Browsers
209
UI. User interface controls authored as a mixture of declarative markup, style specifications and script-based event handlers to implement custom interaction. Binding. A set of common technologies for binding user interface controls to underlying data. Such binding brings data to life by enabling users to manipulate and view structured data. The separation of data, presentation and interaction that is manifest in this emerging architecture for Web components lend itself well toward making Web gadgets available on a variety of devices and interaction modalities. Thus, specialized browsing—a niche concept that was originally limited to special adaptive aids or software engineers building themselves efficient one-off solutions—has now evolved to become center-stage. Specialized browsers that talk HTTP to retrieve information from the data-oriented Web and deliver a custom presentation that is optimized for the user’s special needs and abilities are now a mainstream technology. Such specialized gadgets range in complexity from the simple look up weather gadget to full-blown custom applications such as mobileoptimized email clients. All of these share the underlying Web fabric of HTTP and URLs which means that specialized clients like GMail (2007) need only implement the user-facing aspects of a traditional mail client. This space is still evolving rapidly as this chapter is being written. Component technologies such as those described so far will likely evolve to become pervasive, i.e., a Web component once created is likely to be capable of manifesting itself in a variety of end-user environments ranging from the graphical desktop to the speechenabled mobile phone.
4.4.1 Web Command-Line As the Web platform continues to evolve, we can expect today’s environment of Web mashups backed by RESTful APIs to further evolve to enable end-user composability of Web components. Functionally, the service-oriented Web is a collection of Web APIs that can be composed to create higher-level solutions. In this regard, APIs such as Google Maps are Web components analogous to UNIX shell tools such as ls and find—UNIX is exemplified by its commandline shell where small, task-specific tools are composed to create custom enduser shell scripts to automate common tasks. The next step in this evolution is likely to be the creation of a Web command-line that enables end-users to compose higher-level solutions from existing Web components. In the UNIX shell, components were composed by piping the output of one component to the input of the next component to create logical pipelines—once created, such user-defined components could themselves be used as components. Equivalent concepts for the Web platform are still evolving as this chapter is being written. Below, we enumerate some of the design patterns that have emerged over the last few years to serve as an indicator of what is to come.
210
T.V. Raman
Data Feeds. Structured data feeds encoded as RSS, Atom or JSON are used to communicate between Web components. XML HTTP. XML HTTP is used to make asynchronous requests for data within Web applications, making them more reactive. Eventing. DOM eventing provides a standardized mechanism for reacting to user interaction events. Greasemonkey. Content APIs like the DOM enable content transformation on the client—either via JavaScript as implemented by Greasemonkey or via XSLT. Composability. Web APIs enable composability. Composability can happen on the client, e.g., AJAX APIs coming together in a mashup, or on the server as shown by solutions such as Yahoo (2007b). Command-line. The address bar of the Web browser has for now turned into a poor man’s command-line while we evolve toward a truly programmable Web platform.
4.5 Web Access
A Personal View
The Web is an information platform, and the question What is good Web access? is better answered by rephrasing the question as How does one deliver effective information access? My own work in this field started with the work on ASTER (2007)—a system for producing high-quality aural renderings. The primary insight underlying ASTER was Electronic information is display independent. To produce good aural renderings, one needs to start with the underlying information, as opposed to a specific visual presentation.
ASTER introduced the notion of audio formatting and produced high-quality aural renderings from structured markup by applying rendering rules written in Audio Formatting Language (AFL). ASTER included an interactive browser that allowed the listener to obtain multiple views. Re-ordering and filtering of content is an essential aspect of specialized browsing. Extending these ideas from documents to user interfaces led to Emacspeak (2007)—a well-integrated computing environment that provides the auditory equivalent of the graphical desktop. Emacspeak extended the notion of audio formatting to interactive environments. In implementing rich auditory interaction for the Emacspeak audio desktop, it became clear that most of today’s user interfaces could be phrased in terms of a small number of abstract conversational gestures—see Table 1. The term conversational gestures was chosen intentionally—conversation implies speech; gestures implies pointing; the set of abstract conversational gestures identified by the work on Emacspeak is actually independent of both interaction modalities. Conversational gestures as enumerated in Table 1 enable the authoring of intent-based user interaction that can be mapped to different interaction
Specialized Browsers
211
Table 1 Conversational gestures Exchanging Textual Information Edit widgets Message widgets Toggles Radio groups Previous Left First
Answering Yes Or No Check boxes Select Elements From Set List boxes Traversing Complex Structures Next Parent Right Up Last Root
Child Down Exit
modalities. This notion was further developed and implemented within XForms (2007)—XML- powered Web forms—where we defined user interface controls for each of the conversational gestures. User interaction authored via such intent-based vocabularies lend themselves well to delivery to different interaction modalities. Notice that a common set of abstract user interface controls gives enormous flexibility when determining how a particular piece of user interaction is delivered to the user. But to deliver such flexible interaction, one needs to have the freedom to experiment at the time the user interface is delivered. An emerging pattern in this space is to therefore
Author high-level user interaction using declarative markup, e.g., XForms Deliver specific interaction behavior via event handlers implemented using client-side scripting, e.g., JavaScript This leads naturally to the next step in this evolution—dynamic Web interaction delivered as a collection of declarative markup, prescriptive style sheets and imperative event handlers. Notice that this packaging of Web interaction once again reflects the oft-mentioned separation of content, presentation and interaction.
4.5.1 Speech-Enabling Dynamic User Interfaces User interfaces created using intent-based authoring as embodied by technologies like XForms enable flexible delivery, and consequently make attaching spoken interaction tractable. However, there is a concomitant need to be able to speech-enable dynamic interaction delivered as a combination of declarative markup and imperative event handlers. Notice that the availability of declarative, intent-based representations for common interaction tasks does not eliminate the need for imperative script-based solutions; scripting will always remain as a means to experiment with new interaction patterns. Thus, there is a need to identify the relevant pieces of information that need to be added to the content layer to enable speech-enabling dynamic Web interaction.
212
T.V. Raman
For Dynamic HTML (DHTML), such information consists of the following: Role. A property that reflects the role played by UI component. As an example, property role might be used to indicate that an interactive element on a Web page is a menu. State. Dynamic user interfaces are reactive—the state of user interface controls gets updated dynamically based on user interaction. Thus, a set of user interface controls that were originally disabled might become available to the user during the course of interaction. Dynamic property state can be used to encapsulate such changes in the state of user interface controls. Monitors. In addition, dynamic visual interfaces rely on the eye’s ability to track changes in the presentation. To be able to effectively speech-enable such user interfaces, one needs to be able to establish an observer–observable relationship between various interaction elements making up the user interface. The content layer of the application needs to enable the identification of such relationships and clearly markup those regions of the interface that need to be presented to the user when updated. Addition of these properties to the content layer of Web applications brings the interaction layer on par with the rest of the Web component framework with respect to empowering alternative modes of interaction. Note that these properties also form the underpinnings of the present work on access-enabling rich Internet applications (ARIA, 2007).
5 Summary This section summarizes the key take-aways from this chapter: Web arch. The Web based on HTTP and URLs is bigger than any given type of browser. Browsers have evolved from being viewers for static HTML documents into a universal container for light-weight user interaction. Specialized browser technologies are a key component of the Web and are beginning to play a central role in enabling ubiquitous Web access. Separation of concerns. Refactoring of Web applications to reflect the separation of content, presentation and interaction is an on-going process that progressively enables flexible delivery of content. It is crucial for specialized browsers, and is central to the world of Web components. Gadgets. Web gadgets capable of manifesting themselves in a variety of access contexts ranging from the user’s personalized Web page to the traditional graphical desktop outside the shackles of a Web browser and mobile devices are at the leading edge of today’s advances in Web interaction. Web APIs. RESTful Web APIs have begun to deliver the original but unrealized promise of Web Services. With the arrival of mashups, we are finally beginning to see the emergence of a data-oriented Web.
Specialized Browsers
213
Web platform. The Web environment powered by content feeds and backed by data-oriented Web APIs and dynamic client-side interaction has been tagged with the Web 2.0 moniker. But more significant than the version number is the emergence of the Web as a viable platform for delivering ubiquitous information access backed by flexible user interaction. Web Command-line As the Web platform evolves further, we can expect many of the technologies underlying specialized browsers to morph into a Web command-line that allows end-users to compose flexible custom solutions from the various building blocks provided by the service-oriented Web.
References APP (2007). Available at http://bitworking.org/projects/atom/draft ietf atompub protocol 09.html. ARIA (2007). Available at http://www.w3.org/TR/aria roadmap/. ASTER (2007). Available at http://www.cs.cornell.edu/home/raman/aster/aster toplevel. htmlm. ATOM (2007). Available at http://en.wikipedia.org/wiki/Atom. Browsers (2007). Talking Web Browsers. Available at http://en.wikipedia.org/wiki/Self voicings. Emacspeak (2007). Available at http://emacspeak.sf.nets. FormsPlayer (2007). FormsPlayer and Multimodal Applications. Available at http://www. formsplayer.com/node/141. GMail (2007). GMail Mobile. Available at http://www.google.com/mobile/gmail/. MacDashboard (2007). Dashboard Widgets for Mac Os. Available at http://www.apple.com/ macosx/features/dashboard/). Opera (2007). Opera Widgets. Available at http://widgets.opera.com/. REST (2007). Available at http://www.ics.uci.edu/ fielding/pubs/dissertation/top.htm. Results (2007). Google Results. Available at http://www.google.com/search?q=PWWebspeak+ productivity+works. RSS (2007). Available at http://en.wikipedia.org/wiki/RSS. VoiceXML (2007). Available at http://www.amazon.com/VoiceXML Introduction Developing Speech Applications/dp/0130092622. XForms (2007). Available at http://www.w3.org/tr/xforms. XHTML (2007). XHTML+Voice. Available at http://www.w3.org/tr/xhtml+voice. Yahoo (2007a). Available at http://www.yahoo.com. Yahoo (2007b). Yahoo Pipes. Available at http://pipes.yahoo.com.
Browser Augmentation Vicki L. Hanson, John T. Richards, and Cal Swart
Abstract In this chapter we examine ways of programmatically expanding the functionality of conventional browsers. We consider various mechanisms for creating extensions and add-ons in Internet Explorer and Firefox and briefly review other browsers. We explore some existing browser extensions. The focus is on accessibility extensions, but some prominent other examples of browser extensions are also mentioned. Information for using and creating various types of extensions is presented. Finally, the usefulness of browser extensions compared with other accessibility enablement options for the Web is considered.
1 Introduction There are many ways to achieve the goal of an accessible Web, as discussed throughout this book. Directly authoring accessible Web pages is one way (see Part 2). Transcoding, changing already created Web pages to make them more accessible, is another (see Transcoding). In addition, browsers themselves have built-in features to adjust some aspects of page presentation to make the content more accessible (see Desktop Browsers). Most browsers also provide extension mechanisms that allow developers to augment browser functionality. Such augmentation allows for the creation of a number of accessibility features. Talking browsers are probably the best known form of browser augmentation for accessibility. These browsers allow blind, dyslexic, or otherwise printdisabled users to hear the content of a Web page read aloud. Early versions of this technology (Asakawa and Itoh 1998; Zajicek, Powell and Reeves 1998) have largely been supplanted by talking browser options included in V.L. Hanson IBM, T.J. Waston Research Center, Hawthorne, NY, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_13, Ó Springer Verlag London Limited 2008
215
216
V.L. Hanson et al.
conventional browsers (see Fire Vox1; Opera2) or by all-purpose screen readers such as JAWS13 or Window-Eyes.4 In this chapter we will examine the issue of browser augmentation for adding functionality to make Web pages more usable by people with limited sensory, motor, or cognitive abilities. We begin by giving some examples of enhanced accessibility features based on browser augmentation and then look at some of the augmentation techniques available in current browsers, focusing primarily on Internet ExplorerTM (IE) and Mozilla Firefox.
2 Overview The World Wide Web Consortium (W3C)5 is committed to an accessible Web. To accomplish this goal, the Web Accessibility Initiative6 was created. A threepart approach to Web accessibility has been taken by this group. The first, which arguably receives the most attention, is the Web Content Accessibility Guidelines (WCAG)7 for markup languages to be used for authoring accessible content. The second, the Authoring tool Accessibility Guidelines, addresses authoring tools for generating markup. The third, of most relevance to the present discussion, is the User Agent Accessibility Guidelines (UAAG)8 for Web browsers. WCAG guidelines prescribe how websites can be authored so that the content on the site meets specific accessibility requirements. These guidelines alone, however, often fall short of providing the kind of page renderings needed by many users. Specifically, they may not address the need to unclutter a page or provide for legible fonts – particularly when the definition of ‘legible’ is very subjective. What the guidelines can do, however, is provide mechanisms for the type of flexibility needed by browsers to implement these individual choices (Hanson, Snow-Weaver and Trewin 2006). The fact is that no one-page rendering will be accessible to all users. While WCAG has been able to specify, for example, markup needed to make pages accessible to screen readers, it is not possible to specify one-page rendering that will be easily readable by everyone with vision impairments. Thus, there is the need for individual users to be able to indicate the optimal rendering for 1 2 3 4 5 6 7 8
themselves. In this sense the UAAG and browser extensions/add-ons provide for flexible rendering of pages, customized to individual users. Modern browsers have incorporated several useful accessibility features (see Desktop Browsers). These features generally include font enlargement (at least over a small range of sizes), text, link, and background color modifications to enhance contrast or suit the preferences typical of users with some forms of dyslexia, and whole page magnification (zooming). These features, while having the advantage of being built-in, do not include the full complement of adaptations that a disabled user might want or need. To satisfy these more diverse requirements, developers can create new functions and integrate these extensions within the browser. While all major browsers support extensions, they do so in very different ways. In this section, we will examine current browser extension mechanisms and specific accessibility uses of these extensions.
3 Browser Extension Mechanisms Each of the modern browsers has their own mechanism for extending functionality. We will begin by discussing extension mechanisms provided by Internet Explorer (IE) and will then continue to Firefox, Safari, and Opera.
3.1 Internet Explorer A number of mechanisms exist for augmenting the functionality of IE (Roberts 1999). They include custom style sheets and various program objects that are bound to the browser. The mechanisms to be discussed here work for IE 5 and above. 3.1.1 Style Sheets A potentially simple mechanism to modify all Web pages rendered in IE is to use a custom style sheet. Style sheets can become quite complex, modifying many aspects of a browser’s rendering. In general, though, style sheets provide a simpler means of making changes than creating new program objects; style sheets can be created with nothing but a text editor and require no compilation or build procedure. Once created, a style sheet can be loaded using the Accessibility dialog, available on the Tools menu, under Internet Options. A more sophisticated use of the IE style sheet mechanism is to create style sheets under program control and inject them into IE. This technique is used by our Web Adaptation Technology software to combine several separately specified preferences into a single style sheet (Richards and Hanson 2004). Style sheets in this software allow for combinations of page magnification, line spacing, and letter spacing.
218
V.L. Hanson et al.
3.1.2 Browser Objects More sophisticated techniques for adding functionality to IE involve the creation of additional program objects that are bound to the browser, effectively adding new programmed behaviors. Examples of these program objects include Browser Helper Objects (BHOs), Band Objects, and ActiveX controls. Many BHOs can be found for IE. An example of a commonly available BHO that enhances IE for purposes other than accessibility is provided by Adobe. Their BHO adds the ability to convert a Web page to a PDF document with a single click. This BHO is added to IE when Adobe Acrobattm is installed. In general, bands package a set of new features with an associated user interface. Like BHOs, band objects can listen to browser events if they register themselves to do so (an example of this is given below). Bands generally have a richer user interface than BHOs although a BHO may occasionally pop up dialogs to interact with the user. Our Web Adaptation Technology software uses a band to provide the control interface for selecting accessibility preferences (Richards and Hanson 2004). The band, shown in Fig. 1, features large buttons allowing for easy
Fig. 1 A screenshot from the Web Adaptation Technology showing the band with one of its panels, speak text, visible
Browser Augmentation
219
viewing and clicking. Preferences are organized by type and grouped within simple panels. The full functionality of this band is revealed in a series of panels that allow users to control a number of options for page rendering and mouse and keyboard operation. Users can select the panels by clicking the large left and right arrows on the left-hand side. Help for the current panel can be obtained by clicking the large ? button. This band and a BHO together provide the new accessibility features in our software; the band allows the user to set their preferences and the BHO modifies the content based on those preferences as pages are loaded. In this software, examples of its use include reading text aloud via IBM’s ViaVoiceTM, enlarging text, changing colors of page elements, and reformatting the page layout to one column. This software uses a Browser Helper Object (BHO), written in Visual Jþþ, to make various modifications to the browser’s Document Object Model (DOM) after it is built by the browser but before the content is displayed on the screen. Unlike the user style sheet mechanism reviewed above, this BHO has the ability to make a wider range of modifications (e.g., modifying element attributes based on a potentially complex analysis of nearby elements). In addition to programmatic access to the DOM, a BHO has the ability to respond to user events (such as user mouse movements over a document). This ability allows the Web Adaptation Technology BHO to respond to mouse hover events to read the underlying text aloud (a hover being defined as the user moving the mouse over a non-empty region of the page and then pausing). A BHO can also respond to general browser events (such as the events related to fetching a new page). The interactions between the browser and the band can get rather elaborate. Again, consider our Web Adaptation Technology. The help content for this software is provided by a collection of Web pages. An example of the help for speak text is shown in Fig. 2. Links to the other software functions (such as magnify, text size, banner text, etc.) are shown on the help links on the left-hand navigation bar. Clicking on one of these links causes the associated page to be loaded. Because the band has registered an interest in browser navigation events it gets control each time a page is loaded. It examines the url for the new page and, if it is for one of the help pages, switches its panel so that it corresponds to that page. In this way we can keep the band’s interface and the help content in synchronization allowing the help to reliably refer to visible band elements. Similarly, if help is being viewed, a click on the left or right band buttons causes the corresponding help page to be loaded by the browser. ActiveX objects are arguably the most complex of the objects that can be bound to the IE browser. A rather large example of an ActiveX object is the Java runtime support. ActiveX objects in general are COM applications that are registered with the browser and loaded with each instance. They are distributed as Windows Dynamic Link Libraries (DLLs). These DLLs can be created by programming in Cþþ, Visual Basic, ATL, and C# using the
220
V.L. Hanson et al.
Fig. 2 Help content and band panels synchronized in the Web Adaptation Technology Help system through the handling of browser navigation events
Microsoft Visual Studio development environment.9 Other frequently used ActiveX objects include Quicktime and Flash objects. For accessibility, Muta, Ohko and Yoshinaga (2005) created an ActiveX component that can be utilized by Web site owners to provide features for visitors to their Web site who have vision limitations.
3.2 Mozilla Firefox The Firefox browser was designed to be easily extended. Indeed the browser itself is largely written using the same mechanisms an extension developer would use. As an open source browser, this extensibility provides obvious 9
Unfortunately, although many add ons are harmless, malware writers have used these techniques to spread spyware and monitor browser activity. Users need to be aware of the add ons that are installed. These can be managed from IE by selecting Manage Add Ons on the Tools menu.
Browser Augmentation
221
advantages. In this section we review some of the ways to add new capabilities to Firefox.
3.2.1 Extensions The Firefox Accessibility Extension10 is one well-known extension. It provides for a number of features that allow for easier navigation of elements of a Web page as well as changing some elements of page presentation for users with vision impairments. It also provides validation tools for developers to check their pages for accessibility best practices. Consistent with the general extensibility of Firefox, Mozilla (the underlying platform) provides a fairly self-contained application development and deployment platform. Applications written for this platform, including browser extensions, run on all the systems on which Mozilla runs (including Windows, Linux, and Mac). This means that most of the code an application extension developer would create only needs to be written once. Extensions are created in the XML User Interface Language (XUL), JavaScript, HTML, and CSS. A simple text editor can be used to create these extensions. Alternatively a sophisticated development environment like Eclipse can be used. There are many tutorials on the Web for learning how to start programming an extension. Searching for ‘developing Firefox extensions’ will return many hits. The XUL feature of this platform permits the runtime assembly of complex user interfaces. From XUL, one can call JavaScript functions to accomplish other tasks, such as initialization, event handling, and DOM manipulation. The overlay feature of XUL also permits changes to the browser UI, such as menus and toolbars (elements called the chrome in Mozilla terminology). Our Firefox accessibility software, accessibilityWorks (Hanson et al. 2005), uses XUL as the means of building the panels corresponding to the IE band discussed previously. Several features of XUL made this task particularly easy. XUL supports rapid, iterative development of GUIs. Since there is no compilation and build step, changes can be immediately tested. And since we can register for Firefox events we can trigger such things as DOM manipulation at page load time. The Cross-Platform Component Object Model (XPCOM) is another feature of the Mozilla environment that allows developers to perform lower-level tasks that cannot be done in XUL or JavaScript. Mozilla has a large variety of XPCOM objects that can be instantiated from JavaScript. These objects allow preferences to be retrieved and set, and files to be read and written. Developers can also write their own XPCOM objects. In accessibilityWorks, we created an XPCOM to connect our software to text-to-speech engines such as IBM’s ViaVoicetm. 10
http://firefox.cita.uiuc.edu/index.php
222
V.L. Hanson et al.
In general, Firefox extensions have access to all of the Firefox user interfaces and components. Extensions can modify the user interface by adding buttons, toolbars, and menus. They can modify fonts and colors, capture events, and change Web pages before the user sees them. Extensions have access to the Firefox XPCOM library for low-level functionality. Extensions are only limited by the developer’s imagination. 3.2.2 Style Sheets Unlike IE, there has not been a way for users to add a style sheet to Firefox via a simple browser option. However, users can edit userContent.css and userChrome. css which reside in the user’s Firefox profile chrome directory (see User CSS11). Extensions can register style sheet URIs as additional user and UA style sheets using the style sheet service available in Firefox 1.5 and later (see Using the Stylesheet Service12). A relatively simple extension could be built with this mechanism to allow users to easily create and inject style sheets. 3.2.3 Themes Firefox also provides a mechanism for users to customize the look and feel of the browser – a mechanism not available in IE. Many existing themes and information about how to develop new themes are available on the Mozilla Web site (see Firefox Themes13; Developing Themes14). 3.2.4 Plugins Firefox also provides a mechanism for users to embed content in a page. Typical examples of this are Quicktime and Macromedia Flash. These are developed using the Netscape plugin SDK and an environment that supports C/Cþþ and Makefiles. Microsoft Visual Studio would be one such environment. See http:// www.mozilla.org/projects/plugins/.
3.3 Other Browsers 3.3.1 Safaritm Apple’s Safari browser is primarily extended by way of plugins. There are many plugins currently available. Pimp my Safari15 is a particularly good source of Safari plug-ins. 11 12 13 14 15
Apple provides the Web Kit framework for developing plug-ins. This framework supports two types of plug-ins: Web Kit-based architecture and the Netscape plug-in architecture. Your choice of framework would depend on your target platform(s). Netscape style plug-ins support a standard API and can be supported in multiple browsers and can be deployed on cross-platform systems such as Windows and Linux. The Web Kit-based framework plugins can only be deployed on Mac OS-X applications that support the Web Kit framework. Netscape style plugins are written in a cross-platform C API while Web Kit plugins are written using an Objective-C API. Web Kit-based plugins can also take advantage of Cocoa classes (see About Plugins16). Apple provides two excellent resources for developing Safari plug-ins. These are available in the Apple developer area (see WebKit Plug-in Programming Topcs;17 WebKit Objective-C Programming Guide18). There are samples of both in the Web Kit developer section. Developers would most likely use Apple’s XCode Tool suite to develop extensions. Safari also supports custom style sheets to modify web pages. Just specify the style sheet to load in the Safari->Preferences->Advanced dialog. 3.3.2 Operatm The Opera browser provides a number of extension mechanisms. Style sheets and/or JavaScript files can be invoked on every page load. A style sheet can be selected for the browser by using the Style option available from Tools menu, Preferences->Advanced->Content->Style. Style sheets operate in a similar manner to IE discussed above (see Opera Style Sheets19). User JavaScript allows the full range of JavaScript functionality. For example, events can be handled, and cookies can be read. In addition, scripts on a loaded page can be rewritten, overriding a page’s event handlers, and changing variables and functions used by the page (see User JavaScript20). A JavaScript file can be selected for the browser by using the JavaScript available from the Tools menu, Preferences->Advanced->Content->JavaScript. Opera also supports add-ons through plug-ins. Plug-ins add additional capability to the browser. Opera has standard plugins for objects such as Adobe Acrobat, Flash, Quicktime, and Java applets (see Opera plugins21). Opera supports the Netscape4 plugin API. This means that Opera may support 16
any Netscape-compatible plug-in. Developers can use the standard Netscape Software Development Kit. By developing to a standard API, the plugin can run in multiple browsers. Developers can use a variety of environments to develop Opera plugins, such as MS Windows Visual Studio, and essentially any environment that supports C/Cþþ and Makefiles.
3.4 Impact of Browser Implementation Details Seemingly trivial differences in browser implementations can result in surprisingly large differences in the ease of creating efficient browser extensions. Consider, for example, one of the DOM transformations we have created for both IE and Firefox as part of our Web Adaptation Technology and accessibilityWorks software mentioned earlier. In our page layout transformation, we radically restructure the DOM so that multicolumn content is ‘linearized’ to be a single column. From the user’s point of view this has the advantage of eliminating the need for horizontal scrolling when very large text enlargements are applied to the page. In Fig. 3 we show a Web page with text enlargement
Fig. 3 A page with substantial text enlargement prior to linearization
Browser Augmentation
225
Fig. 4 The same page as in Figure 3, but with page linearization
using accessibilityWorks. In Fig. 4 we show that same page with the same text enlargement after the one column transformation has been applied. In IE we coded a relatively efficient BHO in Java to perform this DOM-based linearization. In Firefox we created a comparable function using JavaScript. One might guess that the Java implementation would be considerably faster given that Java will typically run much faster than interpreted JavaScript. However, the Java implementation needed to invoke API calls to traverse and modify the DOM and those APIs were invoked by way of a COM interface. The JavaScript implementation on the other hand was interpreted within the same computational context as the rest of the Firefox browser (an attribute that results from the fact that extensions to Firefox are comparable in nearly all respects to native browser functions), leading to faster overall performance for the linearization. Another example of the impact of browser implementation choices on extension feasibility is provided by a comparison of the ease of implementing the event handling underlying text-to-speech triggering. Our user interaction model for speaking text aloud involves, among other things, the ability for the user to simply move the mouse to an area of interest and then pause. This hover starts the playback of text from a reasonable starting point in the underlying block of text. Moving the mouse outside this area of text is a way to stop the speech playback and, perhaps, start playback on another area of text.
226
V.L. Hanson et al.
In order to implement this interaction model we need to augment the DOM by adding SPANs to break up large blocks of text into sensible chunks for possible playback. Without SPANs we would always start from the top of the current text node (which could be quite far from the user’s point of interest). Having inserted SPANs, we then need to attach event handlers to deal with mouse events of interest. In Firefox we attach an event handler to the top-level BODY element and any FRAME or IFRAME elements. In IE we were forced to attach event handlers to every node that supports them and in which we have an interest, typically most nodes in the DOM. This more costly DOM augmentation is required because the IE event object does not contain a pointer to the DOM element that initially received the event. Firefox’s event object does contain such a pointer so we can deal with the event at the high-level node that has our handler attached. Specifically, we can initiate speech playback for just the text in that node.
4 Future Directions It should be clear by now that browsers are not merely standalone applications but platforms for the creation of new functionality through browser extensions. As noted previously, no single browser, unmodified, provides all the features desired by every user. The types of browser augmentations described in this chapter enable new features to be added to browsers. While we have focused on browser technology, Web content and applications no longer need to be tied to the browser. Browser engines are being used in standalone applications that can run on the desktop. For example, XULRunner uses the Gecko engine to build standalone desktop applications that have all the richness of the Web. Widgets, such as those of Opera and Yahoo, provide another option. Opera, for example, uses their rendering engine in their widget technology to provide desktop widget applications. Given the need for an accessible Web, one might think that browsers would evolve over time to incorporate all needed accessibility options into the browser itself, thus obviating the need for accessibility extensions. While it is true that many of the options needed by users are available in modern browsers, the needed options are not completely available in existing browsers. Moreover, the seemingly inevitable complexity of browser interfaces tends to hide features, making accessibility features available only to experienced users or users supported by sophisticated support personnel. Many older and disabled users, however, are neither experienced nor well supported. They are not likely to discover accessibility features buried in complex browser interfaces. Sa-ngangam and Kurniawan (2007) found, for example, that items hidden in menus were extremely difficult for older adults to find. In comparing users performing tasks with IE7 and Firefox 2.0, they found longer times and fewer successful task completions for these users when features were hidden in menus. The streamlined
Browser Augmentation
227
default IE7 install that hides much function, therefore, was more difficult to use than Firefox 2.0 for these older adults. Although not tested with younger users, it is likely that the results obtained by Sa-nga-ngam and Kurniawan for older adults would generalize to many other users as well, whether disabled or not. We suspect that even if found on toolbars or within menus, some accessibility features are simply too complex to either understand or set (e.g., having to manually configure a suitably contrasting combination of text foreground, text background, link, visited link, and hover colors). For users with disabilities, moreover, the consequences of not knowing what to do can be significant, ranging from inefficient browser use to just giving up on using the resources of the Web. The point here is that even when browsers include all needed accessibility features they may not expose them in a manner easily used by those who need them (see Trewin 2000). For that reason, various accessibility browser augmentations necessarily repackage functionality, such as color changes, already included in the browser itself (see, for example, Firefox Accessibility Extension;22 Hanson et al. 2005; Muta, Ohko and Yoshinaga 2005).
5 Our Opinion of the Field We have argued previously that the economics of content development will continue to favor an approach based on automatic content transformation (generally browser-based transformations) rather than an approach which presumes that new more accessible content will be created (Richards and Hanson 2004). This is based on consideration of a few facts. The cost of creating accessible Web content can be substantial, especially so if developers have to go back and re-work legacy content. Moreover, the economic incentive to address the needs of what is perceived to be a small segment of the market has suggested to Web site owners that Web site accessibility will not translate into large financial gains. The W3C and others have pointed to the benefits of accessible Web sites,23 but it is only recently that some Web site owners have reported financial benefits of such development that outweigh the costs (Sloan 2006). As we have discussed here, however, an ‘accessible’ Web site, as defined by meeting these guidelines, may not be accessible to the many users whose needs fall outside those encompassed in content guidelines. In these cases, other approaches must be used. Some of this is handled by the User Agent Accessibility Guidelines. In other cases, browser augmentation is important. Depending on specific requirements, browser extensions can have advantages for accessibility transformations over other technologies. Transcoding using a proxy server is another method of re-writing Web pages to be accessible (see Transcoding). In our earlier work with proxy servers, however, we found that 22 23
we were unable to meet our users’ requirements using this approach. Three specific areas of difficulty were identified (Hanson and Richards 2004). First, the proxy server often made transcoding errors or omissions – hardly surprising given the complexity of the task that browsers face (rendering complex and often malformed HTML, JavaScript, cascading style sheets, and a large and growing number of third-party plug-ins). Also different browsers would render things differently which was hard to anticipate on the server. Second, secure sites transmitted content to the browser in an encrypted form. A proxy cannot see and modify the content flowing through a Secure Sockets Layer (SSL) connection unless that connection is mediated by a pair of SSL sockets at the proxy itself. Through this mechanism it is possible to see and modify the content on the proxy. Performing such an operation, however, violates the end-to-end security expected of secure connections and can only occur if the user proceeds after being warned of a serious security violation. Third, there was often a problem in simply enabling a connection to the proxy server. In order to set a browser to go through a proxy, the HTTP proxy must be set on the browser itself. This is not always possible since proxies cannot be cascaded. Thus, users who had a proxy already set, such as for some corporate firewalls and some ISPs (e.g., AOL), are prevented from connecting to a remote proxy. In addition to these three overarching issues, proxies also pose potential copyright problems since they transform and serve content not owned by the proxy’s owners. Also, transcoding poses the general problem of scalability in production systems as so much content needs to not only flow through them but non-trivially transformed as it flows. Of course, browser extensions are not without their downsides. The largest of these is the requirement that users download and install some software. For experienced computer users, this may not be a problem. For inexperienced users, however, even seemingly simple downloads and installations can be daunting.
6 Conclusions Browser extensions can provide important added functionality for browsers. Whether this mechanism or another option will provide the best means of accomplishing a task will depend on the project goals. We have covered here some of the accessibility extensions that have been developed. Given the open computing environment currently enjoyed, it can be anticipated that many other such extensions will be developed and made available to the accessibility community.
References Asakawa, C. & Itoh, T. (1998). User interface of a Home Page Reader. In Proceedings of the Third International ACM Conference on Assistive Technologies (Marina del Rey, California, United States, April 15 17, 1998). Assets ’98. ACM Press, New York, 149 156.
Browser Augmentation
229
Hanson, V. L., Brezin, J., Crayne, S., Keates, S., Kjeldsen, R., Richards, J. T., Swart, C., & Trewin, S. (2005). Improving Web accessibility through an enhanced open source browser. IBM Systems Journal, 44 (3), 573 588. Hanson, V. L. & Richards, J. T. (2004). A Web Accessibility Service: An Update and Findings. Proceedings of the Sixth International ACM Conference on Assistive Technolo gies, ASSETS 2004. New York: ACM. pp 169 176. Hanson, V. L., Snow Weaver, A., & Trewin, S. (2006). Software personalization to meet the needs of older adults. Gerontechnology, 5 (3), 160 169 Muta, H., Ohko, T., & Yoshinaga, H. (2005). An activeX based accessibility solution for senior citizens. Proceedings of the Center On Disabilities Technology And Persons With Disabilities Conference 2005. Available at http://www.csun.edu/cod/conf/2005/ proceedings/2227.htm Roberts, S. (1999) Exploring Microsoft Internet Explorer 5. Microsoft Press: Redmond, WA. Richards, J. T. & Hanson, V. L. (2004). Web accessibility: A broader view. In Proceedings of the Thirteenth International ACM World Wide Web Conference, WWW2004. p. 72 79. Sa nga ngam, P. & Kurniawan, S. (2007). An investigation of older persons’ browser usage. Proceedings of HCI International Universal Access in HCI, Volume 5. Beijing, China. July 22 27, 2007. Springer. Sloan, D. (2006). The Effectiveness of the Web Accessibility Audit as a Motivational and Educational Tool in Inclusive Web Design. Ph.D. Thesis, University of Dundee, Scotland. June, 2006. Trewin, S. (2000). Configuration agents, control and privacy, in Proceedings of CUU’00 (Arlington, VA, November, 2000), ACM Press, 9 16. Zajicek, M., Powell, C., & Reeves, C. (1998). A Web navigation tool for the blind. In Proceedings of the Third international ACM Conference on Assistive Technologies (Marina del Rey, California, United States, April 15 17, 1998). Assets ’98. ACM Press, New York, 204 206.
Transcoding Chieko Asakawa and Hironobu Takagi
Abstract ‘‘Transcoding for Web accessibility’’ is a category of technologies to transform inaccessible Web content into accessible content on the fly. It was invented to help people with disabilities access inaccessible Web pages without asking the content authors to modify their pages. It does this by converting the content on the fly in an intermediary server between the Web server and the Web browser. The technology has matured along with voice browsing technology from circa 1992. In this chapter, we will first cover the history of the transcoding technologies, and then introduce technical details of these transcoding systems. Finally, we discuss future directions and technical problems.
1 Introduction Table 1 lists major transcoding systems. This list is not comprehensive, but it covers the major types of historical and current transcoding systems. In this section, we will briefly look back at the history of transcoding for Web accessibility, by introducing these systems. In order to look back at the history, we need to follow two technology streams in the 1990s: Web accessibility technologies and transcoding technologies. These two types of technologies yielded a new category of technologies by 2000. We will briefly introduce these two streams. The World Wide Web (Aka the Web) was invented in 1992, and it quickly spread all over the world. Over the next decade, Web accessibility technologies appeared and matured. From the beginning of Web accessibility efforts, content transformation was a central topic to make general Web content accessible, especially for blind users. Lynx, a text-based Web browser developed in 1992, was one of the earliest non-visual Web access systems [Lynx]. It has a function to convert pages written in the HyperText Markup Language (HTML) into text-only presentations on C. Asakawa IBM Research Division, Tokyo Research Laboratory, Yamato Shi, Kanagawa, Japan e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_14, Ó Springer Verlag London Limited 2008
231
Year Name
Developer/ organization
Table 1 Transcoding systems Where Input Output Main target Final status transformation format format users
1992 Web BBS gateway
Asahi net
Commercial Intermediary HTML (BBS server)
Terminal
General
1997 IBM Home Page Reader
IBM
Commercial Client side
HTML
Blind
Siemens 1998 HTML VoiceXML converter 1998 BETSIE BBC
1998 Access Gateway
Public (open source) Public (open source)
2000 Aurora
IBM (Almaden Research) 2000 Accessibility IBM (Tokyo transcoding Research)
2001 ITry/LYCOS IBM Japan transcoder and
Public
HTML
Main methods
VoiceXML General user
HTML
Blind and low vision
Content reordering
Intermediary Web server
HTML
HTML
Blind and low vision
Serialization, etc.
Intermediary HTML proxy server
HTML
Intermediary HTML proxy server
HTML
Server side
HTML
Reference Asakawa (2005)
Serialization and link numbering Table header inference, alt text inference, etc. Segmentation
Intermediary HTML (VoiceXML server) Server side HTML
HTML
Metadata
Asakawa (1998a,b, 1999) and Laws (1999) Goose et al. (1998) Specialized for the BBC site
Table 1 (continued) Where Input Output Final status transformation format format
Main target users
Commercial Intermediary Web server
HTML
HTML
senior citizens Blind Text only (compliance)
State Univ. of Public Intermediary New York (open server at Stony source) Brook Commercial Client side 2004 Web IBM (T.J. (browser Adapt2Me Watson plug in) Research)
HTML
VoiceXML Blind
HTML
HTML
Low vision and senior citizens
2004 Dante
Univ. of Manchester
Intermediary
HTML
HTML
Blind and low vision
2006 WebinSight
Univ. of Washington
Intermediary
HTML
HTML
Blind and low vision
2006 SADIe
Univ. of Manchester
Intermediary HTML HTML Blind and low vision proxy server and client side Client side DHTML Text (Tree) Blind and Flash
Year Name
2003 LIFT text transcoder
Developer/ organization LYCOS Japan UsableNet
2004 HearSay
2007 aiBrowser for IBM (Tokyo Multimedia Research)
Public (open source)
Main methods
Automatic segmentation
Metadata
Reference
External annotation (XSLT)
Ramakrishnan (2004) and Borodin (2006, 2007) Hanson (2001, Magnification, 2004), device Richards adaptation (2004) and WA2M External Yesilada (2003) Table of annotation and Plessers contents, (2005) simplification, etc. Insertion of External Bigham (2006, alternative annotation 2007a) texts and automatic analysis Harper (2005a, CSS (class Simplification, and id) and 2006a) and reordering, Bechhofer external etc. (2006) annotation External Miyashita Insertion of annotation (2007) and alternative texts Sato (2007)
234
C. Asakawa, H. Takagi
the client side, and it allows users to navigate in the content by pressing the cursor keys. For example, users could move to the next or previous link texts with the arrow keys. This allowed blind users to access the Web with a DOS screen reader, using telnet to a UNIX server. For some years, such ‘‘serialization of content’’ and ‘‘text-only conversion’’ were the basic functions for transcoding systems. In the same year (1992), a Japanese text-based Bulletin Board System (BBS) provider started a text-based Web browsing service through their BBS service (Asakawa 2005; Fig. 1). This system was similar to Lynx, but the transformation was done on the server’s side, acting as an intermediary in their BBS server. Whenever a user accessed a page, the BBS server obtained the target page from the Web server and transformed it into the text format, assigning sequential numbers to each link. Users were required to remember a target’s link number and input it into the command line of the BBS system to follow that link. Blind users could use the BBS system by using a DOS screen reader, and it meant they could access the Web non-visually. This system can be regarded as one of the earliest server-side transcoding systems, providing practical non-visual access. In the mid-1990s, the focus of Web access systems shifted from transformation to screen reading. In those days, the functionality of Web browsers was evolving rapidly because of fierce competition among browsers. This was later called the ‘‘Browser War’’. Tracking these improvements, screen readers were also updated frequently to read text on various Web browsers. Screen reading is an approach that reads information on the screen ‘‘as it is’’. This approach is important to give blind users equal access to the information on the screen, but meanwhile, the Web content is becoming much more visual, with two-dimensional layouts and embedded rich media. In addition, e-business appeared using HTML forms, and these forms were scarcely supported by screen readers at that time.
Fig. 1 Simulated screen shot of BBS based Web access system (authors’ re creation)
Transcoding
235
In the late 1990s, stand-alone voice browsers were developed to make the nonvisual Web browsing much easier by integrating content transformation and optimized key operations (Asakawa 1998a, 1998b, 1999, DeWitt 1998). IBM Home Page Reader (HPR) was developed and became an official product in 1997 (Laws 1999, Asakawa 2005). It was a stand-alone browser with its own HTML parser, transformation engine, and custom key combinations. It included various transformation functions. For example, if an image link did not have a corresponding alternative text, HPR picked some part of the destination Uniform Resource Identifier (URI) to give an idea of the destination. HPR also analyzed complicated tables and automatically inferred which cells were the table headers (Asakawa 1999), and then allowed users to dynamically jump to these headers with table navigation keyboard commands. These transformation functions worked well to improve the usability of Web access in combination with advanced non-visual rendering functions, such as using a female voice for clickable elements, and sound icons and slower reading for headings. These early Web accessibility technologies can be regarded as transcoding systems both on the client side (HPR) and on a server-side intermediary (the BBS). Meanwhile, transcoding technologies continued to evolve. Transcoding is a general concept of transforming content or a program on the fly in an intermediary server, resulting in other formats. The initial targets in the 1980s were programs and encoded media content. In those cases, the original ‘‘transcoding’’ stood for ‘‘transformation of machine code’’ or ‘‘transformation of media encodings’’. Along with the rapid spread of the Web, the concept and the term ‘‘transcoding’’ were soon applied in a broader sense to the transformation of Web content written in HTML. This was at the same time as the Web accessibility technologies were developing (in the 1990s). Initially, the main target was mobile devices (Bickmore 1997, Hori 2000, Buyukkokten 2000). The approach was applied to make Web content ‘‘adaptive’’ for mobile devices by adapting various methods, such as simplification, fish-eye rendering, optimized navigation, and so on. This was the beginning of transcoding for adaptation, but for diverse devices rather than for diverse users. In the late 1990s, the ‘‘adaptation of devices’’ shifted to the concept of ‘‘adaptation to users’’. In 1997, Barrett et al. (1997) developed a Web personalization system by using an intermediary transcoding approach. Their Web Intermediaries (WBI) was a framework to develop transcoding systems, and they developed personalization functions, such as a personal history, shortcut links, page watching, and Web traffic lights, on top of the framework (Maglio 2000, WTP). The key innovation of this system was the profile repository (‘‘user model’’ in the paper). The repository supported content adaptation for each user. This idea directly inspired the idea of personalization (adaptation) for people with disabilities. From that time (circa 1998), Web pages were becoming increasingly visual. Web designers and content owners tend to lay out various kinds of information in one page with various types of visual effects. This trend made Web access more difficult for users of screen readers or voice browsers. The effectiveness of non-visual Web access was declining, and many blind users became
236
C. Asakawa, H. Takagi
discouraged with Web browsing even though the numbers of sighted Web users were increasing explosively. The Web accessibility transcoding systems were invented to reverse this trend by transforming the inaccessible content on the fly to make it more accessible. BETSIE (Betsie) and the Access Gateway (WAG, Brown 2001) were among the earliest practical transcoding systems for Web accessibility. BETSIE was a transcoding system used by the BBC site to automatically create a text-only version of its Web site, developed by the BBC in 1998. Surprisingly, the system is still in use (2007), almost 10 years after its introduction. It can create pages optimized for visually impaired users by moving the main content to the top of the page, and it changes colors for high-contrast color sets. The system is exploiting specific characteristics of the template of the BBC Web site, and therefore it can only handle pages from that site. The Access Gateway, developed in 1998, is a transcoding server for general Web pages. A user can get a personalized Web page by entering the URI of the target page, and then pressing the ‘‘Get page’’ button. It has a preference page for setting detailed options, such as font sizes, image replacement, color scheme changes, and various user-specific controllable options. This system was the first transcoding server available to the public and capable of handling any page on the Internet. Based on this research and these developments, Asakawa (2000) and Takagi (2000) developed an accessibility transcoding proxy in 2000. The system had two major features: a user profile repository and a combination of automatic transcoding and annotation-based transcoding. The user profile repository allowed storing each user’s profile on the server side and used it to provide comprehensive adaptations for each user. There is a need to automatically transform content, in order to handle arbitrary pages on the Internet. However, automatic transformation has clear limitations. Alternative texts are one example. We still lack any technology that can enter appropriate alternative texts for each image. We developed some heuristics, such as adding the title of the target pages to an image, but the heuristics were flawed, both quantitatively and qualitatively. Annotation-based transcoding is a method to address some of these problems. It is a method to transform contents by referring to manually created external metadata, which adds missing semantic information for appropriate transformations of a target page. In order to make transcoding effective, especially for blind users, it is necessary to transform content drastically but accurately. The external annotation approach supports this level of transcoding. The most important drawback of this approach is the workload of creating annotations for pages. We will discuss this topic in Section 4.1. At the same time, two other accessibility transcoding systems were developed within IBM: one in the Watson Research Laboratory and one in the Almaden Research Laboratory. Watson’s system focused on adaptations for senior citizens and was first developed as a server-side proxy (Hanson 2001), before moving to client-side transcoding (Hanson 2004, Richards 2004). The system
Transcoding
237
was productized and has been deployed at various sites (WA2M). The system from Almaden Research (Huang 2000a, Huang 2000b) focused on the simplification of e-business applications, such as auctions and search engines. This used precise annotation and first transformed the Web pages into semantically structured XML documents, and then the pages for users were generated from the XML documents. Such research established the area of accessibility transcoding as a part of Web accessibility research, and various types of research started based on these results. Dante (Yesilada 2003, Plessers 2005) is an annotation-based transcoding system, which is characterized by a metaphor for non-visual navigation, called the travel metaphor (Goble 2000). They created a taxonomy for nonvisual navigation, based on their metaphor, such as directions, navigation points, travel assistance, decision points, and reference points. Since it takes time to navigate among the component items in a page by using non-visual key navigation, the travel metaphor is a powerful metaphor to establish a welldesigned taxonomy. The SADIe (Harper 2005a, Harper 2005b, Bechhofer 2006, Harper 2006a, Harper 2006b) has both automatic (rule-based) and annotation-based (semantic) transcoding function, and it is characterized by its use of ‘‘inlined’’ metadata as a form of Cascading Style Sheet (CSS) information. CSS is the standard mechanism to format and to layout Web pages by adding styling/layouting attributes to HTML elements by referring identification information, such as ‘‘id’’ and ‘‘class’’ attributes. SADIe is utilizing this identification information as ‘‘inlined metadata’’ for distinguishing semantics of partial contents in a page. The effective utilization of internal metadata will be an important research topic to make the transcoding more feasible in real environment (see Section 4.1). HearSay (Ramakrishnan 2004, Borodin 2006, 2007) is an automatic transcoding system from existing Web pages to VoiceXML (W3CVoice). The system is characterized by its automatic segmentation algorithm, which can avoid the use of annotation. It has functions to analyze the visual structure of a page based on the Document Object Model (DOM [DOM]) tree structure (using the HTML tags) of the page. Goose et al. (1998, 2000) is the one of earliest studies in this category, and Shao and Capra 2003 tried to apply annotation-based transcoding for HTML to VoiceXML transformation. Transcoding technology was also adopted by Web accessibility service businesses, which became active after the U.S. Section 508 (Section 508) took effect. In 2003, UsableNet Inc. started a transcoding service, ‘‘LIFT Text Transcoder’’. This is a service to generate ‘‘text-only pages’’ without modifying the original content based on transcoding techniques. Their approach is a standard approach to the transcoding system, with basic transformations that can automatically create text-only and serialized pages. In order to compensate for the automatic transformation, an XSLT-based annotation can be used. This covers various functions, such as reordering of content, adding alternative text to images, and adding heading tags to plain texts. They are selling services to create annotations and to transform customers’ content into accessible text-only pages.
238
C. Asakawa, H. Takagi
As the years pass, transcoding technologies are becoming more ubiquitous in Web systems, both on the client side and server side. On the server side, many Web sites and content management systems provide functions to personalize features such as the colors, layouts, and font sizes of pages by using a settings panel (e.g., Rainville-pitt 2007). Some browser plug-ins have been developed to enable client-side transcoding. Greasemonkey (Greasemonkey) is a popular plug-in for the Firefox browser to allow people to create transcoding functions on the client side. We can easily integrate transcoding functions by creating simple JavaScript programs. This plug-in is not used only for accessibility, but various scripts to improve accessibility have developed on the framework. The Accessmonkey (Bigham 2007b) is a similar Firefox plug-in but is focusing on accessibility purposes. Some client-side assistive technologies are available on the market for senior citizens, such as WebAdapt2Me and the EasyWeb Browsers, both from IBM. They have some transcoding functions to change color schemas and font magnification functions, and also have text-to-speech services related to mouse operations. Transcoding functions are also added to assistive technologies for visually impaired people. Jaws (Jaws), the most popular screen reader, has a function to serialize the current Web page. This function serializes layout tables and makes non-visual navigation simpler for screen reader functions. The most recent challenge in the Web accessibility field is the accessibility of dynamic Web content, such as DHTML (simulated graphical user interfaces on a Web page using JavaScript) and Flash (a widely used animation format developed by Adobe Systems). At this moment, no general methodology has been widely accepted to make such content accessible, and so accessibility is rarely taken into account for this content. Examples are that buttons are rarely operable with a keyboard and cases where the reading order of elements is not logical to support understanding the meaning of the content. Sato’s Flash transcoding system (Sato 2007) is an example of an attempt to transcode dynamic content for accessibility. The system has functions to associate the most probable text object as an alternative text for a button and to make inaccessible buttons accessible since some visible buttons are not presented as buttons for screen readers. Miyashita (2007) developed an annotation-based client-side transcoding system for dynamic content, such as DHTML and Flash. In this system, metadata can be regarded as a transformation language for transforming a dynamic XML object model structure to a simple tree structure for non-visual access. This process is done completely on the client side.
2 Methods Throughout the history of transcoding technologies, various types of transformation functions were developed. In this section, we will give an overview of these transformation methods.
Transcoding
239
2.1 Text Magnification Text magnification is the most common adaptation method for Web pages for sighted users. This function has become ubiquitous in modern browsers. This method can be applied for a wide variety of users, such as senior citizens with mild vision problems, senior citizens with cataracts, or people with poor eyesight. Even for people with good eyesight, it helps when their screen resolution is too fine for Web pages. (Of course, magnification does not provide any benefit for blind users.) Currently, this method is popular both for server-side systems and client-side systems. Many content management systems provide personalization functions for font sizes, and many major Web sites have the personalization function to give users preferable impression to their sites. Web browsers have been improved to include this function. Firefox has an unlimited magnification function. Users can magnify pages whenever they want, even if a site does not provide a personalization function. We could say that this function graduated from being limited to transcoding and has become a mandatory function in Web systems.
2.2 Color Scheme Changes Color scheme changes are beneficial for people with some types of eye conditions, such as cataracts, glaucoma, or color-vision deficiency. Since our society has many people with these conditions, this method is the one of the major methods for transcoding systems. For HTML content, this method can be implemented by simply changing the Cascading Style Sheet (CSS) for the page (Iaccarino 2006a, etc.). CSS properties can be overwritten by external style sheets, so it is easy to change colors from a default to some other color schema. Another type of technique is the image-processing-based adaptation (Nam 2005, Iaccarino 2006b, etc.). By using image-processing techniques, it is possible to optimize the visual presentation for each disability, such as cataracts or color-vision deficiencies, by using specific color schemes with the bitmap images. For example, Iaccarino (2006b) is a method to shift a color range to the ideal range for people with color-vision deficiency. It shifts sets of confusing colors to other colors that can be discriminated between, such as red and green colors, by using image-processing techniques on the fly. This type of precise color scheme adaptation provides great benefits to users, but many issues have to be addressed to make it practical, especially for performance and scalability.
2.3 Serialization Serialization is a method to remove HTML tags that are used only for layout purposes, such as layout tables, and to generate serialized content (e.g., Brown
240
C. Asakawa, H. Takagi
2001). Serialized content is beneficial both for blind and low-vision users. Each voice browser has table navigation functions that allow users to navigate in twodimensional tables by using the directional cursor keys (HPR). Blind users can understand the structure of a data table by using this navigation function. In contrast, some tables are used merely for layout purposes, such as aligning contents horizontally in table cells with an invisible border. These ‘‘layout tables’’ are generally regarded as a misuse of tables, since the layout should properly be controlled by style sheets, but they are very common. These layout tables interfere with table navigation. For example, if a data table is contained within a layout table, the voice browsers cannot distinguish between the data tables and layout tables, and they try to verbalize the current complicated location (including the cell numbers) for the layout tables. The serialization transcoding can address this issue by eliminating the layout tables. For low-vision users, serialization is beneficial by eliminating troublesome horizontal scrolling when a page is magnified. If the page was designed using CSS functions, the page can easily be serialized by simply disabling the style sheets. However, if layout tables are used in the page, it is necessary to scroll both vertically and horizontally to see the full content. Vertical scrolling can be done with the usual page up/down operations, but a need for horizontal scrolling clearly lowers the usability.
2.4 Alternative Text Insertion ‘‘Alternative text’’ is a concept of adding short descriptions to non-text objects in a page. This is critical to allow blind users to access Web content, since images without alternative text are fundamentally unrecognizable for voice access users. The use of alternative text is one of the most important concepts to make Web content accessible. It is technically hard to automatically detect appropriate alternative text for arbitrary images, even when using optical character recognition (OCR) techniques. Even when an image has text, it is usually highly decorated, which is why an image was used for that text. For other images, such as an icon, there is no text and it is difficult to automatically analyze a description from the image. Every screen reader has a function to read a part of a URI if no other alternative text is assigned to an image with a link. The early versions of HPR read the last two ‘‘words’’ in the URI (and ignoring the file extension). For example, if the URI linked to an image is ‘‘http://www.example.com/news/ articles/images/new.gif’’, then it is read as ‘‘images new’’. This is a primitive but general heuristic method to cope with the problem of missing alternative text. The annotation-based method was invented to augment pages by providing accurate but manually created alternative text (Dardailler). Annotation is a type of metadata used for transcoding (see Section 4). The transcoder has a
Transcoding
241
repository of alternative text. Alternative text is indexed to URIs and image file names in the repository, and the transcoding system automatically retrieves an appropriate text by using the URLs and filenames as keys, and then assigns a proper text to each image. The essential drawback of this method is that a human annotation author should describe the annotations manually. The workload of annotation authoring has prevented this approach from being widely used in practical environments. WebInSight (Bigham 2006, 2007a) is a transcoding system which is focused on inserting alternative texts. The system is characterized by combining three different methods: context labeling, OCR image labeling, and human labeling. Context labeling is a method to get appropriate texts from linked pages for image links without alternative texts. There is an empirical rule that the linked page’s title or headings (especially texts in ) are often appropriate for the alternative text. The system applies this rule for the transcoding. OCR image labeling is a method to analyze alternative texts by using OCR techniques. They optimized an OCR engine for detecting alternative texts and achieved 65% accuracy. Even though this recognition rate is low, this text is a great help to blind users. At the same time, it is desirable to provide more accurate texts, and therefore the system also provides a method for human labeling. This is an annotation-based transcoding method. Because of the importance of alternative texts, these transcoding methods will continue to be improved.
2.5 Page Rearrangement Page rearrangement is a method to change the layout of a page to be suitable for voice browsers or for magnification (e.g., Takagi 2002, Harper 2006b). The trend of Web authoring is to present various types of information in one page. For thispurpose, various types of visual effects, such as background colors, layout tables, spacing, or horizontal lines are used, to visually separate the components of the content. Each component has a ‘‘role’’ in the page, such as header or footer of the page, index list, advertisement, main content, shopping list, and so on. These components and their roles are easily recognized visually at a glance. For example, if a page has a header at the top of the page and an index at the left, then the main content area is usually at the center of the page, logically after the header and the index. This means that most users want to skip that unnecessary information by using navigation keys in order to reach the main content area, while sighted users can skip those components by merely moving their eyes to the main content at once. Page rearrangement is a method to solve this issue, by making the visual components non-visually distinguishable for voice browser users or by providing semantically organized serialization for magnifier users. BETSIE (Betsie) is the first system, which can rearrange the order of components to be suitable for voice browsers. It moves the main content to the top of the page, before the
242
C. Asakawa, H. Takagi
header or index, and thus voice browser users and magnifier users can immediately access the main content. BETSIE is an automatic transcoder, but specialized for the BBC site. Therefore, it recognizes the main content area specific hints, such as the size of a table. These hints and rules can be regarded as metadata. Our transcoding method does full-page rearrangement based on manually created annotation information. Figure 2 shows an example of the transformation. First, the system retrieves corresponding annotation data from the Original
Rearranged
Fig. 2 Example of page rearrangement (excerpt from Takagi 2002)
Transcoding
243
annotation database (see Section 4). Then it rearranges the page by referring to the roles and the importance values assigned to each component. The order of arrangement is determined based on the importance values, so that the main content will be moved to the top, and the not-essential information (e.g., advertisements) will be moved to the bottom. In addition, it inserts some delimiter text to show the borders of the components based on the title information included in the annotation, and it also inserts a page index at the top of each page to jump directly to each component. This type of annotation-based rearrangement can provide highly accessible Web content. It can also be applied to generate pages for telephone access. However, the workload of annotation authoring is a problem. That is why many page segmentation algorithms have been proposed. We will discuss the issue of the workload for annotation authoring in Section 4. Various segmentation methods have been invented to realize automatic reordering and segmentation-based navigation supports (Goose et al. 1998, Buyukkokten 2001, Fukuda 2003, Ramakrishnan 2004, etc.). The technique is a challenging research topic, since it is necessary to semantically ‘‘accurately’’ partition pages into sub-contents, in order to realize meaningful support for non-visual access. If a detected boundary shifts an element, the result will be impacted severely. CSurf (Mahmud 2007a,b) is a tool to support non-visual navigation by using the automatic context detection, but not fully reordering pages. This approach will not be severely impacted by fine-grained accuracy of automatic detection algorithm. This systems shows that it is important not to apply reordering method directly to automatic segmentation method, instead, to invent new types of navigation supports based on automatic segmentation algorithms.
2.6 Simplification Page simplification is a method that presents users with only the important parts of a page by eliminating the nonessential parts. As mentioned in the rearrangement section, each page has various types of components. While rearrangement retains all of the content, simplification removes unnecessary components from the target page and allows users to access only the important or interesting parts. This method is also called page clipping. It is popular for transcoding systems for mobile devices, since these devices only have small screens, and this method is especially suitable for small displays. The most popular method for simplification is annotation-based transformation. For rearrangement, annotation authors must describe all of the components in the page for the annotation of that page. In contrast, for simplification, the annotations only need to describe the components that are to be preserved or removed. This characteristic lowers the cost of annotation authoring and makes this method more cost effective compared with page rearrangement methods.
244
C. Asakawa, H. Takagi
Not only annotation-based transcoding but also various types of automatic transcoding methods have been developed. The page segmentation algorithms can be applied to simplify a page, if they have function to detect important component in a page. Our method (Takagi 2000) is one of the methods (see Section 4). This method is based on differential analysis between two HTML documents based on DOM structure. The basic assumption is that the most important component is the most unique component, which is not included in any other pages. Therefore, if all of the duplicated components are removed, the remaining components should be the unique and important component in the page. For example, header and footer, index are included in other neighbor pages, so they can be eliminated. One of the drawbacks of this method is the concern about losing important content. Ideally, the same information available to sighted users should be presented non-visually, even after the transformations. Especially for compliance to accessibility regulations, it is necessary to preserve all of the content in each page. Also, the lost material will interfere with the business models of many sites by removing their advertisements.
3 Architectures Transcoding is a method to transform content on the fly. That is why the architectures can be classified by the location of the transformation engines, from server-side transformations to client-side transformations. In Table 1, the column ‘‘where transformation occurs’’ shows these locations. We classified this column into mainly three categories: server side, intermediary (proxy server), and client side. The server side implies that the transcoding system is integrated into a Web server and only used for a specific site. Usually transcoders in this category are not visible to end users, since they are just a part of the Web site. These transcoders are usually integrated as a part of the Web server and include any support systems such as reverse proxies used for security or load balancing. BETSIE is an exception among server-side transcoders. This system is actually a kind of Web server-style intermediary transcoder (see below), but is only capable of supporting the BBC sites. The ‘‘intermediary’’ is an approach to transform content between the browser at the end user’s side and the Web server. This approach is the archetypical architecture for transcoding systems, since originally transcoding referred only to this approach. Figure 3 shows the basic architecture of this intermediary approach. There are two types of implementation, one that uses a proxy setting and one that explicitly uses a Web server. Proxy transcoders work as a HTTP proxy and transform the content on the fly. Each user must set the address of the transcoding server in their browser, but the transcoding process is then transparent. They simply use their browser
246
C. Asakawa, H. Takagi Table 2 Comparison of location of transcoding Advantage Drawbacks
Server side
Intermediary
Web server
Proxy
Client side
Can be maintained along with Web servers Does not require users to change their browser settings
Transcoding process is transparent from users
More stable than intermediary setting Possible to provide additional assistive functions (e.g., TTS) Possible to transcode DHTML pages (with limitations)
Only for a specific site
Coverage is limited, since all links in a target page should be rewritten by the transcoder, and this causes various problems Impossible to transcode pages with DHTML or AJAX Require users to change their browser setting Impossible to transcode pages with DHTML/AJAX (better than Web server setting, but worse than client side transcoding) Require users to install client side components
However, the system was just the same as the ‘‘intermediary’’, and simply added some HTML generation code and link rewriting code to implement the Web server setting approach. The client-side transcoding is a method to transform the content, usually inside a browser, not an intermediary server. Figure 4 shows a general architecture for this approach. There is no intermediary between the Web server and the browser. The pages are loaded by the browser as usual. Once a page is loaded, the component for transformation detects it, usually by receiving the page loading completion event (the ‘‘onload’’ event), and then starts the transformations. Modern browsers expose the internal HTML tag structure as a DOM tree structure, and it is possible for browser plug-in components to access the in-memory DOM structure through the DOM API, the most popular API not only for transcoding systems but also for JavaScript and other XML processing. The trend of transcoding architecture is changing along with the changes in general Web architectures. Currently, the processing engine is moving from the server to the client, based on client-side scripts, usually written in JavaScript. AJAX and mash-up provide developers with new paradigms to build new Web applications by combining existing services written in JavaScript. This general trend directly affects the architecture of transcoding systems. It is technically
248
C. Asakawa, H. Takagi
smaller pages. This process was called ‘‘re-authoring’’ of content. In order to implement this function, Hori et al. (2000) applied an annotation-based approach. They defined an annotation language based on RDF [RDF], which could describe alternative content for parts of the target page, give hints for fragmentation, and supply criteria to select appropriate alternative content. Asakawa (2000) and Takagi (2000) developed an annotation-based transcoding system for Web accessibility in 2000, based on previous work including Hori’s system and Nagao’s system. At that time, they soon noticed differences between transcoding systems for mobile devices and for accessibility because the required coverage of the annotation systems was different. For mobile access, site owners usually did not want to make their whole site visible, but just the most important parts of their site, or some specific Web applications, such as the pages for a specific event or a Web form for registration. They also wanted visually ‘‘precise’’ transcoding, and that the resulting pages should be ‘‘well designed’’. Therefore, the annotations did not need to cover as many pages, but were intended to be as fine-grained as possible to precisely control the transcoding. In contrast, site owners wanted to make their ‘‘whole site’’ accessible by using transcoding systems, since the users normally want access to an entire site. It was natural that blind users want to access the same information as sighted users, but with a usable interface. The users were always frustrated when they found pages without annotations. Another reason was compliance. It is normally necessary to make all of a site compliant to satisfy Web accessibility regulations. Therefore, site owners were attracted to the possibilities of cost reduction for making their entire sites accessible by using transcoding technology. In order to achieve ‘‘whole-site’’ transcoding, there are three major technical challenges. In the paper in 2002 (Takagi 2002), we introduced the first and second challenges, but the third challenge emerged with the development and adoption of DHTML technologies. 1. Reduction of the workloads for annotation authoring. One obvious answer is by adapting generalized annotations for similar Web pages. 2. Adaptation of annotations for changing pages. Webpages are not static, but are continuously changing their content and layout, and new pages are created each day. ‘‘Annotation matching’’ and ‘‘component pointing’’ are the key technologies to cope with these problems. Each annotation file should have descriptions for component definitions or alternative tests, and each description should have the location of the target object in the page. This pointing technology can be called ‘‘component pointing’’. This pointing mechanism is essential for transcoding systems. Several methods have been proposed (Huang 2000a,b), but XPath [XPath] is the de facto standard method for transcoding systems. XPath is a pointing language for addressing a part of an XML document, based on the
Transcoding
249
Click Here!
Today's topic
Here is the main content of this page.
(A) Original html
VoxML Converter, ?? 2000 voxML WebAdapt2Me, IBM Corporation; see http://www 03.ibm.com/able/solution_offerings/ WebAdapt2Me.htmlm Web Access Gateway: the Association of C and Cþþ Users, http://www.accu.org/cgi bin/ access/access WebSphere Transcoding Publisher, IBM Corporation; see http://www 4.ibm.com/software/ webservers/transcoding/ XML Path Language (XPath): World Wide Web Consortium (W3C), http://www.w3.org/ TR/xpath Yesilada, Y., Stevens, R., and Goble, C. (2003) A foundation for tool based mobility support for visually impaired web users. In Proceedings of the 12th international Conference on World Wide Web (Budapest, Hungary, May 20 24, 2003). WWW ’03. ACM Press, New York, NY, pp. 422 430.
Part IV
Specialised Areas
The Web is moving fast; sometimes so fast that if accessibility and the requirements of the user are not considered and understood as part of the process, reverse engineering and design rediscovery are made very difficult. Here we have grouped together a number of specialist and fast-developing areas. We can see that Web accessibility is, at present, mainly based around the design and build stage: tools to build pages, guidelines to relate best practice to designers and authors, checkers and validators to warn of errors, and finally repair tools to help change and correct the errors found. In the future, the luxury of expecting creators to know multiple guidelines and best practice will evaporate as the technologies, applications, and user devices they are used to create will be expanding exponentially. Designing for one browser, for one user group, and expecting to know all technologies and formats will soon become outdated. So what will we need to consider into the future? What is on the horizon? and how will Web accessibility research cope with them? To help answer this we examine the top seven areas (Education, Specialist Documents, Multimedia and Graphics, Mobile Web and Accessibility, Semantic Web, Web 2.0, and Universal Accessability) we see as being new and fundamental to Web research in the future and consider the open questions for each technology that will need to be addressed if the Web is to maintain and increase its accessibility.
Education Paola Salomoni, Silvia Mirri, Stefano Ferretti, and Marco Roccetti
Abstract This chapter explores the main issues concerned with the task of designing accessible e-learning systems. While important steps forward have been accomplished to provide students with digital, easy to manage didactical resources, yet, barriers for learners with disabilities remain. Actually, even if such users are not completely ruled out from virtual classrooms, it is a fact that they can only partially enjoy the offered didactical experiences. With this in view, the current trend in research is concerned with the development of smart e-learning systems, able to dynamically customize learning contents based on the specific user characteristics and needs. We provide a survey of current solutions and present our opinion of the field.
1 Introduction Education makes the difference. If we wish to give people the opportunity to be successful, to be able to give the best they can, to live the life they want, to fully integrate into society, it is important to ensure that they can learn. Nowadays, the way education is accomplished is changed and e-learning has come into the picture. The term ‘‘e-learning’’ is generally referred to the delivery of education or training programs through electronic means (Horton and Horton 2003). Such technologies are now impressively influenced by the widespread Internet connectivity and by the new Web-based multimedia technologies. Thus, learning can be distributed and exchanged among teachers, tutors and students remotely located both in time and space. Actually, people with disabilities may encounter several barriers when they try to enjoy traditional educational activities. For instance, students with visual impairments have difficulties in reading didactical materials based on printed resources, deaf learners have troubles in following traditional (spoken) lectures, people with motion disabilities have problems in attending on-site learning P. Salomoni Department of Computer Science, University of Bologna, Bologna, Italy e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_15, Ó Springer Verlag London Limited 2008
263
264
P. Salomoni et al.
programs. As a matter of fact, providing an easy access to digital didactical contents and learning technologies offers new opportunities for people and communities to develop new skills and improve their knowledge. This is particularly true for several underserved segments of population that could be proficiently reintegrated into educational activities. The case of students with disability is an obvious example. Indeed, as to deaf students, adequate strategies can be employed to present e-learning contents in a visual format. Barriers for visually impaired students, which are typically introduced by the use of printed materials, could be easily overcome by resorting to digital contents and related assistive technologies, e.g., screen readers, Braille displays. Finally, the need to move and attend lectures is drastically reduced; obviously, this can be of great help to people with motion disabilities (Driscoll and Carliner 2005; Sloman 2002). However, despite e-learning represents a great opportunity for people with disability, its full potential has not yet been exploited and these students are sometimes ruled out from virtual classrooms. This is due to several factors, mainly related to the complexity of e-learning applications and contents, which are typically designed to improve the efficacy of the general learning experience and are based on a wide use of multimedia and interactive components. All these elements represent potential barriers to e-learning accessibility but, actually, they might be the foundation of an inclusive e-learning, if adequate customization strategies are utilized for the personalization of the learning content delivery. The aim of this chapter is that of surveying the main results on accessibility in e-learning systems, and to discuss the main open issues and future directions, solving this apparent paradox. The remainder of this chapter is organized as follows. Section 2 presents an overview of the field. Section 3 discusses e-learning standards, specifications and their relationship to Web accessibility. In Section 4, we identify possible future directions and describe how these may affect Web accessibility. In Section 5, we present our claims on this research field. Finally, Section 6 provides some concluding remarks.
2 Overview Nowadays, distance education is based on a large variety of delivery methods, including traditional correspondence, books, audio/video tapes, interactive TV, CD-ROM and DVD, as well as services that can be offered through the Internet (Rosenberg 2000). Current e-learning technologies are effectively adopted in a plethora of didactical scenarios, including schools, post-secondary education and professional training. In the past, before the large diffusion of the Internet, several solutions were exploited to support teaching and learning activities, generally referred as ‘‘computer-assisted instruction’’ or ‘‘computer-based training’’. People were allowed to learn by interacting with their computer. The very first educational
Education
265
contents were provided in a form which was similar to the ‘‘classic’’ didactical material. Then, new forms of didactical applications were developed, such as simulations or games, without the necessity of network connectivity. After, new forms of distance learning were designed to be accomplished online. These activities go under the name of ‘‘online learning’’ (which generally emphasizes the presence of the Internet as a means for conducting learning activities) or ‘‘Web-based education’’ (which, more specifically, refers to the use of Web-based applications to support educational activities) and, finally, they contribute in coining the neologism ‘‘e-learning’’. The introduction of such new means of teaching and learning imposes some changes in the way instruction and didactical materials are delivered (Khan 2005). From a methodological point of view, educational and training courses may be conducted: (i) fully in presence (traditional classroom-based forms of education and training, through face-to-face interaction), (ii) fully online (all learning activities pass through the network, including interactions among teachers and learners, assessments and tutoring) or (iii) according to a hybrid approach, which represents a mix of the previous ones, called blended learning. The main aim of this latter methodology is to combine the positive effects of e-learning and the effectiveness of classroom-based instruction. Blended learning can be implemented in a wide range of ways. For example, electronic services can be used to teach theoretical aspects of a subject, while face-toface activities are employed for practical aspects. By considering time dependence, e-learning activities can be classified as synchronous or asynchronous (Clark and Mayer 2002). In the former case, teachers and students synchronize their activities in time (e.g., video conferencing, realtime lectures, chats). The latter approach, instead, does not impose the presence of teachers and students at the same time (e.g., content delivery, file sharing, forum, e-mail communication). The asynchronous approach is more common since it frees actors (i.e., learners and students) to be online at the same time. On the other hand, synchronous activities put students in direct, interactive contact with their teachers. Current typical e-learning systems commonly offer both synchronous and asynchronous communication features, in order to provide learners and lecturers with a wide range of possible didactical activities (Sloman 2002). Needless to say, the new computing technologies play a vital role to support all these new forms of educational approaches. In particular, several domains are involved, ranging from communication features to user tracking technologies and content management systems.
3 Discussion E-learning accessibility is bound to Web accessibility, but some differences exist and several aspects specifically concerned with e-learning must be considered. In particular, a key role is played by those standards developed to ensure portability of learning contents through different contexts and technologies. As
266
P. Salomoni et al.
a matter of fact, not all aspects considered in these e-learning standards fully comply to the requirements and guidelines related to Web technologies. Such standards and their main characteristics are presented in Section 3.1. Moreover, not all the principles of Web accessibility are valid in e-learning, due to the wide use of non-Web-based tools in several e-learning applications (e.g., messages and videoconferencing) and to the typical use of complex rich media (e.g., video lectures). With this in view, a set of guidelines devoted to structure e-learning contents and applications has been defined by the IMS Global Learning Consortium (IMS), which are discussed in detail in Section 3.2. IMS has also developed a novel set of standards for providing personalized e-learning experiences. This aspect is crucial to completely explore the potential of e-learning and to address several accessibility issues. All these aspects are described in Section 3.3. Finally, it is worth mentioning that several projects have been developed aimed at coping with all the issues outlined above. In Section 3.4, we report a succinct survey of these new proposals.
3.1 Learning Objects: Package and Metadata Pieces of a didactical content are typically referred with the term ‘‘learning object’’ (LO). Different definitions of LO are available; probably, the most representative one has been provided by the IEEE Learning Technology Standards Committee, which describes it as ‘‘any entity, digital or non-digital, that may be used for learning, education or training’’ (IEEE 2007). We can then define a content package as a set of LOs coupled with a document which identifies the association among LOs and possible sequencing rules, which allow to organize the whole didactical content. From a technological point of view, a standard description of the content structure is required to guarantee that contents will be interoperable across different e-learning systems. The Shareable Content Object Reference Model (SCORM) is a de-facto standard, through which it is possible to define learning resources, called Sharable Content Objects (SCOs). SCOs can be presented in any SCORMcompliant system and are composed of one or more ‘‘assets’’ or resources (e.g., digital media, Web pages, etc.) (ADL 2004). An XML document is associated to the whole learning content, referred as the ‘‘manifest’’, which contains metadata, navigation or structural description and the locations of each resource. Metadata are defined through IEEE Learning Object Metadata (LOM). LOM is a data model that describes LOs and similar digital resources in order to facilitate their reusability, interoperability and to aid discoverability (IEEE LTSC 2002). The main specifications of SCORM are the following: (i) the Content Aggregation Model (CAM), which defines the content structure and provides
Education
267
metadata, (ii) the Run-Time Environment (RTE), which delivers real-time information about learner actions and (iii) the Sequencing and Navigation (SN), which describes possible paths through learning contents. It is worth noting that SCORM is strongly based on the use of client-side scripting technologies, which seem to break Web Content Accessibility Guidelines 1.0 (W3C 1999) and some international laws related to Web accessibility. However, two considerations are in order: first, at the moment it is not practically possible to leave out SCORM due to its wide use in e-learning systems and second, main current assistive technologies are consistent with the most common client-side scripting technologies; hence, SCORM compliance does not represent a crucial barrier to e-learning accessibility.
3.2 Developing Accessible E-Learning Content and Applications The ‘‘IMS Guidelines for Developing Accessible Learning Applications’’ specification defines a set of guidelines for the e-learning community (IMS 2002a). This specification drives authors and developers to produce accessible e-learning contents and applications. Both Web and non-Web applications are considered. The identified framework highlights existing solutions, discusses their implementation, opportunities and strategies and defines areas where additional development and new features are needed to effectively ensure accessible education. For example, suggestions are offered for producing accessible learning contents based on specific didactical topics (e.g., mathematics, chemistry, music). Specifically, these IMS guidelines are structured in: (i) guidelines to develop multimedia accessible contents, (ii) guidelines to develop accessible e-learning tools (communication tools, interactive environments, testing and assessment, authoring tools) and (iii) guidelines related to accessibility requirements of the specific didactical topics being considered (e.g., mathematics, chemistry, music).
3.3 E-Learning Content Customization In order to describe all the general learner’s characteristics, the IMS Learner Information Profile (IMS LIP) specification defines a set of packages to be used to import data into (and extract data from) an IMS-compliant learner information server (IMS 2002b). The main aim of such a specification is to address the interoperability of Internet-based Learner Information systems with other systems that support the Internet learning environment. The IMS Accessibility Learner Profile (IMS ACCLIP) is a part of IMS LIP devoted to describing students’ accessibility constraints (IMS 2003). ACCLIP enables the description of user accessibility preferences and needs (visual, aural or device) that can be
268
P. Salomoni et al.
exploited for tailoring learning contents (e.g., preferred/required input/output devices or preferred content alternatives). In other words, this personal user profile describes how learners interact with an e-learning environment, with specific attention to accessibility requirements. Another IMS standard related to accessibility is the AccessForAll Meta-data (ACCMD) specification (IMS 2004). ACCMD is devoted to associate to each learning resource its related metadata and to define alternatives for these resources, in order to improve the accessibility of the didactical materials. In practice, an accessible e-learning system should be able to inspect users’ profiles, based on their associated ACCLIP, and select those resources that match their preferences, by resorting to ACCMD. Based on all these efforts to structure e-learning contents and applications, it is possible to conclude that while Web accessibility is based on the idea of a ‘‘one for all’’ solution, e-learning accessibility has to deal with the needs of all users. In other words, a ‘‘one for each’’ approach must be preferred. We claim that the compliance to all the presented standards, together with all the specifications devoted to personalize contents and applications, represents the solution to address all the accessibility issues in e-learning.
3.4 Projects This subsection outlines some projects on accessibility in educational contexts, which can thus improve learning experiences of students with disabilities. Obviously, it is not the aim of this subsection to provide the reader with an exhaustive list of projects related to e-learning accessibility. We only report on the most important ones, which try to address, with different solutions, all the raised issues. The Inclusive Learning Exchange (TILE 2007) is a learning object repository which implements both ACCMD and ACCLIP. Whenever authors use the TILE authoring tool to aggregate and publish learning objects, they are supported in creating and appropriately labeling transformable aggregated lessons (by using ACCMD). Learners define their learner preferences, which are stored as IMS ACCLIP records. Then, TILE inspects such preferences and computes the best resource configuration by transforming or re-aggregating the lesson. Another project based on ACCLIP is Web4All (Web4All 2007). It allows learners to automatically configure a public computer by using a learner preferences profile implemented with ACCLIP and stored on a smartcard. Thanks to the information stored within such a smartcard, each learner can switch from a workstation to another one. Profiling learners (and used devices) is also one of the main aims of the Learning Object Transcoding (LOT) system, which is a distributed system designed and developed to automatically produce device-and user-dependant LOs (Salomoni et al. 2007). LOT exploits a specific profiling mechanism which
Education
269
is based on the use of both IMS ACCLIP and CC/PP (W3C 2004) to describe learner and device characteristics. Based on these profiles, LOT adapts learning contents to meet user needs.
4 Future Directions Currently, e-learning is mainly based on Web technologies; as a consequence, its evolution is strictly bound to that of the Web. This consideration also holds when the focus is on accessibility issues. In particular, new forms of e-learning are emerging with the broad adoption and diffusion of mobile terminals (e.g., cell phones, PDAs) and Web 2.0 technologies. As concerns the wide diffusion of mobile technologies, an important new trend is represented by ‘‘m-learning’’ (Metcalf 2006), which stands for ‘‘mobile learning’’. This new form of e-learning refers to the opportunity of browsing didactical materials everywhere, outside places traditionally devoted to education. M-learners are implicitly constrained by the limited capabilities of their mobile devices (e.g., screen dimension, network bandwidth). M-learning could also produce a ‘‘curbcut’’ effect, bound to the spreading interest of making the Web accessible for mobile devices. Indeed, the wide use of diverse devices to access e-learning contents may induce developers, practitioners and content producers to invest more resources on the personalization and adaptation of e-learning contents. Obviously, this would have a great impact on the accessibility of e-learning systems. With the diffusion of Web 2.0 technologies, new online tools like wikis and blogs can be utilized by students and teachers. These new cooperative and interactive Web applications enable students to reuse and remix contents according to their needs and interests. E-learning 2.0, the Web 2.0-fication of e-learning, can truly empower learners and create new opportunities for cooperation and collaboration in distributed didactical contexts (Downes 2007). Needless to say, these new Web 2.0 technologies need to cope with issues concerned with Web accessibility. With this in view, W3C is defining ‘‘Accessible Rich Internet Applications’’ (ARIA), a specification which states how the advanced features of dynamic contents and applications produced using these new technologies can be made accessible to people with disability (W3C 2007).
5 Authors Opinion of the Field The aspects considered in this chapter make evident that, contrarily to what commonly thought, issues concerned with e-learning are different from those concerned with the Web, even if e-learning is currently deployed by resorting to Web technologies. Such consideration also holds when one puts the focus on accessibility.
270
P. Salomoni et al.
In particular, as to e-learning accessibility, a ‘‘one for each’’ approach should be preferred to the ‘‘one for all’’ approach employed for making the Web accessible. Indeed, the former method enables to better exploit the specific personal capacities of each single learner. In other words, based on a ‘‘one for each’’ approach, each student’s experience can be adapted depending on the personal characteristics and preferences of the learner. Thus, if the learning process takes into account the students’ diversity, then the learner’s experience is strongly improved, as well as his/her motivation (Gay 2000). At a first sight, common multimedia and communication tools utilized to provide interactivity among students and teachers can be interpreted as a potential technological barrier for learners with disabilities, due to their complexity that could limit the level of accessibility of the e-learning system. On the contrary, we claim that the use of several media resources represents a valuable aspect. In fact, offering many modalities to enjoy the same e-learning service is one of the best ways to provide users with different options, allowing them to choose which modality they prefer and which better addresses their own learning style. In this sense, new Web 2.0 technologies may become an important means to support more collaborative e-learning experiences, since the adoption of a Web 2.0 philosophy for producing e-learning applications would enable all users to contribute in creating and defining cooperative and participative e-learning communities. This way, all students would be involved in the e-learning process and would contribute in the production of alternative contents and subjects explanations. Without any doubt, this will have a positive effect on e-learning accessibility.
6 Conclusions E-learning materials and applications are often thought to be used with a specific technology, or configuration. However, this makes them less available to people who have limited access capabilities or are using a non-standard computer equipment. By resorting to assistive technologies, learners with disabilities can greatly benefit from the use of e-learning systems, which provide distance and flexible didactical activities. In this chapter, we have surveyed the main issues on education and accessibility. Our main claim is that dynamic, self-configuring approaches are needed, able to automatically adapt educational services, based on the specific needs of each learner. Finally, it is our belief that a smart use of new Web 2.0 technologies would represent an important step forward, toward the creation of fully accessible, cooperative, socially available learning experiences for the masses.
Education
271
References Advanced Distributed Learning (2004) Sharable Content Object Reference Model (SCORM) 2004 2nd Edition Document Suite, retrieved October, 2006. Available from: http://www. adlnet.org/downloads/70.cfm Clark R. C. and Mayer R. E. (2002) E Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning. Pfeiffer. Downes S. (2007) E learning 2.0. E Learn Magazine. Available from: http://www.elearnmag. org/subpage.cfm?section=articles&article=29 1 Driscoll M. and Carliner S. (2005) Advanced Web Based Training Strategies: Unlocking Instructionally Sound Online Learning. Pfeiffer Wiley. Gay G. R. (2000) Supporting Students with Learning Disabilities: An Introduction to Web based Process Oriented Instruction. The CSUN Technology for Person with Disabilities Conference, Los Angeles. Horton W. and Horton K. (2003) E learning Tools and Technologies: A consumer’s guide for trainers, teachers, educators, and instructional designers. John Wiley & Sons. IEEE (2007) IEEE Learning Technology Standards Committee. Home page. Available from: http://ieeeltsc.org/ IEEE LTSC (2002) IEEE Standard for Learning Object Metadata. Available from: http:// www.ieeeltsc.org/standards/1484 12 1 2002/ IMS Global Learning Consortium (2004) IMS AccessForAll Meta data Specification. Avail able from: http://www.imsglobal.org/specificationdownload.cfm IMS Global Learning Consortium (2002a) IMS Guidelines for Developing Accessible Learn ing Applications. http://www.imsglobal.org/accessibility/ IMS Global Learning Consortium (2003) IMS Learner Information Package Accessibility for LIP. Available from: http://www.imsglobal.org/specificationdownload.cfm IMS Global Learning Consortium (2002b) IMS Learner Information Profile (LIP). Available from: http://www.imsglobal.org/specificationdownload.cfm Khan B. H. (2005) Managing E Learning Strategies: Design, Delivery, Implementation and Evaluation. Information Science Publishing. Metcalf D. S. (2006) M Learning: Mobile E Learning, HRD Press, Inc. Rosenberg M. J. (2000) E Learning: Strategies for Delivering Knowledge in the Digital Age. McGraw Hill. Salomoni P., Mirri S., Ferretti S., and Roccetti M. (2007) e Learning Galore! Providing Quality Educational Experiences Across a Universe of Individuals with Special Needs through Distributed Content Adaptation. The 3rd IEEE International Workshop on Distributed Frameworks for Multimedia Applications, Paris. Sloman M. (2002) The E Learning Revolution: How Technology is Driving a New Training Paradigm. American Management Association. The Inclusive Learning Exchange TILE (2007). Available from: http://www.barrierfree.ca/ tile/ World Wide Web Consortium (2004) Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies 1.0. Available from: http://www.w3.org/TR/2004/REC CCPP struct vocab 20040115 World Wide Web Consortium (2007) Accessible Rich Internet Applications (ARIA). Avail able from: http://www.w3.org/WAI/intro/aria.php World Wide Web Consortium (1999) Web Content Accessibility Guidelines (WCAG 1.0). Available from: http://www.w3.org/TR/WCAG10/ Web4All Project (2007) Available from: http://web4all.atrc.utoronto.ca/
Specialized Documents Ethan V. Munson and Maria da Grac¸a C. Pimentel
Abstract HTML is unquestionably the central document language of the Web, but it is by no means the only language of the Web. In fact, several other specialized types of documents are widely used and have considerable importance. In this chapter, we look at how specialized document types affect accessibility. We do not attempt to consider all possible specialized documents, but rather focus on important examples that illustrate the key issues including Adobe’s Portable Document Format (PDF), microformats, and Rich Internet Applications (RIAs). The Accessibility for RIA (ARIA) initiative is presented as an example of an effort to improve the accessibility of specialized documents, while the DAISY initiative is used as an example of how the same technologies can be harnessed to improve accessibility.
1 Types of Specialized Documents Specialised documents are found on the Web because HTML alone is not sufficient to meet every need. HTML was designed to permit physicists to share scientific documents and its structure is closely related to representations used by batch formatting systems such as LATEX (Lamport 1994) for articles and technical reports. This document model, with its headings, tables, quotations, and paragraphs with sentence-level markup, works well for technical documents, but it has some important limitations that only became apparent as the Web was used for more diverse purposes. These limitations are as follows: Unpredictable presentation. The appearance of an HTML page can vary between browsers or be affected by the size of the window displaying the page. This behavior is generally acceptable to scientists, but it is not acceptable in commercial settings where maintaining an attractive presentation or adhering to corporate appearance standards has real importance. Casual, E.V. Munson University of Wisconsin Milwaukee, Milwaukee, WI, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_16, Ó Springer Verlag London Limited 2008
273
274
E.V. Munson, M. da Grac¸a C. Pimentel
non-commercial users (e.g., amateur poets) can also be frustrated by unanticipated presentation differences. Inadequate semantics. The semantics of HTML’s elements do not map well to many common document types. For example, HTML lacks elements that are natural for representing either the greeting of a business letter or the employment dates for an entry in a resume. So, HTML often comes to be used more as a high-level page description language than as the structured document language it was envisioned as. Static content. HTML is designed for non-interactive documents whose content does not change as the user interacts with them.
1.1 Making Presentation Predictable: CSS and PDF When authors want precise control over the appearance of their Web documents, they are likely to use either Cascading Style Sheets (CSS) or the Portable Document Format (PDF). The W3C has recently approved a related recommendation on XHTML for Printing (W3C 2006), but it is not yet clear whether it will be widely adopted. 1.1.1 Cascading Style Sheets Cascading Style Sheets (CSS) (Lie and Bos 2005) is the primary style sheet language for HTML documents. CSS is a simple, declarative language that can be used to specify the appearance of HTML elements with considerable precision. CSS was designed to give authors better control over the appearance of their HTML documents and to reduce the inconsistencies in how browsers render HTML pages. All modern graphical Web browsers support CSS, though the level of support is not yet consistent. Most modern Web authoring tools make extensive use of CSS. A CSS style sheet is composed of a series of rules. Each rule has two parts: a selector and a list of declarations. The selector specifies which HTML elements the rule will apply to and the declarations specify how those elements should look. For example, the selector of this rule h2.sectHeading { font weight : bold; margin top: 4 mm; text transform: capitalize; } selects all H2 elements that have the word sectHeading as part of the value of their class attribute and declares that these elements will use the bold form of the current typeface, should have at least 4 mm of space above them, and should have their text converted to capitalized form. Version 3 of CSS has nearly one
Specialized Documents
275
hundred properties that provide extensive typographic control and a variety of different selectors and ways of combining them. The class selector used above has particular importance because designers can use the class attribute to define new groupings of HTML elements. When thinking about the interaction between CSS and HTML, it is simplistic to refer to a CSS style sheet as a single entity. CSS code can be included from separate files or it can be embedded in HTML documents. When embedded in HTML, CSS rules can be placed in a style element within the document’s HEAD element or the declarations part of a rule can be specified as the value of a style attribute for a single element. We briefly examined a few pages on commercial Web sites and found all three mechanisms being used in each page. CSS can impact accessibility because it can give new semantics to parts of Web pages, in two different ways. First, when the class attribute is used, the classes usually represent semantic categories that are not supported by HTML. An accessible browser will not know the meaning of those classes and probably will not be able to interpret them for the user. In the example above, the class is intended to be for section headings, but an accessible browser may not know this or be able to deduce it, especially since the class name is abbreviated. A second and more serious problem is that CSS can be used to communicate information solely through presentation effects such as color, type size or style, or spacing. These presentation effects will be apparent to fully sighted readers, but partially sighted readers may not see them and thus miss valuable cues to connotation and intensity of content or to relationships between elements.
1.1.2 Portable Document Format Portable Document Format (PDF) (Adobe Systems Incorporated 2006) is a representation for electronic documents designed by Adobe Systems ‘‘to enable users to exchange and view electronic documents easily and reliably, independently of the environment in which they were created’’ [(Adobe Systems Incorporated 2006), p. 25]. A PDF file describes a set of pages and how marks should be placed on those pages by the graphics engine in a printer or computer. PDF is popular because its documents are rendered identically regardless of the device on which they are produced and because free reader software is available for almost all computer platforms, including as a plug-in for most Web browsers. An author who possesses appropriate software can produce a PDF version of a document and the resulting file can be viewed at zero marginal cost by almost anyone with a computer. PDF presents accessibility problems because, at its core, PDF is a low-level, two-dimensional graphics language with special support for drawing highquality text. Common problems include the following:
Text in the PDF file is organized in the order it is drawn on the page, which may not be the order in which the text should be read. Text in footnotes and marginal notes may be interspersed with main body text. In multi-column
276
E.V. Munson, M. da Grac¸a C. Pimentel
documents, the text may be drawn in horizontal line order, alternating between the columns. A voice browser would have no way to reconstruct the correct reading order, making the text unintelligible. Because the graphics language can move the drawing position to any location on the page, some applications may handle word breaks by simply moving the drawing position a small amount each time a word ends. This means that there may not be any space characters or other clear indicators of word breaks. A voice browser faced with such data might be forced to treat entire lines of text as a single word. Most widely used character fonts were created before the development of Unicode (2006) and use non-standard numeric codes for certain characters, especially accented characters and ligatures (single symbols that are used to represent certain short character sequences like ‘‘ffi’’ in high-quality documents). A voice browser would have to know how each font mapped to Unicode or some other standard in order to correctly pronounce text using such fonts. Happily, these problems are less severe than they might be because Adobe Systems recognized the problem and designed an improved representation called Tagged PDF as part of PDF 1.4. Tagged PDF [(Adobe Systems Incorporated 2006), Section 10.7] uses a structural tagging system introduced in PDF 1.3 to encode the roles of text fragments (e.g., body text, footnote, etc.), adds explicit word breaks, and maps all fonts to Unicode. In addition, Adobe provides a tool for legacy documents called ‘‘Make Accessible’’ that uses heuristics (Lovegrove and Brailsford 1995) to convert arbitrary PDF to Tagged PDF. Still, there is no guarantee that any particular file uses the Tagged PDF representation, so PDF accessibility may remain a challenge for the foreseeable future.
1.2 Improved Semantics: Microformats Microformats1 (Allsopp 2006) are a rapidly growing response to the semantic limitations of HTML. A microformat is a structured collection of class names for HTML elements. They are designed to be used as part of a standard Web page, but to convey richer semantics than is possible within standard HTML. Examples of microformats include hCard (for contact information), hReview (for reviews of products and services), and XFN (for social network data). Microformat advocates often assert that they encode information in ways that were already widely used, but not formally described. In general, microformats are designed to be useful both for people and for automated tools. For people, they can be used in combination with CSS code to present specialized information in informative or novel ways. For automated tools, they can add semantics 1
http://microformats.org
Specialized Documents
277
to the content of Web pages that can be analyzed in order to build databases or form the basis for complex analyses. For example, the hCard microformat can be used to produce presentations that look like business cards (serving people) and can signal web crawlers that part of a document is contact information that can be added to a contact database (serving automated tools). An open-source community has grown up around microformats and this community maintains a Web site and a collection of pages on Wikipedia. These sites serve to document microformats for both naı¨ ve users and for developers. Microformats have both positive and negative effects on accessibility. On the positive side, they provide enhanced semantics which assistive tools can use to better interpret content on Web pages. On the negative side, there is no automated way for an assistive tool to learn these new semantics. So, once a microformat becomes widely used, it is likely that browsers and other tools will have been updated in order to describe, play, or show the content in a useful way. But lightly used or recently published microformats are likely to be unsupported.
1.3 Making Documents Interactive: Rich Internet Applications Many users access the Web not for documents in the strict sense, but for applications (e.g., Web-based email) and services (e.g., Web-based shopping). Such Rich Internet Applications (RIAs) are built by software developers, not by document authors. To build RIAs, developers use many technologies beyond (X)HTML including ECMAScript (widely known as JavaScript) for client-side interaction, XMLHttpRequest(XHR) (W3C, 2007) for asynchronous client–server communication, SVG (W3C, 2007d) or SMIL (W3C 2007e) for graphics and multimedia, Flash (Adobe Systems Incorporated 2007) or Java applets (Sun MicroSystems Incorporated 2007a) for running animations or full applications inside Web browsers, or even Java WebStart (Sun MicroSystems Incorporated 2007b) for downloading Java applications to run on the user’s desktop. All these possibilities allow developers to build complex Internet applications with which users interact, instead of documents that users access for reading. The importance of these applications is such that special attention has been given to them at different scales. On the one hand, requirements for supporting small client-side Web applications known as widgets (or gadgets) are being defined by the W3C so that they can be deployed securely on the Web (W3C 2007a). Widgets display and update remote data, making use of the Web as a communication platform. Example applications are weather forecasts, casual games, for news readers, and currency converters. Widgets are typically packaged in a way that allows a single download and installation in a client machine. On the other hand, widespread use of the XMLHttpRequest protocol has led to the specification of the XMLHttpRequest Object (W3C 2007c), which
278
E.V. Munson, M. da Grac¸a C. Pimentel
defines an API that provides scripted client functionality for data transfers between client and server. This object supports any text format (including XML) and allows client–server communication to occur by means of HTTP/ HTTPS messages without explicit user interaction. This protocol allows interactions similar to those used in desktop GUI applications, such as auto-completion of forms while the user types, based on information stored on the server. For developers, there are two particularly important advantages to using the Web as an application platform: (1) when users access a service via the Web there is no need for software installation/update on the client side because users always access the latest version at the server; and (2) there is no need to implement the application for various operating system platforms – although support for many different browsers can still be quite an effort. 1.3.1 ECMAScript and Dynamic HTML The simplest RIAs rely on ECMAScript to add interactive behavior to Web pages, which is supported by most modern browsers. The combination of ECMAScript, HTML, CSS, and the DOM interface to Web page data is usually called Dynamic HTML. ECMAScript code defines functions that are activated by events such as key presses, mouse clicks, and timers. The functions initiated by these events can modify page contents (with or without the user’s knowledge), refresh page contents, create new windows including alert, confirmation, and prompt pop-up windows, and changing the keyboard focus while a form is being completed. ECMAScript can also be used to define pulldown menus and selection boxes and to automate user navigation of Web pages. Many familiar features of modern Web pages are implemented with ECMAScript, including
Changes in the appearance of links or buttons as the mouse rolls over them; Copying of text from billing address to shipping address fields; Value checking for numeric and date fields; Periodic refresh of pages on instant messaging and news sites.
1.3.2 Ajax Asynchronous JavaScript and XML (Ajax) is an approach that combines extensive use of typical Dynamic HTML technologies (HTML, CSS, ECMAScript, etc.) with XMLHttpRequest(XHR)—the latter providing asynchronous client–server communication via text-based messages. As a result, Ajax allows contents of the page to be automatically updated by the server without an explicit request from the user. A typical use of Ajax is an email service that automatically suggests destination addresses as the user types some initial characters. Although such automatic completion is common in traditional local GUI desktop applications, its
Specialized Documents
279
use on the Web requires quick and transparent client–server communication so that the list of possible completions is dynamically updated as the user enters and deletes characters from the start of the address. In general, standard Web browsers cannot present synchronized multimedia or streaming audio or video. Users extend their browsers by downloading specialized applications that ‘‘plug-in’’ to the browser via a standard API. For streaming media, these plug-ins resemble RIAs because they use specialized communication pathways between the browser and a server to guarantee good quality of service. The players are written in standard programming languages and provide typical GUI controls for starting, stopping, and otherwise controlling playback. On the Web, Flash and Java applications (or animations) are particularly common. Technically, these applications do not use streaming between the server and the client, but they are mentioned here because they are often used to present interactive animations and other multimedia artifacts and are run by downloadable browser plug-ins. Flash and Java applications can be embedded in (X)HTML pages by means of the object element for general inclusion. The object element demands the specification of the location of the code relative to the object’s implementation, the instance to be run, and any necessary parameters, among several other options. As a result, the code specific for one application runs on its corresponding container. Although parameters can be passed to the application, the user agent does not have any control over how the user will interact with the code once it is running. RIAs are used to provide some of the services referred to as Web 2.0 including, for instance, blogs and collaborative platforms. These are discussed separately in ‘‘Web 2.0.’’
1.3.3 RIAs and Accessibility RIAs present tremendous accessibility challenges. They are implemented outside the normal declarative framework of HTML, using languages with most or all of the power of general-purpose programming languages. It is not possible to create a user agent that can read their source code and describe its purpose to a user. Furthermore, they support rather complex applications that make rapid changes to the screen. An assistive browser that tried to describe these changes would likely overwhelm the user with irrelevant information, defeating the purpose of the assistive technology.
2 Sources of Accessibility Problems Having now examined a variety of specialized document representations, what are the key sources of accessibility problems with them? And what are some of the opportunities for research?
280
E.V. Munson, M. da Grac¸a C. Pimentel
2.1 Non-standard semantics Authors respond to the semantic limitations of HTML either by re-purposing HTML elements without any documentation or by using microformats. In either case, the resulting document deviates in some way from HTML’s semantic model, making it difficult for assistive browsers to correctly describe the document’s content. The formal registration of microformats (as done at microformats. org) addresses this problem by publicly documenting the new semantics. But whenever a new microformat is published, it is inherently ‘‘non-standard’’ until it becomes widely adopted and is supported by browser implementations. The same problem will be seen with any new structured document format, including any use of XML. Research could address this problem by developing ontologies to describe document semantics and processes and developing best practices for rapid integration of new formats into assistive tools.
2.2 Style and presentation Authors and page designers use a variety of presentation techniques to enhance the message communicated by their documents. When style is used for purely aesthetic reasons or in order to meet corporate appearance standards, it probably has little impact on accessibility. But style can also be used to communicate substance in subtle ways. Layout, color, and type choice can be used to convey importance. Layout is also useful for conveying relationships between document elements, since items that are aligned or are close to each other will be considered to be related to each other [(Stone et al. 2005), Ch. 5]. In time-based multimedia, synchronization plays the same role as layout, communicating which items belong together. Researchers could focus on techniques for identifying these sorts of uses of presentation effects and for expressing them with assistive technologies.
2.3 Low-level representations Low-level document formats, like PDF, represent documents at or near the graphical rendering level. They may not carry any semantic information and, as described earlier, may organize text in a drawing order different from the natural reading order of the document. In general, assistive browsers cannot reconstruct the document semantics, though it is possible to reconstruct the document’s reading order.
2.4 Interactive document features Interactive documents, such as RIAs, present tremendous accessibility challenges for several reasons. First, the semantics of the scripts and other
Specialized Documents
281
programming technology that drive interactive documents cannot be reliably deduced, so assistive browsers must rely on annotations made by developers. Second, the fact that interactive documents are highly dynamic means that a complete description of every aspect would likely overwhelm both the technology and the user. Finally, some interactive features may have low value, providing decoration and entertainment, rather than substance. The next section of this chapter describes one important effort to make RIAs accessible via semantic annotations. User interface researchers could examine novel assistive interfaces that help users interact efficiently. Information scientists could study usage patterns in order to provide guidance so that RIAs can be deployed judiciously, without compromising accessibility to an unnecessary degree.
3 Improving Accessibility This section presents the complementary initiatives called Accessibility for RIA (ARIA) and DAISY and discusses how they are intended to improve accessibility.
3.1 Accessibility for RIA (ARIA) In the context of RIAs in general, the Protocols & Formats Working Group of the Web Accessibility Initiative (Schwerdtfeger 2006) is engaged in several efforts to identify the main technology gaps associated with RIAs with respect to dynamic content and navigation techniques and to provide appropriate accessibility solutions. These efforts include
the identification of technology gaps via the WAI-ARIA Roadmap (Schwerdtfeger 2006);
the specification, via the WAI-ARIA Roles (Seeman and Schwerdtfeger 2007), of a standard way to associate behaviors and structure to element types that, although dynamic in terms of presentation and navigation, have contents that do not change with time or user actions; and the specification of WAI-ARIA States and Properties (Seeman et al. 2007), which allow XML-based languages to provide information about the behavior of their elements with attributes that associate states and properties whose values are accessible via the DOM interface. The WAI-ARIA Roles specification (Seeman and Schwerdtfeger et al. 2007) is designed to help authors provide proper type semantics for custom widgets that can be used to support accessibility, usability, and interoperability with
282
E.V. Munson, M. da Grac¸a C. Pimentel
assistive technologies. The specification provides an RDF-based ontology of roles that, attached to widgets and structures, allow them to be recognized by assistive technologies. The taxonomy defines a hierarchy of roles and is designed to be able to describe the states supported by each role, the role’s context, and the relationships between roles. As an example, roletype describes the structural and functional purpose of an element and can be specialized as follows: roletype ! role ! widget ! input ! option ! checkbox tristate ! checkbox ! radio ! [HTML] input (type : radio). Also, a checkbox is defined as a control that has two possible value states (e.g., a boolean) and a radio type is known from assistive technologies (an option in single-select list). The WAI-ARIA States and Properties specification (Seeman et al. 2007) declares attributes and their values that can be used to define states and properties of RIA roles. The attributes for states (such as checked and busy) and properties (such as live and haspopup) can be associated with any XML element. The values for these attributes are also defined, such as true, false and error for busy, and off, polite, assertive, and rude, with some default values established (e.g. off, for live). The example below, adapted from the WAI-ARIA Roles specification (Seeman and Schwerdtfeger 2007), creates a checkbox (wairole:checkbox) affected by onkeydown and onclick events. The checkbox supports true or false states for aaa:checked (from WAI-ARIA States) with default value false. The document’s html element includes two namespace attributes that refer to the taxonomy of roles and the set of states and properties, respectively.
... ... A checkbox label ...
Specialized Documents
283
The example also illustrates the use of tabindex attribute defined by the WAI-ARIA States and Properties specification. Although tabindex is not a state that can be associated to any XML element, it can be used with the extended XHTML elements div, span, p, td, th, and li to indicate the tab order of elements. With respect to XHTML in particular, a way to provide information with respect to semantics to any XHTML element is being discussed by the W3C with the definition of the XHTML Role Attribute specification,2 which supports the integration of a‘‘role’’ attribute into any markup language based on XHTML.3
3.2 DAISY DAISY is the acronym for the Digital Accessible Information System. Coordinated by the DAISY Consortium,4 the original effort sought to create audio materials with enhanced navigation features derived from structured textual documents (e.g., textbooks). As a natural evolution, DAISY now aims to provide accessible and navigation-enabled multimedia documents derived from any form with accessibility limitations (from textbooks to video). The effort’s overall goal is to give users with special needs (mainly visual or motor impairment) access to structured digital content that provides a user experience that is essentially equivalent to the original text or video-based content from which it is derived. The DAISY Consortium’s focus has been on providing specifications that allow for a broad range of book structures – from non-structured (e.g., quasi-sequential novels) to very structured books (e.g., manuals, recipe books, or encyclopedias). Since the mid-1990s, the consortium has defined a number of specifications based on existing Web standards for representing and rendering accessible content. The current specification uses XML and SMIL and work is underway to exploit other languages such as MathML,5 SVG, and Braille. While DAISY is a good example of leveraging existing standards in the specification of authoring and rendering tools for a special user population, it also shows how efforts geared to users with special needs can bring benefits to a broader user population – a case in point is the fact that mobile users can directly benefit from the availability of the multimedia-enhanced documents produced by DAISY tools. 2
XHTML Role Attribute Module. A module to support role classification of elements, W3C Working Draft, 25 July 2006 3 XHTML 1.0 The Extensible HyperText Markup Language (Second Edition) A Refor mulation of HTML 4 in XML 1.0 W3C Recommendation, 26 January 2000, revised 1 August 2002 4 http://www.daisy.org 5 http://www.daisy.org/projects/mathml/mathml in daisy spec.html
284
E.V. Munson, M. da Grac¸a C. Pimentel
4 Summary In this chapter, we have discussed the impact on accessibility of various Web technologies that go beyond HTML. Authors and application designers are driven to go beyond HTML in order to better control presentation, to provide enriched semantics, and to build dynamic documents and interactive applications. CSS and PDF were presented as examples of representations that give authors more presentation control, while microformats deliver richer semantics, and a variety of technologies including ECMAScript and Ajax are used to create Rich Internet Applications. Accessibility problems can arise with each of these technologies. We have identified four principal sources of accessibility problems: non-standard semantics, style and presentation, low-level representations, and interactive document features. There are significant opportunities for research addressing each of these problems. The ARIA and DAISY initiatives have been presented because they show two useful responses to specialized documents. ARIA is setting forth standards and guidelines that make Rich Internet Applications more accessible. DAISY is exploiting specialized document standards to create better digital documents for users who need assistive technology. It is important to observe that specialized documents and RIAs can be made accessible regardless of whether they are used over the Web or in a standalone computer. But this goal can be difficult to achieve because, as discussed in the context of the Web [(Schwerdtfeger 2006) Section 2], ‘‘accessibility depends on abstracting semantics from both content and presentation information.’’ In practice, good accessibility requires effort from the application designer from the earliest stages of design so that the resulting document or application takes into account impaired users and their needs. Current efforts toward improving the accessibility of specialized documents include improvements to their specifications, ranging from SMIL to RIAs. One important benefit of these efforts to add metadata to markup, content, and navigation structures is that it has helped foster the Semantic Web: accessibility is improved at the same time that it boosts the development of customized applications such as intelligent user agents. Finally, it is interesting to note that as more attention is given to accessibility, the benefits of following an accessible approach can have an impact on the design of user agents and documents subject to other limitations. This has been the case with XHTML Basic 1.1 (W3C 2007b) which defines the language supported by limited Web clients such as mobile phones, PDAs, pagers, and settop boxes. Acknowledgments Steve Bagley and David Brailsford contributed greatly to our under standing of PDF and accessibility. Lynn Leith helped us understand the DAISY initiative.
Specialized Documents
285
References Adobe Systems Incorporated. Flash Developer Center. Available at http://www.adobe.com/ devnet/flash/, accessed July 2007. Adobe Systems Incorporated. PDF Reference, sixth edition, November 2006. John Allsopp. Microformats. Friends of Ed, see also http://microformats.org/, 2006. The Unicode Consortium. The Unicode Standard, Version 5.0. Addison Wesley Professional, 2006. Leslie Lamport. LATEX: A Document Preparation System. Addison Wesley, 2nd edition, 1994. Hakon Wium Lie and Bert Bos. Cascading Style Sheets: Designing for the Web. Addison Wesley Professional, 3rd edition, 2005. William S. Lovegrove and David F. Brailsford. Document analysis of pdf files: methods, results and implications. Electronic Publishing: Origination, Dissemination, and Design, 8(2):207 220, 1995. Richard Schwerdtfeger. Roadmap for Accessible Rich Internet Applications (WAI ARIA Roadmap) W3C Working Draft 20 December 2006. http://www.w3.org/TR/aria road map/. Available at http://www.w3.org/TR/aria roadmap/ Lisa Seeman and Rich Schwerdtfeger. Roles for Accessible Rich Internet Applications (WAI ARIA Roles) An RDF Role Taxonomy with Qname Support for Accessible Adaptable XML Applications W3C Working Draft 1 June 2007. World Wide Web Consortium (W3C). Available athttp://www.w3.org/TR/aria role/ Lisa Seeman, Rich Schwerdtfeger, and Aaron Leventhal. States and Properties Module for Accessible Rich Internet Applications (WAI ARIA States and Properties) Syntax for adding accessible state information and author settable properties for XML W3C Working Draft 1 June 2007. World Wide Web Consortium (W3C). Available at http://www.w3.org/ TR/aria state/ Debbie Stone, Caroline Jarrett, Mark Woodroffe, and Shailey Minocha. User Interface Design and Evaluation. The Morgan Kaufmann Series in Interactive Technologies. Mor gan Kaufmann, 2005. Sun Microsystems Incorporated. Code Samples and Apps Applets, 2007a. Available at http:// java.sun.com/applets/ Sun Microsystems Incorporated. Desktop Java Java Web Start Technology, 2007b. Available at http://java.sun.com/products/javawebstart/ World Wide Web Consortium (W3C). XHTML Print W3C Recommendation 20 September 2006. Available at http://www.w3.org/TR/xhtml print/ World Wide Web Consortium (W3C). Widgets 1.0 Requirements W3C Working Draft 05 July 2007a. Available at http://www.w3.org/TR/widgets reqs/. World Wide Web Consortium (W3C). XHTML Basic 1.1 W3C Candidate Recommendation 13 July 2007b. Available at http://www.w3.org/TR/xhtml basic World Wide Web Consortium (W3C). The XMLHttpRequest Object W3C Working Draft 18 June 2007c. Available at http://www.w3.org/TR/XMLHttpRequest/ World Wide Web Consortium (W3C). Scalable Vector Graphics (SVG) XML Graphics for the Web, 2007d. Available at http://www.w3.org/Graphics/SVG/ World Wide Web Consortium (W3C). The Synchronized Multimedia Integration Language (SMIL), 2007e. Avaiable at http://www.w3.org/AudioVideo/
Multimedia and Graphics Bob Regan and Andrew Kirkpatrick
Abstract The accessibility of graphics and multimedia should be considered from at least two distinct perspectives. First, in terms of the provision of sufficient information for people with disabilities that limit access to specific elements of the content. Second, graphics and multimedia can be thought of as equivalent alternatives to text for those with cognitive disabilities. This chapter looks at issues in the delivery of equivalent information and the preservation of essential content so as to serve the largest possible audience.
1 Introduction The accessibility of multimedia and graphics is based on a fairly straightforward set of premises.
Provide non-visual equivalents of visual content for those that have difficulty seeing.
Provide non-audio equivalents of audio content for those that have difficulty hearing.
Ensure keyboard access to interactive elements for those that have difficulty using their hands. The challenge of graphics and media is to design in a way that serves the needs of all users at the same time, regardless of disability. The techniques for addressing accessibility in multimedia increase in complexity as the media becomes more dynamic in nature. For a simple graphic element, a short text equivalent coded into the HTML is all that is needed. For more complex, interactive content, often multiple equivalents will be required, with keyboard equivalents for mouse-driven events and instructions for users of assistive technology. B. Regan Adobe Systems Incorporated e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_17, Ó Springer Verlag London Limited 2008
287
288
B. Regan, A. Kirkpatrick
Designers and developers have long sought ways of simplifying the process of creating accessible content. To that end, numerous standards, tools and techniques have emerged to support developers in that process. However, so too have strategies that oversimplify rich content for the sake of the convenience of the designer. This chapter will examine issues in theory and practice of developing accessible multimedia. The specific techniques of developing accessible multimedia are too numerous to cover in this relatively brief chapter. A list of resources is provided at the end of the chapter. This chapter will look at the issue of how graphics and multimedia are used to enhance content and how equivalents for that content should and should not be provided. Special attention will be paid to the topic of Rich Internet Applications as this is a key challenge in the present and near future. A brief overview of changes in technology will be followed by a look at the rapidly changing standards for accessibility. In the concluding section, some challenging questions for further research will be posed by the authors.
2 Overview The accessibility of images and multimedia covers the widest possible range of topics. At one end of the spectrum lie graphics and images. At a purely technical level, providing alternative information for images in HTML is a trivial practice. Take the image shown in Fig. 1. The image shows a picture of two small
Fig. 1 Girls sharing a soda
Multimedia and Graphics
289
girls sharing a soda through straws. The text equivalent for this image might be descriptive of the content and read, ‘‘Girls sharing a soda.’’ The text equivalent might be descriptive of the purpose of the image on the page and read, ‘‘Our photo album.’’ In this way, we are able to provide users of assistive technologies, such as screen readers and screen magnifiers, a means of accessing the content and content of a Web site or other digital content. At the level of content, the use of images becomes a more complex and more interesting practice. Images and multimedia do much more than add interesting elements to a Web site or application, they add content. It is often much easier to understand data and the corresponding relationships when visualizes the data. In his seminal work, The Visual Display of Quantitative Information, Edward Tufte (2001) points out that ‘‘Graphics reveal data. Indeed graphics can be more precise and revealing than conventional statistical computations.’’ It is precisely for this reason that graphics and multimedia are so often used in digital content. Images and multimedia are powerful in their ability to reinforce or convey information. Taking this concept even further, for a person with a cognitive disability, the image may be seen as an equivalent for the text. For a person with difficulty reading or a cognitive disability, text may be very difficult to understand. The use of images can greatly enhance the experience of content for people with cognitive disabilities. The challenge is to use text and images in a way that serves both needs at the same time, not to mention the needs of many others not presented thus far. For a person who is blind, text is often the most accessible form of content. For a person with a cognitive disability, text is often the least accessible form of content. These two perspectives represent two ends of a continuum. The task of the designer is to accommodate as many of these perspectives as possible within their work.
3 Discussion In terms of both research and practice, it is helpful to consider accessibility of images and multimedia in terms of three distinct levels.
3.1 Standards and Specifications The first is the level of standards and specifications. These are the concrete recommendations reflected in standards documents such as the Web Content Accessibility Guidelines (WCAG) from the W3C. These standards are explicit in their recommendations for images and animation, but show their age a bit when applied to interactive multimedia and Rich Internet Applications. The first and most basic guideline of the WCAG is checkpoint 1.1:
290
B. Regan, A. Kirkpatrick
Provide a text equivalent for every non-text element. For the image of the two girls shown in the previous section, we might code that image as follows: Adding a text equivalent to images has become almost a trivial process. Many authoring tools such as Adobe’s Dreamweaver have incorporated prompts that will ask designers to provide a text equivalent whenever an image is placed on the page. With a slightly more dynamic example, such as a simple Flash animation, more options are available for delivering equivalents to the end user. As with QuickTime, Real Media and other media formats, Flash is coded into a page using the object element. By adding a title attribute, we can deliver an equivalent for the content of the animation. Say Fig. 2 is taken from an animation of a moon orbiting a planet. Using HTML, this equivalent might be coded as ... However, in the case of Flash content, the equivalents may also be coded into the Flash object itself. Using ActionScript, the same equivalent for the movie might be added to the root level of the movie as follows: _root.accessibilityProperties = new AccessibilityPro perties(); _root.accessibilityProperties.name= ’Moon orbiting pla net’; Accessibility.updateProperties(); For simple animations, where the content is conveyed using this simple text equivalent, there is little advantage to coding the equivalent into the Flash movie itself. However, as the Flash animations and other multimedia content
Fig. 2 Animation of a moon orbiting a planet
Multimedia and Graphics
291
become more complex, with multiple elements inside of the object that require equivalents, coding equivalents into the movie directly is not only helpful, it is required. Interactive Multimedia and RIA technologies like Flex, and increasingly AJAX, rely on operating system level specifications such as Microsoft Active Accessibility(MSAA). MSAA serves as a list of the objects on screen with a brief description of each object. Increasingly assistive technology such as screen readers, screen magnifiers and speech recognition tools rely on accessibility APIs, like MSAA, to understand what is happening on screen. At the level of RIA development, it is critical that developers expose the controls of an application at the level of the operating system. Like user interface libraries for desktop applications, Flex components and AJAX controls must follow a specific set of conventions in order for them to work correctly in the assistive technology and for the user to understand their operation. One of the most significant trends in accessibility is the rapid pace of change of these OS level specifications. With the release of Vista, Microsoft has introduced a new accessibility API called UI Automation (UIA). At the same time, the Mac OS, which has long had an API (Apple, 2007), has seen the development of a screen reader technology known as Voice Over. On the UNIX front, the Orca and Gnomepernicus screen readers and the ATK API have also increasingly generated interest in the accessibility community. Taken altogether, there is a complex array of APIs that must be addressed at one and the same time. Notably, an effort driven by IBM, known as IAccessible2, has sought to provide an open API that includes features of multiple platform APIs.
3.2 Assistive Technology The second level is that of assistive technology interoperability. At this level, the designer must determine whether following the specification results in the predicted outcome when an individual control or application is used with assistive technology, such as a screen reader. In short, answer the question, ‘‘does this work as it should?’’ In many cases, simply following a specification does not imply that the resulting application or control will ‘‘just work.’’ With simple content such as images, animations, audio and video, problems are almost certain to be a result of user error on the part of the developer. Screen readers have become very good at delivering equivalents for most media formats. In more dynamic forms of multimedia, particularly at the level of component frameworks in AJAX and Flex, determining where the issue resides is more of a challenge. Building and maintaining assistive technology is complex work. As new features and new uses for the spec emerge, it can be a struggle to add support. As a result, creating new accessible controls and applications can sometimes mean coordination with assistive technology manufacturers or at a minimum, documenting issues for users of assistive technologies.
292
B. Regan, A. Kirkpatrick
With the significant change associated with the release of Vista, this challenge becomes particularly acute. While UIA and IAccessible2 offer a rich range of benefits over MSAA, implementing support for these APIs in the screen readers is an immense amount of work. This is further complicated by the work required in user agents such as browsers and the Flash Player.
3.3 User Testing The third level is that of user testing. With multimedia, particularly interactive multimedia, it is important to consider what makes a site or an application usable for a person with a disability. This is probably the most challenging set of issues for an individual developer. In this case, a developer must not only determine what works, but how well an individual control or application works for people with disabilities. In so doing, a host of other challenges are presented, including
Which users should be considered in making such determinations? What constitutes a usable experience for different users? How is usability for people with disabilities distinct from mainstream usability practices?
How does a developer without a disability come to develop instincts for what makes a good experience for someone with a specific disability? At the end of the day, any site or application can be deemed inaccessible if the majority of users cannot successfully complete the tasks associated with it. Please refer to the chapter on end user testing for recommendations on conducting tests and available resources. Suffice it to say in this context that as the complexity of the multimedia increases, so does the need for user testing.
3.4 Text-Only Sites In the early days of the Web, designers often relied on a technique known as creating ‘‘text-only versions’’ of sites. In these cases, the main site used images and other forms of multimedia, and a second version did not. The ‘‘text-only’’ version was often used as a means of addressing accessibility. Rather than spending time and effort to make the primary version of the site accessible, effort was devoted to the text-only version. In many cases, this technique left people with disabilities with a second-class experience of the content. Often, the sites were not as deep or updated as often as the main site. People with mobility impairments who could see were not able to view the media content since the text-only version of the site was the only keyboard-accessible version. While the text-only site may indeed have been accessible, the important question is whether it provides the same experience
Multimedia and Graphics
293
of the content as the full site. In most cases, the answer was no. Some have referred to this technique as the ‘‘ghettoization’’ of the Web. While text-only sites have long since fallen out of favor, other techniques are starting to grow in popularity that rely on a similar concept of people with disabilities’ needs and rights to the same experience or content. Techniques that rely on ‘‘graceful degradation’’ and ‘‘progressive enhancement’’ would fall into this category. Rather than providing an accessible version of the primary multimedia content, these techniques provide only text equivalents for media. While it is certainly a valid concern that some users might not have all necessary plugins to view specific kinds of content, this is a separate issue from that of accessibility. If the means exists for multimedia content to be made accessible, then a reasonable effort should be made to do so. Simply providing text equivalents in place of richer media is not an equivalent experience. Text-only sites and site relying on ‘‘graceful degradation’’ and ‘‘progressive enhancement’’ treat people with disabilities as second-class citizens when it comes to accessing digital content.
3.5 Separating Presentation from Structure In the design of images and multimedia, there is an inevitable tension between control and flexibility. Given the amount of thought and effort given to layout, composition and color, there is a natural desire on the part of designers to keep those designs intact. Even slight modifications may have significant impact on the overall presentation. By contrast, users require flexibility. The context in which content is viewed is never the same as that in which it was created. The capabilities, preferences and personalities of the viewer are never the same as that of the designer. The Web served to exacerbate this tension. Browsers provided users with the ability to change text size, change style sheets or even turn off images and multimedia altogether. While designers initially resisted this change, over time, the practice of design was forced to change. The old rules of print design eventually gave way to a practice of design more consistent with the Web. Consider the use of images on the Web. Initially, images were used sparingly to accommodate limited bandwidth. However, as this constraint began to lessen, some designers sought to use images and image maps as a means of creating entire sites. While this provided for tremendous control over the design, it provides no flexibility to the end user. The text size, color and contrast could not be changed. The text was not searchable and it was not easy to perform dynamic updates. This practice quickly fell out of favor, but designers were restless for an alternative. In the end, designers came to increasingly rely on CSS. This allowed designers to make a distinction between images used as part of the content of the page and images used to help make the page look a certain way. While CSS had been around for a long time, many designers had been resistant to using it
294
B. Regan, A. Kirkpatrick
as the exemplars of CSS were not sites that designers aspired to mimic. In 2003, Dave Shea launched the CSS Zen Garden—a site composed entirely of multiple versions of the same page, each with a different style sheet. The explicit emphasis on design drove incredible interest in CSS’s capabilities. Now, the use of CSS, and more importantly, the separation of presentation from structure, is a practice that is much more common in mainstream design and authoring tools.
3.6 Separating Presentation from Structure in Video When applied to other forms of media, applying the concept of the separation of presentation from structure raises some important questions. Consider a video example. The WCAG has two key requirements with respect to video content: first, that there is closed captioning for audio content; second, that there is audio description for on-screen actions. Some argue that separating presentation from structure in this situation would leave one with a text transcript that includes the caption data and the text of the audio descriptions. This is a fundamental misunderstanding of the use of media. It effect, it creates a ‘‘text-only’’ version of the video. While this is required for some users, those who are deaf–blind for example, it removes important content that is potentially useful to other users. It is critical that designers and developers understand that the video itself is content. While the equivalents are necessary and extremely valuable, there is important detail that is missing from this data that is included in the video. While it is important to provide equivalents, it is equally important to understand that equivalents individually enhance the original content. A set of equivalents should not be viewed as a replacement for the original content.
4 Future Directions Graphics and multimedia represent a wide range of content in terms of complexity and dynamic presentation. Still images sit at one end of the spectrum, while Rich Internet Application (RIA) occupies the other: a term coined by Macromedia in 2002 to describe a new, emerging class of Flash application, where desktop software functionality began to appear in a browser. Now, RIAs are built in Flash and AJAX and have become some of the most popular tools on the Web today (Duhl, 2003). Google Mail and Yahoo! Maps are just two well-known examples. RIA accessibility presents a variety of new challenges to the developer. For the purposes of this discussion, it is helpful to segment the topic of RIA accessibility into two distinct areas:
The accessibility of individual controls The accessibility of interactions between controls within an RIA.
Multimedia and Graphics
295
In the case of the former, a fair amount is already understood. There are a number of documents available detailing how an individual control should behave. Assistive technology vendors have already examined and addressed (or attempted to address) a wide range of control types already declared within the framework of the operating system itself. Usability represents the area in need of the greatest research. While we can rely on mainstream accessibility practice to inform development of an individual control, there is little that specifically speaks to the usability for people with disabilities. At the same time, this is an area where great breakthroughs are not likely to happen. Take a simple example, such as the dial shown in Fig. 3. Visually, the dial provides a significant amount of detail. For example, the knob tells you what it does (control the volume), what the current setting is (8 on the dial) and the maximum setting (11 on the dial). Providing equivalents for these elements of the control cannot be accomplished using simple, static text equivalents. The equivalents must be updated as changes are made to the dial. The easiest way to accomplish this level of detail is to pass information about the control to the user through an OS level API such as MSAA. The same mechanism is used to expose functionality within a desktop application. It requires adherence to the OS level Accessibility API specification, testing for interoperability with assistive technology and some degree of user testing to ensure the control behaves as other, similar controls in different contexts. With RIAs, it is not sufficient to contemplate the accessibility of each individual object alone. A key element of RIAs is that one control may impact the display of one or more different controls, all on the same screen, with no refresh. Hence, the accessibility of an RIA is also determined by the interaction of individual controls with one another. While much is known about how an individual control behaves, little is documented in terms of how one control should affect other controls and how the resulting state should be communicated to the user.
Fig. 3 Graphic of a dial that goes up to 11
296
B. Regan, A. Kirkpatrick
In contemplating RIA accessibility, the interaction between controls within an RIA needs to be examined in the context of standards, AT interoperability and usability for people with disabilities. The practice of RIA development today refers to these interactions in terms of ‘‘design patterns.’’ Efforts such as the Yahoo! Design Pattern Library or the Cairgorm Framework (Adobe Systems Incorporated, 2007) represent efforts to establish categories of interactions in this respect. To date, little work has been done to document accessibility in the context of these design patterns. However, the same approach that has been applied to the accessibility of individual controls may be applied. Figure 4 presents a very simple RIA consisting of a slider and a datagrid. Part of a fictional online shopping application, this RIA allows the user to increase the value on the slider, representing the amount of money they are hoping to spend, and the number of items listed in the datagrid increases. Move the slider in the opposite direction and the number of items in the datagrid decreases. In this example, there are only three pieces of content, the title, the slider and the datagrid. We can style this content, just as can be done in HTML. We can make the interface pink, we can brand it for our online store, or we can leave the application in simple black and white. Many misinterpret the concept of the separation of presentation from structure in this context by focusing on the application technology itself, rather than the elements that make up the application. In creating an accessible RIA, we may ask what a user will do if Flash is not available, or if JavaScript is turned off. However, the more important question is if a user does come to the simple RIA shown above, will the slider and the list be exposed in a predictable way? Will the functionality of these controls be available to the user? Will updates to list, based on changes in the slider, be exposed in real time? The most common context in which we might observe a separation of the presentation from content of an RIA is when using a screen reader. When interacting with the very simple RIA Fig. 4, the screen reader will note the controls, provide a means to make changes to the slider and note updates made to the application. In its simplest form, the application above can be understood
Fig. 4 A very simple RIA
Multimedia and Graphics
297
as a simple list of working controls. If the controls do not actually work, and work in a similar fashion, then the content of the application itself has been changed.
5 Authors’ Opinion of the Field One of the greatest challenges facing the accessibility community today is that of establishing a set of best practices for the accessibility of interactive multimedia, and in particular, RIAs. With the growing popularity of RIAs developed with Flash and AJAX, it is critical that strategies be developed quickly, or there is a very real risk that people with disabilities will be left behind. Designers and developers will not wait to build and deploy RIAs until solutions to these questions are found. Instead, they will press forward, creating a generation of legacy content that will require retrofitting at a later time. Ensuring the accessibility of RIAs must happen at numerous levels. First, support for the various OS Accessibility API sets must be added to commonly used development frameworks. Adobe has added support for MSAA to its Flex framework. However, Flex does not include support for the Mac Accessibility API, Gnome’s ATK or UIA in Microsoft Vista. The authors can speak from personal experience on the challenges of supporting multiple accessibility APIs within the same framework. Whatever the challenge faced by a single corporation, the challenge is even greater when faced by individual developers working on open source efforts. Adding cross-platform support to numerous AJAX frameworks poses a significant challenge in the near future. This is one of the driving rationales behind the IAccessible2 and ARIA efforts and underlines their importance. Second, the accessibility of RIAs relies on interoperability with assistive technology. For makers of existing PC screen readers and screen magnifiers, there is the immediate challenge of incorporating support for UIA and Vista, without losing existing features and functionality. This is a non-trivial task for these very small companies to handle. However, doing so should expose a tremendous amount of new capabilities in applications and the assistive technology itself. More generally, screen reader makers across all platforms must decide how best to expose the new, richer functionality of RIAs to the end user. This is in part a question of usability. A more solid understanding of how people with disabilities interact with RIAs is needed before one can develop feature sets that account for these workflows. Third, in addition to impacting the specific feature sets of assistive technology, a more complete understanding of how users with disabilities interact with RIAs is needed to arrive at a specific set of best practices in the design of RIAs. Design standards and best practices must be based on a firm and rigorous understanding of user needs and requirements. While the latest efforts at
298
B. Regan, A. Kirkpatrick
developing such standards are able to extrapolate issues from previous generations of standards and apply them to the challenge of RIAs, it is merely a hypothesis until more serious research is conducted. In the end, the designer is responsible for the accessibility of their content. However, in order for designers to have a reasonable chance of developing accessible RIAs a number of conditions must first be met. Without frameworks to build upon, a set of tools to evaluate their content or a set of best practices to guide them, it is not likely that anyone will achieve the goal of accessible RIAs.
6 Conclusion The accessibility of graphics and multimedia is perhaps the most challenging area in accessibility today. It is important to remember that images, animations, video, audio and interactive elements are not merely decorative; they enhance content, and in many cases serve as the primary means of delivering content. Further, it is important to keep in mind that, used correctly, multimedia is often helpful for people with learning disabilities in accessing concepts within the content. Thus, it is often not adequate to simply provide text equivalents for graphics and multimedia. Instead, best practices for the each form of media should be followed. When possible, a single, accessible version of the content is far preferable to providing a separate, text heavy, accessible version of content. While newer techniques, like ‘‘progressive enhancement,’’ have sought to make the management of multiple versions of content easier, they do not address the fundamental accessibility of the media itself. As multimedia becomes increasingly sophisticated, new techniques are needed. With the emerging popularity of Rich Internet Applications, it is increasingly clear that the new research is needed in order to understand the usability of RIAs for people with disabilities. In the absence of such information, it will be hard to develop adequate standards or informal best practice guidelines.
Resources Adobe – Best Practices for Accessible Flash Design http://www.adobe.com/resources/accessibility/best_practices/ best_practices_acc_flash.pdf WebAIM – Appropriate Use of Alternative Text http://www.webaim.org/techniques/alttext/ WebAIM – Captioning with MAGpie 2.0 http://www.webaim.org/techniques/captions/magpie/version2/ W3C – Web Content Accessibility Guidelines http://www.w3.org/TR/WCAG10/wai-pageauth.html
Multimedia and Graphics
299
References Adobe Systems Incorporated. Cairngorm. Retrieved February 1, 2007, from the World Wide Web: http://labs.adobe.com/wiki/index.php/Cairngorm Apple, Incorporated. Introduction to Accessibility Programming Guidelines for Cocoa. Retrieved February 1, 2007, from the World Wide Web: http://developer.apple.com/ documentation/Cocoa/Conceptual/Accessibility/index.html //apple_ref/doc/uid/10000118i Duhl, J. (2003) The Business Impact of Rich Internet Applications. Retrieved February 1, 2007, from the World Wide Web: http://download.macromedia.com/pub/solutions/ downloads/business/idc_impact_of_rias.pdf Tufte, E. (2001) The Visual Display of Quantitative Information. Graphics Press, Chesire, CT.
Mobile Web and Accessibility Masahiro Hori and Takashi Kato
Abstract While focusing on the human–computer interaction side of the Web content delivery, this article discusses problems and prospects of the mobile Web and Web accessibility in terms of what lessons and experiences we have gained from Web accessibility and what they can say about the mobile Web. One aim is to draw particular attention to the importance of explicitly distinguishing between perceptual and cognitive aspects of the users’ interactions with the Web. Another is to emphasize the increased importance of scenariobased evaluation and remote testing for the mobile Web where the limited screen space and a variety of environmental factors of mobile use are critical design issues. A newly devised inspection type of evaluation method that focuses on the perceptual–cognitive distinction of accessibility and usability issues is presented as a viable means of scenario-based, remote testing for the Web.
1 Introduction The Web is becoming increasingly ubiquitous as a platform of content delivery taking place in a variety of devices such as desktop personal computers (PCs), personal digital assistants (PDAs), and consumer electronics. Moreover, with the proliferation of wireless networks, many more people have access to mobile devices than to desktop PCs. However, Web access from mobile devices often suffers from usability and interoperability problems, which hinder the full potentiation of the Web. In order to realize unified Web access from any device by anyone in any environment, it is necessary to consider essential characteristics of content delivery. The concept of delivery context, whose significance was noted by W3C Device Independence Working Group (Gimson 2003), is characterized by a broad range of attributes related to device capability, M. Hori Faculty of Informatics, Kansai University, 2 1 1 Ryozenji cho, Takatsuki shi, Osaka 569 1095, Japan e mail: [email protected] u.ac.jp
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_18, Ó Springer Verlag London Limited 2008
301
Mobile Web and Accessibility
303
taken in devising an inspection type of evaluation method that can be instrumental in ensuring Web accessibility and usability at both perceptual and cognitive levels of human–computer interaction. Finally, we discuss problems and prospects of the mobile Web and Web accessibility in terms of what lessons and experiences we have gained from Web accessibility and what they can say about the mobile Web.
2 Going Beyond Perceivability Many have noted that Web access from mobile devices is similar in many ways to Web access by persons with disabilities (Harper 2004). For instance, colorful, graphical, visual representations of content information on Web pages are not available to visually impaired users, whereas limited bandwidth and screen space prevent the mobile Web from utilizing similar visual presentations and effects. There is a shared belief that lessons learnt and experiences gained from Web accessibility can be effectively applied to designing the mobile Web. One particular lesson we shall address here is the importance of going beyond physical/perceptual levels of accessibility and usability in designing the Web. The guidelines and regulations for Web accessibility undoubtedly help designers and developers to become aware of various needs of persons with disabilities. However, simply following those specifications does not guarantee that resulting Web pages are indeed accessible and usable for persons with disabilities. One reason among others is that it is almost impossible to exhaustively consider the varied needs of persons with disabilities in the accessibility guidelines. Another is that designers and developers tend to focus on rather literal or superficial compliance to the regulations and guidelines without deep understanding of what they are meant for. Consequently, as Hanson (2004) puts it, ‘‘It is not uncommon to have Web pages that meet standards for technical accessibility, but are still difficult to use by persons who have disabilities.’’ This situation led Asakawa (2005) to suggest that Web designers and developers should have opportunities to experience real problems faced by persons with disabilities (e.g., accessing the Web using a screen reader) so they can go beyond compliance and consider the real usability. The question remains, however, as to what is actually needed for the Web to be really accessible and usable. It seems that Web accessibility is typically discussed in the context of making information available to persons with disabilities and as such the discussion tends to be focused on physical/perceptual accessibility. There seems to be an implicit, underlying assumption that information is accessible so long as it is perceivable in one way or other (e.g., visible, audible). The place of information may be judged to be accessible when it appears to be easily reachable and perceivable by the users. However, its content should not be judged to be actually accessed unless it is cognitively internalized or understood by the users with or without disabilities. The mere fact that information is perceivable does not
304
M. Hori, T. Kato
guarantee that it is also understandable. Accessibility of information, therefore, should be evaluated not only for its perceivability but also for its understandability. Needless to say, information must first be made physically available to the users so they can perceive it and understand what it is. This requires that navigation as a means to get to the location of the information should also be designed to be easy to use and understand. We have previously proposed (Kato 2005) that the extended cognitive walkthrough (ECW) can be an effective means of going beyond perceivability in designing and evaluating Web accessibility and usability. We suggest that the ECW also be instrumental and useful in ensuring cognitive accessibility and usability of the mobile Web. In the next section, we first describe a cognitive model of human–computer interaction on which the ECW is based, and then explain the design perspectives of the ECW in relation to the guidelines for Web access from mobile and other portable small screen devices.
3 Design Perspectives for Cognitive Accessibility and Usability The ECW is a variant of the cognitive walkthrough (CW) which is a usability inspection method aimed at evaluating the ease of learning user interfaces. Analysts attempt to give yes/no answers to evaluation questions and to indicate why they think the intended user can or cannot be assumed to successfully perform the required action. The CW has been revised from the first (Lewis 1990) to the second (Polson 1992), and to the third version (Wharton 1994) where the number of evaluation questions was reduced to four. Though apparently much simplified from the previous two versions, the third version seems to increase the difficulty in interpreting the CW questions. In our attempt (Kato 2005) to make CW questions easier to deal with and more effective in identifying usability problems, we first extended Norman’s Seven Stages of Action model (Norman 1986) by incorporating two distinctions, ‘‘specifying object vs. action’’ and ‘‘perceiving vs. understanding’’ (see Fig. 2), namely, the stage of specifying the action sequence, which is represented as a single process in Norman’s model, is divided into four sub-processes in the extended human–computer interaction (HCI) model. Based on the HCI processes identified in the extended model, we then generated nine CW questions (Table 1) whose objectives are to examine if it is safe to assume that the intended user will successfully exit each of the HCI processes. This list of nine evaluation questions can be a valuable design aid to bridge the gaps between users’ goals and physical states of computing systems. Any negative answer to any evaluation question is taken to be an indication of potential design problem under the evaluated context. The questions Q1 through Q6 are concerned with the processes of bridging ‘‘the gulf of execution’’ (Norman 1986). These six questions are expected to help analysts to detect usability problems that may stem from the gaps between what users intend to do
306
M. Hori, T. Kato
perceptual–cognitive distinction of accessibility and usability issues. In order to illuminate the importance and relevance of those perspectives to the mobile Web, we try to relate the ECW questions given in Table 1 to the relevant guidelines in the Mobile Web Best Practices (MWBP) (Rabin 2006).
3.1 Distinction Between Perceiving and Understanding 3.1.1 Specifying the Object Users may fail to perceive or notice the existence of the correct object (e.g., a link text, a menu label) due, for example, to insufficient contrast, such as the case when the foreground and background colors are so close, or when viewed with mobile devices under weak lighting conditions. This is the issue related to the ease of perceiving the correct object (Q2) and mentioned not only in MWBP (5.3.6 and 5.3.7 in (Rabin 2006) but also in the checkpoint 2.2 of Web Content Accessibility Guidelines (WCAG) 1.0 (Chisholm 1999). The utility of the extended HCI model is that the significance of the guidelines from different perspectives (e.g., mobile Web access, Web content accessibility) can be understood on the basis of the common HCI processes. The extended HCI model distinguishes between perceiving (Q2) and understanding (Q3) the correct object. The guideline 2 in WCAG 1.0 (Chisholm 1999) requires content developers to ‘‘ensure that text and graphics are understandable when viewed without color.’’ However, the meaning of ‘‘understandable’’ in this statement is somewhat ambiguous and should be elaborated by distinguishing between perceiving and understanding. The version 2.0 of WCAG (Caldwell 2006) appears to be elaborating on this distinction, as the guidelines are organized around the four principles that distinguish between perceiving, operating, and understanding Web content. In the extended HCI model, the delivery context is further elaborated from an HCI perspective in such a way that perceiving and understanding the object are separately dealt with in the processes of both executing actions and assessing resulting outcomes. 3.1.2 Specifying the Action Besides the ease of perceiving the correct object, it is necessary to consider the ease of perceiving the action (e.g., a key press, a mouse click) to be applied to the correct object. In the case of widely used, well-learned input devices such as a keyboard and a mouse, the availability of key pressing and/or mouse clicking is obvious to the users, and Q4 is always answered affirmatively. However, the input device may be equipped with a novel mechanism in such cases as the Web access in a ubiquitous computing environment using wearable devices or by people with disabilities using a wide variety of assistive tools. In such cases, there can be no guarantee that the availability of the correct action is spontaneously noticed by the user, which attests the importance of Q4.
Mobile Web and Accessibility
307
The intelligibility of the perceived action is addressed by Q5. Perceiving the availability of the correct action and understanding the meaning of the perceived action are quite different matters. Consider, for example, the case of invoking the edit mode for a spreadsheet cell. Unless the users know that in this particular situation invoking the edit mode requires double-clicking, they would not attempt to apply a double-click action even if double-clicking itself is in their repertoire of actions. Such is a problem stemming from the difficulty of understanding the correct action (Q5) rather than perceiving its availability (Q4). The default delivery context of MWBP assumes that mobile devices are equipped with only limited-size keypads with small keys but without any pointing devices (Rabin 2006). MWBP suggests that access keys should be assigned to frequently used functions such as links in navigational menus (5.2.5 in [Rabin 2006]). This situation is related to the ease of interpreting the action in addition to perceiving its availability. First, the availability of access keys should be made easily perceivable to the users. Second, it should also be made easy for the users to understand which access key is assigned to which function. In the current version of MWBP, however, there seems to be no explicit statement regarding the ease of perceiving the correct action (Q4). This should not be taken as the limitation of MWBP because the device requirements are intended to be illustrative rather than exhaustive in the best practices.
3.1.3 Assessing the Outcome Even if the user executes the correct action (i.e., applying the correct action to the correct object), trouble may still occur if the user is not sure what has actually happened. The question Q7 examines if the user can perceive any physical change on the part of the system (e.g., page transition on the screen). In the case of the mobile Web access, the user may not notice any physical change immediately after the action due to longer page retrieval for large page size (5.3.2 in [Rabin 2006]) or externally linked large resources (5.2.9 in [Rabin 2006]). In addition to checking the feedback indicating something happened (Q7), the feedback indicating what happened is checked by Q8. The question Q8 attempts to confirm that the intended users understand the meaning of the system’s state change. Even if the users can perceive page redrawing on a smallscreen device, they may still have difficultly in understanding the meaning of the state change if there is no content change within the region displayed on the screen. This is the point mentioned in 5.2.2 of MWBP (Rabin 2006). In addition, in error cases the user may perceive something going wrong because of an unexpected result. However, if the interface does not provide any helpful feedback (e.g., error messages), the user may not understand what exactly has gone wrong (5.4.13 in [Rabin 2006]).
308
M. Hori, T. Kato
3.2 Forming the Intention If the Web page does not provide any prompt or guidance for what needs to be done, the user may have difficulty in forming a correct intention. The question Q1 is to check if the user understands the necessity for the action that is to be taken in the current step of the task scenario. MWBP suggests that the main idea of the page content should be provided in the initial view (5.3.4 in [Rabin 2006]). This is crucial for the mobile Web access because information presented to the user in a given view would be quite limited due to the small screen size. If scrolling is needed to get an idea of what the page is for, the user may fail to recognize relevant prompts to form an appropriate intention for the right effect. In this sense, limiting scrolling to one direction (5.3.3 in [Rabin 2006]) is more broadly relevant to avoiding the user’s difficulty in forming a correct intention. In the last step of the HCI loop (Fig. 2), the question Q9 attempts to confirm if the users can see any progress being made toward their goals. Mobile users are more likely to have goal-directed intentions and seek specific pieces of information than users in the context of desktop Web access (Rabin 2006). Therefore, the guidelines for the mobile Web access need to consider issues related to users’ goals, although design guidelines usually provide descriptions of good practices without any regard to particular task scenarios. MWBP recommends limiting Web content to what the user has requested so that information distinguishing the subject matter can be placed first (5.3.1 in [Rabin 2006]). Since the questions Q1 and Q9 are both concerned with users’ goal-directed behavior, the issues relevant to Q1 are also of relevance to Q9. Moreover, it is important to note that the success or failure of goal-directed behavior needs to be evaluated on the basis of a concrete sequence of actions as in scenario-based evaluation. This is the point further discussed later in Section 4.2.
3.3 Action Slip Human errors are divided into two major categories: mistakes and slips (Norman 1986). Mistakes are made when inappropriate intentions are formed, which are examined by Q1 and Q9, whereas action slips occur when correct actions are incorrectly executed. The objective of Q6 is to examine the possibility that the user may commit action slips at the time of execution. Input capabilities of mobile devices are usually more restricted than those of desktop terminals equipped with keyboards. Data input on a mobile device tends to be relatively slower and more susceptible to slip errors, even if the user knows the correct sequence of characters to be typed. There are some guidelines in MWBP that can be effective in avoiding slip errors: to keep the number of keystrokes to a minimum (5.5.1 in [Rabin 2006]), to keep site entry URIs short (5.2.1 in [Rabin 2006]), and to provide access keys or keyboard short cut for frequently used functionality (5.2.5 in [Rabin 2006]).
Mobile Web and Accessibility
309
4 Lessons Gained from Web Accessibility and Their Implications for Mobile Web In this section, we discuss some of the reasons why we think the design perspectives exemplified in the extended cognitive walkthrough (ECW) can be an effective means of ensuring accessibility and usability for the mobile Web. Our aim is to shed some light on the design considerations that are of particular importance for the mobile Web.
4.1 Explicit Perceptual Cognitive Distinction One fundamental assumption of the ECW is that design remedies for accessibility and usability problems ought to be different depending on whether the problems are essentially perceptual or cognitive in nature and that accessibility and/or usability evaluations should employ design perspectives that explicitly distinguish between perceptual and cognitive factors of human–computer interaction. This perceptual–cognitive distinction may be a foreign concept to nonprofessional analysts but it does not mean that the distinction is difficult for them to incorporate in their evaluations. Once they have experienced analyzing a task scenario or two, the undergraduate students in our previous studies showed rather natural tendencies to view and identify design problems from the viewpoint of the perceptual–cognitive distinction. As far as design problems are concerned, the perceptual–cognitive distinction is in reality rather blur not only because accessibility and/or usability problems can be due partly to perceptual and partly to cognitive factors but also because there can be ‘‘reciprocity’’ between perceptual and cognitive factors such that perceptual clarity could help avoid problems of cognitive nature and vice versa. It is for these reasons that we emphasize the importance of disambiguating the causes of design problems from the viewpoint of the perceptual– cognitive distinction. For example, in the case of Web pages designed for desktop PCs, it is possible that a link with an ambiguous label is successfully chosen because the link is perceptually so emphasized that the user is induced to select the link without actually knowing what it does or where it leads. The shortcoming of such label naming, however, is likely to emerge on other pages where the same sort of perceptual effect may not be used for one reason or other (e.g., no priority difference). Likewise, a poorly visible link may nevertheless be easily noticed because its label name is so familiar to the user or so cognitively distinctive from other names. Also, for Web pages for desktop PCs, since a variety of interface elements of a variety of design options can be made available on a single Web page, they may collectively and synergetically contribute to the ease of perceptual and cognitive processing of individual items. We believe that the perceptual–cognitive distinction is of more importance for the evaluation of the mobile Web as there seems to be much less room for
310
M. Hori, T. Kato
possible reciprocity between perceptual and cognitive factors. For example, the screens of mobile devices typically have so limited space and luminance that the adoption of link designs of high perceptual clarity, possibly high enough to conceal shortcomings of the cognitive aspects of the navigation design, may not be a viable option. Also, the limited screen space prevents the adoption of such screen layouts that may facilitate perceptual and cognitive processing of individual interface elements. The simple truth is that the limitations of mobile devices, functional, spatial, or otherwise do not leave designers much room for devising ingenious perceptual and cognitive designs that can ease the shortcomings of the other aspects of the Web design.
4.2 Scenario-Based Evaluation One critical problem with automated evaluation of Web accessibility is that Web sites are usually analyzed on a page-by-page basis and that scenarios and paths are not taken into account (Bohman 2005). Such page-based analysis lacks efficiency because Web sites are often constructed with consistent templates across pages, which makes it likely that the same errors are repeatedly reported for every page with the same template. Also, the page-based analysis tends to present false impressions of the adequacy for the Web site design. In any design evaluation, what ultimately counts should be whether or not the user will be able to complete an intended task. If a given path contains even a single step that is inaccessible, the entire path should be judged to be inaccessible because that single inaccessible step will prevent the user from completing the task. From the user’s point of view, it does not make any sense to judge whether the path is adequately or inadequately designed on the basis of the relative percentage of accessible steps so long as the remaining steps are inaccessible. As the word ‘‘walkthrough’’ suggests, the essential unit of analysis for the ECW is a scenario or a path. In the preparation of the ECW, a desirable sequence of the user’s actions (i.e., a scenario) is defined along with the interface elements available to the user at each step of the action sequence. Then, the interaction design is evaluated in terms of whether or not the intended user will be able to proceed, without much difficulty, from the first through the last step of the sequence. The ECW is a scenario-based evaluation in the very sense that there can be no proper walkthrough analysis without a scenario. The actual unit of analysis (i.e., a single cycle of applying the nine questions) may be smaller than an individual Web page as it is possible, or even likely, that more than one step may need to be completed on a single (interactive) Web page. However, this should not be regarded as a shortcoming ofthe ECW. The Web design, like other interaction designs, should be evaluated in terms of whether or not each individual action or step needed to achieve an intended goal can be executed easily and comfortably all the way from the beginning to the end. The ECW focusing on both individual steps and an entire scenario should help
Mobile Web and Accessibility
311
designers and developers to attend to the adequacy of both individual Web pages and their linking toward the successful completion of an intended goal. This focus on individual actions and their sequence is of rather critical importance for the evaluation of the mobile Web where functional and spatial limitations of devices and screens are likely to severely constrain not only how the users may execute individual actions but also how they may monitor and evaluate their progress.
4.3 Remote Testing As mentioned before, it is difficult for Web accessibility guidelines, standards, and regulations to cover all kinds of needs of persons with disabilities. Also, it is difficult for designers and developers to understand from just reading them why those specifications are really needed. To compensate for such limitations, it is suggested that Web sites be tested with persons with different disabilities using different assistive technologies and adaptive strategies (Brewer 2004). Although testing with users is widely acknowledged as an effective way of identifying design problems that may otherwise be overlooked by even evaluation specialists, it is also recognized that recruiting appropriate participants often becomes problematic for user testing. A particular challenge for testing with persons with disabilities is that depending on the types and degrees of their disabilities it can be extremely difficult, if not impossible, for them to come away from home to participate in user testing. The ECW can be used in a way similar to user testing in that analysts are users themselves. Our previous study showed that university students without prior experience in design evaluation were still able to produce reasonable results themselves and that they collectively succeeded in identifying as many design problems as experienced analysts did. Further, we confirmed that with the software tool developed for the online ECW evaluation, persons with disabilities were able to conduct the ECW for Web sites remotely at their own home. The ECW results were insightful and valuable as they were first-hand judgments of the persons with disabilities, reflecting their particular needs arising from individual disabilities. Remote testing should also be valuable to the evaluation of the mobile Web in a seemingly different but essentially the same sense. By definition, people are expected to access the mobile Web on the move, away from their own office and home. Where and how they may access the mobile Web are likely to vary given a variety of possible contexts of use and mobile devices, which makes it difficult to come up with reasonable task scenarios for design evaluation. What is more difficult, however, is to predict and incorporate the effects of environmental factors in the evaluation. The levels of lighting and noise, for example, may influence how visible and audible a given Web content can be and in turn how easily its meaning can be extracted. Yet, the levels of such environmental factors may vary and their representative levels are difficult to determine for the purpose of laboratory testing.
312
M. Hori, T. Kato
Remote testing actually serves a sort of field testing that often reveals design problems that may have gone undetected in laboratory testing. The value of remote testing lies in its ability to gain more direct and probably more valid evaluation results whether it is due to the inclusion of otherwise inaccessible users and/or of such contexts of use that are difficult to simulate.
5 Conclusions We agree that the lessons and experiences gained from designing for Web accessibility can be effectively applied to designing the mobile Web. A particular experience addressed here is our own, which we gained from applying the extended cognitive walkthrough (ECW) to the inspection of Web accessibility and usability. Our aim is to draw particular attention to the importance of explicitly distinguishing between perceptual and cognitive aspects of the users’ interactions with the Web. The reason is simply that design solutions ought to be different depending on whether the users may experience accessibility and/or usability difficulties at perceptual or cognitive levels of their interactions. Also, we point out that scenario-based evaluation and remote testing may have greater importance for accessibility and usability evaluations of the mobile Web where the limited screen space of mobile devices and a variety of environmental factors of mobile use become critical design issues. The ECW is presented as a viable means of analyzing accessibility and usability problems from the viewpoints of the explicit perceptual–cognitive distinction and scenario-based evaluation. The ECW, previously shown to be effective in the remote testing of Web pages designed for desktop PCs, seems to be promising for the remote testing of the mobile Web, though obviously some modifications in testing arrangements are needed. Although this article does not address accessibility issues of mobile Web users who may have disabilities, such issues are important and should be carefully considered so that the future mobile Web can help persons with disabilities to further expand their sphere of information access. Acknowledgment This research was supported by the Grant-in-Aid for Scientific Research (17500175) of the Japan Society for the Promotion of Science and by the Academic Frontier Promotion Program (2003–2007) granted to Kansai University by the Japanese Ministry of Education, Culture, Sports, Science and Technology.
References Asakawa, C. What’s the Web like if you can’t see it? In Proceedings of the International Cross Disciplinary Workshop on Web Accessibility, Chiba, Japan, May 10, pp. 1 8 (2005). Brewer, J. Web accessibility highlights and trends. In Proceedings of the International Cross Disciplinary Workshop on Web Accessibility, New York, NY, USA, May 18, pp. 51 55 (2004).
Mobile Web and Accessibility
313
Bohman, P. R. and Anderson, S. A conceptual framework for accessibility tools to benefit users with cognitive disabilities. In Proceedings of the International Cross Disciplinary Workshop on Web Accessibility, Chiba, Japan, May 10, pp. 85 89 (2005). Caldwell, B., Chisholm, W., Slatin, J., and Vanderheiden, G. Web Content Accessibility Guidelines 2.0. W3C Working Draft 27 April 2006, http://www.w3.org/TR/2006/WD WCAG20 20060427/ (for latest version: http://www.w3.org/TR/WCAG20/) (2006). Chisholm, W., Vanderheiden, G., and Jacobs, I. Web Content Accessibility Guidelines 1.0. W3C Recommendation 5 May 1999, http://www.w3.org/TR/WAI WEBCONTENT (1999). Gimson, R. Device Independence Principles. W3C Working Group Note 01 September 2003, http://www.w3.org/TR/2003/NOTE di princ 20030901/ (for latest version see: http:// www.w3.org/TR/di princ/) (2003). Gimson, R., Lewis, R., and Sathish, S. Delivery Context Overview for Device Independence. W3C Working Group Note 20 March 2006, http://www.w3.org/TR/2006/NOTE di dco 20060320/ (for latest version see: http://www.w3.org/TR/di dco/) (2006). Hanson, V. L. The user experience designs and adaptations. In Proceedings of the Interna tional Cross Disciplinary Workshop on Web Accessibility, New York, NY, USA, May 18, pp. 1 11 (2004). Harper, S., Yesilada, Y., and Goble, C. Building the mobile Web: Rediscovering accessibility? W4A International cross disciplinary workshop on Web accessibility workshop report 2006. Accessibility and Computing, No. 85, pp. 21 23 (2006). Jay, C., Stevens, R., Glencross, M., and Chalmers, A. How people use presentation to search for a link: Expanding the understanding of accessibility on the Web. In Proceedings of the International Cross Disciplinary Workshop on Web Accessibility, Edinburgh, Scotland, May 22 23, pp. 21 23 (2006). Kato, T. and Hori, M. Articulating the cognitive walkthrough based on an extended model of HCI. In Proceedings of HCI International, Las Vegas, Nevada (2005). Kato, T. and Hori, M.: ‘‘Beyond perceivability’’: Critical requirements for universal design of information. The Eighth International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS 2006), pp. 287 288, Portland, Oregon (2006). Lewis, R. Glossary of Terms for Device Independence. W3C Working Draft 18 January 2005, http://www.w3.org/TR/2005/WD di gloss 20050118/ (for latest version see: http://www. w3.org/TR/di gloss/) (2005). Lewis, C., Polson, P., Wharton, C., and Rieman, J. Testing a walkthrough methodology for theory based design of walk up and use interfaces. In Proceedings of CHI ’90, Seattle, WA, USA, April 01 05, pp. 235 242 (1990). Norman, D. A. Cognitive engineering. In D. A. Norman & S. W. Draper (Eds.), User centered systems design: New perspectives in human computer interaction, pp. 31 61, Lawrence Erlbaum Associates, Hillsdale, NJ (1986). Polson, P. G., Lewis, C., Rieman, J., and Wharton, C. Cognitive walkthroughs: A method for theory based evaluation of user interfaces. International Journal of Man Machine Studies, Vol. 36, No. 5, pp. 741 773 (1992). Rabin, J. and McCathieNevile, C. Mobile Web Best Practices 1.0: Basic Guidelines. W3C Proposed Recommendation 2 November 2006, http://www.w3.org/TR/2006/PR mobile bp 20061102/ (for latest version see http://www.w3.org/TR/mobile bp/) (2006). Wharton, C., Rieman, J., Lewis, C., and Polson, P. The cognitive walkthrough method: A practitioner’s guide. In J. Nielsen & R. L. Mack (Eds.), Usability inspection methods, pp. 105 140, John Wiley & Sons, New York, NY (1994).
Semantic Web Ian Horrocks and Sean Bechhofer
Abstract The Semantic Web aims to explicate the meaning of Web content by adding semantic annotations that describe the content and function of resources. Providing shareable annotations requires the use of ontologies that describe a common model of a domain. The Web Ontology Language OWL has been defined in order to support representation of ontologies, and their manipulation through the use of reasoning. We provide a brief overview of OWL and the underlying theory, describe applications of ontologies, and give pointers to areas of overlap between Semantic Web and Accessibility research.
1 Introduction While phenomenally successful in terms of size and number of users, today’s World Wide Web is fundamentally a relatively simple artefact. Web content consists mainly of distributed hypertext, and it is accessed via a combination of keyword-based search and link navigation. This simplicity has been one of the great strengths of the Web, and has been an important factor in its popularity and growth: naive users are able to use it and can even create their own content. The explosion in both the range and quantity of Web content has, however, highlighted some serious shortcomings in the hypertext paradigm. In the first place, the required content becomes increasingly difficult to locate using the search and browse paradigm. Finding information about people with very common names (or with famous namesakes) can, for example, be a frustrating experience. More complex queries can be even more problematical: a query for ‘‘animals that use sonar but are neither bats nor dolphins’’ may either return many irrelevant results related to bats and dolphins (because the search engine failed to understand the negation), or may fail to return many relevant results (because most relevant Web pages also mention bats or dolphins). More complex tasks may be extremely difficult, or even impossible. Examples of such I. Horrocks University of Manchester, Oxford Road, Manchester M13 9PL, UK e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_19, Ó Springer Verlag London Limited 2008
315
316
I. Horrocks, S. Bechhofer
tasks include locating information in data repositories that are not directly accessible to search engines (Volz et al. 2004), or finding and using so-called Web services (McIlraith et al. 2001). If human users have difficulty accessing Web content, the problem is even more severe for automated processes. This is because Web content is primarily intended for presentation to and consumption by human users: HTML markup is mainly concerned with layout, size, colour and other presentational issues. Moreover, Web pages increasingly use images, often including active links, to present information. Human users are able to interpret the significance of such features, and thus understand the information being presented, but this may not be so easy for an automated process or ‘‘software agent’’. The Semantic Web aims to overcome some of the above-mentioned problems by making Web content more accessible to automated processes; the ultimate goal is to transform the existing Web into ‘‘. . . a set of connected applications . . . forming a consistent logical web of data . . .’’ (Berners-Lee 1998). This is to be achieved by adding semantic annotations to Web content, i.e., annotations that describe the meaning of the content. In the remainder of this chapter, we will examine in a little more detail what semantic annotations will look like, how they describe meaning, and how automated processes can exploit such descriptions. We will also discuss the impact of the Semantic Web and Semantic Web technology on accessibility.
2 Background As we mentioned above, the key idea behind the Semantic Web is to explicate the meaning of Web content by adding semantic annotations. If we assume for the sake of simplicity that such annotations take the form of XML style tags, we could imagine a fragment of a Web page being annotated as follows: hWizardi Harry PotterhWizardi has a pet called hSnowyOwli Hedwigh/SnowyOwli.
Taken in isolation, however, such annotations are of only limited value: the problem of understanding the terms used in the text has simply been transformed into the problem of understanding the terms used in the labels. A query for information about raptors, for example, may not retrieve this text, even though owls are raptors. This is where ontologies come into play: they provide a mechanism for introducing a vocabulary and giving precise meanings to the terms in the vocabulary. A suitable ontology might, for example, introduce the term SnowyOwl, and include the information that a SnowyOwl is a kind of Owl, and that an Owl is a kind of Raptor. Moreover, if this information is represented in a way that is accessible to our query engine, then it would be able to recognise that the above text is relevant to our query about raptors. Ontology, in its original philosophical sense, is a fundamental branch of metaphysics focussing on the study of existence; its objective is to determine
Semantic Web
317
Fig. 1 Tree of Porphyry
what entities and types of entities actually exist, and thus to study the structure of the world. The study of ontology can be traced back to the work of Plato and Aristotle, and from the very beginning included the development of hierarchical categorisations of different kinds of entity and the features that distinguish them: the well-known ‘‘tree of Porphyry’’, for example, identifies animals and plants as sub-categories of living things distinguished by animals being sensitive, and plants being insensitive (see Fig. 1). In computer science, an ontology is usually taken to be a model of (some aspect of) the world; it introduces vocabulary describing various aspects of the domain being modelled and provides an explicit specification of the intended meaning of the vocabulary. This specification often includes classificationbased information not unlike that in Porphyry’s famous tree. For example, Fig. 2 shows a screenshot of a Pizza ontology as displayed by the Prote´ge´ ontology design tool (Knublauch et al. 2004). The ontology introduces various pizza-related vocabulary (some of which can be seen in the left hand panel), such as ‘‘NamedPizza’’ and ‘‘RealItalianPizza’’, and arranges it hierarchically: RealItalianPizza is, for example, a sub-category of NamedPizza. The other panels display information about the currently selected category, RealItalianPizza in this case, describing its meaning: a RealItalianPizza is a Pizza whose country of origin is Italy; moreover, a RealItalianPizza always has a ThinAndCrispyBase. Ontologies can be used to annotate and to organise data from the domain: if our data includes instances of RealItalianPizza, then we can return them in response to a query for instances of NamedPizza.
318
I. Horrocks, S. Bechhofer
Fig. 2 Example pizza ontology
3 The Web Ontology Language OWL The architecture of the Web depends on agreed standards such as HTTP that allow information to be shared and exchanged. A standard ontology language is, therefore, a prerequisite if ontologies are to be used in order to share and exchange meaning. Recognising this fact, the World Wide Web Consortium (W3C) set up a standardisation working group to develop such a language. The result of this activity was the Web Ontology Language OWL ontology language standard (Patel-Schneider et al. 2004). OWL exploited existing work on languages such as OIL (Fensel et al. 2001) and DAML+OIL (Horrocks et al. 2002) and, like them, was based on a Description Logic (DL). In the following, we will briefly introduce DLs and OWL. For more complete information the reader should consult The Description Logic Handbook (Baader et al. 2003), and the OWL specification (Patel-Schneider et al. 2004).
3.1 Description Logic Description logics (DLs) are a family of logic-based knowledge representation formalisms; they are descendants of Semantic Networks (Woods 1985) and KL-ONE (Brachman and Schmolze 1985). These formalisms all adopt an object-oriented model, similar to the one used by Plato and Aristotle, in
Semantic Web
319
which the domain is described in terms of individuals, concepts (usually called classes in ontology languages) and roles (usually called relationships or properties in ontology languages). Individuals, e.g., ‘‘Socrates’’ are the basic elements of the domain; concepts, e.g., ‘‘Human’’, describe sets of individuals having similar characteristics; and roles, e.g., ‘‘hasPupil’’ describe relationships between pairs of individuals, such as ‘‘Socrates hasPupil Plato’’. As well as atomic concept names such as Human, DLs also allow for concept descriptions to be composed from atomic concepts and roles. Moreover, it is possible to assert that one concept (or concept description) is subsumed by (is a sub-concept of ), or is exactly equivalent to, another. This allows for easy extension of the vocabulary by introducing new names as abbreviations for descriptions. For example, using standard DL notation, we might write HappyParent Parent u 8hasChild:ðIntelligent u AthleticÞ This introduces the concept name HappyParent and asserts that its instances are just those individuals that are instances of Parent, and all of whose children are instances of either intelligent or athletic. Another distinguishing feature of DLs is that they are logics, and so have a formal semantics. DLs can, in fact, be seen as decidable subsets of first-order predicate logic, with individuals being equivalent to constants, concepts to unary predicates, and roles to binary predicates. As well as giving a precise and unambiguous meaning to descriptions of the domain, this also allows for the development of reasoning algorithms that can be used to answer complex questions about the domain. An important aspect of DL research has been the design of such algorithms, and their implementation in (highly optimised) reasoning systems that can be used by applications to help them ‘‘understand’’ the knowledge captured in a DL-based ontology. We will return to this point in Section 4. A given DL is characterised by the set of constructors provided for building concept descriptions. These typically include at least intersection (u), union (t), and complement (:), as well as restricted forms of existential (9) and universal (8) quantification, which in OWL are called, respectively, someValuesFrom and allValuesFrom restrictions. OWL is based on a very expressive DL called SHOI N that also provides cardinality restrictions (5, 4) and enumerated classes (called oneOf in OWL) (Horrocks et al. 2003, Horrocks and Sattler 2005). Cardinality restrictions allow, e.g., for the description of a concept such as people who have at least two children, while enumerated classes allow for classes to be described by simply enumerating their instances, e.g., EUcountries fAustria; . . . ; UKg SHOI N also provides for transitive roles, allowing us to state, e.g., that if x has an ancestor y and y had an ancestor z, then z is also an ancestor of x, and for inverse roles, allowing us to state, e.g., that if z is an ancestor of x, then x is also an descendent of z. The constructors provided by OWL, and the equivalent DL syntax, are summarised in Fig. 3.
320
I. Horrocks, S. Bechhofer
Fig. 3 OWL constructors
In DLs, it is usual to separate the set of statements that establish the vocabulary to be used in describing the domain (what we might think of as the schema) from the set of statements that describe some particular situation that instantiates the schema (what we might think of as data); the former is called the TBox (Terminology Box), and the latter the ABox (Assertion Box). An OWL ontology is simply equivalent to a set of SHOI N TBox and ABox statements. This mixing of schema and data is quite unusual (in fact, ontologies are usually thought of as consisting only of the schema part), but does not affect the meaning – from a logical perspective, SHOI N KBs and OWL ontologies are just sets of axioms. The main difference between OWL and SHOI N is that OWL ontologies use an RDF-based syntax intended to facilitate its use in the context of the Semantic Web. This syntax is rather verbose, and not well suited for presentation to human beings. For example, the description of HappyParent given above would be written in OWL’s RDF syntax as follows:
Semantic Web
321
4 Ontology Reasoning We mentioned in Section 3.1 that the design and implementation of reasoning systems is an important aspect of DL research. The availability of such reasoning systems was one of the motivations for basing OWL on a DL. This is because reasoning is essential in supporting both the design of high-quality ontologies and the deployment of ontologies in applications.
4.1 Reasoning at Design Time Ontologies may be very large and complex: the well-known Snomed clinical terms ontology includes, for example, more than 200,000 class names (Spackman 2000). Building and maintaining such ontologies is very costly and time consuming, and providing tools and services to support this ‘‘ontology engineering’’ process is of crucial importance to both the cost and the quality of the resulting ontology. State-of-the-art ontology development tools, such as SWOOP (Kalyanpur et al. 2005a) and Prote´ge´ (Knublauch et al. 2004), therefore use a DL reasoner, such as FaCTþþ (Tsarkov and Horrocks 2006), Racer (Haarslev and Moller 2001), or Pellet (Sirin et al. 2005), to provide feedback to the user ¨ about the logical implications of their design. This typically includes (at least) warnings about inconsistencies and redundancies. An inconsistent (sometimes called unsatisfiable) class is one whose description is ‘‘over-constrained’’, with the result that it can never have any instances. This is typically an unintended feature of the design – why introduce a name for a class that can never have any instances – and may be due to subtle interactions between descriptions. The ability to detect such classes and bring them to the attention of the ontology engineer is, therefore, a very useful feature. It is also possible that the descriptions in the ontology mean that two classes necessarily have exactly the same set of instances, i.e., they are alternative names for the same class. This may be desirable in some situations, e.g., to capture the fact that ‘‘myocardial infarction’’ and ‘‘heart attack’’ mean the same thing. It could, however, also be the inadvertent result of interactions between descriptions, and so it is also useful to be able to alert users to the presence of such ‘‘synonyms’’. In addition to checking for inconsistencies and synonyms, ontology development tools usually also check for implicit subsumption relationships, and amend the class hierarchy accordingly. This is also a very useful design aid: it allows the ontology developer to focus on class descriptions, leaving the computation of the class hierarchy to the reasoner, and it can also be used by the developer to check if the hierarchy induced by the class descriptions is consistent with their intuition. Recent work has also shown how reasoning can be used to support modular design (Cuenca Grau et al. 2007b) and module extraction (Cuenca Grau et al.
322
I. Horrocks, S. Bechhofer
2007a), important techniques for working with large ontologies. When developing a large ontology such as SNOMED, it is useful if not essential to divide the ontology into modules, e.g., to facilitate parallel work by a team of ontology developers. Reasoning techniques can be used to alert the developers to unanticipated and/or undesirable interactions between the various modules. Similarly, it may be desirable to extract from a large ontology a smaller module containing all the information relevant to some subset of the domain, e.g., heart disease – the resulting small(er) ontology will be easier for humans to understand and easier for applications to use. Reasoning can be used to compute a module that is as small as possible while still containing all the necessary information. Finally, in order to maximise the benefit of all these services, a modern system should also be able to explain its inferences: without this facility, users may find it difficult to repair errors in the ontology and may even start to doubt the correctness of the reasoning system. Explanation typically involves computing a (hopefully small) subset of the ontology that still entails the inference in question, and if necessary presenting the user with a chain of reasoning steps (Kalyanpur et al. 2005b).
4.2 Reasoning in Deployment Reasoning is also important when ontologies are deployed in applications – it is needed, e.g., in order to answer structural queries about the domain and to retrieve data. If we assume, for example, an ontology that includes the above description of HappyParent, and we know that John is a HappyParent, that John has a child Mary (i.e., John hasChild Mary), and that Mary is not Athletic, then we would like to be able to infer that Mary is Intelligent. The above example may seem quite trivial, but it is easy to imagine that with large ontologies, query answering may be a very complex task. The use of DL reasoners allows OWL ontology applications to answer complex queries and to provide guarantees about the correctness of the result. This is particularly important if ontology-based systems are to be used as components in larger applications, such as the Semantic Web, where the correct functioning of automated processes may depend on their being able to (correctly) answer such queries.
5 Ontology Applications The availability of tools and reasoning systems such as those mentioned in Section 4 has contributed to the increasingly widespread use of OWL, not only in the Semantic Web per se, but as a popular language for ontology development in fields as diverse as biology (Sidhu et al. 2005), medicine (Golbreich et al.
Semantic Web
323
2006), geography (Goodwin 2005), geology (SWEET 2006), astronomy (Derriere et al. 2006), agriculture (Soergel et al. 2004), and defence (Lacy et al. 2005). Applications of OWL are particularly prevalent in the life sciences where it has been used by the developers of several large biomedical ontologies, including the Biological Pathways Exchange (BioPAX) ontology (Ruttenberg et al. 2005), the GALEN ontology (Rector and Rogers 2006), the Foundational Model of Anatomy (FMA) (Golbreich et al. 2006), and the National Cancer Institute Thesaurus (Hartel et al. 2005). The importance of reasoning support in such applications was highlighted (Kershenbaum et al. 2006), which describes a project in which the Medical Entities Dictionary (MED), a large ontology (100,210 classes and 261 properties) that is used at the Columbia Presbyterian Medical Center, was converted into OWL, and checked using an OWL reasoner. This check revealed ‘‘systematic modelling errors’’, and a significant number of missed subClass relationships which, if not corrected, ‘‘could have cost the hospital many missing results in various decision support and infection control systems that routinely use MED to screen patients’’.
6 Semantics and Accessibility The development of Semantic Web languages and technology was primarily driven by a desire to overcome the problems encountered by ‘‘software agents’’ in using information available on the Web. However, the chief characteristic of the Semantic Web approach, namely the annotation of resources with machinereadable descriptions also offers a promise for accessibility. Principled separation of content from presentational information (e.g., through the use of CSS) helps to alleviate some problems – for example, ensuring that presentational aspects are not used to convey additional meaning. Rich, semantic annotations of content can push this further – explicitly publishing content in machine readable forms opens up the possibilities for end-user applications to transform the annotated information and present it to a user in an appropriate fashion. We can identify at least three areas where accessibility issues relate to Semantic Web. First of all, Semantic Web end-user applications targeted at consumers must be sympathetic to the needs of users. There are also an increasing number of tools aimed at producers of Semantic information – again, we must be careful in the design and execution of these applications. Finally, there is the possibility of using Semantic Web technologies and approaches in supporting access to content. We briefly discuss each of these issues.
6.1 End-User Applications A number of applications (e.g., Magpie (Dzbor et al. 2003) or COHSE (Carr et al. 2001, Yesilada et al. 2006)) provide what we might call Semantic Web
324
I. Horrocks, S. Bechhofer
Browsers. These provide enhanced navigational possibilities for users, based on additional semantic information which is either embedded in pages, added through annotations, or gleaned at run time through the use of natural language processing techniques. While browsing a Web resource, the applications give additional context or links to related resources. These applications tend to use client-side processing (e.g., relying on dynamic HTML or AJAX-style interactions) in order to provide an enhanced user experience. Clearly this raises questions as to the accessibility of the presentations generated. Similarly, applications and tools that support browsing of RDF repositories tend to be graphical in nature. To date, the issue of accessibility has not been explicitly tackled within such applications.
6.2 Support Tools Tools are also available to support producers of semantic information. As discussed in Section 3.1, the normative presentation syntax for OWL (XML/RDF) is a rather verbose format which is not particularly human readable. Tooling is thus required to support editing and manipulation of ontologies. In particular, ontology editors such as Prote´ge´ and SWOOP allow the development, construction, and maintenance of ontologies. However, these tools are largely graphical in nature, and present potential difficulties for visually impaired users. As with end-user tools, little exploration of accessible ontology development interfaces has been done to date – ontology editors are still perhaps something of a niche market. Most modern ontology development tools have been developed using Java, so the possibility exists for enhancement of the interfaces (e.g., through the Java Accessibility API), but additional care is likely to be required in the interface design.
6.3 Annotation for Accessibility Finally, we turn our attention to the use of Semantic Web technology to support better access to information. The Semantic Web is built on the notation of annotation or decoration of resources with additional information describing the content or function of those resources. This explicit representations of information allows applications or software agents to perform actions on behalf of users. Improving sharing and interoperability between applications is seen as a key benefit of the use of ontologies. EARL (Abou-Zahra 2005) uses vocabularies in order to facilitate the exchange of information between tools. Although most examples of Semantic Web applications focus on tasks such as searching or information integration, semantic annotation has been applied in order to support Web content transcoding. Annotations for Web content transcoding aim to provide better support either for audio rendering, and thus for visually impaired users, or for visual rendering in small-screen devices.
Semantic Web
325
Proxy-based systems to transcode Web pages based on external annotations for visually impaired users have been proposed (Takagi and Asakawa 2000, Asakawa and Takagi 2000). The main focus is on extracting visually fragmented groupings, their roles and importance. There is no particular attempt to provide a deep understanding or analysis of the page. Approaches such as SWAP (Seeman 2004) use semantic annotations to support accessibility and device independence. The SeEBrowser (Kouroupetroglou et al. 2006) consumes annotations in order to support visually impaired users’ navigation around pages. Interestingly, the ontology editor reported in Kouroupetroglou et al. (2006) is very much a visual tool (as discussed above). DANTE (Yesilada et al. 2004) used an ontology known as WAfA to provide terms relating to mobility of visually impaired users. Annotations made on pages describe the roles that particular elements may play. These annotations can then drive a transformation process. The use of an ontology helps to guarantee a consistency across annotations and their interpretation. DANTE relies on the annotation of individual pages, however, which can be costly. An alternative approach is adopted by SADIe (Harper and Bechhofer 2005, Bechhofer et al. 2006), which relies on annotations of style sheet information attached to pages. The rationale behind SADIe’s approach is that the classes that appear in Cascading Style Sheet definitions often have some implicit semantics associated with them. For example, a CSS class menu may be used to define the presentational attributes associated with a menu appearing on a page. Such an object is likely to be important in supporting navigation around a site, and so should be given a prominent rendering in a transcoded page. The ontology provides an abstraction over the roles of elements appearing in pages, and it allows applications to apply general transformations over different sites. The structure of the ontology also allows the description of general rules – e.g., menu items should be promoted to a prominent position – along with specialisations of those items – e.g., items in a navigation bar are menu items. These annotations differ slightly from mainstream Semantic Web annotation approaches (Handschuh and Staab 2003) which tend to focus on annotation of content rather than structure. This is still, however, an example of the explicit exposure of information in machine readable forms. The approach of treating annotations as first-class citizens, separate from the resources they annotate, is of benefit here, however, allowing third parties’ potential opportunities to improve access to resources where the original provider will not, or cannot alter existing content.
7 Future Directions As we have seen in Section 5, OWL is already being successfully used in many applications. This success brings with it, however, many challenges for the future development of both the OWL language and OWL tool support. Central
326
I. Horrocks, S. Bechhofer
to these is the familiar tension between the requirements for advanced features, in particular increased expressive power, and raw performance, in particular the ability to deal with very large ontologies and data sets. Use of OWL in the life sciences domain has brought to the fore examples of both of the above-mentioned requirements. On the one hand, ontologies describing complex systems in medicine and biology often require expressive power beyond whatis currently supported in OWL. Two particular features that are very often requested are the ability to ‘‘qualify’’ cardinality constraints, e.g., to describe the hand as having four parts that are fingers and one part that is a thumb, and the ability to have some characteristics be transferred across transitive part-whole relations, e.g., to capture the fact that a disease affecting a part of an organ affects the organ as a whole. The former feature (so-called qualified cardinality restrictions) has long been well understood, and has been available for some time in DL reasoners; the latter feature is now also well understood, thanks to recent theoretical work in the DL community (Horrocks and Sattler 2004, Horrocks et al. 2006), and has recently been implemented in DL reasoners. This happy coincidence of user requirements and extensions in the underlying DLs and reasoning systems has led to a proposal to extend OWL with these and other useful features that have been requested by users, for which effective reasoning algorithms are now available, and that OWL tool developers are willing to support. In addition to those mentioned above, the new features include extra syntactic sugar, extended datatype support, simple metamodelling, and extended annotations. The extended language, called OWL 1.1, is now a W3C member submission,1 and it is already supported by tools such as Swoop, Prote´ge´, and TopBraid Composer. As well as increased expressive power, applications may also bring with them requirements for scalability that are a challenge to current systems. This may include the ability to reason with very large ontologies, perhaps containing 10s or even 100s of thousands of classes, and the ability to use an ontology with very large data sets, perhaps containing 10s or even 100s of millions of individuals— in fact data sets much larger than this will certainly be a requirement in some applications. Researchers are rising to these challenges by developing new reasoning systems such as the OWL Instance Store (Bechhofer et al. 2005), which uses a combination of DL reasoning and relational database systems to deal with large volumes of instance data, HermiT,2 which uses a hypertableaubased technique to deal more effectively with large and complex ontologies, and Kaon2 (Hustadt et al. 2004), which reduces OWL ontologies to disjunctive datalog programs and uses deductive database techniques to enable it to deal with very large data sets.
8 Conclusions As we have seen, the goal of Semantic Web research is to transform the Web from a linked document repository into a distributed knowledge base and application platform, thus allowing the vast range of available information and services to be more effectively exploited. As a first step in this transformation, languages such as OWL have been developed; these languages are designed to capture the knowledge that will enable applications to better understand Web accessible resources and to use them more intelligently. As we have seen in Section 6, the annotation of resources with machine readable descriptions also offers a promise for accessibility. Although fully realising that the Semantic Web still seems some way off, OWL has already been very successful and has rapidly become a de facto standard for ontology development in fields as diverse as geography, geology, astronomy, agriculture, defence, and the life sciences. An important factor in this success has been the availability of sophisticated tools with built-in reasoning support. The use of OWL in large-scale applications has brought with it new challenges, both with respect to expressive power and scalability, but recent research has also shown how the OWL language and OWL tools can be extended and adapted to meet these challenges.
References Abou Zahra Shadi. Semanticweb enabled web accessibility evaluation tools. In W4A ’05: Proc. of the 2005 Int. Cross Disciplinary Workshop on Web Accessibility (W4A), pages 99 101, New York, USA, 2005. ACM Press. ISBN 1 59593 219 4. doi: http://doi.acm.org/ 10.1145/1061811.1061830. Asakawa Chieko and Takagi Hironobu. Annotation based transcoding for nonvisual web access. In Proc. of the Fourth Int. ACM Conf. on Assistive Technologies, pages 172 179. ACM Press, 2000. Baader Franz, Calvanese Diego, McGuinness Deborah, Nardi Daniele, and Patel Schneider Peter F., editors. The Description Logic Handbook: Theory, Implementation and Applica tions. Cambridge University Press, 2003. Bechhofer Sean, Horrocks Ian, and Turi Daniele. The OWL instance store: System descrip tion. In Proc. of the 20th Int. Conf. on Automated Deduction (CADE 20), Lecture Notes in Artificial Intelligence, pages 177 181. Springer, 2005. Bechhofer Sean, Harper Simon, and Lunn Darren. SADIe: Semantic Annotation for Acces sibility. In I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, editors, Proc. of ISWC2006, the Fifth Int. Semantic Web Conf., volume 4273 of Lecture Notes in Computer Science, pages 101 115. Springer, November 2006. Berners Lee Tim. Semantic web road map, September 1998. Available at http://www.w3.org/ DesignIssues/Semantic.html Brachman Ronald J. and Schmolze James G. An overview of the Kl ONE knowledge repre sentation system. Cognitive Science, 9(2):171 216, April 1985. Carr Leslie, Bechhofer Sean, Goble Carole, and Hall Wendy. Conceptual Linking: Ontology based Open Hypermedia. In WWW10, Tenth World Wide Web Conference, May 2001.
328
I. Horrocks, S. Bechhofer
Cuenca Grau Bernardo, Horrocks Ian, Kazakov Yevgeny, and Sattler Ulrike. Just the right amount: Extracting modules from ontologies. In Proc. of the Sixteenth Inter. World Wide Web Conf. (WWW 2007), 2007a. URL download/2007/CHKS07a.pdf Cuenca Grau Bernardo, Kazakov Yevgeny, Horrocks Ian, and Sattler Ulrike. A logical framework for modular integration of ontologies. In Proc. of the 20th Int. Joint Conf. on Artificial Intelligence (IJCAI 2007), 2007b. URL download/2007/CKHS07a.pdf Derriere Sebastian, Richard Andre, and Preite Martinez Andrea. An ontology of astronom ical object types for the virtual observatory. Proc. of Special Session 3 of the 26th meeting of the IAU: Virtual Observatory in Action: New Science, New Technology, and Next Genera tion Facilities, 2006. Dzbor Martin, Domingue John, and Motta Enrico. Magpie Towards a Semantic Web Browser. In Dieter Fensel, Katia Sycara, and John Mylopoulos, editors, 2nd Int. Semantic Web Conf., ISWC, volume 2870 of Lecture Notes in Computer Science. Springer, October 2003. Fensel Dieter, van Harmelen Frank, Horrocks Ian, McGuinness Deborah, and Patel Schneider Peter F. OIL: An ontology infrastructure for the semantic web. IEEE Intelligent Systems, 16(2):38 45, 2001. URL download/2001/IEEE IS01.pdf Golbreich Christine, Zhang Songmao, and Bodenreider Olivier. The foundational model of anatomy in OWL: Experience and perspectives. Journal of Web Semantics, 4(3), 2006. Goodwin John. Experiences of using OWL at the ordnance survey. In Proc. of the First OWL Experiences and Directions Workshop, volume 188 of CEUR Workshop Proceedings, CEUR. 2005 (http://ceur ws.org/) Haarslev Volker and Moller Ralf. RACER system description. In Proc. of the Int. Joint Conf. ¨ on Automated Reasoning (IJCAR 2001), volume 2083 of Lecture Notes in Artificial Intelligence, pages 701 705. Springer, 2001. Handschuh Siegfried and Staab Steffen, editors. Annotation for the Semantic Web, volume 96 of Frontiers in Artifical Intelligence and Applications. IOS Press, 2003. Harper Simon and Bechhofer Sean. Semantic Triage for Accessibility. IBM Systems Journal, 44(3):637 648, 2005. Hartel Frank W., de Coronado Sherri, Dionne Robert, Fragoso Gilberto, and Golbeck Jennifer. Modeling a description logic vocabulary for cancer research. Journal of Biome dical Informatics, 38(2):114 129, 2005. Horrocks Ian and Sattler Ulrike. Decidability of SHI Q with complex role inclusion axioms. Artificial Intelligence, 160(1 2):79 104, December 2004. Horrocks Ian and Sattler Ulrike. A tableaux decision procedure for SHOI Q. In Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), pages 448 453, 2005. URL download/2005/HoSa05a.pdf Horrocks Ian, Patel Schneider Peter F., and van Harmelen Frank. Reviewing the design of DAML+OIL: An ontology language for the semantic web. In Proc. of the 18th Nat. Conf. on Artificial Intelligence (AAAI 2002), pages 792 797. AAAI Press, 2002. ISBN 0 26251 129 0. URL download/2002/AAAI02IHorrocks.pdf Horrocks Ian, Patel Schneider Peter F., and van Harmelen Frank. From SHI Q and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 1(1):7 26, 2003. ISSN 1570 8268. URL download/2003/HoPH03a.pdf Horrocks Ian, Kutz Oliver, and Sattler Ulrike. The even more irresistible SROI Q. In Proc. of the 10th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR 2006), pages 57 67. AAAI Press, 2006. Hustadt Ullrich, Motik Boris, and Sattler Ulrike. Reducing SHIQ description logic to dis junctive datalog programs. In Proc. of the 9th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR 2004), pages 152 162, 2004. Kalyanpur Aditya, Parsia Bijan, Sirin Evren, Cuenca Grau Bernardo, and Hendler James. SWOOP: a web ontology editing browser. Journal of Web Semantics, 4(2), 2005a.
Semantic Web
329
Kalyanpur Aditya, Parsia Bijan, Sirin Evren, and Hendler James. Debugging unsatisfiable classes in owl ontologies. Journal of Web Semantics, 3(4):243 366, 2005b. URL http:// www.mindswap.org/papers/debugging jws.pdf Kent A. Spackman. Managing clinical terminology hierarchies using algorithmic calculation of subsumption: Experience with SNOMED RT. Journal of the American Med. Infor matics Ass., 2000. Fall Symposium Special Issue. Kershenbaum Aaron, Fokoue Achille, Patel Chintan, Welty Christopher, Schonberg Edith, Cimino James, Ma Li, Srinivas Kavitha, Schloss Robert, and Murdock J William. A view of OWL from the field: Use cases and experiences. In Proc. of the Second OWL Experi ences and Directions Workshop, volume 216 of 2006 CEUR. http://ceur ws.org/) Knublauch Holger, Fergerson Ray, Noy Natalya, and Musen Mark. The Prote´ge´ OWL Plugin: An open development environment for semantic web applications. In Sheila A. McIlraith, Dimitris Plexousakis, and Frank van Harmelen, editors, Proc. of the 2004 Int. Semantic Web Conf. (ISWC 2004), number 3298 in Lecture Notes in Computer Science, pages 229 243. Springer, 2004. ISBN 3 540 23798 4. Kouroupetroglou Christos, Salampasis Michail, and Manitsaris Athanasios. A semantic web based framework for developing applications to improve accessibility in the www. In W4A: Proc. of the 2006 Int. Cross Disciplinary Workshop on Web Accessibility (W4A), pages 98 108, New York, NY, USA, 2006. ACM Press. ISBN 1 59593 281 X. doi: http:// doi.acm.org/10.1145/1133219.1133238 Lacy Lee, Aviles Gabriel, Fraser Karen, Gerber William, Mulvehill Alice, and Gaskill Robert. Experiences using OWL in military applications. In Proc. of the First OWL Experiences and Directions Workshop, volume 188 of CEUR Workshop Proceedings, CEUR. 2005. (http://ceur ws.org/) McIlraith Sheila, Son Tran Cao, and Zeng Honglei. Semantic web services. IEEE Intelligent Systems, 16:46 53, 2001. Patel Schneider Peter F., Hayes Patrick, and Horrocks Ian. OWL Web Ontology Language semantics and abstract syntax. W3C Recommendation, 10 February 2004. Available at http://www.w3.org/TR/owl semantics/ Rector Alan and Rogers Jeremy. Ontological and practical issues in using a description logic to represent medical concept systems: Experience from GALEN. In Reasoning Web, Second International Summer School, Tutorial Lectures, volume 4126 of LNCS, pages 197 231. SV, 2006. Ruttenberg Alan, Rees Jonathan, and Luciano Joanne. Experience using OWL DL for the exchange of biological pathway information. In Proc. of the First OWL Experiences and Directions Workshop, volume 188 of CEUR Workshop Proceedings, CEUR. 2005 (http:// ceur ws.org/) Seeman Lisa. The semantic web, web accessibility, and device independence. In W4A ’04: Proc. of the 2004 Int. Cross Disciplinary Workshop on Web Accessibility (W4A), pages 67 73, New York, NY, USA, 2004. ACM Press. ISBN 1 58113 903 9. doi: http://doi.acm. org/10.1145/990657.990669. Sidhu Amandeep, Dillon Tharam, Chang Elisabeth, and Sidhu Baldev Singh. Protein ontol ogy development using OWL. In Proc. of the First OWL Experiences and Directions Workshop, volume 188 of CEUR Workshop Proceedings, CEUR. 2005 (http://ceur ws. org/) Sirin Evren, Parsia Bijan, Cuenca Graua Bernardo, Kalyanpura Aditya, and Katz Yarden. Pellet: A practical OWL DL reasoner. Web Semantics: Science, Services and Agents on the World Wide Web, 5(2):51 53, June 2007. Soergel Dagobert, Lauser Boris, Liang Anita, Fisseha Frehiwot, Keizer Johannes, and Stephen Katz. Reengineering thesauri for new applications: The AGROVOC example. Journal of Digital Information, 4(4), 2004. SWEET. Semantic web for earth and environmental terminology (SWEET). Jet Propulsion Laboratory, California Institute of Technology, 2006. http://sweet.jpl.nasa.gov/
330
I. Horrocks, S. Bechhofer
Takagi Hironobu and Asakawa Chieko. Transcoding proxy for nonvisual web access. In Proc. of the Fourth Int. ACM Conf. on Assistive Technologies, pages 164 171. ACM Press, 2000. Tsarkov Dmitry and Horrocks Ian. FaCTþþ description logic reasoner: System description. In Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2006), volume 4130 of Lecture Notes in Artificial Intelligence, pages 292 297. Springer, 2006. URL download/ 2006/TsHo06a.pdf Volz Raphael, Handschuh Siegfried, Staab Steffen, Stojanovic Ljiljana, and Stojanovic Nenad. Unveiling the hidden bride: Deep Annotation for Mapping and Migrating Legacy Data to the Semantic Web. Journal of Web Semantics, 2004. Woods William A. What’s in a link: Foundations for semantic networks. In Ronald J. Brachman and Hector J. Levesque, editors, Readings in Knowledge Representation, pages 217 241. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1985. ISBN 093461301X. Previously published in D. G. Bobrow and A. M. Collins, editors, Representation and Understanding: Studies in Cognitive Science, pages 35 82. New York Academic Press, 1975. Yesilada Yeliz, Harper Simon, Goble Carole, and Stevens Robert. Dante annotation and transformation of web pages for visually impaired users. In The Thirteenth Int. World Wide Web Conf., 2004. Yesilada Yeliz, Bechhofer Sean, and Horan Bernard. Personalised Dynamic Links on the Web. In SMAP2006: 1st Int. Workshop on Semantic Media Adaptation and Personaliza tion, 2006.
Web 2.0 Becky Gibson
Abstract The Web is growing and changing from a paradigm of static publishing to one of participation and interaction. This change has implications for people with disabilities who rely on access to the Web for employment, information, entertainment, and increased independence. The interactive and collaborative nature of Web 2.0 can present access problems for some users. There are some best practices which can be put in place today to improve access. New specifications such as Accessible Rich Internet Applications (ARIA) and IAccessible2 are opening the doors to increasing the accessibility of Web 2.0 and beyond.
1 Introduction Web 2.0 means many things to many people. The term Web 2.0 was first coined by O’Reilly Media and MediaLive International at a joint conference in 2004 (O’Reilly 2004). The term Web 2.0 has come to refer to the next generation of the Web which is characterized by awareness, participation, real-time interaction, collective intelligence, and access to and presentation of data. The Web has moved from publishing to participation. No longer is it sufficient to provide pages of static information; Web 2.0 allows for and encourages interactivity and inclusion. This has consequences for the accessibility of the Web. Many people with disabilities have come to rely on the Web for access to information and increased independence. In the initial days of the Web, access for people with disabilities was limited. There has been significant improvement in assistive technologies as well as increased understanding of how to create accessible Web content. However, there is concern that once again the disabled community will be left behind as the next wave of innovation sweeps the Web. B. Gibson IBM, Emerging Technologies, Westford, MA, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_20, Ó Springer Verlag London Limited 2008
331
332
B. Gibson
There are basic techniques such as proper use of semantic markup, clear instructions, and proper management of focus that can be applied to Web 2.0 content to make it usable today. In order to go beyond basic usability, new technologies are needed. The development of the Accessible Rich Internet Applications (ARIA) specification (see Section 4.2) within the World Wide Web Consortium (W3C) augments HTML and XHTML with additional metadata to allow for notification and interaction. The new IAccessible2 interface (see Section 4.3) extends existing Microsoft accessibility application programming interfaces (APIs) to provide more information and increased accessibility. Adoption of these new specifications requires changes from the browser and assistive technology manufacturers and implementation by Web authors. For Web 2.0 to be accessible by all participants it is critical that innovative new languages and technologies be developed and incorporated into mainstream Web development. This can be achieved by encouraging authoring tools and toolkits used to create content for the Web to incorporate new standards and innovations. The goal is to make it easy to create Web 2.0 content that is usable by all.
2 Overview 2.1 Web 2.0 Becomes Mainstream Google1 spearheaded Web 2.0 technologies with the creation of applications such as Google Suggest, Google Maps, and Gmail (a Web 2.0-enabled email application) which relied upon incremental updates via XmlHttpRequest (XHR). XmlHttpRequest is an application programming interface that can be used via JavaScript to transfer data over the standard Web Http protocol to update portions of the page (van der Vlist, Vernet, Bruchez, Fawcett, Ayers 2007). The initial implementation provided for transferring data via eXtended Markup Language (XML), but other forms of data are now common. The use of XmlHttpRequest to update the page gained even more popularity when Jesse James Garrett coined the acronym Ajax for Asynchronous JavaScript and XML (Garrett 2005). Today all Web companies are embracing Ajax and building Web applications which incrementally update different parts of the page in real time – sometimes in response to user action but also via polling for updated data. Web 2.0 is about more than the use of Ajax. It is all about collaboration, social networking, and realizing the opportunities of collective intelligence – the average minds of the many working together are greater than the expert minds of the few. Consider the popularity and accuracy of Wikipedia,2 the online dictionary collectively created by users. Wikis, applications that allow real-time 1 2
http://www.google.com/ http://www.wikipedia.org/
Web 2.0
333
creating and editing on the Web, are popping up everywhere to allow online collaboration. The collaboration possibilities go even further when wikis are combined with instant messaging and other interactive tools to create applications for working on projects collectively or organizing your friends for an impromptu night out. These applications are delivered, developed, and updated in real time. User installation is no longer required and new features can be quickly developed and integrated into existing applications. This allows for user testing, feedback, and design changes at ‘‘Web-speed’’ resulting in a collaborative development process and morphing of applications and content. As Web applications become more complex, more sophisticated user interfaces are required to manage the data and interactions.
2.2 Web 2.0 and Accessibility Concerns When the Web was composed of static pages, it was relatively easy for a user with disabilities to gain access to information via assistive technologies. The default HTML elements and form components operated via the mouse and keyboard; information changed only by request via the loading of a completely new page; and there was little multimedia or interaction. There are standard guidelines and techniques (Thatcher, Burks, Heilmann, Henry, Kirkpatrick. Lauke, Lawson, Regan, Rutter, Urban Waddell 2006) that provide information about how to easily make static content accessible. The Web Content Accessibility Guidelines 1.0 (Chisholm, Vanderheiden, Jacobs 1999), published in 1999, is the current, common standard for making static content accessible. Unfortunately, Web content authors do not always take the necessary steps to ensure accessibility for all users. Technology continues to march forward, and more guidance is needed to ensure accessibility and usability.
2.2.1 Scripting and CSS Recent reviews of client side technologies indicate that JavaScript is used on about 60% of pages and Cascading Style Sheets (CSS) is used on over 50% (E-Soft, Inc. 2007). The increased use of scripting, which in many respects started the movement toward Web 2.0, creates problems for assistive technologies. The Document Object Model (DOM) is a set of application programming interfaces for accessing and updating the contents and behavior of a structured Web document. The use of scripting to add new components into the DOM of the Web page or to add new behaviors to existing components was not always recognized by assistive technologies, leaving the user confused or unaware of additional behaviors. Assistive technologies such as screen readers have become more capable of dealing with changes to the DOM, but Web 2.0 is creating new problems.
334
B. Gibson
2.2.2 Updating Data Consider a Web page which makes incremental updates. Even if these updates are made in response to user input, an assistive technology user may not be able to find the updated information. One example is a user interacting via a screen magnifier which is displaying only the upper portion of the Web page. When the user activates a button or link that loads new information into the lower portion she may not realize that additional content has been added. The same problems exist for a screen reader user – updates may occur and the screen reader software can detect and interact with the updated data but the user does not know how to find the new information or must excessively navigate to reach the new content. Similar problems exist for information which is updated automatically such as stock quotes, weather data, and news tickers. In this case, users of assistive technologies may have no idea that an update is occurring. Users with cognitive disabilities who can visually see and notice the data changes may be distracted by the updates. Sites which rely on online collaboration can provide the same difficulties – an over abundance of updates and action on the screen vying for the user’s attention.
2.2.3 Organizing Data Just the increase in data provides problems. The construct of the Web is the link – providing direct connection to other pages and additional information. As the amount of data has increased, so has the amount of links and the organization of those links into groupings. By default, links are keyboard accessible using the tab key. A user can press tab to move focus through each of the links and form components on the page. Imagine a page that is composed of hundreds of links which not only link to additional information but may have behavior or action provided via scripting. This is very common for news or shopping portals. Such pages have links organized into different topic areas, but there is no mechanism to move from topic to topic. A keyboard user may have to tab excessively from the top of the page to the bottom to reach the desired entry. The screen reader user who is also sequentially navigating the page may be overwhelmed by the navigation. Dynamic interaction and increased access to data have required the development of more sophisticated user interface controls which mimic those found in rich client applications. Authors combine scripting and CSS to create interactive components which display real-time data and allow modification via drag and drop technologies. A tree control presents a hierarchical mechanism for navigating through data such as a table of contents, a portfolio of artists and music available for purchase and download, or the list of file folders in an email application. Menus are created to organize, find, and modify information. Toolbars provide quick access to common functions and controls. And what collaborative site exists today without a rich text editor for contributing content?
Web 2.0
335
The HTML specification does not provide elements for these rich user interface controls. Nor does it provide the necessary semantic constructs to identify these types of controls when created within the markup. The div and span elements which are often scripted to create these graphical user interface controls have no semantic information and no built-in keyboard support. Responsible sites have been creating these new Web user interface components by scripting the existing HTML form components and link elements which have built-in keyboard support. While the sophisticated controls created in this manner are keyboard navigable, the problem of excessive tab key navigation is not solved. In addition, users of assistive technology have no information about the type and behavior of these new user interface elements. 2.2.4 Abundance of Data The increase in bandwidth of accessing the Internet at high speeds has increased the use of multimedia on the Web. Sites such as flickr,3 a photo sharing site, and YouTube,4 a video sharing site, are possible because users are no longer limited by download speeds. In addition to providing online user manuals, manufacturer sites can now include video demonstrations of proper product usage and maintenance. The average user can upload vacation photos or post family pictures for instant sharing. This addition of rich media creates additional accessibility barriers on the Web. In addition to expectations of visual acuity and manual dexterity, hearing clarity is expected as well. This might lead to the dismal conclusion that the Web is becoming an exclusive club only for those 100% enabled in all senses. While it is true that there may be some types of applications such as games or time-based activities that are not available to users with particular disabilities, the goal is to make Web 2.0 and beyond accessible to all.
3 Discussion There are steps that can be taken today to make Web 2.0 applications accessible. The first step is to enable basic access and use of the site; future steps must address usability, not just for persons with disabilities but for all users to make the experience more productive and enjoyable.
3.1 Provide Simple Navigation A site that is easy to navigate via the mouse and the keyboard is a productivity bonus for all users. When information will be updated incrementally on the 3 4
http://www.flickr.com/ http://www.youtube.com/
336
B. Gibson
page, notify the users where to expect those changes. Set expectations so that when a user activates a link he understands the consequences. The old paradigm of pressing a link to load a new page is gone. In today’s world activating a link may update only a portion of the page, launch a new window or floating pane. Consider a page to search a variety of periodicals for various topics in science or technology. The page will likely contain a standard search input text field and button to initiate the search. Below this search box may be a link to the home pages of the various periodicals in the search database or an advertisement or subscription offer. Further down on the page is the area where the search results are returned. The page is not reloaded when the search is performed; the results are added onto the page. A sited user will see the results, but someone accessing the page via a screen reader, screen magnifier, or using large fonts will need to be instructed where to find the results. A simple link to an instructions page or text next to the search box that indicates ‘‘Search results are provided under a heading labeled with your search term’’ can ease the navigation burden. Skip links, a link at the top of the page which ‘‘skips’’ the navigation section of the page and takes the user to the main content of a page, are a standard technology in existing Web applications (Slatin, Rush 2003). Such links are even more applicable in a Web 2.0 world that squeezes even more information onto a single page at a higher density. A Web portal provides a set of applications organized on one page with a common user interface look and feel. Providing a list of links at the top of a portal that identifies and provides quick navigation to each section of the page can ease navigation burdens for all users. A link that invokes a pop-up pane containing navigation instructions is easily accomplished using Web 2.0 technologies. The instructions do not require extra real estate since they can be provided in a pop-up pane that displays above the content and is invoked only as necessary. Providing proper management of focus makes it available to assistive technology.
3.2 Manage Focus Appropriately In the Web 2.0 world, information may be added to a page at any time. Often information is added in response to user action, such as submitting a search but it can also be updated automatically in real time. Updates are not always noticed by a user, especially data which is updated automatically. One option is to set focus to the new data. This will cause assistive technologies to also shift focus to the new item, and in the case of screen readers, speak the information to the user. However, moving focus should be implemented with care. Consider a user with a cognitive disability who has increased her font size, causing the entire page to no longer fit on the screen. She is slowly reading the top paragraphs of the page when focus suddenly shifts to the updated weather forecast embedded at the bottom of the page. This can be incredibly disorienting, not to mention
Web 2.0
337
that the user has to navigate back to her previous point of focus and begin reading again. If the weather update is in a continuous loop that she cannot modify, she may never be able to complete her perusal of the page. She is an unhappy user and potentially a lost customer. Give the user the ability to modify or disable automatic updates, prevent them from changing focus, or offer to provide them in a different window or pane. There are some updates that the user may care about. Someone closely monitoring a stock price in order to time a purchase order may want focus to shift each time the stock price falls by a certain percentage. Or, if a severe weather alert or emergency notice is broadcast it is essential for all users to receive the information. The key is to think about different user situations and interaction environments and to provide options. After all, the scripting to offer the choice of automatic or manual updates is just as easy as the code which is performing the incremental updates!
3.3 Use Semantic Markup The battle cry of the W3C for years has been to follow the specifications! CSS has provided the ability to separate presentation from structure. A developer no longer has to embed font sizes and colors within the markup. But CSS has also created the ability to forgo good document structure. Using CSS an HTML div element can be styled to visually look like a heading – this is a bad practice. While the experience may be the same to a visual user, it can cause navigation headaches for the vision impaired or keyboard user. Most assistive technologies and many user agents allow the user to navigate by headings. Providing instructions that information under particular headings will be updated allows the user to quickly navigate to those areas as long as the proper heading markup is used. Another common error is to add scripting to a div or span element to make it behave like a link or a button. If an element exists which has a particular behavior or semantic meaning, it is best to use it for that function wherever possible. When an advanced Web 2.0 feature requires scripting of non-semantic elements, provide as many behavioral clues and navigational structures on the page via traditional markup as possible. Provide documentation for navigating to and interacting with the advanced features.
3.4 Provide Clear Instructions Remember that WWW stands for World Wide Web and design appropriately for different cultures, age groups, learning styles, and access methods. While many sites are designed for a specific age group or culture, all people have different experiences and education levels. A teenager may easily grasp a stylized portable music player-like user interface implemented on the Web, but an older
338
B. Gibson
individual might not have the same level of comfort with the paradigm. Simple instructions can improve the accessibility immensely. Adding an additional page that explains the page layout and operation of user interface controls is a simple way to enhance the usability of a site. Identify the different pieces of the site and provide navigation to each one. Offer different layouts and font choices. These should be relatively simple additions for the Web developer gurus creating the innovative Web 2.0 sites. Allow the site developers to show off their talents by requiring alternative layouts, font size selections, and keyboard methods of interaction. Document the use of these features in simple, plain text language.
4 Future Directions While some traditional techniques can help make Web 2.0 more accessible, there are still significant gaps. The new user interface controls for navigating and interacting with Web applications can be cumbersome or even impossible for some assistive technology users. Changes are needed. Changes to the regulations and accessibility guidelines are necessary to ensure that Web developers take the appropriate steps to make sites accessible. New specifications and programming models are needed to provide the additional semantics and techniques for enabling the Web. Updated accessibility application programming interfaces (APIs) are required in order to communicate the new semantics to assistive technologies. The user agents and assistive technologies must embrace and support the new technologies and paradigms. The Web never stands still and is always evolving – the tools and technologies to access the Web must continue to evolve as well.
4.1 Accessibility Guidelines The Web Content Accessibility Guidelines (WCAG) 1.0 (Chisholm et al. 1999) were released in 1999 when the Web was a more static environment. Nearly all content was created using straight HTML markup with little scripting and CSS. Web 2.0 has moved beyond static HTML and updated guidelines are needed. Active development of WCAG 2.0 (Caldwell, Chisholm, Slatin, Vanderheiden 2007) has been underway for the last several years (see Web Accessibility and Guidelines). The 2.0 guidelines are being rewritten to be more technology neutral rather than depending on current technologies. For example, rather than requiring alt attributes on img elements to identify the purpose of an image, the new guidelines require ‘‘text alternatives for non-text content’’. While this is less specific and may be a bit more confusing for an HTML developer, it can be applied to other technologies such as Adobe’s PDF (portable document format), Adobe Flash, SVG (scalable vector graphics), scripted
Web 2.0
339
dynamic HTML, or any other technology that becomes prevalent after the guidelines have been released. Governments are looking to model accessibility laws and requirements after WCAG 2.0. The US Government is also looking to update Section 508 of the Rehabilitation Act which was created in 1998. This law requires federal agencies to make electronic and information technologies accessible to people with disabilities. Section 508 needs to be updated to accommodate changes in Web technologies, improvements in assistive technologies, and to harmonize the Web and standalone software requirements.
4.2 Accessible Rich Internet Applications (ARIA) ARIA (Schwerdtfeger, Gunderson 2007) is a new specification being developed in the Protocols and Formats working group in the Web Accessibility Initiative (WAI) area of the W3C. ARIA describes how graphical user interface components built using JavaScript can be made fully accessible to assistive technology (Gibson, Schwerdtfeger 2005). It is designed to work across technologies so that it can be applied to several types of markup such as HTML, scalable vector graphics (SVG), XML User-Interface Language (XUL), or others. 4.2.1 Providing Additional Semantics ARIA adds semantic data to the interactive components of a Web page. The user agent interprets and translates this additional information into a format that an assistive technology can understand and present to the user. ARIA is filling in the semantic gaps of today’s markup by defining the additional role and state semantics needed to create rich interface controls. Developers can now create graphical user interface elements using generic HTML div and span elements and identify them with the appropriate role, for example, a tree, treeitem, menu, menuitem, dialog, alert message, live region. In addition to the role, the appropriate state is identified as well: expanded or collapsed, checked or unchecked, required, invalid, labelledby, etc. The additional role and state information can be translated by the browsers into a format usable by assistive technologies to provide more information about the function and state of a component. This specification is still under development, but portions of it have already been implemented in the Firefox5 browser versions 1.5 and 2 and supported by the Window-Eyes6 and JAWS7 screen readers. Firefox 3 will support the completed specification. Expectations are high that other browser vendors will support this specification as well. 5 6 7
4.2.2 Full Keyboard Support The next step toward full accessibility is for the Web developer to provide the necessary keyboard behavior for created components. Navigating to elements via the tab key is the common behavior of the Web. Pressing the tab key moves focus from one link or form component to the next in the order they are created in the page markup. The tabindex can be used to change the order, but for increased usability and to make components work like their desktop counterparts, navigation via arrow keys is required. To accomplish arrow key navigation a script function responds to onkeydown and onkeypress events which are generated by the browser when the user presses a key. The function interprets these keystrokes and moves focus to the appropriate element. Setting focus to an element causes an assistive technology such as a screen reader to speak the relevant information about the component such as the role, state, and description. To create components using generic div and span elements with the appropriate roles and states, these elements must be able to receive focus. Internet Explorer8 and Firefox support adding the tabindex attribute to an element to allow it to receive keyboard focus in the tab order of the page, in a specific tab order or via scripting. Most visible HTML elements can receive focus when a tabindex attribute is added to it. The value of the tabindex attribute specifies whether the element will receive focus via the tab key or if focus can be set programmatically via scripting. This fills an important accessibility gap in the HTML specification where elements like div and span can respond to the mouse but not the keyboard. Using the tabindex attribute and scripting, Web developers can now create user interface controls that respond to both mouse and various keyboard input and can model the behavior to match operating system user interface controls. 4.2.3 Incremental Updates and Live Regions As mentioned previously, incremental updates via Ajax technologies can create accessibility issues. The Web developer needs to indicate the nature of the updates and communicate this to the browser and assistive technologies. ARIA is introducing the concept of live regions which can be used to mark areas of the page which will receive updates. This will improve the accessibility of Ajax-enabled applications and make it easier for assistive technology users to interact with them. The Web author can indicate the nature of these updates via additional semantics such as off, polite, assertive, or rude. This gives the user agent and assistive technology information about the nature of the updates. The assistive technology may immediately set focus to or speak rude interruptions but may wait to interrupt the current process for regions marked as polite. The user needs to be given information about the live regions on the page by the assistive 8
technology and provide the opportunity to control the interruption types. Version 3 of the open source Firefox browser will support ARIA regions.
4.3 IAccessible2 Interface Each operating system has an accessibility api to communicate information to assistive technologies. Microsoft created MSAA (Microsoft Active Accessibility) for versions of Windows and UI Automation for Vista, ATK (Accessibility Toolkit) exists for the GNOME desktop on UNIX, and Apple has Mac OS X Accessibility Protocol. The MSAA interface was created many years ago and does not have the same level of support for more complicated user interface elements that the accessibility APIs of other platforms provides. The goal of IAccessible2, which is an engineered open interface standard available via The Linux Foundation, is to harmonize the standards between operating systems and to provide the additional accessibility information required by Web 2.0 and rich software applications to the Windows platform. IAccessible2 is necessary to communicate ARIA live region information from a browser running on Windows to the assistive technology. In addition to live updates, it also provides the needed support for rich text, interactive tables and spreadsheets as well as other Web 2.0 and client application interactions.
5 Author’s Opinion of the Field Someday I envision a ubiquitous Web. It will be available on demand at any time in any place. It will direct our lives and provide instant access to information, entertainment, government, and commerce (Emiliani, Stephanidis 2005). I will access this Web via a unique biometric and not be overwhelmed by security and fraud detection. Am I a dreamer? To some extent – unfortunately there will always be people who use creative talents for taking advantage of the system rather than embracing and improving it. But I do believe the Web will become pervasive and that will make it more accessible to all users. Consider the future car manufacturer that wants to include Web access in its vehicles. Until cars can drive themselves and drivers can remove focus from the road ahead, an alternative ‘‘eyes-off’’ interface will be necessary. Perhaps a car manufacturer will have the resources and clout to demand and develop a voiceonly interface. In order to have content available for the vehicle system it must be easy to create and work with various presentation formats – visual, audible, and textual. Who will be the innovator to create the language, tools, and user agents to make this possible? Once the ‘‘mainstream’’ user understands the benefits they will demand information presented in different modalities. People with disabilities will no longer be required to purchase specialized assistive technologies – those technologies will be demanded by the public at large and
342
B. Gibson
become common place. Assistive technology vendors will become mainstream and the technologies freely available or available at minimal cost. Technology still needs further development, but I envision a time when captioning video will not be a special process; voice recognition will capture and transcribe the information. A user will call up an avatar to automatically translate video content into another audible spoken language, textual transcript, or the appropriate sign language. The next big innovation will be an entirely new language and access method. Using the computer and the Web should not be so hard – I keep waiting for computer usage and Web interaction to become as easy as driving a car. When I get into a new car, the controls may be laid out a bit differently, but I understand the function and can begin driving almost immediately. Sharing information on the Web should be as easy as driving a car – the user interface methods may vary slightly but the functionality is easily accessible to the operator. Until my dream is realized we all need to take steps to enable Web 2.0 and beyond for access by all individuals. Encourage vendors to adopt and implement IAccessible2 and ARIA. Help users create accessible content by creating authoring tools that incorporate accessibility into Web development. Web authors need to embrace toolkits which are incorporating accessibility into the environment and user interface components. The Web is built on inclusion and adoption of the latest trends and technologies; let us embrace the use of these new technologies and encourage the development of new and better ones in the future.
6 Conclusions Web 2.0 is today’s technology; what will the next generation of the Web bring? New technologies are only useful if they are implemented and embraced by the world wide community. As an open source browser, Firefox is able to take immediate action to implement the ARIA standards and incorporate IAccessible2. Assistive technology vendors realize the benefit to their customers when providing support for these specifications and technologies. Large corporations which sell to the united states and other governments must take action to create accessible applications in order to meet regulations. Such corporations have the ability to force accessibility compliance within their development organizations. IBM, SAP, and SUN Microsystems are leading by example with contributions to the ARIA and IAccessible2 specifications. Other corporations are contributing through implementation of these new specifications. Web companies such as Google, Yahoo!, and AOL must take the initiative to create fully accessible content and be the standards bearers for full accessibility. The best practices and standards presented here can lead the way to making Web 2.0 accessible. Who is going to provide the next innovation to make the future, ubiquitous Web accessible and usable by all in all situations and venues?
Web 2.0
343
References Caldwell, B., Chisholm, W. Slatin, J and Vanderheiden, G (2007) Web Content Accessibility Guidelines 2.0, http://www.w3.org/TR/WCAG20/ Chisholm, W., Vanderheiden, G. and Jacobs, I (1999) Web Content Accessibility Guidelines 1.0, http://www.w3.org/TR/WCAG10/ Emiliani, P. L., Stephanidis, C. (2005) Universal Access to Ambient Intelligence Environ ments; Opportunities and Challenges for People with Disabilities. In IBM Systems Journal Accessibility, Vol. 44, No.3 pp. 605 619. E Soft, Inc. (2007) Security Space Site Penetration Report, http://www.securityspace.com/ s_survey/data/man.200701/techpen.html Garrett, J.J (2005) Ajax: A New Approach to Web Applications, http://www.adaptivepath. com/publications/essays/archives/000385.php Gibson, B., Schwerdtfeger, R. (2005) DHTML Accessibility Solving the JavaScript Acces sibility Problem. In Proceedings of the Seventh International ACM SIGACCESS Confer ence on Computers and Accessibility, Association of Computing Machinery (ACM) Press, New York, NY, pp. 202 203. The Linux Foundation, (2006) IAccessible2 Enhancing Accessibility and Multi Platform Development, http://www.linux foundation.org/en/Accessibility/IAccessible2/Overview. O’Reilly, Tim, (2005) What is Web 2.0?, http://www.oreillynet.com/pub/a/oreilly/tim/news/ 2005/09/30/what is web 20.html Schwerdtfeger, R. and Gunderson, J (2007) Roadmap for Accessible Rich Internet Applica tions (WAI ARIA Roadmap), http://www.w3.org/TR/aria roadmap/. Slatin, J., Rush, S. (2003) Maximum Accessibility Making Your Web Site more Usable for Everyone. Addison Wesley, Pearson Education, Boston, MA pp. 183 4. Thatcher, J. Burks, M., Heilmann, C., Henry, S.L., Kirkpatrick, A., Lauke, P., Lawson, B., Regan, B., Rutter, R., Urban, M., Waddell, C., (2006) Web Accessibility Web Standards and Regulatory Compliance, Friends of ED, an Apress Company, Berkeley, CA. Chapter 6. van der Vlist, E., Vernet, A., Bruchez, E., Fawcett, J., Ayers, D. (2007) Professional Web 2.0 Programming. Wiley Publishing, Inc., Indianapolis, Indiana, pp. 71 74.
Universal Usability Sarah Horton and Laura Leventhal
Abstract Universal usability of World Wide Web (Web) environments—that is, having 90% of households as successful users—requires universal access, usability, and universal design. Factors such as Web technology and usercentered design contribute to universal access and usability, but key to universal usability is a universal design methodology. Universal design principles for the Web follow from universal design principles for the built environment, and emphasize perceptibility, self-explanation, and tailorability for the user. Universally usable Web environments offer the benefit of expanded participation, as well as the unanticipated benefits that generally follow from innovative design initiatives. However, to achieve Web universal usability, Web designers need tools that facilitate the design of intuitive interfaces without sacrificing universal access. The purpose of this chapter is to promote a universal design approach to meet Web accessibility requirements and to establish a research agenda for the development of standards and tools that support both universal access and advanced interfaces, which can be used by Web designers to design for universal usability.
1 Introduction: The Elements of Universal Usability Functional limitations arise when people are in some way unable to negotiate and navigate their environment (Vanderheiden 1990). Functional limitations can be minimized through design. For example, if standard text were larger, the physical limitations of aging vision would be less disabling. As well articulated by Covington and Hannah (1996), ‘‘. . .disability becomes a handicap only when we encounter . . . barriers.’’ As designers of Web environments, our task in reducing functional limitations and barriers is to ‘‘support a wide range of technologies, to accommodate diverse users, and to help users bridge the gap S. Horton Dartmouth College, Academic Computing, Hanover, NH, USA e mail: [email protected]
S. Harper, Y. Yesilada (eds.), Web Accessibility, DOI: 10.1007/978 1 84800 050 6_21, Ó Springer Verlag London Limited 2008
345
346
S. Horton, L. Leventhal
between what they know and what they need to know’’ (Shneiderman 2003). This is accomplished through universal usability. Shneiderman (2000) defines universal usability as ‘‘. . .having more than 90% of all households as successful users of information and communications services at least once a week.’’ Breaking this definition down suggests that universal usability of Web environments requires at least three elements to achieve:
Universal access: Content of pages must be accessible Usability: Pages need to be functional in their usage context Universal design: Accessibility and functionality are integral to design. In the rest of this introduction, we address these elements more fully.
1.1 Universal Access The U.S. Communications Act of 1934 specifies the following: For the purpose of regulating interstate and foreign commerce in communication by wire and radio so as to make available, so far as possible, to all the people of the United States, without discrimination on the basis of race, color, religion, national origin, or sex, a rapid, efficient, Nation wide, and world wide wire and radio communication service with adequate facilities at reasonable charges, for the purpose of the national defense, for the purpose of promoting safety of life and property through the use of wire and radio communication (Federal Communications Commission 1996).
This statement is a definition of universal access. As the statement suggests, universal access implies that all individuals should be able to avail themselves of contact with a variety of communication services. In 1934, those services included telephone, telegraph, and radio services. Updating this interpretation suggests that universal access means that network and computing services are available at a reasonable cost; availability at a reasonable cost is an outcome of the combination of public and commercial policy. Is availability at a reasonable cost enough to facilitate the Shneiderman definition of universal usability? The answer is no, because ‘‘the greater complexity of computing services means that access is not sufficient to ensure successful usage’’ (Shneiderman 2000). Consider a study, described in Kraut, Scherlis, Mukhopadhyay, Manning, and Kiesler (1996). In this study, households were given connected computers and training. Often the persons in the homes could not use the equipment to connect to the Internet. In this case, the users had access but they were unable to use the equipment to its fullest extent; that is, within the Shneiderman definition, they were not successful users of technology. What then is the next requirement after access for universal usability? The answer is usability; to be a successful user of a technology, the technology must be usable.
Universal Usability
347
1.2 Usability Usability has been variously defined by many authors. One compelling model of usability is due to Eason (1984). In his model, usability is the causal outcome of at least three factors:
User characteristics Task characteristics System (user interface) characteristics. In other words, in the Eason model, the usability outcome of a particular user interface is determined by the combination of features of the setting, that is features of the user, what it is that they are trying to do, and how the user accomplishes their task within the system that they are using. So, for example, a Web design that works well for novice users doing a repetitive data entry task may not work well for highly expert users doing a more open-ended searching task. Reconsidering the definition for universal usability from Shneiderman in the context of the Eason model, it is clear that to be universally usable, information technologies need to support the relevant user characteristics of at least 90% of the population doing wide variety of tasks. Systems need to incorporate a number of usability characteristics, such as ease of use, ease of learning, and task match to the target tasks. Information technologies need to support the characteristics of diverse audiences doing a wide variety of activities on a broad range of platforms. In addition, information technologies need to be accessible. So if we acknowledge that to achieve universal usability, we must have access and usability, is that enough? We would argue that the answer is no. The last element needed to achieve universal usability is a strategy to achieve this goal; that is, we need a design process that is focused on universal usability. Without a process in place with the goal of universal usability built in, universal usability can become at best an add-on feature or a goal to evaluate after a Web environment has been built.
1.3 Universal Design The third piece of the universal usability puzzle, in addition to access and usability, is a design process that can achieve such an end. Following from the previous discussion, such a process would anticipate a diversity of users, tasks, and systems and create a design whose content is accessible and functionality is easy to use. Here we turn to the concept of universal design, which seeks to create designs that are ‘‘usable to the greatest extent possible by people of all ages and abilities.’’ Universal design arose from a need to meet public access requirements for people with disabilities and a desire to implement standards without sacrificing design (Story 1998).
348
S. Horton, L. Leventhal
The fundamental difference between accommodating diverse abilities and designing for diverse abilities is that rather than creating separate designs to meet different requirements, one design is fashioned to meet the needs of all. For example, ramped building access, as opposed to staired access, eliminates the need for a separate ‘‘handicapped’’ entrance. Universal design has been found to yield several benefits across a number of design fields, including
Save money Produce better designs Benefit everyone (Vanderheiden 1990). These benefits and more may result when designers commit to universal usability for the Web. The Trace Research & Development Center of the College of Engineering, University of Wisconsin-Madison states the following: People who could benefit from more universal designs include many both with and without disabilities. In some cases, people may experience difficulty in using products purely as a result of the environment or an unusual circumstance.
In addition, Shneiderman and Hochheiser (2001) suggest that the challenge of universal usability, like other attempts to broaden user populations in the past, may lead to the development of advanced user interfaces and interface technologies. They cite historical changes from assembly language programming to high-level language program, changes from command-line interfaces to visual interfaces, and changes in searching techniques on the Web as examples.
2 Factors Supporting Universal Usability In our previous section, we pointed to three elements that can encourage Web universal usability: universal access, usability, and universal design. In this section, we explore elements of the current Web environment that contribute to these three factors.
2.1 Web Technology The Web provides an environment well suited to universal usability. Two attributes in particular favor universal usability: namely, flexibility and user control. Here we discuss those attributes in more detail. Designers often strive for designs that work best for the ‘‘average’’ person. Take, for instance, text formatting. Numerous studies have been done to determine the optimal text size and typeface for readability (Tinker 1963). These studies are important and necessary when text is set to print as readers have little recourse when faced with small text, short of reading glasses and magnification.
Universal Usability
349
A conscientious designer adopts the findings of such research in the hope of creating a design that will reach the most people possible, understanding that some will be unable to read the text. Web technology provides a flexible environment, where attributes such as text size are largely governed by the user. Browser settings determine which size and typeface are applied to Web documents, and users control their browser settings. Additionally, the Web is largely text based, and text can be read by software. This allows different tools and technologies to access Web content and render it as appropriate for different-use contexts. This pairing of flexibility and user control means that designers concerned with universal usability are not responsible for finding one design that works for everyone. Instead, designers are responsible for creating designs that respond well to flexibility and user control; thus promoting universal usability.
2.2 User-Centered Design In user-centered design, user needs and preferences form the basis for design decisions. A user-centered approach is strongly indicated by the Eason usability model as a key contributor to usability in general, and many practicing designers see this strategy as a key element to ensure usability. Does user-centered design mean that each user gets a different specialized user interface, tuned to his or her narrow band of needs? User-centered design might seem on the surface to promote a multitude of design solutions when the audience is diverse. In fact, a wide variety of user needs, taken in the context of a user-centered design process, can lead to universally usable designs as long as universal design principles are followed and the elements of flexibility and user control are leveraged to satisfy diverse user needs.
2.3 Policies Several pieces of recent legislation provide for the enforcement of accessibility guidelines and Web accessibility guidelines, including the Workforce Investment Act 1998, Rehabilitation Act Amendments of 1998, and Section 508 of the United States. These acts stipulate that information technology that is developed or obtained by the U.S. Government must be accessible to Federal employees and the public regardless of disability or background (U.S. Access Board). These and similar policies are primarily focused on universal access, rather than universal usability per se. However, policies promote universal usability as an approach to meeting accessibility requirements, much as public access requirements promote a universal design approach.
350
S. Horton, L. Leventhal
3 Universal Design and the Web Web Accessibility is focused on promoting best practices and tools that make the Web accessible to people with disabilities (Brewer 2004; Chisholm 1999) (see Web Accessibility and Guidelines). This focus has been instrumental in raising awareness of the universal access requirements for Web sites. All people have the right to successfully access and operate Web sites, and all Web developers are responsible for making their sites accessible. As with building requirements, where architects have clear-cut specifications for designing, for example, sidewalks and entryways to accommodate a diversity of travelers, Web accessibility guidelines serve as an important measure against which to design Web sites. Universal usability uses these guidelines as a foundation for creating accessible designs from the start, as opposed to measuring against the guidelines after the fact or retrofitting existing designs to meet minimum access requirements. Universal usability also employs good design practices, such as user-centered design, in order to achieve and evaluate usability. Universal design arose out of a desire to implement access standards in the built environment without sacrificing design integrity. As architects began to wrestle with the implementation of standards, it became appar ent that segregated accessible features were ‘‘special,’’ more expensive, and usually ugly. It also became apparent that many of the environmental changes needed to accom modate people with disabilities actually benefited everyone. Recognition that many such features could be commonly provided and thus less expensive, unlabeled, attrac tive, and even marketable, laid the foundation for the universal design movement (Story 1998).
The Web is at a similar juncture, with designers and application developers seeking to comply with Web accessibility standards and mandates while keeping pace with current trends in visual design and interface functionality. We propose that universal usability is the means to that end, and that much can be learned from studying and applying the principles and guidelines that resulted from the universal design movement to the business of Web design. In this section, we apply several of the principles and guidelines from the Universal Design Principles from the Center for Universal Design at North Carolina State University’s College of Design (Connell et al. 1997) to the task of creating universally usable Web sites.
3.1 Universal Design Principles Since the Universal Design Principles were devised to address challenges having to do with designing the built environment, not all the principles and guidelines translate to the Web. Here we examine the first four principles and their relevant guidelines and discuss how they apply to Web design.
Universal Usability
351
Principle One: Equitable Use The design is useful and marketable to people with diverse abilities. Provide the same means of use for all users: identical whenever possible; equivalent when not (Connell et al. 1997). In the Web context, one thing is certain: an ‘‘accessible’’ or ‘‘text-only’’ version is not an option for universal usability. Instead, we must strive for ‘‘same means of use’’ by creating a single, appealing design that accommodates different access methods, conveys the same information, and provides the same functionality. Let us see how this can be accomplished. Principle Two: Flexibility in Use The design accommodates a wide range of individual preferences and abilities. Provide choice in methods of use (Connell et al. 1997). Here the Web comes into its own, with flexible layouts that adapt to different screen dimensions, text that can be scaled by the user to a size that is comfortable for reading, pages that are usable with custom formatting, and functional elements that can be operated using a pointing device or the keyboard. Key to supporting this principle are flexibility and user control. Principle Three: Simple and Intuitive Use Use of the design is easy to understand, regardless of the user’s experience, knowledge, language skills, or current concentration level. Eliminate unnecessary complexity and arrange information consistent with its importance (Connell et al. 1997). Web technology allows us to encode importance using structural markup, and careful attention to markup and code design address issues around information sequencing. However, complexity in design is widespread on the Web. Most Web pages have a low signal-to-noise ratio, with much of the display occupied by elements of the browser interface, branding elements such as logos and taglines, advertising, and extensive navigation options (Horton 2006b). In addition, as the demand for the Web to be a platform that supports an everincreasing set of tasks and applications, simplicity becomes even more difficult to achieve. Supporting simplicity requires a fundamental change in design approach that promotes essential content and functionality regardless of the complexity of the task, and that holds the elements of universal usability above all other considerations. Principle Four: Perceptible Information The design communicates necessary information effectively to the user, regardless of ambient conditions or the user’s sensory abilities. Use different modes (pictorial, verbal, tactile) for redundant presentation of essential information and provide compatibility with a variety of techniques or devices used by people with sensory limitations (Connell et al. 1997). Text is the fundamental modality for conveying information on the Web, and text can be read by software, making it possible to support a variety of modalities and devices. Also, many non-text
352
S. Horton, L. Leventhal
elements can be marked up with equivalent text, which means content that is not perceptible in its native format, such as audio, can have an equivalent textbased fallback (e.g., captions and transcripts) that is perceivable and carries the same information signature. The above exercise illustrates that the spirit and much of the substance of the Universal Design Principles can be used as a basis for universal usability, in particular the first two elements proposed by Shneiderman (2000): support for technical variety and support for user diversity. Using these principles, supported by the methods and techniques outlined in Web accessibility guidelines, as a basis for making design decisions will yield gains in universal usability. While the third element of Shneiderman’s model—spanning the gap between what people know and what they need to know—is also addressed in the principles, actualizing this element is more difficult in today’s environment. Indeed, it is the ‘‘usability’’ in universal usability that presents the greatest challenge.
4 Authors’ Opinion of the Field Compared with other developer tools, the Web and its associated tools and technologies form a relatively primitive environment for authoring graphical user interfaces. Native Web technology supports limited interaction, with forms and links allowing for navigation among and within pages and for the submission of information. Fundamentally, the Web is a client–server application and does not support page-level interaction, meaning many of the tools and techniques for designing intuitive and usable applications (Shneiderman 2004) are simply not available. Interestingly, the simplicity of the technology and its limited options help support technical and user diversity. Text-based content and keyboard-based interaction enable two of the ‘‘Basic Components of Universal Usability’’ proposed by Vanderheiden (2000): ‘‘Ensuring that all information presented . . . can be perceived’’ and ‘‘Ensuring that the device is operable by the user.’’ However, these same constraints limit the ability to design interfaces that help ‘‘span the gap’’ in user knowledge. Much of the functionality applied toward building intuitive interfaces in other contexts cannot be accomplished using basic Web technologies. Since its inception, graphic and interface designers have been pushing against the boundaries of the Web, many with the goal of creating better designs and more usable interfaces. The fruits of this struggle can be found under the surface of the Web, in the invisible table layouts and spacer graphics that allow for controlled, complex layouts, and on the surface layer, in the heavy use of graphics and the trend toward enhanced client-side interaction using addon technologies such as Flash, Java, and JavaScript. And Web designers are caught between adhering to Web accessibility standards and providing expected features and functionality. Before long, a site without Ajax widgets (see Web 2.0) will earn the same scorn that ‘‘first-generation’’ sites did in the mid-1990s.
Universal Usability
353
And herein lies the challenge. The universal elements of Web technology are not keeping pace with Web development trends. Web sites support an everincreasing diversity of tasks and usage contexts, using sophisticated user interface tools. Because of the complexity of technologies and designs required to support this enhanced functionality, providing universal access is more difficult now than it was when the Web was simply text, links, and forms. Designers who insist on simplicity of design in order to provide universal access risk imposing non-intuitive elements on the design. Designers who insist on enhanced functionality and intuitive interfaces risk imposing barriers to access. In this fundamental struggle between simplicity and functionality, simplicity is losing ground.
5 Future Directions This challenge calls for a research agenda to help bridge another gap: the gap between functionality and universal access. Web technology needs to be deliberately broadened and enhanced to allow for advanced designs that support universal access. Hypertext Markup Language (HTML) provides a limited toolset that did the job on those ‘‘first-generation’’ sites, but does not allow for designs that satisfy current expectations. Updated standards, protocols, and technologies that both support universal access and address real-world requirements must be developed, and quickly adopted and put into widespread use. Web authoring tools must promote universal usability, both in their usability by diverse users and technologies, and in the quality and integrity of the Web sites they produce. Client software must offer consistent and full support for standards. Society must commit to the value of universal usability for the Web and all that the commitment entails, including the development of the aforementioned authoring tools, standards, and client software. And finally, designers must recognize that their role in a flexible, user-controlled, text-based environment is to put every available tool to use in creating solid, stable, and universally usable designs, and then quietly step back and allow users to shape the Web into whatever form they find suitable.
6 Conclusions As noted by Dardailler et al. (2001), while the Web has ‘‘made it possible for individuals with appropriate computer and telecommunications equipment to interact as never before,’’ its graphical information and direct manipulation interface has displaced text-based interaction, a modality that ‘‘enabled all but the most severely disabled to use computers.’’ The Worldwide Web Consortium, and in particular the Web Accessibility Initiative, has done much to ensure that the Web is accessible to people with disabilities, as evidenced by
354
S. Horton, L. Leventhal
technologies such as Cascading Style Sheets (CSS) and Synchronized Multimedia Language (SMIL) and the ongoing enhancements to HTML, and the accessibility guidelines for Web content, authoring tools, and user agents (WAI, W3C) (see Best Practice and Guidelines and Applications). These technologies and guidelines are tools we can use to support a range of technologies and diverse users. To fully support universal usability, we need to adopt a universal design approach that integrates these tools and specifications into designs that are accessible and easy to use. Universal design is the best way to integrate access for everyone into any effort to serve people well in any field. Although it will never be easy to design for diverse populations, concern for people should become an expected component of the process of designing any environment, product, service, or policy (Story 1998).
References Brewer, J., ed. 2004. How People with Disabilities Use the Web. http://www.w3.org/WAI/EO/ Drafts/PWD Use Web Chisholm, W., G.C. Vanderheiden, and I. Jacobs, eds. 1999. Web Content Accessibi lity Guidelines 1.0. http://www.w3.org/TR/WAI WEBCONTENT Connell, B.R., M. Jones, R. Mace, J. Mueller, A. Mullick, E. Ostroff, J. Sanford, E. Steinfeld, M. Story, and G.C. Vanderheiden. 1997. Universal Design Principles. North Carolina State University, The Center for Universal Design. http://www.design.ncsu.edu/cud/ about_ud/udprinciples.htm Covington, G.A., and Hannah, B. 1996. Access by Design. Van Nostrand Reinhold, New York. Dardailler, D., J. Brewer, and I. Jacobs. 2001. Making the Web Accessible. In User Interfaces for All: Concepts, Methods, and Tools. Stephanidis, C., ed. Lawrence Erlbaum Associates, Mahwah, NJ. Eason, K.D. 1984. Towards the Experimental Study of Usability. Behaviour and Information Technology. 3(2) 133 143 Federal Communications Commission. 1996. Communications Act of 1934, as amended by the Telecommunications Act of 1996. http://www.fcc.gov/Reports/1934new.pdf Horton, S. 2006b. Designing beneath the surface of the Web. In Proceedings of the 2006 international Cross Disciplinary Workshop on Web Accessibility (W4A): Building the Mobile Web: Rediscovering Accessibility? (Edinburgh, U.K., May 22 22, 2006). W4A, vol. 134. ACM Press, New York, 1 5. Kraut, R. E., W. Scherlis, T. Mukhopadhyay, J. Manning, and S. Kiesler. 1996. HomeNet: A field trial of residential Internet services. Communications of the ACM. Section 508 Homepage: Electronic and Information Technology. http://www.access board. gov/508.htm Shneiderman, B. 2003. Leonardo’s Laptop: Human Needs and the New Computing Technolo gies. Cambridge, MA: MIT Press. Shneiderman, B. 2000. Universal Usability. Communications of the ACM. 43 (5) 84 91. Shneiderman, B. and H. Hochheiser. 2001. Universal usability as a stimulus to advanced interface design. Behaviour and Information Technology. 20 (5). 367 376. Shneiderman, B. and C. Plaisant. 2004. Designing the User Interface: Strategies for Effective Human Computer Interaction (4th Edition). Reading, MA: Addison Wesley.
Universal Usability
355
Story, M., R. Mace, and J. Mueller. 1998. The Universal Design File: Designing for People of All Ages and Abilities. Raleigh, NC: North Carolina State University, Center for Universal Design. Tinker, M. 1963. Legibility of Print. Ames, IA: Iowa State University Press. Trace Research and Development Center. General Concepts, Universal Design Principles and Guidelines. http://trace.wisc.edu/world/gen_ud.html U. S. Access Board. Section 508 Homepage: Electronic and Information Technology. http:// www.access board.gov/508.htm Vanderheiden, G.C. 1990. Thirty something million: should they be exceptions? Human Factors, 32 (4) 383 396. Vanderheiden, G.C. 2000. Fundamental principles and priority setting for universal usability. In Proceedings on the 2000 conference on Universal Usability (Arlington, Virginia, United States, November 16 17, 2000). CUU ’00. ACM Press, New York, NY, 32 37. DOI= http://doi.acm.org/10.1145/355460.355469 W3C Web Accessibility Initiative (WAI). http://www.w3.org/WAI Worldwide Web Consortium (W3C). http://www.w3.org