Reading and Writing the Electronic Book
iii
Synthesis Lectures on Information Concepts, Retrieval, and Services Editor Gary Marchionini, University of North Carolina, Chapel Hill Reading and Writing the Electronic Book Catherine C. Marshall 2010 Understanding User – Web Interactions via Web Analytics Bernard J. ( Jim) Jansen 2009 XML Retrieval Mounia Lalmas 2009 Faceted Search Daniel Tunkelang 2009 Introduction to Webometrics: Quantitative web research for the social sciences Michael Thelwall 2009 Automated Metadata in Multimedia Information Systems: Creation, Refinement, Use in Surrogates, and Evaluation Michael G. Christel 2009 Exploratory Search: Beyond the Query-Response Paradigm Ryen W. White and Resa A. Roth 2009 New Concepts in Digital Reference R. David Lankes 2009
Copyright © 2010 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Reading and Writing the Electronic Book Catherine C. Marshall www.morganclaypool.com ISBN: 9781598299052 paperback ISBN: 9781598299069 ebook DOI: 10.2200/S00215ED1V01Y200907ICR009 A Publication in the Morgan & Claypool Publishers series Synthesis Lectures on Information Concepts, Retrieval, and Services Lecture #9 Series Editor: Gary Marchionini, University of North Carolina, Chapel Hill Series ISSN ISSN 1947-945X
print
ISSN 1947-9468
electronic
Reading and Writing the Electronic Book Catherine C. Marshall Microsoft Research
SYNTHESIS LECTURES ON INFORMATION CONCEPTS, RETRIEVAL, AND SERVICES #9
vi
ABSTRACT Developments over the last 20 years have fueled considerable speculation about the future of the book and of reading itself. This book begins with a gloss over the history of electronic books, including the social and technical forces that have shaped their development. The focus then shifts to reading and how we interact with what we read: basic issues such as legibility, annotation, and navigation are examined as aspects of reading that eBooks inherit from their print legacy. Because reading is fundamentally communicative, I also take a closer look at the sociality of reading: how we read in a group and how we share what we read. Studies of reading and eBook use are integrated throughout the book, but Chapter 5 “goes meta” to explore how a researcher might go about designing his or her own reading-related studies. No book about eBooks is complete without an explicit discussion of content preparation, i.e., how the electronic book is written. Hence, Chapter 6 delves into the underlying representation of eBooks and efforts to create and apply markup standards to them. This chapter also examines how print genres have made the journey to digital and how some emerging digital genres might be realized as eBooks. Finally, Chapter 7 discusses some beyond-the-book functionality: how can eBook platforms be transformed into portable personal libraries? In the end, my hope is that by the time the reader reaches the end of this book, he or she will feel equipped to perform the next set of studies, write the next set of articles, invent new eBook functionality, or simply engage in a heated argument with the stranger in seat 17C about the future of reading.
Keywords eBooks, e-book, electronic publications, field studies, ethnography, reading, personal digital libraries, personal information management, annotation, bookmarking, navigation, study design, e-book design.
vii
Preface Questions about the future of reading have fueled considerable speculation over the past 20 years. To frame the topics covered in this book, the book begins with a gloss over the history of electronic books, including the social and technical forces that have shaped their development. From this background, Chapters 2 and 3 then move into reading itself and how we interact with what we read. Topics covered in Chapter 2 include layout, legibility, and eBook display hardware. Chapter 3 moves on to annotation, navigation, linking, clipping, and bookmarking—in short, the interactions people have with ordinary print books. Reading is above all a social activity, so Chapter 4 takes a close look at the sociality of reading—how we read in a group and how we share what we read. Because qualitative and quantitative study results are integrated throughout the book, studies of reading or eBook use are not relegated to their own chapter. Instead, Chapter 5 talks about how a researcher might go about designing some types of studies that are particularly appropriate for exploring reading and for evaluating the results of eBook field deployments. No book about eBooks is complete without an explicit discussion of content, i.e., how the electronic book is written. Hence, Chapter 6 delves into the underlying representation of content and efforts to create and apply markup standards to texts. This chapter also examines how print genres have made the journey to digital and how emerging digital genres including blogs, wikis, and hypertext fiction might be realized as eBooks. Finally, Chapter 7 discusses how we can do more with electronic books than just read them cover to cover. Support for different kinds of reading that are part and parcel of different disciplines—law, education, analysis, and so on—may be implemented within eBooks. We devote the concluding sections of the chapter to managing and using collections, such as tools for gathering, triage, and analysis. We focus particularly on what it will take to make eBook platforms into more than just books rendered electronically: how can eBook platforms be transformed into portable personal digital libraries? In the end, some basic philosophical questions remain. My hope is that by the time the reader reaches the end of this book, he or she will feel equipped to perform the next set of studies, write the next set of articles, invent new beyond-print functionality, vet the next generation of eBook business plans, or engage in a heated argument with the stranger in seat 17C about the future of reading.
Contents Preface���������������������������������������������������������������������������������������������������������������������� vii Figure Credits............................................................................................................ xiii 1.
Introduction........................................................................................................1 1.1 Generation 1: A New World of Hypermedia....................................................... 1 1.2 Generation 2: EBook Hardware Arrives.............................................................. 3 1.3 Generation 3: ePaper and the Quiet Revolution.................................................. 6 1.4 This Book............................................................................................................. 8
2.
Reading............................................................................................................. 11 2.1 Reading.............................................................................................................. 14 2.1.1 Assumptions About Reading.................................................................. 14 2.1.2 Purposes of Reading............................................................................... 18 2.1.3 Types of Reading.................................................................................... 19 2.2 Layout, Typography, and Legibility.................................................................... 21 2.3 Studies of the Effect of Layout on Readers’ Performance.................................. 29 2.4 Reading Hardware and Display Technologies.................................................... 33
3.
Interaction......................................................................................................... 37 3.1 Annotation......................................................................................................... 38 3.1.1 Representing Annotations...................................................................... 40 3.1.2 Anatomy of an Annotation..................................................................... 42 3.1.3 Linking................................................................................................... 45 3.1.4 Functions of Annotation........................................................................ 46 3.1.5 Status and Value of Annotations............................................................ 49 3.2 Navigation.......................................................................................................... 51 3.2.1 Three Navigation Scenarios.................................................................... 53 3.2.2 Moving................................................................................................... 56 3.2.3 Orienting................................................................................................ 58 3.3 Clipping.............................................................................................................. 62
reading and writing the electronic book
3.4 Bookmarking...................................................................................................... 67 3.5 Hardware for Interacting with EBooks.............................................................. 69 3.5.1 Hardware That Supports Navigation..................................................... 69 3.5.2 Pen-Based Interaction............................................................................ 71 3.6 Essential but Insufficient.................................................................................... 72 4.
Reading as a Social Activity................................................................................ 73 4.1 Reading Together............................................................................................... 74 4.1.1 Shared Focus.......................................................................................... 75 4.1.2 Collaborative Search and Reference Following...................................... 77 4.1.3 Reading Together as an Informal Act..................................................... 78 4.1.4 Peer-to-Peer Sharing.............................................................................. 79 4.2 Sharing the Artifacts of Reading........................................................................ 80 4.2.1 Reading to Know What Other People Know......................................... 81 4.2.2 Sharing Annotations.............................................................................. 82 4.2.3 Aggregating Annotations: The Wisdom of Crowds............................... 85 4.2.4 Sharing Encountered Information......................................................... 89 4.2.5 Information Brokering........................................................................... 91 4.2.6 Sharing and Recommending Books....................................................... 92
5.
Studying Reading............................................................................................... 95 5.1 Types of Studies................................................................................................. 98 5.2 Quantitative/Laboratory Studies........................................................................ 99 5.2.1 Performance Metrics for Reading........................................................... 99 5.2.2 Eye-Tracking........................................................................................ 100 5.3 Field Studies..................................................................................................... 101 5.3.1 Interview Studies.................................................................................. 104 5.3.2 Diary Studies........................................................................................ 105 5.3.3 Observational Studies........................................................................... 105 5.3.4 Surveys and Questionnaires.................................................................. 106 5.3.5 Instrumenting Software........................................................................ 108 5.4 Performing a Field Study of Reading............................................................... 109 5.4.1 Research Questions and Study Design�������������������������������������������������109 5.4.2 Finding Participants............................................................................. 111 5.4.3 Developing an Interview Script............................................................ 112 5.4.4 Preparing Materials.............................................................................. 113 5.4.5 In the Field........................................................................................... 114 5.4.6 Data Analysis....................................................................................... 115
contents xi
6.
Content: Markup and Genres........................................................................... 117 6.1 Content Representation.................................................................................... 118 6.1.1 Page Description Languages................................................................ 123 6.1.2 Markup Languages............................................................................... 123 6.1.3 Packaging Files..................................................................................... 124 6.1.4 Accessibility.......................................................................................... 125 6.1.5 Digital Rights Management................................................................. 125 6.1.6 DRM Technologies.............................................................................. 128 6.1.7 DRM in Use......................................................................................... 128 6.1.8 Standards Efforts.................................................................................. 130 6.2 Content Preparation......................................................................................... 130 6.3 Paper Genres Reborn....................................................................................... 133 6.3.1 eNewspapers......................................................................................... 133 6.3.2 eMagazines........................................................................................... 135 6.3.3 eTextbooks and Course Packs............................................................... 136 6.3.4 Electronic Journals................................................................................ 138 6.4 New Digital Genres.......................................................................................... 138 6.5 EBooks and Libraries....................................................................................... 140 6.5.1 A Pilot EBook Program in a Public Library........................................ 140 6.5.2 EBook Experiences in Other Libraries................................................. 143 6.6 Sustainability and Digital Preservation............................................................. 144
7.
Beyond the Book.............................................................................................. 147 7.1 Beyond Paper Capabilities................................................................................ 147 7.1.1 Domain- and Practice-Specific Capabilities......................................... 149 7.1.2 Within-Book Search............................................................................ 151 7.2 Portable Personal Libraries and Collection-Level Functionality...................... 155 7.2.1 Search at the Collection Level.............................................................. 156 7.2.2 Re-encountering................................................................................... 157 7.2.3 Gathering and Triage........................................................................... 160 7.2.4 Supporting Browsing with Computed Visualizations.......................... 163 7.2.5 Metadata for Personal Digital Libraries............................................... 164 7.3 Conclusion....................................................................................................... 166
References................................................................................................................ 169 Author Biography�����������������������������������������������������������������������������������������������������185
xiii
Figure Credits Figure 1.2 Zippy “Paper Trail” comic strip Copyright © 2001 Bill Griffith. Used with permission. Figure 2.3
from Marshall, C.C. and Ruotolo. Reading-in-the-Small: a study of reading on small form factor devices. Proceedings of JCDL’02, New York, pp. 56–64. Copyright © 2002, Association for Computing Machinery. Reprinted by permission.
Figure 3.2
from Marshall, C.C. and Bly, S. 2005b. Turning the Page on Navigation. Proceedings of JCDL’05, ACM Press, New York, pp. 225–234. Copyright © 2005, Association for Computing Machinery. Reprinted by permission.
Figure 3.3
Courtesy of the British Library.
Figure 4.2a
from Marshall, C.C., Price, M.N., Golovchinsky, G., and Schilit, B.N. Introducing a Digital Library Reading Appliance into a Reading Group. Proceedings of Digital Libraries 99, ACM Press, New York, pp. 77–84. Copyright © 1999, Association for Computing Machinery. Reprinted by permission.
Figure 4.6
based on Marshall, C.C. and Brush, A.J. 2004. Exploring the Relationship between Personal and Public Annotations. Proceedings of JCDL’04, ACM Press, New York, pp. 349–357.
Figure 4.7
based on Marshall, C.C., Price, M.N., Golovchinsky, G., and Schilit, B.N. 1999. Introducing a Digital Library Reading Appliance into a Reading Group. Proceedings of Digital Libraries 99, ACM Press, New York, pp. 77–84.
Figure 7.1
from Marshall, C.C., Price, M., Golovchinsky, G., and Schilit, B.N. Designing e-Books for Legal Research. Proceedings of JCDL’01, ACM Press, New York, pp. 41–48. Copyright © 2001, Association for Computing Machinery. Reprinted by permission.
xiv reading and writing the electronic book
Figure 7.2
from Marshall, C.C., Price, M., Golovchinsky, G., and Schilit, B.N. Designing e-Books for Legal Research. Proceedings of JCDL’01, ACM Press, New York, pp. 41–48. Copyright © 2001, Association for Computing Machinery. Reprinted by permission.
Figure 7.3
based on Marshall, C.C.. Collection-level Analysis Tools for Books Online, 2008, Books Online Workshop, 30 October 2008, Napa, California.
chapter 1
Introduction Publishers and technologists have been promising us electronic books of one sort or another for well over two decades. Indeed, many of them delivered products in the marketplace in what amounted to at least three distinct waves of technology and content. Each wave was heralded by hyperbolic claims about the disappearance of the print book, the death of text, and other ways in which literacy would be forever changed—claims that excited critics’ wrath and sometimes their derision in the wake of the products’ disappointing showings in the marketplace. Not only did the products fall short of readers’ expectations, but also the changes were slower to arrive than the more optimistic forecasts predicted and the products themselves demonstrated limited economic viability. For those of us whose lives have been inexorably intertwined with electronic texts all along, the immediate source of the disappointment has varied: reading on the screen just was not engaging enough; there were not enough titles in enough genres to satisfy our most immediate and fundamental textual desires; and beyond-the-book functionality was inadequate. More importantly, the hardware was too clunky and battery life was too short to make electronic books any more than a novelty purchased by early adopters, then stored unused in a drawer or forgotten in a hotel room. Yet—like Charlie Brown running at the football one last time, hoping to pull off one solid kick at the goalposts—we kept on trying to read and write the electronic book. Each wave of electronic books was spurred by somewhat different social and technical forces. Each anticipated different changes and each foundered on a new set of challenges that the products quickly revealed. It is instructive to look at the waves one by one if we are to understand the separate (but not separable) facets of the new medium.
1.1
GENERATION 1: A NEW WORLD OF HYPERMEDIA
The first wave of electronic publishing focused on what we could do with the new medium using pixels and fonts instead of ink and moveable type, screens in lieu of paper, and hypertext links in place of page flips. Companies like Voyager and Eastgate Systems not only developed electronic book software and figured out a way to distribute titles and sell them like most publishers would;
Voyager is now defunct; Eastgate can be found at http://www.eastgate.com/.
reading and writing the electronic book
they also worked through an initial economic model for the endeavor and nurtured a stable of writers and content preparation technologies. A small number of first-wave titles emerged. Writers learned how to create in the new medium. Dissemination depended on (relatively) inexpensive digital media such as laser disks (for high-end multimedia titles) or floppy disks (for experimental electronic literature, hypertext fiction, or documentary projects). Multimedia was Voyager’s strong suit as an early publisher of electronic content; the company initially adopted the laser disk as a primary means of distributing content and developed a series of titles that were well suited to multimedia. Consumer-friendly titles included history books such as Who Built America?; the work of more accessible performance artists such as Laurie Anderson; an electronic version of Art Spiegelman’s graphic novel about the Holocaust, Maus; and A Hard Day’s Night, which drew on the Beatles’ music. Laser disks were quickly replaced by CD-ROMs, but the focus of these companies remained on multimedia. The ability to transcend other intellectual limitations of linear text—limitations initially explored in Gedanken experiments by writers like Jorge Luis Borges (Garden of Forking Paths) and Julio Cortázar (Hopscotch)—appealed to a small set of literary pioneers, and hypertext fiction publisher Eastgate Systems was born. Research prototypes like Bellcore’s SuperBook System transformed marked-up files into hypertext that included a dynamically generated table of contents and index to facilitate what were then new kinds of reader navigation (Remde, Gomez, & Landauer 1987). Thus, the initial wave of electronic books produced a highly diverse (but sparse) selection of titles and a handful of research systems developed to explore issues associated with reading on the screen. In some sense, if you wanted to read an electronic book, you would have to adjust your reading preferences to what was available. Availability was further constrained by hardware platforms (some electronic titles were only available for Macs; others were only available for PCs), by limitations in how the titles were delivered ( just how much content fits on a floppy?), and by primitive reading hardware (color monitors were new, heavy, and expensive, and their resolution seems so low by today’s standards as to be utterly unusable). Furthermore, readers were tethered to their desktop computers; laptops, although available, were far from ubiquitous and were only portable if your arms were strong. In a 1992 New York Times article, Brown University professor (and a novelist himself ) Robert Coover forecast the end of books (Coover 1992), thus setting off what amounted to the first electronic book backlash; the best-known examples include The Gutenberg Elegies (Birkerts 1994) and several of the essays included in The Size of Thoughts (Baker 1997). The page versus pixel debate spurred further interest in the nature of documents themselves: Scrolling Forward (Levy 2001) is a thoughtful and reflective exploration of the document and the emergence of modern document technologies; The Myth of the Paperless Office (Sellen & Harper 2001) takes a more practical look at
introduction
why paper persists in an electronic world. A set of influential essays, The Future of the Book (Nunberg 1996), which puts the emergence of the information age into a historical context. Finally, The Order of Books (Chartier 1994) provides a cultural backdrop for understanding the role of books and how we read. Taken together, these works form a scaffolding for our subsequent thinking about electronic books and reading. By the end of the first wave of what we see now as eBooks, there were flagship electronic publications in many areas: the Journal of Postmodern Culture paved the road for a new era of scholarly publishing; titles like Afternoon ( Joyce 1990), Victory Garden (Moulthrop 1993), and Its Name Was Penelope (Malloy 1991) pioneered electronic literature; corpora like Perseus demonstrated that there may be more fruitful avenues for teaching and learning than traditional textbooks; and many early digital libraries like the University of Virginia’s eText Center showed that it was possible to offer the traditional canon of Western literature and special collections online. Although their coverage was by no means complete, these projects developed processes for the transition from print resources to digital. At the same time, the World Wide Web was emerging as a new mode of distribution for the burgeoning born-digital content, starting with high energy physics preprints, but moving quickly into an unimaginable sea of electronic genres. Although individual efforts maintained a measure of internal consistency, at this early date, there was little thought given to the standardization and sustainability of electronic content.
1.2
GENERATION 2: EBook HARDWARE ARRIVES
The World Wide Web and the early electronic texts that it delivered were not called eBooks. eBook was a word coined when companies like Nuvomedia introduced the Rocket eBook and
Naturally I have forgotten or have otherwise omitted volumes that have had a tremendous influence on the thinking that has taken place as the computer screen has become our primary venue for reading. The adventurous reader is urged to pay close attention to the bibliographies each of these authors has painstakingly assembled—there is much interesting reading to be pursued.
http://www.perseus.tufts.edu/hopper/.
http://www2.lib.virginia.edu/etext/index.html.
The Web had a set of standards with a momentum of its own; native Web documents were marked up with HTML, which was standardized to some extent. But the early multimedia titles I have referred to so far conformed to the storage formats required by their own presentation software.
The Rocket eBook and its parent company Nuvomedia were acquired by Gemstar TV Guide International for what was then considered a shockingly high price tag of 200 million dollars. The Softbook Reader and its parent company, Softbook Press, were also acquired that same year. The hardware was consolidated and offered as the REB1200 product, which limped along in the marketplace for several more years.
reading and writing the electronic book
SoftBook Press offered its comparable SoftBook platform; both eBook readers hit the market in 1998 amid greater fanfare than adoption. The eBook platforms only displayed eBooks that were prepared specially for them. Figure 1.1 shows these competing readers. Both the Rocket eBook and the SoftBook Reader were noble forays into purpose-built hardware, but they were clunky: the Rocket eBook weighed in at a bulky 22 ounces and the SoftBook was not exactly svelte either at 2.9 pounds. Furthermore, neither reader stored enough content to meet the promise of a portable personal digital library: the Rocket eBook held about 10 titles and the SoftBook Reader held about eight. The emphasis of these products was to provide the consumer with reading hardware that was book-like in its size and shape. Both companies nurtured partnerships with traditional publishers and developed digital rights management (DRM) mechanisms; eventually, they were compliant with the emerging Open eBook (OEB) standard (discussed in Chapter 6). Immediately there were complaints about battery life (which in the case of the SoftBook Reader was insufficient to handle a coast-to-coast airplane flight); consumers balked at carrying an extra device that, while not as heavy as a laptop of the time, was still significantly bulky. Consumers and public libraries (some of which were enticed into sponsoring ambitious eBook programs) were not broadly satisfied with the range of content and the ease with which they could prepare and transfer their own documents onto the platforms. Some libraries, both public and academic, reported on their experiences with these early devices (e.g., see McKnight & Dearnley 2003); we report on a California public library’s pilot eBook program in Chapter 6.
FIGURE 1.1: The second generation of reading platforms. (a) Nuvomedia’s Rocket eBook. (b) SoftBook Press’ SoftBook Reader.
introduction
The second generation of eBooks also saw the introduction of reading software, generally aimed at providing functionality that would allow the user to annotate, bookmark, look up words, search within the text, and interact with the material in basic ways. Reading software ran on conventional PCs and laptops, but it was also designed to run on handhelds such as Pocket PCs, Palm Pilots, and other small mobile devices. This software included Microsoft Reader, which in 2000 rolled out an extensive advertising campaign that once again heralded the end of reading as we know it. The ads crowed about the coming demise of “the pBook”, which prompted another round of backlash from scholars and critics. Figure 1.2 shows a Bill Griffith comic strip that appeared not long after the first Microsoft Reader ad was published in The New Yorker. At the second generation’s onset, in 1998, the Research Library Group’s Walt Crawford published an essay in Online Magazine called “Paper Persists: Why Physical Library Collections Still Matter,” in which he said: Reading from digital devices . . . suffers in several areas—among them light, resolution, speed, and impact on the reader—and there has been essentially no improvement in any of these areas in the last five years . . . It’s just too hard to read from a computer, and it doesn’t seem likely to get a lot easier (Crawford 1998). Four years later, the new hardware and software had not made a significant dent in critics’ skepticism. For example, Jimmy Guterman wrote a short critical piece on CNN.com in late 2002 that claimed readers were deriving insufficient benefit from eBooks, that publishers just did not get it:
FIGURE 1.2: Backlash prompted by the launch of a second generation of eBook products (ca. 2001). Copyright © 2001 Bill Griffith. Used with permission.
reading and writing the electronic book
Publishers need to add value. Current onscreen magazine systems . . . simply load a bulky software program and copy-protected PDFs of the magazines onto your system. There’s some rudimentary searching, zooming, and annotating, but that’s all . . . There’s . . . no desire to do anything but replicate a flat print publication (Guterman 2002). Research projects, such as Fuji Xerox Palo Alto Laboratory’s (FXPAL) active readingoriented XLibris (Schilit et al. 1999) and Xerox Palo Alto Research Center’s (PARC) conversionoriented UpLib (Bier et al. 2004), made serious attempts to address this question of adding value to onscreen reading applications. XLibris investigated pen-based functionality that would tackle the difficulty of annotating and otherwise engaging with serious reading material such as technical articles, textbooks, legal documents, and intelligence sources. UpLib took on the problems associated with gathering and preparing the documents the user wanted to read. These projects were successful in their research communities but had scant influence on the eBook products that were constrained by hardware costs and limitations, publishers’ invariable concerns about piracy, and other market forces. Thus the second generation of electronic books rolled in with considerable fanfare and receded much more quietly, all but disappearing from the record. All the while, a revolution was taking place unnoticed on the sidelines as many people simply started reading newspapers, magazines, blogs, and other born-digital content on the screen, ignoring eBooks, and making do with whatever hardware they had in hand. During field interviews, I began to hear statements like this: “My dad and I like to read newspapers online. It’s faster and free.” Faster and free—the screen was winning the battle with little attendant anxiety or ado. Mobility, convenience, and portability were making a compelling argument for a turn away from paper. In a 2001 interview, a college student told me: if I’m going home to Colorado, I have to really be sure I’m going to read something if I’m going to bring it. Otherwise, why should I bring it? This thing [a handheld device with eBook software and all his course materials], I was like, “I’ll bring it, and if I read it, I read it; if I don’t, I don’t.” It doesn’t matter. It’s small; it’s handy (Marshall & Ruotolo 2002).
1.3
GENERATION 3: ePAPER AND THE QUIET REVOLUTION
The third generation of electronic book platforms (Figure 1.3) began to take advantage of a convergence of enabling technologies (low-power bistable displays, capacious storage, lighter hardware,
It is notable that Wikipedia’s entry for the RocketBook is very brief and the corresponding entry for SoftBook is altogether absent. Nor are the earlier products linked to Amazon’s Kindle as any hint of historical precedent. In fact, a search for SoftBook in Wikipedia asks the user, “Did you mean: songbook?”
introduction
FIGURE 1.3: The third generation of eBooks: low-power ePaper-based bistable displays. (a) Amazon’s original 6-inch diagonal Kindle. (b) Sony’s Reader with touch-screen display. (c) Kindle DX with 9.7-inch diagonal display.
the availability of digital content, and ubiquitous wireless) with warming consumer attitudes toward reading on the screen and purchasing unbundled content (using an iPod-like model of what a reading device might be like). Several different reading platforms—most notably, Amazon’s Kindle and Sony’s Reader—were introduced to a public that found them less alien and less of a novelty item than they had the previous generation of eBook devices. Interestingly, the third generation of portable eReaders—while considerably more appealing from a form factor and battery life standpoint—has yet to unleash the real power of electronic texts. The ability to carry a lifetime’s worth of reading in a mobile device is a compelling idea, and yet Clifford Lynch’s vision of a portable personal library has yet to be realized. Not so far, anyway, although the obstacles to such a thing have diminished substantially over the last decade-and-a-half of eBook history: Given the historic price–performance trajectories for storage, in a few years at least some high-end appliances will house hundreds, if not thousands, of books simultaneously, and certainly laptops with software book readers will house thousands or tens of thousands of books at once. Think of portable personal digital libraries, not portable electronic books, as the future role of these appliances (Lynch 2001, emphasis added). Then too, we have yet to unlock the potential collective effects of a world of electronic readers and writers. Functionality that involves computation over large collections of digital texts—along
reading and writing the electronic book
with wisdom-of-crowds approaches to analyzing our interactions with these texts—has yet to be fully investigated and implemented.
1.4
THIS BOOK
We are still hung up on the physical form of the book and its place in our culture. Perhaps this is rightfully so. Why, then, is it important to recount the short and somewhat disappointing history of reading and writing the electronic book? What is there to know? At first blush, the answer would seem to be, not much. An electronic book is like a print book, only with pixels instead of toner. Indeed, there have been many research projects and consumer products devoted to making electronic books seem familiar. Yet, if we look a little closer—if we get beyond cycles of hyperbole and disappointment—it would seem that we have learned quite a bit. We have gotten somewhere. There have been spectacular efforts documenting and analyzing the larger social forces at work on the book, on text, and on media. Series of post-McLuhan discussions have taken place, as well as countervailing efforts to discredit electronic reading and writing and to hold steady against the forces of change. Rather than joining the cacophony of voices from the literary and social science worlds, some of which we’ve already mentioned, this book will keep its nose down—in the book, so to speak—and examine instead a rather more pragmatic set of issues and developments that have arisen over the last few decades. This book is intended to reflect on its counterparts in a quotidian way, drawing on sources from information science, computer science, and human–computer interaction, but especially on the results of studies I have conducted with colleagues and by myself over the last decade-and-ahalf. In this book, we will steer clear of the turbid waters of critical theory and the more abstruse studies of human cognition; instead, we’ll stay focused on how the next generation will read and write the electronic book. We will start with a look at what an electronic book will need to do, beginning with a closer read of reading itself: the assumptions that underlie reading; the different ways that people read; the different reasons they have for picking up books or basking for hours in the screen’s warm glow. Because eBooks have an inescapable need to render words on the screen, this chapter first turns its sights to layout and legibility, developments that draw on the mature design disciplines of typography and text layout. Hand in hand with these techniques go studies of legibility. The second chapter closes with high-level coverage of eBook hardware requirements and recent advances in display technology. Once these basics are out of the way, Chapter 3 discusses reading-related functionality: annotation, navigation, linking, clipping, and bookmarking—in short, all of the things people can
introduction
do with an ordinary print book as they interact with their reading material. The emphasis in this chapter is on practice and on an analysis of what this practice means for the form and function of the electronic book. Like Chapter 2, Chapter 3 closes with a discussion of specialized hardware, this time to support interaction—support that goes beyond the traditional mice and menus. Reading is, above all, a social activity, so Chapter 4 takes a close look at the sociality of reading: how we read in a group, how we share our reading materials, and how we share the artifacts of our reading including annotations, bookmarks, clippings, and the books themselves. As is true of any discussion of sharing, we must pay attention to the associated issue of privacy; in the digital world, everything from page views to reading lists has become fodder to fuel our social networks. What we read (and what we make of it) is now potentially available to one’s online friends and to a broader community of readers. Because study results are integrated throughout the book, individual studies of reading and eBook use are not relegated to their own chapter. Instead, Chapter 5 talks about how a researcher might go about designing some types of studies that are particularly appropriate for exploring reading and for evaluating the results of eBook field deployments. Studies of reading and readingrelated activities are difficult for specific reasons; we will go through these reasons and discuss ways of compensating for the difficulties. No book about electronic books would be complete without an explicit discussion of content, i.e., how the electronic book is written. Hence, Chapter 6 delves into the underlying representation of content and efforts to create and apply markup standards to texts. This chapter also takes a practical look at text preparation, coupled with an examination of how print genres have made the journey to digital and how some emerging digital genres including blogs, wikis, and hypertext fiction may be realized as eBooks. As we discuss content, we also hit on the high (or some would say low) points of economic and legal underpinnings for the electronic book marketplace, most notably DRM and other mechanisms for protecting content from piracy and unrestricted copying. Finally, because we are on a computer, we can do things with digital texts that we were never able to do with print books. For example, computational power allows us to apply existing search and way-finding mechanisms inside the book; translation facilities and online references can be ready-to-hand; support for different kinds of reading that are part and parcel of different disciplines (such as law, education, and analysis) may be implemented within an eBook architecture that supports such extensions and adaptations. Chapter 7 discusses how we can do more with electronic books than just read them cover to cover. I devote the concluding sections of the chapter to collection-level functionality for digital texts, such as tools for gathering, triage, and analysis.
For example, just studying reading as it naturally occurs is difficult and is even considered a bit creepy: watching people read when and where they normally read is seldom comfortable for the observer or the observed.
10 reading and writing the electronic book
Earlier, I quoted Clifford Lynch as saying, “Think of portable personal digital libraries, not portable electronic books, as the future role of these appliances.” Thus, the chapter as a whole focuses on what it will take to make eBook platforms into more than simply books rendered electronically and what it will take to transform eBook platforms into portable personal digital libraries. In the end, some basic philosophical questions remain: Just how faithful should electronic books remain to print books? Should we say, the book is a known form, and document genres have coevolved with human practice; hence, the print book is the apotheosis of a particular mode of communication? Or should we take the radical position that our colleagues took during the 1990s when they heralded the end of books and the death of text? How should we even set about answering these questions about the future of the book and reading? My hope is that by the time the reader reaches the end of this book, the reader will feel equipped to venture opinions and write the next set of op-ed pieces, perform the next set of studies, build the next iteration of eBook prototypes, vet the next generation of eBook business plans, or engage in a heated argument with the stranger in seat 17C about the future of reading. • • • •
11
chapter 2
Reading When we talk about electronic books—eBooks—what we’re talking about is reading. Reading is the fundamental way that we engage with books, isn’t it? It seems self-evident and hardly worth talking about. And that’s exactly why it is worth talking about. EBooks have evolved amid an atmosphere of anxiety and skepticism. Anxiety has been provoked by the literati’s numerous predictions of the end of books, of changes in reading as we know it (Bolter 1991); this anxiety has provoked a tremendous backlash (Gass 1999; Birkerts 1994). Skepticism has followed in the wake of the collapse of the paperless office (Sellen & Harper 2001): why would anyone read on a computer screen when paper is so malleable and affords such natural interaction? As evidence of this anxiety, Gass (1999) wrote a predictably nostalgic article about the pleasures of reading print books. It did not go unnoticed. To introduce an editorial by humanities scholar and writer, Diane Greco, Eastgate System’s chief scientist Mark Bernstein wrote: Pundits nostalgically continue to disparage new media forms. William Gass issued the latest sortie in a recent issue of Harper’s Magazine, arguing that “words on the screen” (his phrase) cannot compete with the pleasures of paper-and-ink textuality, because digital media do not record the serendipitous events that can occur in individual readings. To support his claim, Gass points to his discovery of certain jam-stained pages in his copy of Treasure Island, which, to him, evinced praiseworthy engagement with the book. Although Gass seems to have outgrown the juvenile excitement to which the pages attest, nevertheless he values his copy, with its sticky leaves, as a relic, as tangible evidence of better days before everything became, in Gass’s words, “data day and night.” But surely Gass is talking about a certain kind of reading and a certain kind of interaction. And certainly we all have deeply held stereotypes about the nature of reading.
This sidebar was taken from http://www.eastgate.com/HypertextNow/archives/Gass.html. (Re-retrieved June 10, 2009.)
12 reading and writing the electronic book
What’s worse, we’re looking at a moving target: new genres emerge; readers’ purposes vary; new technologies are just over the horizon. Ten years ago, Web logs (i.e., blogs) were so new that they were worthy of comment in The New York Times. Fifteen years ago, John Seabrook, writing in The New Yorker, felt it necessary to explain what a home page was. Yet these genres are so much a part of our lives by now, we wouldn’t question them or why we read them: emerging genres are sneaky. Even the stability of ludic engagement, reading in its most fundamental form, is being questioned. Ludic engagement is that form of reading that we think of when we say someone is “lost in a book.” eBook design has been prone to focus on this sort of immersive reading—deep involvement with a single work—despite the thought that it is fast disappearing, given the availability of so many content choices and our social tendency toward fragmented attention. In his thoughtful book about documents and reading in the digital age, Scrolling Forward, David Levy discusses some of the factors that mediate against immersive reading: Changes in the technologies and the character of modern life may be putting an end to reading in depth. That’s the fear, at any rate, in some quarters . . . It isn’t that the book has gone away, but rather that the cultural conditions for [deep] reading . . . are fast disappearing. (Levy 2001, pp. 108–109) Levy differentiates among three types of engagement with documents: intensive reading, the traditional picture of reading that is the basis for much electronic book technology; extensive reading, which acknowledges that a reader may be using many different books at once; and hyperextensive reading, a distinction that brings the fragmented quality of human attention into sharp focus. Not only are readers using many different books at once, they are also reading parts of books and they may be reading these fragments out of order. Reading is aligning with our fragmented lives. Although at last count we’re printing more than ever, we may be printing not because we’re reading more, but rather because we’re reading less. In a recent internal survey, the most common reason that people printed documents was to read them later. In other words, we print so we don’t have to read. Although it’s tempting to think that this is a harsh conclusion and that we print to defer reading, interviews reveal that this deferred reading may never take place: once a document has been printed, it can be consulted as needed and is thus never read in the way that we conceive of as reading.
Katie Hafner, I Link, Therefore I Am: a Web Intellectual’s Diary, New York Times, July 22, 1999.
John Seabrook, Home on the Net, New Yorker, October 16, 1995.
reading 13
New reading technologies are also on the horizon: bistable displays give us the ability to furnish eBook platforms with low-power screens. But it is dangerous to fall victim to technological determinism; technology, by itself, says little definitive about where we are going. In 1945, President Roosevelt’s science advisor, Vannevar Bush, wrote an article for the Atlantic Monthly that many people regard as prescient. In As We May Think, Bush foresaw technologies like hypertext, realized in a device he called the “memex”: The owner of the memex, let us say, is interested in the origin and properties of the bow and arrow . . . He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together. Thus he goes, building a trail of many items. Occasionally he inserts a comment of his own . . . Thus he builds a trail of his interest through the maze of materials available to him (Bush 1945). But he also quietly missed the mark on one important social change. A second quote, also from As We May Think, illustrates it the best: The advanced arithmetic machines of the future . . . will have enormous appetites. One of them will take instructions from a whole roomful of girls armed with simple keyboard punches and will deliver sheets of computed results every few minutes. There will always be plenty of things to compute in the detailed affairs of millions of people doing complicated things (Bush 1945). If we take this quote, and not the other, as the linchpin of change foreseen, we come away with a different picture. Computers were not going to be personal machines; they were going to stay put in their role as stodgy number crunchers. And women were not going to be writing blogs on computers; they were going to be relegated to the keypunch room. In Vannevar Bush’s time, who would have believed that men would learn to type? Certainly not Xerox executives, who, upon seeing the first personal computer at Xerox PARC, reminded the researchers that men would never do their own typing. Technological change can mix with social change in ways that no one really foresees.
I have heard this story from several different sources. Unfortunately, I don’t have the citation necessary to cement the story’s authenticity, so it will have to remain in the status of urban myth.
14 reading and writing the electronic book
Hence, in this chapter, we’ll start at the most basic level of eBooks: reading. We’ll examine assumptions about reading, explore some of the purposes of reading, and set forth a reading typology. Once we’ve sketched out these basic elements of how a reader engages with a book, we can examine the flip side of the question from the book’s perspective: why a print page looks like it does and why an onscreen page looks like it does. Then, because reading from computer screens has always stirred up a great deal of controversy, we’ll look into some studies of the effect of layout on readers’ performance and check our intuitions about why it’s been difficult to shift readers from the print page to the screen and why this difficulty is easing.
2.1
READING
The subject of this chapter’s first section is perhaps the most self-evident of all: what do we assume about reading, are these assumptions true, and what distinctions do we want to make among the kinds of reading that people do at work and at home? Looking at these assumptions and distinctions is important. They ultimately arise when we question the veracity of reading research (“Sure it works that way in the lab. But what happens when people are reading what they want to read?”) and when we question the reach of various products (“Sure, you might want to read a Star Trek novel on a Kindle, but will you like it when you’re reading Proust?”). Critiques of eBook technologies are deeply rooted in the types of reading that we picture: of course, you’ll need to annotate if you’re reading to learn; of course, you’ll need different modes of navigation if you’re reading for reference; of course, you’ll need a backlit device if you’re reading in the dark, in bed. And so we’ll start at the beginning.
2.1.1 Assumptions About Reading Let’s look at some of the assumptions we make about reading. First of all, we think of reading as a stationary activity, something people do seated at their desks in their offices, at their carrels in a university library, and in their easy chairs in their living rooms. We think of reading as passive, an activity in which at best we move our eyes across the screen, from word to word, from page to page, until we’re done. And we keep going in that unstoppable march forward from word to word until we’re done reading and go on to the next thing. And as far as being a solitary pursuit? Most certainly. Reading is an intimate act: we do it alone. We read in private. It’s just us and our cognitive processes. Together. Alone. And when we read for meaning (as opposed to reading purely for pleasure), we assume that reading is fundamentally about the information contained in the documents. This is not to say that we just absorb the material at face value—we might skim quickly, we might analyze the material
reading 15
deeply, or we might synthesize it with the other things we know—but reading is about the content of the book, not the container. But if we look at these assumptions, one by one, they are not straightforward at all. Mobile. The first assumption we can call into question is that reading is stationary. It is stationary only from the perspective that people can get into trouble if they walk and read at the same time: they can fall down a manhole or walk into a pole. Often, people carry their reading material with them to read when they have time or when they find a place conducive to reading. It doesn’t matter whether they’re reading on a mobile device or on paper. For example, a researcher said this of a technical paper she was reading: I took it [a technical paper] home a couple of times, but it never got anywhere there . . . You can see that it’s totally trashed. It’s been to the pool. It’s been just about everywhere with me (Marshall et al. 1999). If a mobile device is successful, it presents that same opportunity to change locations and situations until the right time and place arises for reading. Recalling the interview segment quoted in the first chapter, the Pocket PC’s success as a reading device hinged substantially on its portability: If I’m going home to Colorado, I have to really be sure I’m going to read something if I’m going to bring it. Otherwise, why should I bring it? This thing [a handheld with eBook software], I was like, “I’ll bring it, and if I read it, I read it; if I don’t, I don’t.” It doesn’t matter. It’s small, it’s handy (Marshall & Ruotolo 2002). Mobility doesn’t always mean that a dramatic change of venue must occur. Micromobility is a common phenomenon; a reader may move to a more comfortable chair or to a place where the light is better: I usually read in one of the chairs in [the living room]. That’s partly because I don’t have a desk lamp in here . . . [The chairs are] very comfortable (Marshall & Ruotolo 2002). The mobility of reading is underscored by taking a comparable look at writing. In interview studies, when I have asked “where do you work?” to set up places to conduct interviews with highly mobile students, researchers, and other professionals, most of them have answered more in terms of where they wrote than where they read; they see writing as their fixed, visible work, while reading is a highly mobile, highly fluid activity. Is this finding still true in an age of light laptops and increasingly capable small devices? Is reading still noticeably more mobile than writing? In a 2008 study of scholarly writing, I found that
16 reading and writing the electronic book
despite the quest for seamless replication of content (thus making both reading and writing possible everywhere), writing continued to be more place-constrained than reading (Marshall 2008b). Thus, all evidence continues to point to the fact that reading is mobile. And because it is mobile, when we contemplate electronic books, we have to think of them as something that will be toted along with everything else we carry around. Early generations of eBooks were heavy and bulky: it was natural to react to them as just another mobile technology that would need to be carried—and charged—along with all of the other special purpose gadgets we carry and charge. It is also necessary to consider how material will get on the device (and how the device will synchronize with other mobile devices). Will reading material accumulate and become a personal library? Or will ad hoc collections grow on each device according to the device’s use? We will look at these questions later in the book. Interactive. It is all too easy to stereotype reading as a relatively passive activity: videogames are interactive, reading is passive. Mortimer Adler famously suggested that critical engagement with a text is necessarily active (Adler and van Doren1972); hence he (and others of his ilk) use the term active reading to promote a higher degree of critical interaction with one’s reading and to characterize the thoughtful reader as being in perpetual analytic dialog with the writer. But interactivity need not be so high-minded as that. Grab a used textbook and you will find all kinds of annotations, worthwhile and otherwise. Magazines at the laundromat will have articles torn from them. Even the way books and papers are organized on our physical or digital desktops will remind us that reading begets a great deal of interaction. We discuss basic types of interaction in Chapter 3, how these interactions play into the social life of eBooks in Chapter 4, and more advanced interactivity in Chapter 7. Social. Is reading really a solitary pursuit? In Scrolling Forward, David Levy tells us that reading is inherently social: It is also worth noting that solitary reading always was, and still is, inherently social: how we read is ultimately determined by social conventions and community membership (Levy 2001). But beyond that abstract notion of sociality—that reading and writing are fundamentally communicative acts, taking place within a cultural context—there are ways in which we see evidence of reading together all around us. Students read together in a classroom; drivers read billboards together as they speed by the landscape; and people browse the Web together, possibly with one person looking over the other’s shoulder or in a multiway Skype call. If you think about it, you’ll be able to come up with many ways that people read together. People also share the artifacts of their reading, either inadvertently (they loan someone a book that they’ve annotated) or purposefully. They may send each other clippings or recommend books to one another.
reading 17
In the end, the sociality and sociability of reading is uncontroversial. Material. It is easy to treat the idea of information extravagantly: as something that explodes, as something there is too much of, or as something that has no material form. I have used this quotation from Geoff Nunberg many times in the last decade and a half, in many circumstances, but my overuse has rendered it no less evocative: Reading what people have had to say about the future of knowledge in an electronic world, you sometimes have the picture of somebody holding all the books in the library by their spines and shaking them until the sentences fall out loose in space . . . (Nunberg 1993). A book has an easy materiality, a materiality which can be purposefully examined, seen, and even smelled. In an article about Arion Press, an apprentice printer tells the journalist, “this work is so tactile—it’s a sensory experience,” noting the pleasure of working with ink, paper, cloth, and leather. Can you envision an intern, marking up an electronic text according to TEI dicta, saying that the keyboard has an enticing tactile quality? The materiality of books is appreciated implicitly. Online used book seller Alibris had a longrunning ad campaign in which they showed the tattered covers of memorable paperbacks—Coming of Age in Samoa, On the Road, and Catcher in the Rye—and offered the ability to reclaim the book that you’d lost “in a bus somewhere near Kathmandu.” The tattered cover is, in and of itself, a reminder of the book and all of the experiences surrounding it. EBooks inhabit an ambiguous place in the materiality of reading. The eBook has a form, but what does it mean? The heft of an eBook tells the reader nothing: The Elements of Style has the same heft as Fowler. The eBook page is malleable and may be reformatted on the fly. But reading remains essentially physical. One of the early objections to eBooks was the plaintive, “I could never curl up with an eBook.” A graduate student who participated in the Pocket PC study countered that objection, saying: And I heard things like, “Oh I could never curl up with a computer.” And I always test that for myself. And in fact, I would lie back on my couch with my feet up on this end, my head propped against the pillow there (Marshall & Ruotolo 2002).
In Heidi Benson’s “The Power of the Press” (San Francisco Chronicle, April 20, 2003).
Fowler’s Modern English Usage is a comprehensive style guide of substantial heft; The Elements of Style is a slender and idiosyncratic volume.
18 reading and writing the electronic book
And with that, she demonstrated her ability to sit comfortably on her sofa, reading textbooks on her laptop. Thus, despite our tendency to think of anything on the screen as virtual, reading itself is material, and many eBook designs make an effort to reclaim that material, physical quality.
2.1.2 Purposes of Reading One of the early studies that was performed with the explicit purpose of informing the design of what the paper referred to as “digital reading devices” was a diary study (Adler et al. 1998); what is important about this study is that it reminds us of the multiple purposes people have for reading, beyond simply reading for information. Thus, the categories include reading to support discussion, reading to edit or critically review text, reading to learn, and reading to answer a question. Schilit et al. (1999) break down some of these purposes for reading according to the level of interaction they require and whether they cross boundaries among multiple texts. They point out that reading a novel or browsing the newspaper requires relatively little interaction with the text and that studying a textbook, reviewing a proposal, or keeping up-to-date professionally requires substantially more interaction. Single text activities include reading a poem aloud and using a diagnostic manual; multiple text activities include surfing the Web or researching a topic. We can extend this contrast between active reading and immersive reading to form a space of reading types. This space is populated with examples in Figure 2.1. Marshall (2007) characterizes the reasons for reading a newspaper as including reading primarily for relaxation and as a diversion; reading to following the narrative of specific breaking stories; reading in response to particular recommendations; and reading broadly to stay informed or to keep up with the events of the day. What is clear from these varying discussions of reading’s purposes is that reading is seldom one thing, even given a single reader engaged with a single text. Furthermore, readers may not be able to fully articulate their reasons for reading. In the Times News Reader study (Marshall 2007), participants reported a variety of reasons for reading the publication (e.g., no stories about Paris Hilton on the entry screen or that the stories are well thought out and fact-checked); yet these reasons were somewhat dissonant with their actual reading practices (they are primarily seeking the latest news and may not look too far beyond the headlines). Reading to stay informed might include a check of the advertising inserts: What I did miss, funnily enough are the ads. It’s weird, but the Saturday and Sunday editions, especially the Sunday edition, there are a lot of ads. [You can find] specials on Broadway shows. I do use that a lot to book my tickets online (Marshall 2007).
reading 19
Active reading (purpose in mind) Requirements: interaction, manipulation
Executive skimming email to respond Professional looking through business magazines to see “what’s new”.
Coffee drinker paging through a newspaper
Lawyer reading key cases and viewing a video deposition Music student preparing for a final in music theory Analyst reading “hot” cables
Airplane passenger reading a novel
Immersive reading (focused attention) Requirements: transparency, legibility
FIGURE 2.1: Active reading versus immersive reading: a space of examples.
Reading a newspaper to stay informed may be virtually indistinguishable from reading a compelling novel or from tracking down information from multiple sources to answer a question: For example, there was this news about Wesley Snipes being arrested. Which was flashing on the banner stuff. And I wanted to know. . . . I came back home late, so I missed the evening news. It was that in-between couple of hours. It was 6:35, actually. So I missed the 5:00, the 5:30 news, and I wasn’t listening to the anchor. In any case, I went online—actually, the TV was on. And the TV was talking about this. And I went on the Times Reader to try and see if it was there (Marshall 2007). It is easy to see why categorizing a reader’s purpose is so problematic. Nonetheless, it is important to acknowledge that reading may be motivated by a multiplicity of purposes.
2.1.3 Types of Reading Just as there are multiple purposes for reading, there are distinctions among the different ways that people read. At the very least, we should distinguish among five rough categories that are easy to identify on video recordings. We use three of these categories for analytic purposes in Marshall and
20 reading and writing the electronic book
TABLE 2.1: Reading types. Type
Characterization
Reading
Canonical careful reading. The reader traverses the text linearly. The aim is comprehension.
Skimming
Faster than canonical reading. Traversal is still linear, but comprehension is sacrificed for speed. The aim is to get the gist of the text.
Scanning
Faster than skimming. Traversal becomes non-linear; the reader looks ahead and back in the story. The aim is often triage or to decide on further action.
Glancing
Pages are turned very quickly; the reader spends almost as much time turning pages as looking at them. The aim is to detect important page elements (e.g. beginnings and endings of articles, photos or figures, headings) until something holds sufficient interest to transition to another type of reading.
Seeking
Reader scans quickly for a particular page element (e.g. proper nouns) with an aim orthogonal to full comprehension. A study participant described looking for names in a magazine article: “I don’t know anything about pop music. I should because we’re always stumped in The New York Times crosswords by the pop music characters. I do know Beyoncé is an important character who appears in the crosswords. So I may get a few names out of it. But that’s about it” (Marshall and Bly 2005b)
Rereading
Rereading is a meta-type that is included in the table as a reminder that any type of reading may be occur multiple times.
Bly (2005a), but we also discuss the other two categories (skimming and seeking). These categories are evident in other discussions of reading but are not set out independently from reading purpose. Table 2.1 identifies five categories; a sixth orthogonal category, rereading, is also introduced. No discussion of reading types would be complete without mention of rereading. It is important to remember that some skimming and scanning is performed either before or after deep reading takes place. Thus, a text can be approached multiple times (sometimes with multiple purposes in mind). Rereading may occur in a different venue (i.e., on a different computer, or in print if the
reading 21
skimming was done on paper) than it did originally; it may serve as an introduction (if it is done before) or a reminder if it is done afterward. Unless a text is specified explicitly as new, we need never assume that it has not been seen before. The Pocket PC study (Marshall & Ruotolo 2002) reminds us that an eBook might be a com plementary reading venue, existing in an ecosystem of personal devices that also may be used to read. In that study, most of the students used the electronic texts on the handhelds to become familiar with shorter texts and excerpts, not to read deeply. They used words like skim and glance to describe how they read on the Pocket PCs. In general, the students characterized on-screen reading this way to contrast it with how they read print materials. Thus, in this situation, on-screen reading was not sustained deep reading but rather quick reads of course texts, possibly in situations in which the student’s attention was divided (such as in-class reading), with the intention of reading the text again. The students skipped around in the electronic texts, focusing on some parts more closely than others. The ability to search and focus on short segments of a longer text and to navigate through an extensive set of familiar materials seem to be universally cited strengths of reading on the screen. Thus, rereading across multiple venues should hold a prominent position in our taxonomy of reading types.
2.2
LAYOUT, TYPOGRAPHY, AND LEGIBILITY
What do we mean when we talk about layout? Usually, layout refers to various elements of page design: how a linear flow of text is represented on the print page or in a window on the computer screen. In the print world, page layout is largely the job of the publisher. That is, neither the reader nor the writer has much to say about how text appears on the page; the writer supplies the publisher with content—probably in digital form—and the reader buys finished book, and only through indirect feedback (the writer grumbles to her publisher; the reader buys an edition with a print style she believes to be more legible) does either have any control over how the book looks. On the other hand, a designer working for the print publisher may put a great deal of thought into choosing a type face and size; specifying the margins, column widths, and paragraph leading; placing figures and charts on the page; opting for text justification and hyphenation; and generally attending to the book’s aesthetics and legibility (Schriver 1997). Design matters become less straightforward when we start talking about reading on the screen. Digital publications with significant resources may draw on layout techniques from the print world, with publishers and designers controlling the page layout to a significant extent. At the
The online edition of The New York Times (http://nytimes.com) is a good example of this trend. Many publications have put extensive resources into their visual brands and are anxious to maintain these recognizable brands in their online sites.
22 reading and writing the electronic book
other extreme, on-screen layout may be left in the reader’s hands to a much greater extent, as it was originally in Web browsers, with the reader choosing the type face, the font size, and the general look of the text, and the writer/publisher specifying only a few functional elements of the document, such as heading levels and where paragraphs begin and end. Unlike print documents, digital documents offer the opportunity for the page layout to be computed at display time, guided by algorithms and heuristics (rules of thumb). Thus, the application—and therefore, the software developer—and the reader may play a larger role relative to the publisher in determining the final look of a document. Text size, style, position, and rendering techniques may be optimized to take advantage of the display hardware, in conjunction with the characteristics of the material being displayed and the reader’s own preferences. Hence, in electronic books, we often find that the final layout is produced through an interaction among three factors: (1) the content representation (i.e., how the content has been marked up, and how styles have been defined by the publisher); (2) the eBook software (which interprets the markup and applies any additional constraints imposed by a particular platform); and (3) readercontrolled settings (e.g., font size). In this chapter, basic elements of layout will be introduced and defined, along with a discussion of some common heuristics for computing a readable page display. We will expose only the tip of the iceberg in this chapter; the interested reader can find many published resources and ongoing discussions of designing typefaces and page layouts for the screen. Does layout matter, beyond the basic aesthetic concerns that are intrinsic to design? That is, most of us are aware that some documents and books just look better than others; we are also aware, perhaps acutely, that an ugly book is uninviting to read and that it is possible for text to be presented on the ragged edge of readability, with type that’s too small or too tightly spaced, or with lines of running text that are obviously too long for comfort. And certainly, we are aware of a look that is part of a publisher’s brand. But does layout have any bearing on actual reading performance? Does the way that the words are presented on the page have any effect on the speed with which a reader may make her way through the text or on her comprehension of its content? Furthermore, do these design principles change a reader’s immediate reaction to the content? In other words, is an argument more persuasive if the page is well laid out? Certainly, we might expect to have a different reaction to the same book read aloud by Paris Hilton and by James Earl Jones, but in the case of page design, are looks that important?
There are a great many long-running forums, blogs, and online resources on document design and typography; the enthusiastic reader is encouraged to go off and discover them.
For example, see O’Reilly’s technical book series.
reading 23
In general, the overarching importance of document layout is an area of some controversy; even empirical results may not be conclusive. Although we’d all agree that it’s more pleasant to read a well laid-out book, we might not say that a beautiful argument is more persuasive or that an aesthetic layout allows a reader to assimilate material more quickly and accurately. Thus, after discussing basic aspects of text layout, this chapter will cover some of the studies that have been performed in an effort to resolve these performance questions and quantify the differences they introduce. By now, almost all of us have had the experience of preparing a document in a text editor (or in a full-on desktop publishing application) on a personal computer, although we have varying degrees of experience with adjusting the word processing application’s default settings. Application developers make an effort to use default settings that work well for most people; thus, the layout functions of text editors can be fairly invisible to typical users. In fact, it’s not until we prepare a document that consists of more than running text—when we start fussing with tables, graphics, or embedded media—that we have to worry about these settings at all. Even then, we might be tempted to ignore the arcane terminology that appears to be a relic of the printing press (and the typewriter) and is today more germane to the work of graphic designers than it is to eBook readers and writers.10 The purpose of this chapter is to provide enough background so that the reader is conversant in all of the design elements that are discussed by studies of eBooks in use and in the more general studies of reading from the screen. This chapter will also provide us with a common vocabulary to refer to aspects of eBook layout; there is little sense in reinventing language that has evolved with the technologies of reading and writing. Legibility. Legibility (in this usage) is the extent to which running text presented on a display appears to be readable. Many different elements contribute to legibility. Figure 2.2 shows two contrasting eBook pages to give you a sense of what we mean when we use the word legibility: the same words may be on a page, but slight adjustments in the page design may give the reader a very different impression of its legibility. Use plays into legibility too. Our 2001 study of reading eBooks on a small form-factor device concluded that students were more likely to skim the material provided to them on the device; their reading was time-constrained (“I have 10 minutes before class”), interrupted and fragmented (“I read at the bus stop”), and nonlinear (i.e., students skipped from place to place in the text). In this situation, students are more likely to judge layout efficacy on how much they can see of the text
10
Document design is a complex topic; this book doesn’t have room to do it justice. The interested reader might refer to an authoritative design guide such as Shriver (1997) or to discussions of the use of graphics (Tufte 1990 and Bertin 1983 are classics in this area). Typography itself is practically a cult, and there are many worthwhile discussions of the art and science of type design.
24 reading and writing the electronic book
FIGURE 2.2: Two different layouts that illustrate legibility trade-offs. (a) Open layout that has been produced using legibility heuristics; the last line is selected for comparison purposes. (b) Tighter layout, optimizing the amount of content displayed on the page (for maximum context). Note the relative position of the selected line.
at once. That is, they’re willing to sacrifice reading comfort (e.g., ample leading, wide margins) for more context (literally, more words on the page). Genre also makes legibility a less straightforward matter. Certain kinds of material are very sensitive to layout; changing the layout changes the reader’s perception and understanding of the content. Poetry, particularly structured verse forms like rhyming couplets, provides us with com pelling examples of this phenomenon. Participants in the 2001 study cited above described verse with rhyming couplets where the last words of most (print) lines fell on new lines (on her Pocket PC display). When the rhyming words drew attention to themselves, the verse gave the reader an unanticipated sense that the poetry was doggerel. Re-reading the poetry properly laid out was necessary to correct this misperception and restore the verse’s intended meaning and sense. Figure 2.3 shows an example of this layout phenomenon. Thus, we can see that legibility is a complex interplay of factors. What kind of reading will the reader be doing? What constraints are introduced by the genre? Does the display impose any other constraints by virtue of its size or properties? Some type-rendering techniques
reading 25
FIGURE 2.3: Layout and genre: the perception of a poem is altered by its layout on the screen.
such as Cleartype take advantage of LCD striping; hence, if the display is not LCD-based, it is possible the technique will not work properly. Once we have teased out these mitigating factors, we can look at page layout in terms of more obvious physical elements such as type size, font family, font types, leading, spacing, and line lengths. There are other more subtle parts of type design—e.g., the letterforms (the actual shapes and characteristics of the letters) and counterforms (the enclosed spaces in letters like q, o, or g)— that are the concern of typographers; we will not worry about these design elements here. Type size. Unfortunately, type size is computed in less uniform terms than we might initially imagine. Most of us have encountered the standard type size metric of points. Points are used to measure type size both on the screen and in print in most English-speaking countries. There are approximately 72 points to an inch; in other words, we might expect a 12-point type to be 1/6 of an inch tall at its tallest point (i.e., from the bottom of its lowest descender, the part of the letter that extends beneath the line, e.g., the tail in a “y,” to the top of its highest ascender, the tallest part of the letter). But there are fascinating fudge factors built in to this measurement, due in part to its history and relationship to movable type (e.g., see the brief account given by Schriver). The same story that makes points such an interesting measure also makes them difficult to standardize. You have probably noticed that there is a great deal of size difference among comparable letters in different font families (which is why you often must redo document layout when you change fonts even if you change nothing else). These differences compound over the course of a book or lengthy document, and they play strongly into our discussion of legibility: some fonts may be easier to read on the screen than others are. Figure 2.4 illustrates the size difference between five uppercase Cs, all presented in 72-point type in Microsoft Word. Letter sizes were measured directly from the screen using a machinist’s
26 reading and writing the electronic book
(a)
(b)
(c)
(d)
(e)
Font
Times New Roman
Georgia
Arial Narrow
Lucida Sans Typewriter
Verdana
Size
72 points
72 points
72 points
72 points
72 points
Height
0.69 inch
0.72 inch
0.73 inch
0.71 inch
0.73 inch
Width
0.55 inch
0.55 inch
0.48 inch
0.49 inch
0.59 inch
C C C C C FIGURE 2.4: Comparing the measured width and height of a letter rendered on the screen in different fonts (in Microsoft Word). Letter spacing and kerning may also be affected by font choice.
caliper. The character sizes vary substantially in width (measured to be as much as 0.10 inch) and by a visually conspicuous amount in height (measured as 0.04 inch). Letter spacing and kerning. It is obvious from looking at most text displayed on the screen (or in print for that matter) that the letters are not simply placed an equal distance from one another; most fonts include embedded corrections that take into account the way the letters fit together into words. For example, an “o” next to an “x” (say, in “box”) looks very different from an “l” immediately adjacent to another “l” (as you would see in “llama”); because the outward bulge of the o coincides with the inward shape of the x, they can be placed closer together. Kerning is the term used to describe how the distance between specific letter pairs are adjusted for the sake of aesthetics and legibility. Kerning may be performed automatically, as part of the type design process, or both. Tracking or variable letter spacing also refers to text density that creates word shape. Beyond aesthetics or genre conventions, letter spacing that is either too tight or too loose can have an adverse effect on legibility.11 Not all fonts use such corrections, however. So-called proportional fonts—which include most of the fonts you see on the screen when you are reading—use kerning and variable letter spacing. Monospace or fixed-pitch fonts are used when it is necessary to predictably align letters; in a fixedpitch font, each letter occupies the same amount of horizontal space. For example, fixed pitch fonts are often used in software development tools, where indentation and alignment are used to reflect 11
Tight letter spacing can create an emotional reaction on par with listening to someone speak very fast, as in a salesman’s frantic high-pressure pitch.
reading 27
syntactic structures in the programming language. Courier and Lucida are examples of common monospace typefaces. Font hinting is another technique that is used for on-screen text rendering. Hints are created during the typographical design process and embedded in the font files so that different-sized fonts can be rendered clearly. In essence, hints specify which pixels are crucial definers of a letter’s shape. Font hints and their resulting shapes often can be calculated according to mathematical algorithms, but high-quality fonts demand designer intervention to make the text visually appealing and maximally legible. Families, fonts, and faces. Three decades of desktop publishing have had their effect on typographic terms, and many words that used to have distinct meanings are now used interchangeably. Instead of being a stickler, it seems better to be flexible and to sort out some of these things through common usage. Historically, a font meant a specific typeface (e.g., Arial) in a specific size (e.g., 12) rendered in a particular style (e.g., narrow). Today, font means something more general, e.g., Arial in any size or rendering style. A family is a set of fonts that use a particular typeface design; a font family includes all of the stylistic variations (narrow, thin, etc.), all of the size variations (8 point, 10 point, 11 point, 12 point, 14 point, etc.), and all of the rendering variations (bold, italic, etc.). Serif and Sans Serif. Although type designers have created an extensive palette of fonts that may be used to great effect in electronic book, there are two principal styles of typography: serif and sans serif. In other words, fonts either have serifs (think Times New Roman) or they don’t (think Helvetica). A serif is a short line that is used to decorate or emphasize the end of the basic strokes of a letterform. Serifs are felt to enhance legibility because they help the reader recognize the letterform and distinguish it from similar-looking letters; they also may provide the visual continuity that supports the perceptual grouping of letters into words. Thus, it is a common belief that a serif font is the most legible choice for rendering running text. Later in this chapter, we will investigate this question more fully: Does empirical data provide any reason, beyond aesthetic judgment, to use serif fonts for running text and sans serif text for headings? Are serif fonts really more legible than sans serif fonts? Practically speaking, sans serif fonts may strike the reader as cleaner and more modern. The bolded form of sans serif letterforms is easier to distinguish from the plain versions of the letterforms; hence, this makes them an appropriate choice for headings because it is easier for the reader to interpret heading hierarchies that use variation in type styles. Electronic publications often use serif fonts for the main body of the text and sans serif fonts for the headings, captions, and marginalia. Leading and line length. The word leading is derived from the strips of lead used in typesetting to form vertical separation between horizontal lines of type.12 Like font size, leading is measured in points and, for what we think of as single-spaced text, is typically between 1 and 6 points, 12
Hence leading is pronounced “ledding” not “leeding.”
28 reading and writing the electronic book
TABLE 2.2: Rough guidelines for leading by font size. Font Size
Points of Leading
9–11
1–3
12
2–4
14
3–6
16
4–6
18
5–6
depending on the type size; as you would guess, the bigger the font, the more generous the leading should be. Of course, text may be rendered with no leading between the lines without one line interfering with the next; we will discuss whether this affects legibility later in the chapter. Table 2.2, adapted from Fenton (1996), offers rough guidelines for setting leading for running text.13 Rendering techniques. Anti-aliasing is a computer graphics technique used to reduce the appearance of jaggedness when lines are drawn diagonally on the screen.14 Jagged lines are a natural side effect of drawing lines using a rectangular grid of pixels. In general, the higher resolution a computer display is (the more pixels per inch and the less distance between the individual pixels), the less this effect matters; however, anti-aliasing can compensate for the actual size and relative placement of pixels as they form diagonal and curved lines. Anti-aliasing does its magic by using subtle changes of color in the areas where the jagged edges are the most obvious. These slight changes in color cause a blending (almost a blurring) of the sharp edges to give the appearance of a straight line. This technique can be important for rendering fonts, because they generally involve many diagonal lines and curves, and the jaggedness can be quite distracting and may interfere with letterform recognition. Because this technique is at the pixel level, it is difficult to detect anti-aliasing unless the letterform is blown up. Figure 2.5 shows an example of an anti-aliasing technique used to smooth an “a” letterform in the Calibri typeface. The second generation of electronic book technologies saw the introduction of sub-pixel rendering techniques that build on the idea of anti-aliasing by using the properties of modern color 13 14
These guidelines are presented as just that; they are not intended to be hard-and-fast rules.
Anti-aliasing is not necessary for lines that run vertically or horizontally, since the pixels will line up with one another.
reading 29
FIGURE 2.5: Anti-aliasing technique applied to smooth the “a” letterform in the Calibri typeface. Calibri is a new face, designed specifically for on-screen reading. (a) An “a” letterform without anti-aliasing applied. (b) An “a” letterform with sub-pixel anti-aliasing.
displays, coupled with careful hinting of the fonts designed for on-screen reading. The best known of these techniques is Microsoft’s ClearType.15 Specifically, sub-pixel rendering techniques take advantage of the “stripes” in LCDs. That is, each pixel of an LCD consists of three narrow vertical stripes of color that the eye blends together; the pixel’s color is determined by the intensity with which each sub-pixel element is illuminated. By addressing each stripe of color separately—that is, lighting it only if the letterform actually passes through that sub-pixel, the screen resolution can be effectively tripled (at the expense of a minor perception of blurriness). Fonts designed for on-screen reading. Because the screen has different properties than the printed page—in particular, reduced resolution and different pixel shape—some recent typefaces have been designed to meet the needs of on-screen readers. Common examples of these typefaces include Verdana, Georgia, and Trebuchet; the first two were designed by Matthew Carter and the last by Vincent Connare. Typefaces designed for on-screen reading have slightly different properties than print typefaces. These characteristics include larger lowercase letters (so there is more open space in letters like “e”), more open letter spacing (so the individual letters don’t touch one another), and greater ubiquity (ensuring that most users will have them installed, thus making the designer’s job easier) (Quinn 2009).
2.3
STUDIES OF THE EFFECT OF LAYOUT ON READERS’ PERFORMANCE
The first studies of reading on the screen were motivated (whether explicitly or not) by the question of whether reading on the screen had a deleterious effect on reading performance when it was 15
Another color sub-pixel anti-aliasing technique is Adobe’s CoolType.
30 reading and writing the electronic book
compared to reading on paper. Although at the time of the initial studies, it was readily apparent that people did not like to read longer documents on the screen, and would instead print them out if given the choice, it was less clear what it would take to convince them to read on the screen. Would reading on the screen always be intrinsically inferior to reading on paper? Would people read less carefully and, consequently, with poorer comprehension when they shifted from print to display? Would they read more slowly and experience greater eyestrain and fatigue? Many of the earliest studies of reading on the screen were performed as a reaction to futurists’ claims that print forms would one day disappear. Certainly, given the state of display technology at the time of the studies, the claims were, at best, premature. In 1992, Dillon prefaced his comprehensive survey of studies of reading on paper versus reading on the screen by saying: Even so, paper is an information carrier par excellence and possesses an intimacy of interaction that can never be obtained in a medium that by definition imposes a microchip interface between the reader and the text. Furthermore, the millions of books that exist now will not all find their way into electronic form, thus ensuring the existence of paper documentation for many years yet (Dillon 1992). Needless to say, this prefatory remark—one that seemed like motherhood at the time—may now be subject to greater scrutiny. Millions of books are finding their way into electronic form and we are achieving a far greater intimacy of interaction with the screen, although we are still some distance away from where we need to be to make viable predictions about the end of books.16 It is still wise to wonder what kind of effects reading on the screen will have on performance as well as on reader satisfaction. The first conclusion of Dillon’s review of reading research to date (as of the early 1990s) was that it is dangerous to draw definitive conclusions from these studies because it is so difficult to control the numerous factors that affect both the outcomes of reading (using the familiar performance metrics of speed, accuracy, fatigue, comprehension, and preference) and the actual process of reading (eye movements, interaction with the medium, and navigation, which includes elements like link following, a type of navigation that was still fairly novel at the time the review was written, and page turning). In other words, the two delivery vehicles—the screen and the printed page— were so radically different that it was difficult for psychologists to compare them without stacking the deck right from the start.
16
As Robert Coover did in his so-named 1992 article that appeared in The New York Review of Books.
reading 31
Indeed, some of the experimental conditions ascribed to past research call ominous attention to themselves: in the on-screen condition, subjects read white text on a blue background or green on a black background; subjects reading from CRTs were hampered by possible glare from overhead lights; or subjects read on-screen text that was unnaturally large. It is easy to see why Dillon was reluctant to find these results conclusive, although he cautiously allowed that “reading speeds are reduced on typical VDUs17 and accuracy may be lessened for cognitively demanding tasks.” That readers preferred high-quality print documents over the display technology of the day—both for their legibility and for the interactions they afford—hardly seems surprising from the vantage point of 1992 when the article first appeared. The key to the early findings is that they are tied to the display technology of the day. Thus, the comparisons will continue to be meaningless until the quality of electronic text presentation reaches a paper-like level. At that point, we can revisit the legacy results evaluating the effect of image quality (i.e., typography and layout as they are realized on paper, e.g., Tinker 1963). It is also critical to consider the performance metrics: are speed and comprehension how we want to measure reading effectiveness? Finally, we must question the typical tasks for these performance measures: Is proofreading really a typical reading task? Is it the best gauge of accuracy? Dillon also points out that once the document in question becomes more than a single page/ single screen in length, other factors beyond image quality come into play. Indeed, the scrolling versus paging question is still a fundamental controversy that drives eBook design; we discuss it further from an empirical/design perspective in Chapter 3.2. Page turning, its advocates argue, allows the reader to continue to take advantage of the properties of the print book such as its intrinsic support of spatial memory. But scrolling has proponents too; scrolling advocates cite continuity and navigation speed as major advantages. In fact, studies as recent as Liesaputra and Witten (2008) continue to find no definitive resolution to the scrolling versus page turning controversy; the answer is contingent on so many different factors that it will continue to be difficult to resolve the question “once and for all.” One hypothesis that has been floated among members of the eBook community is that a weakness of digital publications is layout and that improving layout would greatly enhance peoples’ satisfaction with reading on the screen and might even improve their performance using conventional metrics. Indeed, it was postulated that one of the reasons that people prefer to read print books over eBooks hinges on print books’ superior layout. What would happen if we applied the known layout heuristics to on-screen displays of text and graphics? Chapparo et al. (2005) compared subjects’ reading performance for online documents prepared under four conditions: with adequate margins, without adequate margins, with optimal (open) 17
Video Display Units.
32 reading and writing the electronic book
leading, and without optimal (open) leading. In this study, the optimal use of white space affected both reading speed and text comprehension: subjects read the text with margins more slowly but comprehended it better; subjects also preferred the text that was prepared so that it had adequate margins. We might surmise that if a document is more pleasant to read, subjects will spend longer reading it and will score better on measures of comprehension. Optimal leading, on the other hand, had no effect on performance, but the subjects preferred it over text prepared with suboptimal (i.e., no) leading. A second unpublished study by Chaparro’s group18 demonstrated that applying on-screen layout heuristics to produce a better looking page (using typographically optimal headers, paragraph indentation, proper figure placement, visually appropriate quoting, etc.) does not translate into comparable improvements in reading performance. In an experiment that compared the two conditions (good and poor layout), no difference in reading speed or comprehension was found. Readers preferred the nice layout, but it did not change their ability to take in what was on the page. The application of good type design principles had even less effect: subjects did not notice them to appreciate them. But can rendering itself improve performance? In other words, if the letterforms are easier to recognize, can this translate into overall improvements in reading speed and comprehension? Gugerty et al. (2004) found that subjects’ word recognition improved slightly when the words were rendered with the Cleartype sub-pixel anti-aliasing technique. This improvement translated into small performance improvements in reading speed and comprehension for a sentence-level reading task; the improvements did not carry over into long-duration pleasure reading, however. Sheedy et al. (2005) performed another experiment on legibility—the accurate recognition of letterforms and words—to tease out the factors that make text legible on displays. Pixel height (which corresponds to font size), font type, and stroke width all influenced legibility, with optimal legibility attained at a 10-point font size. Four fonts were evaluated in this study: Times New Roman and Franklin were the least legible fonts and Verdana and Arial were the most legible. A recent study by Larson and Picard (2005) attempted to quantify the effect of the aesthetic differences that cause readers to prefer one page design over another. It has been found that mood influences task performance: a good mood will cause people to perform better on creative tasks than they would if they were in a bad mood. So in this study, armed with the knowledge that better rendering improves performance and that subjects prefer to read on-screen documents that look good, Larson and Picard showed that high-quality typography can put subjects in a good mood, and they demonstrated that you can feasibly ask “how good?” using established methods for measuring affect. 18
An account of this study may be found in the work of Larson and Picard (2005).
reading 33
In the end, when we contemplate the results of these studies, we find that lab subjects prefer nicely laid-out text, although the legibility enhancements provided by rendering techniques and layout do not result in corresponding increases in reading performance. In the field, what we see is more complex. There are trade- offs introduced by good layout: fewer words fit on the screen; hence, the reader may need to spend more time navigating and may put forth more effort retaining context (because there isn’t as much of the text available to quickly refer back to or to glance forward at). If we return to Figures 2.2 and 2.3, it is easy to get an intuitive sense for some of these trade-offs. Nothing is simple, but as the page on the screen becomes more comparable to the print page—and as the materiality of the eBook becomes more comparable to that of the print book—the objections to reading on the screen begin to disappear.
2.4
READING HARDWARE AND DISPLAY TECHNOLOGIES
The word eBook can refer to hardware, software, content prepared to be read on the screen, or a combination of all three. In much of this book, when we talk about eBooks, we’re by and large referring to the software—the reader—used to present the content. However, in this section, we’ll take a brief look at the actual hardware platform, the devices on which we read. While readers might choose to read on any computer—from a desktop with a large stationary display, to a laptop, to a smart phone—we will look at the elements of purpose-built hardware especially designed for reading. Let’s start with the distinction between active reading and immersive reading that we made earlier in the chapter. In reality, this distinction is artificial. Readers can change midstream from reading immersively to interacting with their reading or to skimming and scanning. However, these distinctions give us a place to start and a way to organize the discussion. Reading the newspaper on a screen suggests different hardware requirements than reading technical articles as you prepare to write your doctoral dissertation; likewise, reading office documents raises a different set of concerns than reading the latest best seller and making it part of your personal library. Thus, instead of discussing the existing electronic book products or research prototypes, we will discuss various elements of a capable hardware platform, no matter what the emphasis. If we revisit Figure 2.1, it gives us a structure for thinking about reading platforms and for eliciting requirements to specify different hardware components. Figure 2.6 shows this matrix. Immersive reading suggests a focus on legibility and ultimately on portability, since this might be the kind of reading material you would carry with you to a variety of places. Because we are suggesting this might be purpose-built hardware, it will have to be very light. Immersive reading—and to some extent, active reading as well—suggests that the user interface should disappear, and the modes of user interaction should be very straightforward and not require instructions to learn; after
Active reading (purpose in mind) Requirements: interaction, manipulation
34 reading and writing the electronic book
Laptop, notebook, or other mobile platform that is capable of running a range of standard applications
Tablet or device that supports direct interaction with material and is fully functional
Mobile reading platform with large display (for skimming/scanning)
Mobile reading platform, emphasis on storage and portability
Immersive reading (focused attention) Requirements: transparency, legibility
FIGURE 2.6: Hardware platform requirements as driven by different reading needs.
all, no one needs instructions on how to read a book, assuming they are literate. Active reading suggests a focus on interactivity; hence, we should be thinking about different modes of user input. Depending on the type of interaction we want to emphasize—say, annotation as opposed to rapid page flipping—we might come up with a variety of different mechanisms that support the style of interaction we have in mind. As we begin to think about the overarching activities associated with reading, storage enters the picture. Of course, we’ll always have to store the content somewhere, but perhaps it might be wise to get beyond the transience of the newspaper or novel-based immersive reading experience that dominates the lower quadrants of the matrix. Although active reading suggests that we’ll either need to rely on cloud-based applications that run in a browser or we’ll need to specify a general-purpose computer platform, the common element is high-bandwidth connectivity. Unlike the second generation of eBook devices that we discussed in the introductory chapter, we will need to think about higher bandwidth connectivity and—more than likely—synchronization among copies of our personal libraries as we use them on different devices.19 Finally, one significant overarching concern of mobile technologies enters the picture: battery life. People find little more 19
Products such as Microsoft’s LiveMesh address the problem of synchronizing files among personal devices. More complex replication problems are the focus of current research (Terry 2008).
reading 35
frustrating than being forced to stop reading because they need to plug in their eBooks; long-lived batteries, coupled with efficient use and management of available power, is essential to successful eBook platforms. One argument for the use of the bistable displays is that they consume less power than LCD displays of comparable resolution. Reading relies crucially on-screen legibility; at the same time, the mobility we associate with reading relies on reducing the weight of a conventional computing platform. While people can read on less-than-optimal displays without a significant performance sacrifice, they don’t like it (refer to the discussion earlier in the chapter). The third generation of eBooks aptly illustrates this: these new eBook products rely on the bistable display technology that has matured recently. It is well suited to text display, is light, and uses relatively little power (partly because once an image has been rendered, it is stable). Hence, we will focus on bistable displays, rather than the ubiquitous LCD displays. A bistable display is one that remains in a stable state without requiring power; this stability is in contrast to an LCD display, which requires continuous power to display an image. The most common type of bistable display is known variously as ePaper or E Ink (both of which refer to specific products). Electronic paper consists of tiny electrostatically charged spheres or charged particles suspended (in oil) in a planar plastic matrix; electrodes surround the somewhat flexible plane. Because of the way the display image is produced, the screen is reflective like paper, rather than backlit like a traditional computer screen. These properties—image stability, reflectivity, and the consequent wider viewing angle—make electronic paper more comfortable to read in most situa tions than other types of displays.20 An early version of this technology used charged spheres that were black on one side and white on the other. Electric power was applied to change the state of the individual balls in such a way that they rotated to form an image (Sheridon et al. 1997). In newer versions of the technology, electrophoretic displays, the images are created using charged pigment particles rather than the discrete bicolored balls. In other words, the individual pixels can be thought of as positively charged on one side and negatively charged on the other, but many particles form each pixel; to produce the bicolored effect, the particles all migrate to one side of the ball or the other. Figure 2.7 shows schematics of the early electronic paper technology, Gyricon (Sheridon et al. 1997), and the comparable electrophoretic display technology used by the third-generation eBook hardware like Amazon’s Kindle and the Sony Reader. Although its weight and reduced power consumption make electronic paper a good display technology to support immersive reading, ePaper is somewhat more problematic when it comes to the requirements introduced by active reading. It has a low refresh rate, so it is more difficult to implement the facilities necessary for skimming, for scrolling rapidly, or for flipping through the 20
It should be noted that bistable displays cannot be read in the dark because they are not backlit.
36 reading and writing the electronic book
FIGURE 2.7: Producing pixels using electronic paper display technologies. (a) Gyricon display. (b) Electrophoretic display.
pages. Currently, the technology is still limited to black-and-white images, although various experiments to extend the technique to color displays are underway. Predictions variously put low-power color displays on the scene as early as the end of the year and as late as the end of the next decade. Recent research has examined the feasibility of two-screen readers, which will give the reader a more natural view onto multiple pages. Chen et al. (2008) at the University of Maryland and the University of California have been building a dual-display prototype that supports interactions like folding, flipping, and fanning. The displays also detach from one another to provide more paper-like modes of interaction; if they are attached, the two displays are like two consecutive pages of a document, and if they are detached, they become windows onto pages of separate documents. We will discuss the interactive aspects of this prototype in the next section; the point here is that the price, weight, and power consumption of electronic paper has finally made a dual-display reader feasible. Past attempts have shown it is logistically possible, but not particularly practical (e.g., see the description of the Everybook in an article by Schilit et al. 1999).21 • • • •
21
It should be noted that the prototype gets around the display refresh issue we discussed earlier in the section by using screen real estate to provide a reduced representation of the document’s pages that is used for visual navigation. See Space Filling Thumbnails (Cockburn et al. 2006) for a detailed description of such as technique.
37
chapter 3
Interaction EBook products to date have focused on particular types of reading—reading for entertainment, as one would read a novel, or reading to become more informed, as one would read a business book— and have more or less ignored others (e.g., reading to learn, as one would read a textbook, or reading for reference, as one would consult an encyclopedia). These assumptions about types of reading have dictated general eBook functionality and the specific facilities for interacting with eBooks. For example, the platforms assume that navigation will primarily consist of page turning: the reader will approach a book in a linear fashion, reading straight from beginning to end. Similarly, much of the interaction is confined to simply marking one’s place in the text with a bookmark or highlighting a few choice selections in the text through a clunky menu-based option. Yet some types of reading demand more sophisticated functionality. Electronic textbooks have been less successful than electronic novels partially for this reason. Students need to be able to annotate, work problems in the margins, follow along in class, and take notes. Often, this kind of reading is referred to as active reading, and various pedants have offered prescriptive approaches to how readers should approach this activity. While active reading has been at the center of eBook research projects such as XLibris (Schilit, Golovchinsky, & Price 1998), it has yet to find its way into eBook products. Likewise, successful electronic reference books have taken advantage of the power of the computer to provide their users with the ability to, for example, translate words from an unfamiliar language or to navigate through the content in ways that are well aligned with the books’ expected use (Marchionini 1995; Egan et al. 1989). They may also include enhanced multimedia presentations and the ability to work across different computers. New modes of interaction and added functionality are not yet commonplace in eBook products, and when they are available, they seem like an afterthought rather than a fundamental part of the product design.
The complex economics of textbooks are another important factor in the marketplace response to electronic textbooks.
38 reading and writing the electronic book
In this chapter, we will examine some important types of functionality that are tightly interwoven with the act of reading, such as annotation, navigation, linking, clipping, and bookmarking, and the hardware needed to support this functionality. EBook-based collaboration is also a source of advanced functionality; Chapter 4 will extend the functionality covered in Chapter 3 into the social realm and will address aspects of reading and interaction such as sharing annotations, clippings, and entire books. Although fluid interaction with eBooks is necessary to give them the so-called affordances of paper, it is important to remember that reading may be part of a more complicated system of activities. In the final chapter of this book, we’ll build on Chapters 3 and 4 to examine additional functionality that supports the move from reading to writing. This type of functionality often involves working across electronic texts to gather, triage, and analyze different information sources. Taken together, all of these different mechanisms, functions, and related applications will give eBooks the ability to take on the challenges of active reading: reading to inform and instruct.
3.1
ANNOTATION
Annotation is a basic and often unselfconscious way in which readers interact with texts. A student struggling with a difficult philosophy text might trace his progress through the dense prose by highlighting line after line. An amateur cook might cross out an unused ingredient in a favorite recipe to reflect how she usually makes the dish. An engineer might pencil in the value of a constant in the margin next to a formula or might draw a rough graph to better visualize an equation. Each of these examples shows annotation used in a different way: to focus the student’s attention; to augment the cook’s memory; to actively engage with an engineering concept. Just as the uses vary, so does the long-term value of the annotation: the student studying for final exams might want his highlighting to disappear when he re-reads his textbook; the cook might want her annotation to become an indelible enhancement to her cookbook; and the engineer might want the notations to stay in place for the duration of a project, but not when she lends the book to a colleague. These examples show different kinds of people annotating paper texts for a variety of reasons. That’s because annotation on paper is a seamless, flexible, and well-developed practice (if sometimes taboo). Annotations on electronic texts have generally been more problematic. On some reading platforms, annotation is clunky, interrupting reading as the reader pulls up menus, makes selec
In this chapter, I take annotation to mean the kind of informal markings readers make on a page while they are reading rather than using the word to refer to published (and unpublished) scholarly annotations as one would find in a critical edition of an important literary work.
Interaction 39
tions, switches to a keyboard to type in text: the reader’s attention is refocused on the user interface rather than on the book’s content. Electronic annotation tools may also limit a reader’s expressive intent (e.g., forcing a highlight to be continuous when the reader wants to fragment it, or imposing neatness on a reader when she wants to scrawl). Sometimes the electronic annotations are stored in infelicitous ways so they are either gone when the reader returns to the eBook on a different computer or so they are recoverable when the reader believes them to be deleted and loans the book or document to a colleague. In fact, some early objections to electronic books are based on their immutability, because reading might not change them the way it changes print books. The French post-structuralist cultural theorist Baudrillard put it this way: The compact disc. It doesn’t wear out, even if you use it. Terrifying. It’s as though you’d never used it. So it’s as though you didn’t exist. If things don’t get old anymore, then that’s because it’s you who are dead (Baudrillard 1996, pp. 32–33). Indeed, the digital ideal is a book that doesn’t show wear or become shabby and dog-eared with use; functionality must be added so that the eBook’s pages reflect wear. On paper, annotation is one crucial record of a reader’s interaction, along with other telltale signs of previous engagement; there are compelling reasons to duplicate this effect in electronic books. If they had paid attention to the medium’s history, critics wouldn’t have harbored these fears about the sterility and immutability of electronic books; even in the earliest electronic books and hypertext systems, annotation was seen as a key form of interaction. In an influential keynote that Brown University professor Andreas van Dam delivered to the ACM Hypertext conference in 1987, he said: The reason I encouraged such annotations [in FRESS] was that I remembered that when I was in college with Ted [Nelson], I would always grab the dirtiest copy of a book from the library, rather than the cleanest one, because the dirtiest ones had the most marginalia, which I found helpful (van Dam 1988).
There have been legal cases based on the ability of an adversary to recover a document’s deleted annotations. Needless to say, this vulnerability has been the cause for some concern to developers implementing electronic annotations.
For example, the read wear user interface (Hill et al. 1992) uses various intuitive visualizations to show the reader how often he or she has accessed a section or passage.
40 reading and writing the electronic book
Indeed, there has long been considerable optimism about the power of annotations and the reader’s role in the intellectual ecology of the digital age. In this section, we’ll look at several different aspects of annotating electronic books with an eye toward how annotations are represented and stored and how they may form the basis of advanced functionality. In recent years, different communities have shown a great deal of research interest in annotations. Researchers have generally agreed that annotations can be a valuable artifact that reflects a reader’s engagement with and understanding of a text, an artifact that may persist beyond the immediate reading. There has been general recognition that reader annotations should not be tied to the particular eBook software that was used to produce them. Hence, it is important to develop uniform terminology for annotations and to represent them consistently.
3.1.1 Representing Annotations Readers don’t always annotate. If you browse novels in a used bookstore, you would be surprised to find many annotations on the books’ pages. Usually wear is shown in more subtle ways: damage to the book’s spine, old bookmarks, or dog-ears (folded-down page corners). On the other hand, a novel used as a textbook might have notes in its margins. In fact, these notes might be a strong indicator that the novel was used as a textbook. This example is a harbinger of a more specific claim about readers’ markings in their books: Readers annotate in a way that depends crucially not only on the book’s genre but also on their reason for reading it (Figure 3.1). Annotations are thus as diverse as readers’ motivations for reading. Let’s look at some annotations in print books (in this case, in textbooks since being a student offers ample reasons for marking in books). Figure 3.1a shows a link connecting two separate fragments of text; Figure 3.1b shows a page with extensive highlighting; Figure 3.1c shows asterisks in the margin (we might assume that the reader thought this passage was important); Figure 3.1d shows a longish bit of marginalia that refers to a sentence that the reader has underlined; and Figure 3.1e shows evidence of a coding system that a reader has developed to categorize his annotations. What is important here is the diversity of the marks. The annotators have found ways of writing on the print page that do not obscure the original text; they have managed to connect their notes with the words the notes refer to; and they have used personalized symbols if they thought it appropriate. In some less common cases, they have even found imaginative ways to code their markings. Let’s look further at the characteristics of these marks: what makes an annotation tick? Some of them make an explicit comment on the text. For example, Figure 3.1d shows a student’s interpretive note, “Eros is the force that binds things together and keeps the structure of things intact.”
Interaction 41
FIGURE 3.1: Examples of five kinds of annotations that demonstrate a broad range of reader intentions. (a) Linking. (b) Highlighting. (c) Margin marks. (d) Extended note anchored in the source text. (e) Coding system.
We don’t know whether the student wrote that note in class, echoing the professor’s lecture, or whether the note represents the student’s own reflection on the material. But that doesn’t matter for what we are trying to do here. What we do know is that there is an explicit comment (with an apparent meaning) and it pertains to a well-delimited region of the source text. We can also find annotations that are less explicit. Figure 3.1b shows a highlighted region. We don’t know what the student has to say about this text, only that the student took particular note of this passage while he or she was reading. The passage might have been important, or it might have been just difficult to understand. In other words, the body of the annotation is implicit or telegraphic; we have to assume that the meaning of the highlight will be apparent to the reader when the marks are re-encountered.
42 reading and writing the electronic book
TABLE 3.1: A space of annotation forms. Characteristics
Explicit Anchor
Implicit Anchor
Explicit body
Example: An interlinear translation of an underlined word.
Example: A note in the margin without any corre sponding marks in the text.
Implicit body
Example: An underlined sentence.
Example: An asterisk in the margin that does not explicitly specify extent.
It is easy to imagine an example that is the other way around: the reader has written a longish note, but we can only guess what portion of the text that the note refers to. In other words, the anchor of the annotation is implicit. Finally, there are marks like the ones shown in Figure 3.1c that are vague in both how much text they refer to and why they have been added in the first place. Table 3.1 summarizes these variations and provides an example of each. Table 3.1 does more than lay out a space of annotation forms: it also suggests an anatomy for annotations and a universal way to represent them. Work on integrating annotations into digital library systems (and other user interfaces that incorporate annotation functionality) has thus found it convenient to adopt a consistent terminology for parts of an annotation (e.g., see Agosti et al. 2004). A consistent terminology and representation of annotations makes it possible to develop interoperable systems and mechanisms for storing, reading, and rendering annotations in electronic books and digital libraries.
3.1.2 Anatomy of an Annotation From our table and the previous discussion, we can see that there are at least three elements to consider when we look at annotations: a body, an anchor, and a marker. The body of the annotation is any content that a reader has added to the work (e.g., a note in the margin or an asterisk). Many of the annotations that readers add to books are highlights or underlines with no explicit body. Any representation of an annotation must handle this common case of a null-content annotation. The second element of an annotation is an anchor, which delimits the scope of an annotation, even if the scope is implicit. Anchors are the heart of an annotation, because they tell us how any added content is related to the source material, the portion of the eBook that the annotation actually refers to. Anchors may not only be implicit; they may also be broad in scope—a review or
Interaction 43
rating might refer to the entire document rather than to a specific passage. Similarly, an annotation’s scope may be narrow: an interlinear translation may refer to a particular word and proofreading annotations may even be anchored to one or two letters (e.g., to indicate where a letter should be inserted). The third part of an annotation is a marker ; a marker tells us how the anchor should be rendered when it is displayed. In other words, we don’t just need to know what the annotation refers to in the source material; we also need to know how to display it. Let’s disentangle these abstractions using an example (Figure 3.1d) to see what these terms mean in practice. It looks like there are three distinct annotations on this paper page. The first one’s body is the marginalia “Eros . . . keeps the structure of things intact,” a comment that appears in a particular position in the margin. Its anchor (the portion of the anchor that is visible) spans two lines of source text, starting with the word “it” and ending with the word “the.” Finally, its marker tells us the anchor should be displayed as a black underline. The second annotation, the yellow highlight starting at “something” and ending at “change,” has a null body (i.e., there is no added content); its anchor spans the entire paragraph of source, and the marker is a yellow highlight. Finally, there’s a third annotation, an asterisk at a specified location, which forms the annotation’s body. It has no explicit anchor, but depending on what we’re going to do with the annotations later on, we might want to assume it is anchored to the same full paragraph as the second one. The null marker would tell us that the anchor is not displayed. In most implementations, annotations also have an author (i.e., it is important to record who created the annotation) and a time stamp that records when the annotation was created. The anatomy of an annotation may be arbitrarily complex, but these basic elements capture essential qualities of many different types of annotations created in a broad range of circumstances. While none of these interpretations of our on-paper examples are absolute, it is important to have a set of terms that will enable us to discuss annotations in a relatively consistent way, independent of how they are realized in software implementations on various platforms; a set of common terms makes it reasonably easy to compare different ways of anchoring annotations and displaying their markers. How common are the different types of annotations? The answer obviously depends on the material (its genre, level of complexity, and its physical form) coupled with the reader’s purpose (no matter how inviting a page is, a reader who is relaxing with a novel on the beach will make fewer marks—probably no marks beyond a drip of ice cream—than the same reader reading the same novel for a graduate course on Twentieth Century American Literature) and the particular
Specifying the position of an annotation’s body is tricky and highly variable; this portion of the representation is left as an exercise to the creative developer.
44 reading and writing the electronic book
circumstances (with any luck, a library book will have fewer annotations than a used paperback). But the data we collected for the annotations study reported by Marshall and Brush (2004) gives us a representative idea of the relative frequencies of different types of annotations. Table 3.2 summarizes the data. These data confirm qualitative observations I have made in other studies: by far, the most common kinds of annotations are anchor-only annotations such as underlines and highlights. Specifying anchors is a difficult and important aspect of handling annotations programmatically. Anchor specification relies crucially on the type of content being annotated and the ultimate functionality that a developer expects to offer. Is the content changing, as it would be in text editor? Is content moving in the electronic document relative to the page, but staying relatively static, as would be the case in a blog where new entries are being added, pushing the older content further down the page? Are we implementing annotation functionality for published eBooks? Will the page layout change (say, when larger fonts are used), or are page images fixed, as they would be in a PDF format document? Are there unique identifiers for book elements (such as chapters or paragraphs)? It is easy to see that specifying anchor positions will take some creativity and insight into the type of content and applications we intend to support. First, let’s look at most straightforward situation, a published electronic book stored as a PDF document. Because we’re talking about a published book, we know that the underlying content is not apt to change and the page layout has been pre-calculated when the PDF was produced. This stability allows us to anchor annotations geometrically, as an overlay to the electronic equivalent of a printed page. Depending on what we’re planning to do with the annotations, we might not even
TABLE 3.2: Typical relative frequencies of annotation types (data from Marshall & Brush 2004). Annotation Type
Total Number of Annotations
Relative Frequency (%)
1, 276
83.1
Body only (including marginalia and symbols)
105
6.8
Compound (anchor+body)
138
9.0
16
1.1
Anchor-only (including underlines, highlights, and combinations)
Other (e.g., doodles) Total
1,535
100
Interaction 45
need to know what the content is that we’re anchoring to, just where it is in the document. Thus, in our example, we might store the anchor as an x, y position on page z. But perhaps there are accessibility concerns or we’re rendering the eBook on a mobile device. We might want to increase the size of the text to accommodate visual impairment. Or we might want to reduce the text size so more can fit on a small screen. In this case, we might change the anchor specification to consist of a starting point (a particular character in the stream, beginning at a uniquely identified book element) and an extent or span (the number of characters beyond the initial position that the annotation refers to). For example, we might specify that an annotation is anchored in an eBook’s Introduction (where the Introduction is a structural element of the eBook file), starts 234 characters into the chapter, and extends 47 characters beyond that. Our two examples thus far are simple representations for the annotation’s anchor and may be sufficiently robust for our needs. But wait! We might want to specify annotations in such a way that the underlying text can change (as it would in a document under review and revision by a standards body), and the annotations on the text will still stay in sensible places. In this case, we may want to pick certain keywords from the underlying text and use them to represent the anchor (Brush et al. 2001). Thus, for our previous example (see Figure 2.1d), this might mean that the first annotation is anchored to the sentence containing absolute, beautiful, and things. Yet another method of specifying an anchor allows us to handle annotations on text that moves but does not change (say a blog entry that gradually moves down the page as new entries are added). In this case, we’ll want to uniquely identify text in a way that is position-independent, e.g., as a fingerprint, or some other method that allows us to calculate and maintain a compact representation of the underlying content so that that annotation can be anchored to specific content, rather than a specific document (Hong et al. 2008). For the sake of brevity, there are further complexities that we will not pursue here. For example, spatially proximate annotations may be combined to form a single unit, a compound annotation. In other cases, we might want to split an annotation into two or more parts because we want to handle each part differently. We may also want to handle standard metadata as a special type of annotation that refers to the entire document. It is easy to see why the representation of annotations is an important problem for us to tackle if we are going to develop interactive electronic books.
3.1.3 Linking What is a link? We can approach this question philosophically to reach a powerful and very general formal definition—in fact, more than 15 years ago, the hypertext community did just that (Halasz & Schwarz 1994)—but we will approach links in this section in a more operational way. From an eBook perspective, links connect nonadjacent spans of text, possibly in different books.
46 reading and writing the electronic book
Links actually have two functions. One is to specify a relationship: “Paragraph 13 is related to Paragraph 25 even though they are not adjacent.” Another is to support nonlinear navigation: “By clicking on Paragraph 13’s anchor (via its marker), you will traverse to Paragraph 25.” We will focus primarily on the first (representational) function in this section, since we are in the middle of discussing annotations, but the reader should keep the second function in mind when we go on to talk about navigation. If you are familiar with HTML, it should be clear that annotations share terminology with hypertext links. In HTML, a set of tags specify a link’s anchor. Anchor tags enclose a span of text (or another object) that acts as the source end of a link. In fact, we can think of an annotation as being a particular case of a link, one in which the link destination is either a new entity (when there is a note or marginalia) or null (when the anchor and marker combine to create a simple highlight, underline, or some other anchor-only form). Most links in electronic books are span-to-page links, because that’s how they’ve been implemented. That is, link traversal takes you to the page that’s on the other end of the link; no span is usually indicated on the destination end of the link. From a representational point of view, it is easy to see how eBook links could just as well be span-to-span links; the user interface and underlying data structures would only need to provide the reader or author with the opportunity to specify this span when the link is created. Alas, this is an awkward dialog and it is difficult to indicate the destination span either during link inspection, traversal, or visualization. Thus, many developers have felt that it is simply better to specify links as span-to-page. This representation is entirely consistent with our notion of annotations. Should readers be able to create links beyond the implicit links they are creating when they annotate? Most eBook products have answered “no” to this question; they may feel that it is too complicated or confusing for readers and that it is unnecessary to offer such a capability. However, in the original pre-Web vision of hypertext, link-making was felt to be an important advance offered by electronic books, because it helped readers become writers. As it stands, all links are defined during writing and publication, often as part of the markup process. It remains to be seen whether reader linking will ever become an extension of annotations.
3.1.4 Functions of Annotation The function of annotations is tied very closely to the underlying motivation for reading in the first place. A high school student who writes “man v. nature” on the first page of Moby Dick because
Web links may be “weak” span-to-span links, since HTML allows the author to designate anchors that may act as link destinations by using the NAME property.
Interaction 47
his English teacher has explained the work’s motifs in class is doing something very different than an editor who puts an exclamation point in a manuscript’s margin to remind her to talk to the author about something she’d like changed before the work is published. A copy editor’s proofreading carat stands in further contrast to either of these annotation examples. In the end, it is clear that annotations fill many roles. Table 3.3 shows some examples that contrast an annotation’s form with its function. Why do annotations’ varied functions matter when we’re discussing electronic books? Isn’t it enough to say that the ability to annotate is a welcome capability for eBooks, regardless of their genre or the reader’s purpose? Even if lawyers make different kinds of markings than students (or even law students) and use these markings in different ways, doesn’t it just underscore the importance of a generalized annotation feature for any reading platform? The simple answer to these questions is that understanding the function of annotations may be enormously consequential when the time comes to design any advanced eBook capabilities. The annotations’ intended uses have far-reaching implications for design, from storage to data structures to advanced functionality to user interface. Let’s look at a simple example: how should we represent and store the annotations that a reader has added to an eBook? Our understanding of the annotations’ function assumes a crucial role in how we formulate an answer to this question. Let’s say we’ve been looking at a group of users who go wild with highlighting pens—“happy highlighters,” one of my colleagues has called them—and the pages of their books look something like the page shown in Figure 3.1b. Conversations with these users reveal that the only function of these highlights is immediate: the readers trace along the difficult text as they read to focus their attention. These annotations don’t have significant long-term value. Nonetheless, we’d want their creation to be seamless (i.e., we don’t want to interrupt the reader). But we’d also want the marks to be easily removed; the only reason they should stay with the book is if the reader picked up the book on another device to continue reading (or perhaps the reader wants a way of knowing that she’d read the book at some time in the past). In this case, perhaps we’d just store the annotations in the same file with the book and give the reader the ability to quickly remove them. They might be anchored and marked in a similarly straightforward way, say as the triple (offset, span length, marker type), where offset specifies where the highlight starts, span length specifies how many characters beyond the offset have been highlighted, and marker type specifies which pen the reader has used, say the yellow highlighter.
This example is perhaps over-simple, since it glosses over many of the nuances involved in properly designing storage; take it with a grain of salt.
48 reading and writing the electronic book
Table 3.3: Different functions of annotations. Form Example
Example in Field Materials
Apparent Function
Underlined structural elements (e.g., headings), asterisks in the margins
Signal for future attention
Highlighted phrases, circled words, other several-word anchors
Aid memory
Notation in margins (near figures or equations)
Help work a problem
Marginalia, longer notes on frontispiece or above chapter head
Interpretation
Extended highlighting or underlining
Focus attention through difficult narrative
Notes, doodles, drawings, other markings unrelated to the substance of the text
Reflection of the material circumstance of reading
Interaction 49
Now, let’s say we’re not talking about a published book, but rather about a standards document that’s under evaluation by an international standards body. Everyone on the standards committee has his or her own copy of the current version of the document, but the annotations that they make on these documents are public, are created on an ongoing basis, and need to be readily shared. A robust anchoring scheme would enable the annotations to survive across versions; storing the annotations in a database on a centralized server would allow them to be shared. This strategy is very similar to the one described in Cadiz, Gupta, and Grudin (2000). Such a design enables the eBook software to overlay the current set of annotations on the document when the committee member opens it; there may even be a polling scheme by which the software looks for new annotations at set intervals. Thus, in our second case, the annotations are not stored with the document; rather they are stored in a centralized database. Furthermore, they are not represented as a span and an offset, but instead in a more robust, content-based representation that allows them to survive minor changes in the text. It’s easy to come up with new annotation scenarios, each of which suggests slightly different requirements; the main message is simply that the annotations’ function matters, both in terms of how we design the immediate feature and in terms of the enhanced capabilities we add to the eBook system later on.
3.1.5 Status and Value of Annotations Just how important are annotations? If we reflect on reading in its many quotidian day-to-day manifestations—scanning the newspaper while we’re riding the bus, reading a novel on the beach, or skimming a research paper quickly for the sake of citing it appropriately—we quickly realize that the vast majority of reading situations do not involve annotation. Much of the time that we read, we neither do it with a pen in hand, nor do we need to. That said, there are many types of active reading in which the ability to annotate the material is important, if not vital. In these cases, if annotation capabilities aren’t offered by the eBook, then the reader usually prints the material on paper just for the ease with which paper documents can be marked on (O’Hara & Sellen 1997). Even the most environmentally conscientious reader turns to the printer when asked to review a journal submission, to proofread a document, or to refer to the document in situations that a laptop (or even a portable reading device) would be awkward. It’s easy to see what becomes of the marks on print books. Whether the reader likes it or not, if the marks have been written in pen, they will become part of her library. If the reader is a well-known public figure, these marks may even have intrinsic value as a window onto the reader’s thoughts. Even if the reader is not a cultural icon, if she thinks the marks are private, she’ll have
50 reading and writing the electronic book
to go to some effort to get rid of them: how many of us have erased lightly penciled notes from a margin before we let someone borrow one of our books or before we pass an article we have read to one of our colleagues? What then becomes of annotations on electronic books? Their disposition has become a choice, not an absolute. Are they indelible, as they may be on paper? Are they propagated to other copies of the book? Are they shared when the book is shared? Can they be made public as part of a rating scheme? Just how public is public? Suddenly, something as trivial and quotidian as a reader’s annotations can become a burden on the reader and a puzzle to the developer. First, let’s continue to be clear about what we mean by the word annotation. All through this chapter, we have taken annotations to mean “the unselfconscious marks people make when they are reading” and NOT the published scholarly commentary that might look the same as those un selfconscious marks. This distinction is important. Even when people casually share their personal annotations, if they want others to make sense of what they’ve produced, they will need to treat annotation as an explicit authoring activity; they will have to go out of their way to make their annotations intelligible to others (Marshall & Brush 2004). Chapter 4 plunges more deeply into how the artifacts of reading may be shared with others. What of anonymous marks that a reader might happen into inadvertently, say, by buying a used textbook? Any assessment of value in this case is likely to be tied to the reader’s perception of who the mystery annotator was: if the annotations are perceived to have been made by an expert, the reader may very well be influenced by them, but if they were made by a peer or someone who is new to an area, they are less well regarded and may even be annoying. As such, the reader looks for clues to indicate the annotator’s role. In a series of experiments to determine the influence annotations have on future readers, Wolfe (2000) discovered that annotations do affect the persuasive qualities of a text and that the students are most positively influenced if they think the annotator is a disciplinary expert. What about a person’s own annotations? It is important to consider whether they were made last week, last year, or another time in the reader’s life, and under what circumstances (i.e., why was the reader reading the book in the first place and why was she marking in it?). Unsurprisingly, students in the Pocket PC study ventured that the annotations they had made in class were more valuable and more authoritative than the annotations they had made when they were reading on their own (Marshall & Ruotolo 2002). Old annotations—even annotations that had seemed valuable at creation time—did not weather the passage of time well. One student, a graduate student in English Literature, talked about the annotations that she had recently encountered in the texts she had used as an undergraduate: Some of them [the annotations] are absolutely ridiculous and I can’t believe that I actually wrote this in pen in this book. Some of them are—I have no idea what I’m talking about.
Interaction 51
Some of them are really interesting, and it’s something I’d forgotten. It just depends on the notes. . . . When I did Milton, we were doing the epithets about Satan or something, so I underlined all of them. And when I was going back through it, I’m like “what on earth!” (Marshall & Ruotolo 2002) Note that she hasn’t roundly condemned all of her old notes—she admits “some of them are really interesting, and it’s something I’d forgotten”—but she does have misgivings about the value of her old annotations as a whole.
3.2
NAVIGATION
Navigation in electronic books has long been a topic of inquiry. Is full hypertext linking an efficient means of navigating through electronic books? Or are links disruptive? Is page turning better than scrolling or is it the other way around? Are book-like navigational structures such as the table of contents or an authored index sufficient to support reading on the screen? Because there is no way of avoiding navigation (short of publishing material that is sufficiently brief that it fits on a single screen so that readers never need to go anywhere), it is important to give navigation its due. Just how vital is navigation? Are readers mostly navigating while they are reading linearly (in which case, navigation is straightforward and occupies a relatively minor proportion of the readers’ overall attention) or are they jumping around in the publication—glancing at a paragraph here, skimming another one there, and reading deeply less frequently—a situation in which navigation plays a relatively more prominent role? Genre, activity, and context must all be considered to answer this question. Suppose the reader is holding a Kindle-like device in one hand, reading the newspaper while he is standing up on the subway; in this case, not only is navigation important, it’s also important to be able to carry it out one-handed. Or suppose our reader is lost in a novel; it would be best for navigation to simply disappear. In a study that focused specifically on navigation, we found that while our magazine-reading participants spent most of their time actually reading, they also glanced at and scanned far more pages than they read; when they were not reading carefully, it was important for them to be able to move along quickly (Marshall & Bly 2005b). In other words, navigation is intertwined with the act of reading; at best, it is both essential and invisible. Consider what Barthes said about the rhythm of reading in The Pleasure of the Text: We do not read everything with the same intensity . . . a rhythm is established, casual, unconcerned with the integrity of the text; our very avidity for knowledge impels us to skim or to skip certain passages (anticipated as “boring”) in order to get more quickly to the warmer parts of the anecdote . . . we boldly skip (no one is watching) descriptions, explanations,
52 reading and writing the electronic book
analyses, conversations; doing so, we resemble a spectator in a nightclub who climbs up onto the stage and speeds up the dancer’s striptease, tearing off her clothing, but in the same order, that is: on the one hand respecting and on the other hastening the episodes of the ritual (like a priest gulping down his Mass). . . And yet, it is the very rhythm of what is read and what is not read that creates the pleasure of great narratives: has anyone ever read Proust, Balzac, War and Peace, word for word? (Proust’s good fortune: from one reading to the next, we never skip the same passages.) (Barthes 1975) Many researchers’ questions about navigation have probed whether the behavior of digital media should be modeled after that of print forms. What is gained and what is lost by adopting the familiar tropes of pages and page turning? Are the standard navigational shortcuts like indices and tables of content, coupled with basic keyword search capabilities, enough to get the reader to her destination? Are the constraints and metaphors inherited from print media powerful or limiting? Research results answer these questions with a resounding “it depends!” As with other electronic book functionality, we need to ask ourselves what the reader is doing. Is the reader engrossed in a novel on an airplane, researching legal cases at her desk, or skimming a magazine in the laundromat, looking for the next story to read? If we assume that the reader is “lost in a book,” we wouldn’t need much convincing that page turning is the essential mode of navigation; indeed, there is a reason such books are referred to as “page turners.” But if a reader is researching a legal case, she will be looking first for a string to pull, a place to begin her research; once that key case is located, link-following—jumping from case to case, forward- and backward-chaining from a precedent to newer decisions—is a primary way that the reader goes through the materials. By contrast, skimming a magazine may be very different from either of the other situations; it may involve thumbing through many pages very quickly, glancing and moving on, progressing at varying speeds according to expected interest. Navigation is so fundamental to the success of electronic books that it is worthwhile to give it focused attention. It may be addressed by clever hardware solutions, by a variety of user interface techniques, or by using a combination of the two. Navigation hardware or software can easily become the sticking point of an eBook product. For example, one reviewer criticized Amazon’s Kindle saying, “The placement of the Kindle’s buttons on the side means that the first time you hand it to someone, they almost always turn a page accidentally.” Microsoft’s Reader (which is implemented as a software application that runs on a variety of mobile hardware) was criticized for the feedback it gave readers about where they were in the book. What must this navigational hardware or software support? At the very least, the reader must be able to move from one point of interest to the next, whether it is the next page, the next chapter,
Interaction 53
the next story, or some other narrative element. But that’s not all: the reader must also be able to answer questions that allow her to feel oriented. How long is this book? Where am I in it? What else is in this book? How much have I read so far? How much is left to read? Thus, we can think of navigation as having two different components: the ability to move around in the material, and the ability to stay oriented. Each of these components is more complex than it looks at first blush.
3.2.1 Three Navigation Scenarios As we explore different ways of moving through a book (or other sort of document), let’s keep three scenarios in mind. These are three ways participants in a recent study were observed (and described themselves) reading the New Yorker, a magazine that contains some long articles as well as some shorter features, cartoons, and advertising. The scenarios are taken from a paper that describes a study I performed with my colleague Sara Bly (Marshall and Bly (2005b). Our first reader, Jay, is a planner. He either looks through the table of contents for familiar authors or interesting topics or leafs quickly through the magazine before he starts reading in earnest. Because Jay is a movie buff, he often starts by reading the Current Cinema, a feature he locates by its position in the magazine (a few pages from the end). Subsequent navigation is guided by the table of contents; he dog-ears that page so he can get back to it easily after he finishes an article. Jay reads the cartoons serendipitously as he encounters them, but he doesn’t go looking for them. He doesn’t read the whole magazine; however, he does do one final flip through the pages to determine whether he’s “done” with an issue. Jay often reads himself to sleep, nodding out as he concentrates on the long articles. Our second reader, Gene, told us that he progresses straight through the magazine (in a manner similar to Barthes’ rhythms of reading), from cover to cover, reading, skimming, or glancing as interest and circumstance dictate. Although he takes in the table of contents as part of his linear perusal of the magazine, he referred to it as a preview, and said it doesn’t necessarily change his course. He interrupts himself as he reads a longer article to look at the cartoons—they are his first priority. Gene is conscious of the rhythms of his reading, describing the different types of skimming and glancing that he does (e.g., when he looks for names in an article that is otherwise not of interest) in addition to more focused reading. Some navigational nuances are apparent as Gene talked through how he selects what to read: As he reads the Table of Contents, Gene refers to a profile of sculptor Maurizio Cattelan: “I don’t care about ‘Profiles’ probably. . . . I’ll just look at the photographs by Richard Avedon, because they’re nice photographs. When I get there.”
The names in these scenarios have been changed to protect the participants’ privacy.
54 reading and writing the electronic book
When he reaches the start of the article, he re-affirms his earlier stance, “I’ve never heard of this artist. And the title is ‘The Prankster.’ And this face is completely distorted. This is not a subject that would attract me.” But when he finally reaches the Avedon photos in the article, he tells us: “This [the photos] got my attention. . . . So now, by virtue of [the photos], I might read the first paragraph. Otherwise I wouldn’t.” He backtracks and begins to read the long article. (Marshall & Bly 2005b) There are several things of note in this interview segment. The table of contents is used as an orientation mechanism, rather than being directly involved in navigational movement. Article metadata and elements like photos are another important aspect of Gene’s navigation—they confirm that he’s where he thinks he is. Serendipitous encounter with article content (in this case, a photo) causes Gene to recant his decision to skip the article; thus, he needs to find its logical beginning so he can read it. Note that a standard hypertextual jump would have had entirely different results: Gene never would have encountered the photo and thus would never have read the article. Skipping an article by turning its pages quickly has a different overall effect than skipping to the next article using a link. Finally, our third reader, Constance, flips randomly when she gets the magazine, looking for something to catch her eye. But her flipping is not entirely random; the magazine has a predictable structure, which Constance uses so she can be certain to check the poetry and other features she likes. She too uses the table of contents as a preview, not as a navigation mechanism. Constance actively seeks ads for new books and is easily distracted as she reads. Like the others, she relies on the magazine’s predictable structure for navigation and to maintain her sense of orientation. Naturally, these three scenarios not only reflect the type of reading and the genre of the publication; they also illustrate some important characteristics of navigation, not the least of which is the individual differences. Neither the material (a literary magazine) nor the activity (reading for entertainment or information) predicts how individuals will approach navigation. Close observation of these three readers (via videotaped sessions where and when the individuals normally read) revealed some other general patterns: the overwhelming majority of navigation acts were forward page turns; when people read print publications, they use a variety of lightweight navigation techniques to manage their attention; and readers rely on serendipitous encounter (via flips and incidental reading) to expose them to material they might otherwise skip if they relied on metadata (such as titles and authors) alone. What do we mean by lightweight navigation techniques? These are subtle interactions with the material to support the nonlinearity of reading: narrowing or broadening the focus by manipulating the physical magazine (folding the page or opening the magazine to look at both pages at once); letting one’s eyes stray to a page element out of the textual flow; looking ahead in the text to preview or anticipate; and looking back to re-read for context (e.g., to find the place in an ar-
Interaction 55
ticle where a person was introduced or a word was defined). It is not surprising that people have developed navigation techniques to manage their attention; if we look closely at annotation, many unselfconscious (and unremembered) underlines and highlights are for that same purpose, focusing one’s attention on specific passages. It is instructive to look carefully at a sequence of actions when someone reads a print article. Figure 3.2 breaks down Constance navigating through an apparently linear story frame by frame.
FIGURE 3.2: Detail of navigation while reading a print magazine. (a) Constance reads the initial page of a book review. (b) She finishes the first page and reaches for the binding in preparation to turn the page. Note that she can still see what she is reading. (c) She turns over the magazine and encounters a full-page graphic. She glances at the graphic. (d) Then she reaches to turn to the continuation of the review’s text. (e) But she lingers over the entertaining caricature before she completes the page turn. (f ) Constance opens to the next two pages of the review’s content, which gives her a glimpse of what’s to come and quickly informs her that there is much more to the review. (g) She folds the left page (the next page in the article) under. (h) She then flips the magazine to continue reading. But first, she glances at the advertising column, as is her habit, to see if there are ads for new fiction. (i) Then she successfully completes the flip of the magazine so the left page—next to read—is on top. (j) She changes the position of her hands so she’s holding the magazine comfortably, ready to continue reading the second page of the review. Readers frequently move their hands to their faces or hair while they read.
56 reading and writing the electronic book
How can this detailed look at reading on paper inform navigation facilities for an electronic book? First of all, we can see that the reader performed a series of useful lightweight navigational acts without extra thought or effort. It is also evident that the reader is always looking at something meaningful when she is reading (which is not always true of electronic book interfaces). But it is more provocative to realize that this typical series of maneuvers took time—6 seconds to be specific—and that there is a certain awkwardness to them: there’s a need to smooth and refold pages and a very real potential for accidentally losing one’s place when the magazine’s pages stick together. Having taken this close look at real navigation as part of reading, let’s return to the problem of moving around and orienting oneself in an electronic book.
3.2.2 Moving Navigation in electronic books can be book-like; it can also take advantage of the fact that we’re reading on a computer. Book-like navigation sometimes means that we are going to rely on the physicality of the book to suggest digital functionality. In other words, we may keep the page turning metaphor. Indeed, user interfaces based on page turning may regard the physical act of turning the page as so central to the experience of reading that they simulate the physics of the paper page (Chu et al. 2004). In such interfaces, the speed and fluidity of page replacement trumps other possible functionality. Since their inception, electronic books have raised the pagination controversy: is it better to turn pages than it is to scroll? In 1988, Frank Halasz dubbed this controversy as “the card sharks versus the holy scrollers” (Halasz, 1991). He was referring to the argument between proponents of Hypercard, an early Macintosh-based delivery vehicle for digital content that used a metaphor of screen-sized cards, and systems like Bell Lab’s SuperBook or Jef Raskin’s Canon Cat, which relied on searching, scrolling, and other means of navigation through long unpaginated documents. Although many eBook products have implicitly taken sides in this pitched battle, there is little evidence that one technique is reliably better than the other. Studies may uncover performance differences between these two ways of implementing linear navigation (e.g., Liesaputra & Witten 2008); however, they do not offer incontrovertible evidence, and user preference seems tied to many factors (what the user is accustomed to; how good the hardware or software scrolling mechanism is; the type of material that is used in the test; and other characteristics of how the system, content, and user interact with one another). Furthermore, pagination can imply several different things. Pagination is sometimes implemented as a fixed page size to emulate a print page; the PDF format has this characteristic. That
Interaction 57
way, both readers and writers can rely on the fixity of the page; the artifact is stable regardless of the rendering hardware. For instance, the instruction to “look at the first paragraph on page 8” will always mean the same thing, thus supporting simple collaborative use of the book or document without any further functionality. On the other hand, displays have varying properties, and commitment to a paginated presentation may not be the same thing as commitment to a fixed page. Microsoft’s Reader uses a paginated layout but recalculates the page size based on changing properties of the target display (e.g., the size of the display and the desired font size). Book-like navigation does not preclude using the structure of the book to navigate. Print books have developed various conventions for structural navigation such as Tables of Content and manually constructed indices. If these conventions are realized through functional markup (e.g., by naming the section divisions) and are coupled with links, electronic books can have easy-to-use navigation mechanisms based on structure. Some print books also provide a physical means for accessing the book’s structure. Think of a large dictionary with tabs on the side that allow you to open the book to the desired letter. Such physical tabs are sometimes emulated in electronic books by means of a visualization or by some other generalized jump function that does not rely on the exact structure of the document. Reading on a computer means search can be used to navigate within a document. Searching is especially effective if the reader is trying to refind a place she knows to be in the book. A simple search facility can be implemented using a standard text-indexing capability. That is, for collections like personal libraries, search is usually divided into two kinds of processing. First, an index is created beforehand (and extended when new content is added) by parsing the text in the items and building a list of terms, in which each term is associated with all locations in the text that the term occurs. At retrieval time, the index allows the search function to quickly locate and display passages that contain instances of the term (or, by union or intersection, terms); the reader may use this capability to navigate to the desired place (or places). Naturally, this is an oversimplified explanation of information retrieval (IR); concepts such as stop words (words that are too common in the collection to use as indexing terms) and stemming (reducing terms to a normalized or canonical form) are discussed in many basic IR references (e.g., see Marchionini 1995 for a discussion of information seeking from a user perspective; Salton & McGill 1986 for a basic primer that explains modern IR). This type of term search is familiar to most readers, since it is used in many applications and in most curated online collections.
Web search is more difficult because it requires the use of adversarial techniques. The simple content-based search we describe here is most useful when the items are well-represented by the terms they contain; on the Web, the pursuit of viewer attention has provoked content creators misrepresent what is in the documents. Web search more commonly uses inbound link-based metrics of popularity as first suggested by Kleinberg (1998).
58 reading and writing the electronic book
But searching based on words and phrases in the text may not get the reader where he or she wants to go. Generally, unless the reader has read the book before or knows a great deal about a subject area, terminology is going to be an issue. Either a given term is too general and won’t take the reader to a useful point in the book even if the reader uses repeated searches for the term or the author may use different language than the reader expects. For example, suppose a student is writing a paper about Shakespeare’s use of bird imagery. She remembers a passage in Merchant of Venice rich with birds. But she won’t be able to find the passage if she searches on “bird”; she’ll have to guess the name of a particular bird to navigate to the passage. In other words, if she thinks to search for wren, crow, lark, or nightingale, she’s in luck (unless she spells nightingale incorrectly!); if she searches on finch, she won’t find the passage. Further discussion of search functionality, and what it may be used for in eBooks, may be found in Chapter 7.
3.2.3 Orienting Orientation is the flip side of navigation. Navigation asks the question “how do I get there?,” while orientation asks the question, “where am I now?” This question may refer to the reader’s relative location in her personal library (“Which book am I reading? Which part of the collection is it from?”), in a particular book (“Where am I in the book?”), or in a subpart of the book (“How much of the chapter is left? What else is in this book?”). Thus, orientation need not refer to the entirety of a book or collection; it can also refer to attempts to discover one’s relative location. Print books offer the reader many clues about where she is. The reader can glance at the book’s pages and see the relative heft of pages that have been read relative to pages left to read. Page numbers are a simple orientation mechanism: if a book has 435 pages and the reader sees that she is reading page 200, she knows she is almost halfway through the work. A reader may also flip through a work quickly with the intent of orienting herself as opposed to navigating to the next place; in other words, the flipping will not result in a new focus of attention, but merely a renewed sense of where the reader is right now. There are a variety of ways of allowing the reader to orient herself; these techniques may involve some combination of abstract representations of place (e.g., page numbers), visualizations (e.g., maps of the book’s structure), and actions that will only temporarily disrupt the reader’s current focus of attention (e.g., flipping through a book or glancing ahead or back; several of these actions are illustrated earlier in this chapter in Figure 3.2). In practice, eBooks usually take advantage of some combination of these techniques to give the reader this vital sense of where she is. We will explore each in turn. First, one can use compact representations of relative location such as page numbers or headers and footers. An eBook reading application can number the pages so the reader can tell that she
Interaction 59
is on page 200/435. Because pagination in eBooks can vary according to display properties (a small screen as opposed to a large one) and reader preferences (a large font as opposed to a small one), readers sometimes find these traditional measures of progress and orientation to be frustratingly difficult to interpret. For example, if our reader is tackling Charles Dickens’ substantial Victorian novel Bleak House on an iPod, the work may seem to have an endless number of pages. We saw this effect in our studies as students (and other Pocket PC users) read long works on small screens; an English Literature graduate student described reading on her HP Jornada accordingly: You get this little screen, so you get no sense of even how long the work is . . . You have 600 pages, which means what? No-one knows. And so . . . I definitely don’t see it as a literary experience (Marshall & Ruotolo 2002). Because of the orientation difficulties introduced by this lack of material cues, most on-screen reading software couples representations like page numbers with visualizations of reader progress through the work; these visualizations may be interactive so they can be used for navigation as well as orientation. Figure 3.3 shows an example of a visualization that reclaims the lost material cues by simulating the physical properties of the book (e.g., thickness) to help the reader regain a sense of orientation. Others (e.g., Liesaputra & Witten 2008 and the Open Content Alliance) have constructed similar interfaces with varying degrees of attention to the literal physics of turning the paper page. A third means of supporting orientation builds on standard navigation functionality: orientation is a matter of hanging onto one’s current position while one looks backward or forward to gain a sense of what is to come or to remind oneself of where one has been. Figure 3.2f shows how judicious placement of a “finger” as a place marker allows one to flip forward rapidly and come right back and resume reading. Why is orientation vital? At first blush, orientation looks to be an insignificant subset of what might be implemented by way of navigation. Yet our studies have shown that if a reader lacks a good means of discovering where he or she is in the text, he or she becomes frustrated with the on-screen reading experience. Even in a linear reading trope (e.g., reading a novel with a conventional plot), readers interpret what they are reading by using their sense of orientation; a murder on page 25 of a mystery will be apprehended very differently than a murder on page 375. Returning to the navigation study (Marshall & Bly 2005b), let’s look at two examples of a reader orienting himself. The first involves a quick look forward, probably to see how much is left of a long article and to preview where the article is going, followed by an oscillating return to the spot the reader left off to regain purchase on the stream of text. Figure 3.4 shows the log of this orientation process; it is evident how important—yet how seamless—this combination of actions is to the reader’s progress through the long magazine article.
60 reading and writing the electronic book
FIGURE 3.3: The British Library’s page-turning visualization to aid the reader in navigation and orientation.
Let’s look more closely at the participant’s actions when he is returning to page 104 and page 105 to continue reading (the portion of the log below the double line) after his excursion to the end of the article on page 111. When asked to comment on what he was doing, the participant said: So something has made me re-read something that I just read. Now that happens to me. And I’ll think, “oh, oh, they’ve just referred back to something.” And so I’ll go back and re-read it. What often happens to me is an article will have a lot of names in it, and . . . it’ll keep mentioning somebody, and I think, “Oh, who the hell was that?” And so I’ll go back, scan for their name, until I don’t see their name any more, and then figure, “Okay, they’ve got to introduce this person” (Marshall & Bly 2005b). From this example, it is easy to see both the importance and the invisibility of navigation and orientation. If the reader is unable to accomplish this basic part of reading easily, he or she may reject eBook hardware and software out-of-hand. Although Figure 3.4 reminds us what a small part
Interaction 61
Time Code
Seconds
00:00
122
Picks up magazine (12/3) to continue reading David Kelly article.
02:02
178
Turns to p. 101; continues reading.
05:00
121
Turns to p. 102; continues reading.
07:01
13
Turns to p. 103, then to p. 104, where article continues. Glances at p. 105.
07:14
126
Resumes reading on p. 104.
08:20
4
Turns to p. 105.
08:24
6
Turns to pp. 106–107
08:30
5
Turns to pp. 108–109
08:35
5
Turns to pp. 110–111, where the article ends.
08:40
4
Turns back to p. 104.
08:44
14
Turns to p. 105; resumes reading.
08:58
62
Turns back to p. 104; resumes reading.
10:00
Action
Turns to p. 105; continues reading.
FIGURE 3.4: Navigation log for participant “Jay.”
of our time is spent actually navigating and orienting (out of over 10 minutes of reading time, the participant spent less than half a minute looking forward to the end of the article and looking back to reorient himself ), it also speaks to the significance of this time. Would the participant have finished the article if he couldn’t have found out how much was left, or would he have given up? Would it have made a difference to his interpretation of the article’s meaning if he discovered he was on the sixth page of a 100-page article, one that took up the entire issue of the magazine? Would he have been able to recover context gracefully had he been unable to look back and see who someone was? One danger introduced by precise navigation is the potential loss of serendipity (Marshall & Bly 2005a, 2005b; Marshall 2007). Because flipping through a print publication is, by its very nature, imprecise and nonlinear, it affords a great deal of serendipity. We should not underestimate
62 reading and writing the electronic book
the importance of this phenomenon. Consider the way a reader describes his interactions with the New Yorker: An article will go on and on . . . [and] at some point I’ll think, “okay. I’ve gotten this whole point. I’m tired of this. When is this going to end?” And if it’s like three more pages, then I may just either give up. Or just go into a scan mode, where I just flip, you know, see what grabs my attention (Marshall & Bly 2005b). Navigation has been an underappreciated aspect of reading. Successful eBook interaction will include inventive hardware and software mechanisms that will more fully support a reader’s range of navigation and orientation activities. Because on-screen navigation has the potential to be more comfortable and less awkward than reading print books (there are no pages to smooth and refold), and the capacity to be more imaginative (consider the many kinds of interactive visualizations that are possible and the types of haptic feedback hardware that have yet to be put in readers’ hands), we can expect considerable development in this area as the next generations of eBook platforms unfold.
3.3
CLIPPING
We have all had the experience of opening our email in the morning and finding a newspaper article sent by a friend, colleague, or family member. For the sake of consistency, we will refer to this mode of interaction with reading material as clipping. Clipping is a way people extract, save, and share portions of a longer work—a chapter of a book, for example, or an article out of a magazine, or even a memorable paragraph hidden deep within a long novel. Will people still want to clip if they can simply search and refind the item of interest again later? We can argue that they will, because people clip for many reasons (Marshall & Bly 2005a). They sometimes clip because they have encountered something of interest that’s off the immediate topic, something they don’t expect to remember or have reason to look for again. They may also clip an item to share it with someone (many of our most memorable experiences with clippings may well be in this category). In a study Sara Bly and I performed of how demographically diverse people clip material out of print and electronic periodicals (including magazines, newsletters, and newspapers) (Marshall & Bly 2005a), two findings were readily apparent: (1) each participant performed some variation of clipping, and more notably, each participant shared clippings and received them from others; and (2) at the outset of each interview, most participants denied that they did so themselves, preferring to ascribe this habit to friends, colleagues, and family members. Some emphatically denied saving clippings, although by the end of the interview, we had turned up many examples they had scat-
Interaction 63
tered through their physical files, stashed in their computer folders, slipped into print books, saved in their email, and stored in any number of other physical and virtual places. Thus, like annotation, clipping is an unselfconscious practice. We will examine clippings as we did annotations—by looking at as many examples as possible to understand their form, function, status, and value, and when they are used and how—with an eye toward how clippings can become part of the functional substrate of electronic books. We will start with clippings that people save for themselves. Because clipping is unselfconscious, it is to some extent invisible. It is also surprisingly ubiquitous; all the participants in our study showed some evidence that they had clipped material themselves or they had saved material someone else had clipped on their behalf. What were they doing with these clippings? Although our study focused on clippings taken from periodicals, because the periodicals represented so many genres, the results are likely to apply to material one would find in equally diverse genres of electronic books. When people clip material for themselves at home and at work, the most straightforward plan they have for the clipping is to use the information for future reference; in these cases, the information itself was regarded as valuable. For example, the owner of a commercial nursery showed us a large box that contained both whole magazines and clippings from other periodicals that are about interior decorating. She described the cache as “pictures of ideas” that she doesn’t intend to look at again until she has settled into her new house, which wasn’t yet built at the time of the study. Similarly, a financial consultant saved an article from The New York Times about a client in Hawaii. He planned to revisit the article before he took his next trip to Hawaii so he could “remind [himself ] of the situation that’s going on over there. There is some tension between some of these institutions and the locals there. So I wanted to make sure . . . I’d be aware of the situation.” Substantial collections of clippings may be used at the anticipated time, but our findings established that other clippings were forgotten and not brought to bear when they would have been the most useful. Time-sensitive clippings often expired before the clippings were re-encountered. Because clippings are hard to organize—they may not fit within any preexisting category and are thus prone to misfiling—they are difficult to find again. If the clippings are saved by reference [e.g., as universal resource locators (URLs)], the actual clipping may no longer exist by the time it is rediscovered. It isn’t unusual for people to realize that they are prone to forget their clippings. They thus devise strategies wherein they don’t even try to manage them. For example, a study participant who was a partner in a design firm told us: I rarely keep anything . . . Mainly because I’ve just proven to myself that they just go sit in a folder and I never look at them again. So I try to grab as much as I can out of it and continue on . . . I’m opportunistic. If I have the time, and the article comes that makes sense—whether it’s sent via link, I find it online, it comes in a magazine—then I’m done with it at that point . . . (Marshall & Bly 2005a).
64 reading and writing the electronic book
Nonetheless, this participant still had many clippings scattered throughout his paper files, on his desktop, and in his email. People also keep clippings as a reminder for intended future action. In other words, they’ll keep an article, an announcement, an advertisement, or some other clipped material as a visible cue that prods them to do something. The content may only indirectly suggest the action, but it is sufficiently evocative for the clipper to remember what it meant. For example, study participant who was a professional environmental activist had photocopied a picture from a newspaper story about salmon; he intended to use this fish picture to remind himself to follow up with a fellow activist in Chicago: “I have a fellow in Chicago, an activist with [an interest in this topic]. And I photocopied this because I wanted to mail it to him.” He told us that he might even follow–up by phone, using the article as an excuse. This finding—that clippings act as reminders—echoes the results of others in the personal information management (PIM) community who have studied the functions of other types of office documents (Boardman & Sasse 2004; Jones, Dumais, & Bruce 2002). Of course, there’s an element of uncertainty using clippings this way: the reminder might not be in the right place at the right time and there are limits to available space and attention. Furthermore, it is difficult to leave digital clippings out on one’s desktop; an unexpected reboot can remove a clipping from the person’s immediate view, taking with it the reminder function. Not all clippings that people save for themselves have such pragmatic functions (as immediate sources of information or as reminders of things to do); they also may be emotionally evocative, calling an event back to mind. We have found that people intend to keep this type of clipping “forever” and thus become interested in aspects of sustainability. For example, a study participant who was a woman in her late 20s had kept a copy of Highlife magazine and a program for a cannabis festival in Amsterdam (“where everything’s allowed”) to remind her of a European trip she took by herself in 1995. During an interview, she told us that even though “that’s a long time ago. I’ll probably keep [the magazine and a program for a festival she attended] because it was an experience I had and I want to remember it.” It is also not uncommon for people to save published material that evokes the person’s place in a global event (e.g., a local front page from September 11, 2001 or a magazine published at the turn of this century). EBook platforms (in their role as venues for personal digital libraries) can facilitate this type of interaction (clipping and keeping) because there is less intrinsic danger to keeping too much.10 However, it is difficult to anticipate the long-term value of such a clipping, and we have more than once observed that upon re-encountering items such as these, people find their desire to keep them inexplicable as the years roll on (Marshall 2008a).
10
Although we must remind ourselves that storage is only one part of the cost of keeping an ever-growing digital collection; human attention must be factored into the total cost associated with sustainability ( Jones 2004).
Interaction 65
The most obvious and ubiquitous purpose of clipping is to share the excerpted material with others; in our study, all of the participants both shared clippings and received shared clippings. Remember that these clippings might not be an article literally ripped from a print newspaper; rather they might be a URL sent in an email message that points to an article in an online magazine. When we think of clippings in this inclusive way, it’s easy to grasp their ubiquity. A full 40% of the examples we saw during the course of the interviews were shared clippings. This dominance suggests we should pay close attention to the function and value of the material that is shared. What is the social role of this sort of sharing? After all, varying interpretations of clippings imply different functionality. In Chapter 4, we delve into the social function of clippings, but we will continue to focus on clipping as a type of eBook interactivity in this chapter. Figure 3.5 shows the relative frequency of different reasons people give for clipping material out of periodicals. This chart is not intended as an authoritative reference for how often people do one thing instead of another with their reading, but to show the scope of motivations people have for doing this kind of thing. Most importantly, the figure underscores that readers aren’t just saving this material for its information content. Thus, like annotating, clipping is a fundamental way that we should expect people to interact with electronic publications including eBooks. It is important to translate clippings’ functions and roles into requirements for eBook applications. Some of these requirements (e.g., refinding a desired item) have been explored at some length in the human–computer interaction (HCI) and IR communities (Dumais et al. 2003); others place more subtle constraints on applications. It is clear that like annotation tools, clipping tools need to be very well integrated with reading so that this kind of unselfconscious secondary interaction— clipping—can be performed without seriously interrupting the primary one. How often do people
FIGURE 3.5: The relative prominence of various reasons for clipping material out of publications.
66 reading and writing the electronic book
refuse to use tools on a computer because they are simply too intrusive at the time they are needed? Sometimes, it’s just a matter of a few extra clicks on menu items, just enough to divert one’s attention from reading and interacting with the material to wholly interacting with the computer, and not reading at all. Or the tools are ignored because the user worries that her social intent may not be reflected appropriately. Or the physical characteristics of the clipping are missing (e.g., their ability to visually remind), thus rendering the clipping ineffective in its anticipated role. Clippings remind us that context and provenance are necessary to keep the products of our interactions with eBooks useful. If a clipping’s source has been forgotten and cannot be inferred from a property of its form (e.g., a recognizable font and writing style), it’s difficult to reestablish its authority: it’s different to save an article from The New York Times than from the South Bay Daily Breeze,11 yet there may be ample reasons to do either. Clippings also remind us of the need to re-encounter the excerpts saved from our reading. Sometimes, clipping is part of a larger activity that is driving reading (e.g., the reader is a student who is writing a term paper and is collecting material for subsequent analysis and synthesis). But as we learned earlier in this section, clipping is sometimes driven by serendipitous encounter; in this case, the clipping context tells us little about what provoked the reader to save some part of what he or she is reading. Hence, clipping underscores the need to somehow re-encounter or remember what has been saved at the appropriate place and time. Implicit query (IQ ) capabilities may address part of this problem; IQ uses the current context, the other documents in use, to run queries against the user’s personal collection (Teevan, Dumais, & Horvitz 2005). But just as there is a need for serendipitous encounter, there is also a need for serendipitous re-encounter: emotionally evocative material (the issue of Highlife we used as an example earlier in the section) is not necessarily a welcome interruption if the user is focused on an intellectually demanding task. That clippings are used collaboratively implies a need to extend collaborative tools to work in the preferred reading environment. If I see a quotation in an eBook that reminds me of my neighbor Evert, I don’t want to interrupt my reading, bring up a mailer, create a new message, find Evert’s email address (or his IM identity), copy–paste the quotation into the new message, send it, and get back to my reading. Rather, I want this secondary interaction to take place in a manner that doesn’t interrupt the primary one.12 Because collaborative reading is such an important topic, and has so much impact on eBooks, it merits its own separate discussion in Chapter 4. This discussion includes some thought to how sharing clippings and other forms of encountered information contributes to the notion of social capital (ala Putnam 1995; Wellman et al. 2001). 11 12
The local paper that served the southwestern part of Los Angeles when I was growing up.
Many designers have already learned this lesson; I can send the document I am working on right now to another device directly from Word’s main menu. Unfortunately, many eBook reader applications exist in splendid isolation and assume that the user will be doing nothing else besides reading.
Interaction 67
3.4
BOOKMARKING
What is bookmarking? Is it functionally necessary to distinguish bookmarks—specified places in an electronic publication13—from reader annotations? After all, highlights, underlines, and other marks in texts provide a way to return to a particular place. The most visible difference between annotations and bookmarks in print books is that bookmarks may be seen from outside the book rather than just being visible from within. Bookmarks may serve a number of functions and may either be a permanent or transient part of the reading landscape. At times, a bookmark may be entirely transient and simply hold the reader’s place across sessions: it is the place where the reader left off reading just before she was interrupted or just before she put the book down and went to sleep. Bookmarks of this sort may have no lasting value. A bookmark may also be useful in a limited scope, within the context of the task at hand, such as the bookmarks that signal the quotes to be used in composing a legal brief (Marshall et al. 2001). At the other end of the spectrum, a bookmark can be a persistent resource, a place the reader returns to repeatedly as a reference or shares with others. Let’s first look at transient bookmarks. These are often acknowledged as a just-in-time resource, one that is not supposed to have continued meaning after the fact. A student confronted with one of her own bookmarks during an interview said: Yeah, I have a dog-ear in here . . . I can’t remember what that signified. But at the time when I folded the page over and probably at the appropriate instance after that, I remembered what that meant, what I was supposed to go back to (Marshall & Ruotolo 2002). In cases like this, bookmarks and annotations serve very similar abstract functions over time: they form a personal geography over one’s reading material, reminders of where one has been, if not substantive comments on what one’s reading contains. Another student in the same study discussed her annotations as having the same geographic function as her bookmarks: So of this I’m starting to skim more. As it talks about things that are less relevant. I can’t remember—here’s another highlight, so I definitely read this far. Uh. I think I read to the end of the chapter. Yeah, there’s another highlight.
13
Here we are careful to distinguish eBook bookmarks from the Web browser function that sometimes uses the same term (but may be also referred to as a Favorite). Although these two types of bookmarks may have similar placeholding functions, they may also be quite different (finding one’s way back to a familiar place on the Web is quite different from opening a book to where you have left off reading). As such, they may be stored in a different manner (e.g., in a local directory) than the storage we might design for the equivalent eBook feature.
68 reading and writing the electronic book
Our observations of the roles of bookmarks echoes the findings reported in Abrams, Baecker, and Chignell (1998). They describe Web bookmarks as used to reduce cognitive load (i.e., placeholding across sessions), to facilitate access (i.e., a persistent landmark),14 and as a collaborative resource (i.e., to share material within a book, either as a place or as a quotation). We add to that taxonomy with the more general function of bookmarks as a part of the reader’s personal geography. Bookmarks need not be simply an anchoring point in the text. Sometimes, a reader purposefully marks her place in one document using a second document. Consider, for example, the case of a photo of gray clouds used as a bookmark in a meteorology textbook or a looseleaf sheet of class notes used to mark the chapter that the test is on. Although bookmarks like this (which are arguably very similar to annotations in that they have a body in addition to an anchor and marker) are less common than the other kinds of place markers (gum wrappers, playing cards, a ripped corner of a magazine cover, and the like), they pose an interesting point on the continuum. What is the status and value of these bookmarks? The type of bookmark that is most often shared is the kind that serves as a persistent resource; bookmarks used for other functions are rarely shareable because they are either transient (only valuable within a limited scope) or tacit (the reader gives no clue about why the bookmark is there) or both. I have observed bookmarks as important shared resources in places like law offices, where a group of attorneys will share a hand-tabbed copy of a book of statutes. Sometimes, this practice creates a profound problem when a new edition of the reference is published; the people who use the reference are reluctant to give up the added value of the bookmarks in favor of a book that’s more up–to–date. For the sake of representational consistency, it might be best to consider bookmarks as a kind of annotation (or vice versa) and to treat them in a similar fashion because they introduce the same design and implementation questions: are they stored with the book, in a separate local store, or in a central database? Because we can envision bookmarks that are themselves content (the cloud photo bookmarking a meteorology textbook), they may be seen as having the same general anatomy as an annotation: a body, which is usually null, but which may be bookmark content; an anchor, which is likely to refer to a physical boundary (a page) or to a major structural element (a chapter heading); and a marker, which refers to the visualization used to indicate the presence of a bookmark (say, a stripe along the edge of a page). In spite of the distinction we posed earlier in this section, our observations of the Web in use (by readers using the most common browsers of the day) bear out the generalizations we have made about how people bookmark print documents. While most people do use bookmarks (or favorites), they use them in very different ways. For some users, their bookmarks are almost wholly transient. 14
While we might readily tolerate a 404 error when our Favorites fail to turn up the proper Web page, we would feel quite differently if we reopened the book at a bookmark, and found a blank page.
Interaction 69
They mark a page they’d like to get back to in the near future, and—having bookmarked it—only use it a few times, possibly in the course of performing a specific task. Sometimes, these users have unruly lists of bookmarks, and only refer to the ones at the bottom of the list. Other times, they will organize these lists, but not use them (e.g., they are aware that few of their links work anymore). Still other users have bookmarks that are actually references—an online dictionary, a favorite porn site, a magazine they read regularly, an email portal, and so on. How do people organize the bookmarks that they keep? Abrams, Baecker, and Chignell (1998) describe a number of different methods for saving Web bookmarks (some of which seem more pertinent to electronic books than others). Users might not organize them at all (so they remain in the order they were saved); they might keep them in an ordered list; they might put them in folders; or they might export them to a separate application for the purpose. Thus, Web bookmarks are managed within a continuum of structure, from a time-based order which requires no user action, to managed structures. These results are similar to the findings presented by Shipman et al. (2004). Because this discussion of bookmarks is edging into the territory of personal libraries (i.e., managing multiple electronic publications) and PIM, we will relegate it to Chapter 7.
3.5 HARDWARE FOR INTERACTING WITH EBookS Active reading and other forms of non-immersive reading rely on the reader’s ability to interact with the material. We can divide readers’ interactions into three essential types: pen-based interactions (annotation and possibly clipping); navigational interactions (page turning, scrolling, flipping, and so on); and the interactions necessary to move from reading into related activities such as writing or communicating. Right away, we’ll relegate the third category to general purpose mobile devices: there’s no sense in reinventing the laptop, the smart phone, or any number of other multipurpose devices that enable readers to communicate, share material, or write. Hence, in this section, we’ll focus on two types of hardware: pen-like input devices and innovative techniques that support navigation.
3.5.1 Hardware That Supports Navigation As we learned earlier in this chapter, navigation is a fundamental interaction for reading. Although it occupies only a small portion of the reader’s time, it is an activity that represents a major fraction of the reader’s interactions. Thus, eBook hardware designers have endeavored to make basic navigation a prominent feature of their devices, realized as hardware buttons, sensors, or haptic feedback mechanisms. Buttons. Hardware buttons are probably the most ubiquitous means for turning pages in eBook hardware. Click a button; turn a page forward. Click a separate button; turn the page
70 reading and writing the electronic book
back.15 Page turning via a hardware button or rocker switch characterized most of the second generation of eBook devices. Figure 1.1a and 1.1b shows the page-turning buttons on the Rocket eBook and the SoftBook Reader. The first version of Amazon’s Kindle apparently got page turning wrong—no reviewer would let them hear the end of how easy it was to accidentally turn an eBook’s page by gripping the edge of the device to hold it. Even in the platform’s second incarnation, page turning receives considerable attention. Usability guru Jacob Nielsen wrote: Kindle shines in one area of interaction design: turning the page is extremely easy and convenient. This one command has two buttons (on either side of the device). Paging backwards is a less common action, but it’s also nicely supported with a separate, smaller button. (Nielsen 2009) Unfortunately, other types of navigation didn’t fare as well in Nielsen’s evaluation of the device; other types of navigation like moving around in the page and following links are relegated to a less successful joystick-like input mechanism. Sensors. Sensors are a means of detecting certain use conditions; they are mechanisms that measure an environmental characteristic and convert it into a form that may either be read from a display (as you would find in a thermometer), or (in our case) used as an input parameter to software that controls interaction. For reading software, a sensor might measure pressure or tilt, and convert it to input for a navigation technique. Ideally a sensor will not need calibration very frequently. Fishkin and his colleagues at Xerox PARC added sensors to a small reading device to detect user handedness to shift a page’s position in the appropriate direction so the margin would be large enough to accommodate annotations (Fishkin et al. 2000). They fitted another small device with tilt sensors (i.e., accelerometers); tilt may be used to represent desired scroll speed for skimming, so the lines literally fall off the page faster as the device is held at a steeper angle16. Finally, they used sensors on either side of the top of a display case to support page turning forward and back; the reader “flicks” the page in the desired direction by applying pressure to the pads. Chen et al. (2008) extended the idea of sensor-based navigation in their dual-display prototype. Flipping the display (the whole device, more accurately) advances or rewinds the page, depending on the orientation of the flip (clockwise versus counterclockwise). A hysteresis mechanism 15
Even when page turning is implemented in software, the focus is on one-button forward page turning. The Times News Reader’s basic scenario features a commuter reading one-handed, pushing a button repeatedly to page through the morning paper. 16
Apple’s iPhone has made this type of interaction commonplace via its on-board accelerometers; the iPod’s scroll wheel is a second example of how sensors can be incorporated to control scroll speed.
Interaction 71
saves the last page the reader has viewed on the second display. A similar gesture implements fanning through pages very quickly—a sensor means there is no need to interrupt the activity to look for a button or a software control.17 The Kindle uses an accelerometer to detect the orientation of the device. Tilt it one way and the page is displayed in the normal portrait orientation; tilt it the other and the display becomes landscape. It even detects when the device is upside down and orients the page accordingly. If the accelerometer is oversensitive, sometimes unwelcome accommodations are made.
3.5.2 Pen-Based Interaction Early eBook prototypes, such as FXPAL’s XLibris, used tablet hardware under the assumption that annotating with a pen required far less cognitive overhead than producing a comparable annotation using mice, menus, and a keyboard; freeform digital ink was seen as a way of providing relatively natural interaction with books and other documents displayed on the screen. (Schilit, Price, & Golovchinsky 1998b). Field studies of XLibris used in real situations (e.g., a reading group) demonstrated that this was indeed true; people read and marked in a very similar manner as they did using printed reading materials and a pen or highlighter (Marshall et al. 1999). In fact, the interaction was so natural that readers didn’t bother changing pen style and wrote instead with whatever pen they had in hand (as they would with a real pen). For example, a reader would write brief marginalia with the highlighter she was already using, rather than switching the stylus to write like a ballpoint-style pen (a change easily accomplished in the user interface). Although pen-based interaction is very natural for annotation, unfortunately it may require further thought about the rest of the user interface. What is easy to do with a keyboard, mouse, and standard menus may be difficult to accomplish with a pen input device. Adaptations like marking menus and gestural user interfaces have attempted to address some of the user interface problems that are introduced by pen input. Furthermore, studies have found that producing longer segments of text—notes and long annotations—is more comfortable using a keyboard. Even with a handheld device like the Pocket PC, several students wanted plug-in keyboards; students in the Pocket PC study said that they could type a great deal faster and more legibly than they could write, so when they were writing anything beyond the briefest marginalia, they wanted to type (Marshall & Ruotolo 2002).18 Much research has been performed to determine whether handwriting recognition is actually necessary 17
This functionality is duplicated by interacting with a trackball so the eBook can be flipped when it is set down on a table or other surface. 18
It should be noted that in this study, several students who had been exposed to Palm Pilots had found Palm’s Graffiti to be a more effective input method than trying to write normally on the screen.
72 reading and writing the electronic book
and whether other implicit structures (e.g., lists) can form the basis for interaction instead (Moran et al. 1995). Systems that do use pen interaction must be designed to work around the inaccuracies that are common in recognition.
3.6
ESSENTIAL BUT INSUFFICIENT
Thus far, I have strived to keep this discussion general and to describe the kinds of functionality and interaction that transcend specific intellectual work and eBook genres, and that are faithful to the interactivity offered by the physical form of the book (annotation, navigation, clipping, and bookmarking). Yet some of the most powerful functions developers can add to eBook platforms are specific in their intent: translation, inter- and intra-corpus linking, skimming tools, search capabilities, and analytic tools. This type of beyond-the-book functionality is so important that we will give it its own chapter, Chapter 7. The basic interactions and functions we explored in this chapter are the live-or-die aspects of the electronic book. Doing them right will not guarantee that readers will see the virtues of reading on the screen; but doing them wrong ensures that readers will reject eBooks out of hand. The ability to annotate, navigate, clip, and bookmark will not cause our hypothetical reader to say “Wow!,” but these are necessary components of interaction with electronic books. • • • •
73
chapter 4
Reading as a Social Activity I began Chapter 2, the chapter about reading, by calling into question the assumption that reading is a solitary activity. It isn’t: I invoked document scholar David Levy to assert that reading is inherently social. This is not to say that people don’t read alone; of course they do. They may even go to some lengths to separate themselves from other people in the name of quiet and to prevent unwanted interruptions. But reading is social in a way that crosses several dimensions beyond the immediate stereotype of a scholar deep in thought in a library carrel or a child curled up with A Wrinkle in Time in a picture window. To tease out these social dimensions, let’s call to mind the old but serviceable computersupported cooperative work (CSCW) two-by-two matrix that divides up the world (albeit somewhat artificially) according to place and time. These dimensions will give us four quadrants characterized by events happening in the same place versus events happening in distant places and events happening at the same time versus events happening at different times. Figure 4.1 sketches out this matrix, styled after Johansen (1988). This matrix gives us a simple framework to examine the social side of eBook use. Use that occurs in the same place at the same time implies that people are reading together. Reading together may mean that the situation has been deliberately organized as an opportunity to read together: reading groups or classroom discussions are both examples of such a situation. It may also mean that people have contrived informally to read together: two students push their chairs together to share a computer display or two friends talking on the phone browse the same Web pages together. The second variant, reading together over the phone, represents use that occurs at the same time, but in a different place. Activity in this quadrant of the collaboration matrix is usually intentional because it involves a communications technology; that is, the distance must be bridged by some kind of connection. Once people are reading at different times, some sort of persistent artifact of their reading activity must be involved: annotations, clippings, bookmarks, recommendations, page views, or the books themselves may be used as a record of the activity; otherwise, collaboration or some other kind of reading-related social interaction would be difficult. Hence, this aspect of sociality involves sharing records of reading. These records can be explicit, such as annotations, or implicit, such as page
74 reading and writing the electronic book
FIGURE 4.1: Collaboration place/time matrix.
views recorded in a log. Asynchronous sharing can likewise be something intentional and planned, such as a discussion of assigned reading conducted in an online annotation system (Brush et al. 2002) or the recommendations of books or authors that we might see in a social cataloging Web application like LibraryThing, or the sharing may be serendipitous and may take advantage of large-scale collective effects such as an online newspaper’s most emailed story. In the end, it seems that we have posed two basic distinctions that make very different demands on eBooks and their users. First, there are synchronous activities that involve reading together, i.e., either in a way that is co-located (e.g., reading together in a classroom) or remote (e.g., browsing the Web together over the phone). Then there are asynchronous activities that involve sharing records of reading. These asynchronous activities can be further divided into those in which the records of reading are intentionally shared (e.g., clipping a story from the local newspaper and mailing it to a friend) versus those in which the records of reading are aggregated and shared in a way that represents collective intelligence or the wisdom of crowds (e.g., assembling a bestseller list or a citation index). Taken together, these variations give us an interesting way to explore the diverse social side of eBooks.
4.1
READING TOGETHER
If we take the idea of reading as a social activity as literally as possible, then we will conjure up the setting of a classroom, meeting, or reading group—a place where people gather to read and to
LibraryThing (http://www.librarything.com) was developed by Tim Spalding; it is discussed later in this chapter in the “Sharing and recommending books” section.
reading as a social activity 75
FIGURE 4.2: Reading together in a structured situation. (a) Reading in a reading group. (b) Reading in the classroom.
discuss what they are reading. Figure 4.2 shows two examples of this kind of setting: Figure 4.2a is a video frame of the reading group that was the subject of the study reported by Marshall et al. (1999). Figure 4.2b is a video frame of an undergraduate class that participated in the study reported by Marshall and Ruotolo (2002). Although we can envision either group working from a projected display that everyone shares, in practice, it is more common (and for reasons we will discuss, in these cases more desirable) for people to participate in such a discussion referring to their own copies of the work in question. There are two relevant phenomena that we observed in these settings: the first concerns “getting everyone on the same page”; the second involves the transition from face-to-face discussions over reading materials to exploratory search (or to simple question-answering IR). These phenomena may also arise in less structured situations.
4.1.1 Shared Focus When people read together in a structured situation of the sort portrayed in Figure 4.2, they frequently refer to the materials they have read (and possibly annotated). This creates an immediate and pressing problem of shared reference: how does everyone in the discussion know that they’re
For the sake of argument, let us erase the colocated/remote distinction we made earlier and lump together distance learning with face-to-face classroom situations. What we really care about here is whether all of the attendees can see the same shared screen/page and whether they have one of their own.
76 reading and writing the electronic book
looking at the same thing? Shared reference is a persistent problem, even given stable print editions, although the problem is exacerbated when the copies differ for some reason—a different edition of the work has been read or an eBook has been reformatted according to display constraints. Thus, in the case of the reading group—a situation in which pagination is fixed and everyone is reading a copy of the same document—“sync” conversations were still required. In other words, reading group members still needed to know which part of the article the others were looking at, despite the apparent linear progression of the group through the document. Why didn’t they just all refer to the same display? It would have been easy enough to project the article so everyone could focus on the same text as the discussion’s emphasis shifted. Interviews with members of the reading group revealed that they wanted to see their own annotations in full context while they were paying attention to the shared topics. One reader remarked: They [his annotations] were all there in the physical document, and we’d gone from the beginning to the end anyway. And I wanted to get the context, so if somebody was saying something about something else, then I would have had it there. Readers rely on complete context rather than disaggregated parts (Bishop 1998). Although there are shortcuts to sharing a view of a single document, these shortcuts would have removed the reading group members’ ability to see their own annotations and notes in context. Observation of the group while they were discussing the article also revealed that one person or another might briefly and deliberately shift out of sync with the rest of the group to check something on his or her own. Figure 4.3 revisits the details of Figure 4.2. Notice how one member of the reading group is looking ahead in the paper; other members of the group are looking at the first page, while he is looking further. After such an excursion, the group member seamlessly regains purchase on the conversation without disrupting it to ask, “where is everybody?” In this case, he need only glance at where the other group members are in the article because all copies look essentially the same. In the classroom, matters were even more complicated. Rather than each student using his or her own copy of the same book, students (especially the English Literature graduate students) were each using editions of the text that they already owned (possibly from previous courses) or different editions that they had purchased secondhand (to save money). Sometimes, a shorter work used in the course was a small part of a thick anthology. Add to this mix several different electronic versions of the text, each reflowing according to screen or window size. Furthermore, some versions of the electronic text had been prepared in eBook format for Microsoft Reader and others in HTML for Web browser display. Obviously, page numbers—or in some cases, line numbers—were not a stable point of reference.
reading as a social activity 77
FIGURE 4.3: One member of the reading group skips ahead while still following the discussion.
Thus, if we want eBooks to support reading together while still allowing collaborators to look at their own annotations or on-screen context (e.g., notes in a separate window), they need to have easy ways of communicating a shared sense of place in the electronic work. People need to be able to co-navigate to the same place in the text without losing their own sense of context. It is telling that the undergraduate class in the study made extensive use of the materials on the Pocket PC in class, while the graduate class did not. This use (and nonuse) may have been influenced strongly by the way the course materials were structured: the undergraduate class navigated the course materials together in class using the preset hypertext links available in the table of contents; the materials the graduate class used were not structured this way. Instead, they would have needed to search for key terms to co-navigate to the right place in longer texts; search was reported as too slow and too clumsy to use in this way.
4.1.2 Collaborative Search and Reference Following One clear advantage of mobile devices like eBooks is that they offer the reader an opportunity to consult external references on the spot ( Jones et al. 2000). Furthermore, recent study results have revealed that exploratory search effectiveness is enhanced by collaboration (Pickens et al. 2008).
For example, if the class chose to navigate to a place in the text using a key term, it would have to be the first use of the key term in the work for the strategy to be effective.
78 reading and writing the electronic book
Taken together, these results might lead us to expect people who are reading together to launch into effective collaborative search sessions together, either for the sake of answering questions that have come up during the discussion or to explore new topics. Indeed, in our Pocket PC field study, the students (and faculty members) did just that. In this study, they were confined to using materials local to their devices for this type of search (the study predated ubiquitous wireless), but it is easy to envision broader on-the-spot research today. The undergraduate class used an extensive corpus of primary materials about the Salem witch trials, which had been preloaded onto each student’s Pocket PC at the beginning of the term; participants used the full course materials in classroom conversation to look up answers as questions arose. In a classroom session that covered Arthur Miller’s The Crucible, the discussion turned to a remark in the preface that referred to the hanging of two dogs at the Salem witch trials. The students immediately sought documentary evidence for the event; freed from the chronology mapped out in the syllabus, exploratory research evolved organically from classroom interest. Students were already anticipating the use of wireless connections in the classroom; they thought the kind of research that wireless access to the Internet would engender would make an interesting class even more compelling. As a related practice, we might expect eBook readers to follow references in this type of collaborative situation; reference following is discussed in Chapter 7.
4.1.3 Reading Together as an Informal Act When people read in public places, it may be the occasion for social interaction. Figure 4.4b shows two junior high school students reading together in a computer lab (the observation was part of the Walden’s Paths project reported in Furuta et al. 1997). Each student began the class session in the computer lab at his or her own PC (as shown in Figure 4.4a); by the end of the session, most of the students had clustered together in dyads (or more occasionally, in small groups). This is an extension of the phenomenon Michael Twidale reports in his discussion of over-the-shoulder learning (Twidale 2000): solitary reading turned into a productive kind of reading together as the students helped each other out and shifted their joint attention from the tools to the content. Over-the-shoulder learning may be essential to the adoption of electronic textbooks and other kinds of eBooks. In the Pocket PC study (Marshall & Ruotolo 2002), the students tended to teach each other strategies for using the new devices effectively. That is, rather than relying on the formal training resources offered to them by the Electronic Text Center, the students showed one another new ways to use the functionality that the Pocket PCs offered. Some of the learning contributed directly to the handheld’s use as a reading platform; other learning focused on the handhelds themselves (for instance, how to beam files to another handheld
Students, in general, viewed wireless access with great enthusiasm and anticipation, even when it was still relatively rare. For supporting evidence, see the work of Jones et al. (2000) on wireless tools in the stacks.
reading as a social activity 79
FIGURE 4.4: Examples of how people read together informally. (a) Students in classroom, one per computer. (b) Students began reading together. (c) A spontaneous gathering to read together.
or how to download games). The students might actively seek advice, but they were also opportunistic, capitalizing on opportunities to learn better ways to do common operations. One student said that he’d learned to navigate directly to a later page in an eBook by interacting with a classmate a few weeks after he’d received his Pocket PC: “Until then, I had to flip through.” Another student reported that she’d learned how to do things incidentally by socializing with other people who used the device. She said: It was things like closing applications. No-one knew how to do that, and my boyfriend was playing with it. And he said, “You have seven different things open. You might want to close them.” And so then I shared that with the class (Marshall & Ruotolo 2002). Likewise, Figure 4.4c shows an informal session in which two colleagues spontaneously began reading together in an open area in our workplace. This photo, taken in the mid-1990s, is of WebTV when the product was still a novelty. First, one researcher set it up and began browsing; not long afterward, another researcher joined him. It was readily apparent that the addition of a second person changed the dynamic of how the first person read. The two colleagues continued to read together, negotiating where to go next when both were ready to move on.
4.1.4 Peer-to-Peer Sharing Mobile devices seem to evoke a desire to reclaim the hand-to-hand physicality of sharing print publications. After all, it seems natural to share things by simply passing them from hand to hand, from one person to another. Then too, when documents are viewed as objects, meaning is conveyed by the way they are passed from person to person (Bowker & Star 1999). We also found this to be the case when the artifacts of reading are shared. In the Pocket PC study, students found the idea
80 reading and writing the electronic book
of beaming class materials to each other (via the infrared port) to be very compelling. Yet, in practice, peer-to-peer networking of this sort has been slow to materialize; instead, people have been using servers (i.e., “the cloud”) as an intermediary (Dearman & Pierce 2008). Yet, using peer-to-peer networking may be an interesting way to share material when people are physically proximate. In principle, it is easy to imagine co-workers syncing mobile devices so that they can share bibliographic updates to a local digital library. However, we should note that many DRM implementations preclude sharing eBooks at all; this vision of a shared local library will be difficult to realize under current restrictive policies. Peer-to-peer networking also comes into play as a personal digital library infrastructure; syncing copies of collections among a set of personal devices allows people to read and access the same material on different devices (without relying on a central server). We can think of sharing with oneself as differing from the type of peer-to-peer sharing described above. For example, you may want to share your own progress through an eBook across several reading devices so you can leave off reading on your Kindle and pick up the same eBook on your laptop.
4.2
SHARING THE ARTIFACTS OF READING
Once we move into the two lower quadrants of the matrix, we are talking about asynchronous reading, i.e., reading the same material at a different time; hence, something must be shared beyond the view of the same page or the discussion that focuses people on the same place in the text. To be inclusive, we will refer to that something (or those somethings) as artifacts of reading, tangible records that persist across time and space. These artifacts include intentional records, such as annotations, clippings, bookmarks, notes, and other purposeful things the reader has created while she was reading, and implicit records of reading that have been recorded by the eBook software that include logged events such as page turns, scrolls, opening and closing books, mouse clicks, and so on. Often, implicit records are referred to as telemetry because they allow someone to measure or observe the reading activity at a distance or at another time (or both). Let’s examine these shared artifacts more closely because sharing them is by no means straightforward. People often regard their personal annotations as just that, personal, and they may go to some lengths to remove them before they share their reading materials. Often, the records of reading are considerably more private than the materials themselves. In the United States, libraries have a long tradition of keeping the records of reading—for example, the list of books a patron has checked out—private; this privacy policy is so important that librarians have fought for it.
The American Library Association has developed guidelines for library privacy policies: http://web1.ala.org/ala/ aboutala/offices/oif/iftoolkits/toolkitsprivacy/libraryprivacy.cfm.
reading as a social activity 81
Yet—on the flip side—people sometimes want to know what other people are reading, especially if they hold the other reader in some regard (e.g., as an expert). Likewise, others’ annotations may be valued. What makes these artifacts private? Why would people want to share them? What makes these artifacts valuable? Are there ways of sharing artifacts of reading that may preserve privacy while giving others the virtue of their insight?
4.2.1 Reading to Know What Other People Know One common theme in many information use studies is that people want to know what other people know: in other words, they want to be in sync socially. In our study of news reading (Marshall & Bly 2005a), we saw that people are often motivated by a social purpose to read newspapers; this desire manifests itself in a desire to find out what other people are talking about: A lot of times what will happen is we’ll start talking about something in the office and everybody’ll be like: “oh. Well I saw it on blah-blah-blah a few days ago.” So it would be nice to be able to go back and find that article (Marshall & Bly 2005a). In the workplace, reaching social sync is particularly important in maintaining relationships with customers and clients. We have seen this effect in several studies. For example, a study participant who worked in a law office and characterized himself as a frequent reader of NYTimes.com said: What I really do with the NYTimes.com is I scan the headlines, because we have different practices, we have different clients. I like to know when someone is in the news so we can respond. So I can get my attorneys prepared—so we can be knowledgeable . . . I know where to look for Markets. I know where to look for Opinion, because I always read that. And then I scan down—I have really demanding health care guys—I’m always looking in those sections to see who’s there and who’s not. Or [looking for] related articles on the industry. It’s really about seeing it, finding it, and moving on. And I see it all at once (Marshall & Bly 2005b). Awareness is the key here: the participant is neither reading for comprehension nor for details, rather he is reading so that he knows “who’s there and who’s not.” In our study of the social use of clippings (Marshall & Bly 2004), we found that many clippings that are shared in the workplace (especially in customer-focused jobs) are passed around among co-workers for exactly this reason—to establish mutual awareness. Participants in the study frequently read periodicals with the idea that their customers read these same periodicals. The articles in these periodicals
82 reading and writing the electronic book
might, for example, help them understand the customer’s situation; a senior sales manager described putting an article on his colleagues’ desks if the article was about a significant event—layoffs or promotions—in the client’s company. He felt no need to explain a clipping like this because the practice is so common and ingrained. Awareness may extend to making sure that the form of the articles is reproduced: when study participants shared clippings for this purpose, they made an effort to duplicate the way that customers would be expected to encounter the information. So if the client would see the article in a print newspaper, the clipping would be distributed as it appeared in print even if the online form of the article was more readily available.
4.2.2 Sharing Annotations There has long been a tendency to idealize the annotations that people make while they are reading and to conceive of them in literary terms. In her “reading memoir” Ex Libris: Confessions of a Common Reader, Anne Fadiman begins with a rosy picture of shared annotations, “I have come to view margins as a literary commons with grazing room for everyone—the more, the merrier.” Some pages later, it is almost as if she has come to her senses and has peered inside one of these literary commons to see just who is grazing there and what they have left behind. Once inside, she succumbs to the harsh reality of the actual annotations: Not everyone likes used books. The smears, smudges, underlinings, and ossified toast scintillae left by their previous owners may strike daintier readers as a little icky, like secondhand underwear (Fadiman 1998). The shift from a literary commons to secondhand underwear may be warranted, if a trifle extreme. If we can share personal annotations, will we want to? What does it take to make a thicket of personal annotations to a fabled literary commons worth visiting? Annotations can be less lucid than their author imagines; unfortunately, it is not uncommon for readers to make far more annotations—underlines and highlights, especially—than they remember. EBooks (and electronic documents, in general) make it much easier from an implementation standpoint to share annotations and to move annotations onto different copies (or differ-
It should be noted that they weren’t just worried about the form; online and print sources may have the same headline, but one or the other goes into more detail. If they clipped the article from different sources, they ran the risk of the content differing in subtle, but possibly important, ways.
reading as a social activity 83
ent versions or editions) of a document (Brush et al. 2001; Golovchinsky & Denoue 2002; Hong et al. 2008). Scholarly annotations and commentary may be shared intentionally; in fact, this layer of scholarship that exists in the space between reading and writing is thought to be an important step forward in promoting the dialog between reader and writer (Bolter 1991). Digital documents also support the ready removal of annotations; as Chapter 3 discussed, some implementations maintain annotations separately from the base publication so they can be saved and not shared. Annotations may also be saved, stored, and not rendered; in other words, the annotations are both removed by virtue of being invisible and not removed by virtue of remaining in the file. This makes it very easy to share the annotations unintentionally. There was a legal case not too many years ago in which an attorney marked up a document for his colleagues. He turned off the rendering of the annotations to see what the final form of the document looked like, and without intending to, sent the document to the counsel for the other side with his firm’s annotations still intact. The sender was beyond embarrassment; he sought to assign blame and reprisal. This story is an apt illustration of the importance of the question of whether annotations should be shared: the answer is they may be extremely powerful and useful when they are shared and they may be extremely private too. A look at the typical annotation on a print book or document reveals that many individual annotations are hardly worth sharing or further thought: they are a persistent record of the reader’s own engagement with the material and not much else. Sometimes the reader finds them useful down the line, and more often not. Figure 4.5 shows a familiar type of annotation: a reader’s spontaneous reaction to the material. That the reader finds the content “Confusing to me” is probably of little use to other readers (although if every other reader annotated this paragraph similarly, we might think it was a difficult story. We address this case in Section 4.2.3). On the other hand, readers may use their personal annotations as a basis for what they share, modifying the annotation to make it intelligible to others (Marshall & Brush 2004). Figure 4.6 illustrates the relationship between the annotation a reader made for herself and the one she shared with her classmates. The annotation on the printed page (on the left) was very likely made in anticipation of how it would be used in the online discussion tool WebAnn (Brush et al. 2002), shown on the right. The example shown in Figure 4.6 makes annotation reuse seem straightforward. In practice, reuse and sharing is complicated by readers’ normal annotation habits. As Table 3.2 reminds us, most annotations are not immediately intelligible to other readers. Our study data reveals that only 9% of students’ personal annotations are complete and fully specified—i.e., they have a well-defined anchor (so the reader knows what the annotation pertains to) and an explicit body (Marshall &
84 reading and writing the electronic book
FIGURE 4.5: A reader’s personal annotation. Does it make sense to share an annotation that says, “Confusing to me”? Perhaps it does in a situation of high trust. Perhaps the teacher should see the annotation and the student’s peers shouldn’t. Perhaps the reverse should be true (the student’s peers see the annotation and the teacher doesn’t). One can imagine many scenarios under which this annotation should be shared and many others in which it shouldn’t.
Brush 2004). It is relatively rare to encounter an annotation that is unambiguously meaningful to someone else without further explanation. Let’s refer to this study data to see what else we can learn about the relationship between personal and shared annotations. Remember that the students’ personal annotations might well reflect their particular circumstances. In other words, they marked on the papers because they knew they’d be responsible for summarizing and discussing them online. If we examine the annotations that were in the proper form (anchor+body) to use in WebAnn, we see that only about one-third were used in the online discussions (fewer than 20% of the unanchored marginalia and fewer than 5% of the anchor-only annotations found their way into the online discussions). Reuse of the anchor-only annotations was higher (but not much higher) in the summary-writing task, still under 20%. What we see isn’t surprising: both the body of the annotations and their anchors need to be revised before the annotations are suitable for use in a public discussion. These results cast some doubt on the idea that readers will simply make their personal annotations public; rather, they will need to revise them before they share them. This is especially true of handwritten annotations one would make with a pen–tablet user interface. If we assume that annotation habits carry over from paper to freeform digital ink (Marshall et al. 1999), then we can also assume that more than a handwriting recognition capability and “publish this now” button will be required to make personal annotations publicly intelligible, even among immediate members of a community or work group.
reading as a social activity 85
FIGURE 4.6: A personal annotation is transformed into a sharable comment. (a) A student’s personal annotation on a published article. The annotation, which is anchored to a sentence in the article, reacts saying, “That may be true, but many of the early systems weren’t highly usable and usability is really taking hold.” (b) The corresponding annotation the student has shared with her classmates. It is anchored to the same sentence, and it says, “many of the early systems weren’t highly usable, yet we’re correcting that problem. VSD doesn’t examine how to fix past wrongs.”
That said, there may be other ways to share annotations. People may value experts’ annotations (Marshall 1997; Wolfe 2000), and there are systems that have been developed to support the sharing of annotations among the members of a workgroup or small team. Can we help people shop for annotations? Can we rifle through dozens and dozens of online annotations that represent the perspectives of many readers? Perhaps. But it may also be possible to aggregate annotations and use them that way.
4.2.3 Aggregating Annotations: The Wisdom of Crowds Imagine that a class uses a particular textbook. Each student marks it in his or her own way. One student uses a pink highlighter. Another student uses four different colors: pink is for term definition, yellow means important, green means “I don’t understand this,” and blue means “this is something I picked out when I re-read the book.” A third student underlines while he rides the bus to campus; his underlines are messy and approximate. Yet another student writes careful notes in the margin, some while she’s reading the textbook the first time, and others while she’s listening in class. A fifth student rarely annotates at all. A sixth marks in only a few places: where he stopped reading and at the beginning and end of the assignment. This variation is fairly typical (Shipman et al. 2003). What can we say about all these annotations? It seems safe to say that they represent a blip in the reader’s interest, or perhaps even that the reader thinks the text under his pen will be useful someday—on the test, in class—if not today.
86 reading and writing the electronic book
Readers seldom remember making most of these marks, so it seems sensible to say that if they’re important, they’re not very important, nor very reliable. Do readers’ annotations overlap more frequently than they would by chance? If one reader underlines 10% of the text, and another reader highlights 20% of the text, probability dictates that the readers would select the same text 2% of the time. Our initial calculations demonstrated that the text selections coincided a significantly greater proportion of the time than probability would predict (Marshall 1998). Furthermore, they overlap in interesting ways. We might surmise that the overlap occurs at topic sentences and pull-quotes, the text the writer, editor, or publisher thought was important. But this is not the case: the annotators converge on the text they think is important. The text that is marked the most frequently may be hidden in a long paragraph or toward the end of a section. These overlapping annotations may call attention to important points that would be otherwise buried within dense material. It seems as though there is a legitimate wisdom-of-crowds effect (Surowiecki 2004). Let’s look at a specific example, the annotations that the six reading group members made on a page of the technical article that they were discussing. Because we observed their meeting and interviewed them about their reading practices, we know that they were reading with the same purpose in mind. Their backgrounds are similar, although their specialties are different (that is, several specialized in signal-processing; several others specialized in HCI). Hence, the overlap in their annotations will probably be interesting. Figure 4.7 shows an example of a place on the page where their marks converged. Reader 1 did not annotate anything in this region. In fact, he seldom annotates when he reads; his entire reprint is devoid of marks. Reader 5 marks sparsely and only made one mark very early on in the paper. Hence, we wouldn’t expect either reading group member to have marked in the target region. Readers 2, 3, 4, and 6 made varying marks in the region. Without interpreting the specific marks, all we can say is that the phrase time-constrained clustering (and its immediate context) is of interest to the group. The set of equations themselves is of slightly less interest, perhaps because they are, to some extent, obvious to the readers. Observation of the group’s meeting demonstrates that this is the case. Different techniques may be used to identify overlapping anchors. Fingerprinting (Hong et al. 2008) and robust anchor specification (Brush et al. 2001) are two such techniques; other methods of identifying overlapping annotation anchors might rely on identical underlying texts (Schilit, Golovchinsky, & Price 1998a). There are many things we might do once we have identified these points of apparent communal interest: we can use them to bolster our characterization of what the paper is about (after
One member of the group led the discussion; this may mean that his or her marks reflected this extra responsibility.
reading as a social activity 87
FIGURE 4.7: A wisdom of crowds approach to aggregating reader annotations. (a) Reader 1’s annotations (none). (b) Reader 2’s annotations. (c) Reader 3’s annotations. (d) Reader 4’s annotations. (e) Reader 5’s annotations (none). (f ) Reader 6’s annotations.
all, we have identified terms that contribute substantially to the article’s relevance); we might use the identified text as a skimming aid (we can make this portion of the paper stand out to facilitate skimming); we can use the text we have identified in summarization or in visualizing the contents of the paper. In other words, that everyone singled out this text should be of consequence. Figure 4.8 shows how the results of this analysis might be used to support skimming (Marshall et al. 2001). It is easy to imagine how a researcher who joined the group late might look at the selected text to quickly catch up on past readings. This wisdom-of-crowds effect may be more pronounced as the number of annotators is scaled to larger communities. As a greater number of eBooks become available and as more people read and annotate on the screen, it becomes practical to explore whether this technique yields any
88 reading and writing the electronic book
FIGURE 4.8: Showing the consensus of many readers’ annotations to facilitate skimming. concrete benefits. It is presented here as an example of how the artifacts of reading may be used in new ways that were not possible in the print world.
4.2.4 Sharing Encountered Information Over 40% of the examples we saw in our clipping study had been clipped with the explicit purpose of sharing them with someone else (Marshall & Bly 2004); every participant in the study had on hand at least one shared clipping. For our purposes in this chapter, we can think of this practice as sharing encountered information. But sharing encountered information is something of a misnomer. In the study, it became apparent that sometimes the act of sharing was more important than the specific content that had been shared (although the content was always at least somewhat relevant to an interest or a shared value, e.g., a shared sense of humor). The content itself was more likely to be important if the clipping was saved for oneself rather than shared with someone else. Why, then, do people share encountered information if not for the information itself ? This kind of material was shared for many reasons: to keep in touch or develop rapport; for mutual awareness; to educate the recipient; to strengthen social ties (usually by demonstrating knowledge of the recipient’s interests); or for some combination of these reasons. Of course, it is important to remember that the recipient may give an entirely different account of why they received something than the sender would give for why they sent it. The recipient may say, “he sent this to
reading as a social activity 89
me to show me that he’s thinking about me,” while the sender is thinking, “this might change her mind about the war in Afghanistan.” Certainly the perceived utility of the shared material differs: frequently the sender thinks he or she is sending something of greater value than the recipient finds it. Sharing a clipping to keep in touch is not unlike sending a greeting card. For example, a father sent his high-school-aged daughter newspaper clippings that covered current events (sometimes as often as several times a day); of this practice, his daughter said, “Sending me the article is like a little note . . . And it’s nice to feel like someone’s thinking about you. It’s his way of saying ‘hello’ during the day.” Needless to say, she usually read these articles on the screen and then deleted them from her email. In general, the content of this type of material, while it necessarily fell within the scope of both parties’ interests, was not regarded as valuable and therefore was seldom intentionally kept. Sharing for mutual awareness is a common practice in the workplace. Sometimes a company’s management will distribute an article that mentions the company to all of its employees. A financial consultant in the clipping study reported that he reads the Wall Street Journal and The New York Times several times a week to see if his company is mentioned; he described how the articles are distributed among employees: “Those are two papers our company shows up in a lot, so usually we get an email that says, ‘hey. We’re listed today. Check out the article.’ ” Again, these articles may not be regarded as having any persistent value, unless they are further shared with clients (usually by sales people who have frequent contact with customers); in these cases, the clippings may also be shared to educate the recipient. Sharing clippings for this reason—to educate the recipient—is also common. One study participant had a preadolescent son who had been diagnosed with Asperger’s syndrome, a form of high-functioning autism. She had photocopied a “really good” print article from Time magazine and kept copies in a file for exactly this purpose. She told us, “I mailed this to so many people. Because it was very, very good.” She described attaching a note to one of the copies—“You need to read this”—and sending it to her son’s teacher. Material may be clipped and shared with the intent of strengthening social ties. Content of this sort usually demonstrates not only a familiarity with the recipient’s interests but also the fact that the recipient is on the sender’s mind. In such cases, the sender may be well aware that the actual information in the clipping won’t be that useful to the recipient; it just has to be sufficiently on the mark to accomplish its goal. Sharing clippings does not seem to be a practice limited to a distinct category of person or distinct social type—Pettigrew, Durrance, and Unruh’s (2002) information brokers—but rather
As we have noted elsewhere, it is often easier to keep these things than it is to methodically and consistently cull them.
90 reading and writing the electronic book
fulfills the somewhat overlapping functions discussed above. All of our participants not only shared information they’d encountered but also moved fluidly from acting as a giver to being a recipient. Findings suggest that the practice transcends age and specific context: it occurs both at home and in the workplace; it occurs when people read on the screen and when they read material in print. As such, it seems to suggest some specific eBook functionality. In a similar study, Rioux (2000) discovered that when people shared information they encountered on the Web, they had specific concerns. It is instructive to examine these concerns before we rush to develop new functionality for our eBook platforms. First, people often did not have the intended recipient’s contact information at hand. If the material included advertising, they wondered if the advertising would be sent along with the clipping. What would the clipping look like—would it mirror what they had seen on the screen themselves? Would there be long-term threats to anyone’s privacy as publishers tracked their interests and their friends’ interests? Would the clipping look like spam? Would it be perceived as impersonal? These findings underscore the importance of materiality: when people share published material, they want to know what it looks like and how it is excerpted. Consider, for example, clipping a photo from a newspaper article and sending it to a friend. Is there a difference between sending a photo and sending the entire article the photo is from? Have the photo credits been properly sent? Will the photo look as good on the recipient’s screen as it does on your own? Likewise, it seems to be important to have control over the mode of sharing. Consider the subtle differences among three modes of sharing a paragraph from an eBook: (1) embedding a clickable link to the paragraph in an email message; (2) excerpting the paragraph and putting it into the body of the message; or (3) sending the entire eBook as an attachment and using a bookmark to take the recipient to the desired page. In the first case, the recipient has to seek out the eBook, but has a message explaining the sender’s intent; in the second case, the recipient has the excerpt at hand, but no further context—the eBook’s authority is not conveyed; in the third case, the recipient has the entire eBook—and thus a substantial amount of context—but the extra material might be overwhelming. Naturally, it is up to the sender to determine her own intent and how to best convey that intent. In practice, people do consider these questions. A museum designer gave her own account of how she makes this sort of decision: My plan is to actually give it [a hardcopy of an article from Nature Online] to him [a project manager] and talk to him about it, rather than just put it in his in-basket because he’d kind
For the sake of convenience, let us say that the eBook is part of an online collection, for example, a shared library or an online resource like Google books.
reading as a social activity 91
of wonder where it came from or why he was getting it. And I’d rather say, “hey, I saw this online and it’s pretty interesting. Check it out” (Marshall & Bly 2004).
4.2.5 Information Brokering Studies of how people share information have led to the idea of information brokering. Information brokers are people who routinely mediate access to online information. Current research on information brokering in a community network setting shows that the practice “fosters social cohesion” as people search for materials on “behalf of another person (e.g., relative, friend) and not always at that person’s behest” (Pettigrew, Durrance, & Unruh 2002). Research on Internet use has demonstrated that people find ways to share their expertise on topics and help more junior users as a way of building social capital and strengthening social ties within a community (but that very heavy Internet use may also weaken community commitment; Wellman et al. 2001).10 Hence, the jury is still out on the overall effect of finding materials on the Internet relevant to the interests of others. Our own past work has found that it may play an important—if supplemental—part in building social capital and strengthening social ties within a community (Marshall & Bly 2004). It is important to note that in our study, we focused on encountered information which may not be driven by information needs; that is, the information that is shared does not meet any stated need on the part of the recipient. Information brokering has two possible implications for eBooks: (1) eBooks, or portions of eBooks, may be shared in response to a stated need; and (2) information that is encountered may be shared for any of the various purposes we discussed earlier in the chapter. In the latter case, an information needs framework is not the appropriate one for thinking about sharing; rather, we need to think about the effect the sharing has on social capital independent from the information the material contains.
4.2.6 Sharing and Recommending Books Recommendation is an important part of the social life of eBooks; it can take many forms and can rely on different sources of evidence. Books may be recommended explicitly, either from person to person, in much the same way as we have seen clippings recommended—that is, one person tells another, “Here’s a book you might like”—or to a wide audience, as is true on a blog or Web site. Books also may be recommended implicitly, via citation or quotation; that is, by citing a book or 10
This hypothesis that Internet-based interaction can strengthen, as well as weaken, a community’s social ties is one that developed as a response to Putnam’s famous “bowling alone” finding that public community participation has been on the decline in recent decades (Putnam 1995).
92 reading and writing the electronic book
quoting its author, we are saying, “go take a look at this work; it’s important.” Although not all citations are positive (i.e., you might cite an author’s work because you disagree with her), in some ways, just telling someone to look at a book for whatever reason amounts to a recommendation. It is much like the old show business adage that bad publicity is better than no publicity at all. Some recommendation is wholly content-based. That is, a recommender system might take some evidence of your interest—say, the text you’ve annotated or the last book you’ve read—and find you a book that has comparable content, either through text analysis or through a combination of text analysis and explicit citation (Schilit, Golovchinsky, & Price 1998a; Woodruff et al. 2000). Although this type of recommendation is not without a social component (both texts were written by authors interested in the same topic, and by interacting with one of the books, you were also showing interest in this topic), the recommendation is, by and large, implicit. Similarly, books might be connected with one another through shared quotations, because we can assume that sharing quotations means that works share key concepts (Schilit et al. 2008). Although the social component here is stronger (i.e., we are certain that the two authors looked at the same sources), the link between the two books is still inferred. Content-based recommendation has not been as successful as explicit recommendation, because social recommendation is often a more nuanced thing: a recommendation on a site like Amazon won’t just tell you that another reader liked a particular book; it will also tell you why the reader liked it and why you might like it too, beyond the topic simply being of interest. The reviewer might comment on less tangible aspects of the work such as mood, style, political leanings, and so on. And you might take the reviewer’s characteristics into account, including the reviewer’s reputation and how well the review is written. Recommendation may also be indirect—a topic is on the tip of everyone’s tongues for a day and a reader consults his favorite news sources to find out about breaking news. For example, a Times News Reader study participant told us: It was a few days ago. Michael J. Fox was accused of [faking]. So I saw that, I was on AOL for a moment. So I just got the quick headline and I got the photo. And so I thought this was a perfect use for [the TNR]. So I came here . . . (Marshall 2007). The reader knows what he wants to read about, how much he wants to read, and where he wants to go to read about it. It is no wonder that recommenders based solely on content have not been overwhelmingly popular.11 11
Note, however, that news aggregators have been very popular; it is telling that their popularity relies on their timeliness, not on the availability of specific topical content.
reading as a social activity 93
Recommenders have historically relied on assigned ratings, coupled with each user’s reading behavior (Goldberg et al. 1992; Konstan et al. 1997). When readers read an article, they assign a numerical rating; their own interests are defined by what they read. This partitioning of users into communities of interest allows the recommendations to be tailored, to some degree, to individual interests and tastes. Other recommenders are based on reader actions; for example, most newspapers track which article have been sent to other readers, which again takes a “wisdom of crowds” approach to recommending: surely the most popular articles are the ones you also want to read.12 The last few years have seen the rise of several important social networking sites for recommending books, eBooks, and other types of electronic publications. LibraryThing13 allows its users to catalog their own collections and, through these personal catalogs, meet other people with similar libraries or reading interests; the Web application bootstraps the cataloguing process by providing its users with records from multiple external sources such as the Library of Congress and Amazon. In essence, people are connected with other people through what they read. Zotero14 began by providing users with online bibliographic tools, not unlike EndNote and other reference management systems. Like LibraryThing, Zotero has grown in a collaborative direction, again pursuing the idea of connecting scholars through the references they use (but also allowing them to share reference materials through identifying common interests). Bookmarking services like delicious15 (formerly del.icio.us) are also oriented to tagging, sharing, and managing Web-based material. In addition to supporting recommendation, Delicious uses bookmark popularity to recommend sites associated with specific topics. Over the past decade, recommenders’ strategies have grown increasingly sophisticated leapfrogging beyond explicit the ratings and reviews that they originally relied on. Recommenders now use multiple sources of evidence—both implicit and explicit—to characterize the popularity and topicality of a material. Readers are not only connected with books; they are also connected with other readers through what they choose to read and what they cite as authoritative. • • • •
12
In the most general case of citation implying interest, we enter into the familiar territory of Web search, in which inbound links are taken as evidence of topical quality, a technique first suggested by Kleinberg (1998), but then popularized in the Google search engine. Obviously, taking on this broader topic is biting off more than we can chew. 13
http://www.librarything.com.
14
http://www.zotero.org/.
15
http://delicious.com/.
95
chapter 5
Studying Reading This chapter is about studying reading and studying electronic books in use. Why devote a whole chapter to studies? There are many excellent research methods resources that describe how to design and conduct studies, including texts on quantitative research design (Creswell 2009), qualitative methods (Marshall & Rossman 2006), qualitative studies in service of technology design (Sharp, Rogers, & Preece 2007), basic ethnography (Fetterman 1998), involving users in design (Brun-Cottan & Wall 1995)—in short, for every method imaginable for learning about human activity and human interaction with technology. Yet studying reading poses particular challenges. Reading is so essentially invisible and so commonplace that it is difficult to actually “see” it and to untangle it from the many assumptions that we make about it. Tzvetan Todorov, quoted by Nicholas Howe in Jonathan Boyarin’s compilation, the Ethnography of Reading, said of reading: Nothing is more commonplace than the reading experience, and yet nothing is more unknown. Reading is such a matter of course that at first glance it seems there is nothing to say about it (Howe 1993). Although there is much to be learned in the lab—how our eyes move across the page when we’re reading for comprehension, how typography affects our mood, how spatial memory works, and many other details of how we read—it is important to acknowledge all that we leave behind when we move into the lab. Thus, the main focus of this chapter will be on field studies, on finding out how people read in the wild. Watching people read isn’t easy and, in fact, has made for some of the most difficult field studies I’ve undertaken. Observing people read is by nature creepy. Once when I was involved in
I must hasten to add that this is the other Catherine Marshall, Catherine R. Marshall. Our identities are sometimes confused.
The reader should note that I have just provided examples of each kind of methods reference. There are many references in each area written in different styles with varying goals (practical how-to guides, more theoretical discussions of the rationale behind methods, and so on). Here’s where you will need to make some meta-decisions about what is important to you.
96 reading and writing the electronic book
the design of an electronic magazine product, I decided I’d watch people read on the airplane as a precursor to a more formal observational study. I just wanted to get the lay of the land, so to speak. So, during a short flight from San Francisco to Seattle, I watched a businessman across the aisle browse the golfing magazine he’d brought on board with him. I thought I was being subtle, allowing myself to cast sidelong glances at him across the aisle. He left his seat only once toward the end of the flight to use the lavatory, so I had over an hour to watch him and take notes. For some reason, he didn’t continue reading when he returned to his seat after his trip to the lavatory, so I focused my attention elsewhere. After the flight landed, when we were gathering up our belongings to get off the plane, he confronted me. “You stole my magazine,” he said. He had an accent. I didn’t know what to say. “No. I didn’t.” I finally said. “Open your briefcase and let me look,” he said. I was embarrassed and offended. How dare he! But after all, I’d been surreptitiously observing him for the duration of the flight. He must have sensed something was amiss. I sheepishly opened my briefcase so he could see for himself that I did not have his magazine. Even after he looked, I could tell he was unconvinced. Hence the hazards of watching someone read: there’s something creepy about it. Creepy and important. Simply asking people how they read and interact with their reading materials is not only hard—these are unselfconscious activities and people don’t always know what they do—but also misleading. People try their best to answer the interview questions and, in so doing, may lead the interviewer astray. When Sara Bly and I studied clipping, if we had asked people how they clipped at the outset of the study, we would’ve missed all of our participants, because they each denied that they did it at all. Furthermore, if we assume the rationale for these activities is self-evident, or that people have good insight into why they do some of the things that they do and we might miss valuable results. For example, if you ask a researcher why he writes in the margins of the articles he is reading, he might tell you, “Some people just scribble whatever in the margins or use highlighters, but I actually
I assumed he was a businessman because of the way he was dressed. He could’ve been something else. There is an intrinsic danger to allowing stereotypes be one’s guide, but further conversation with him also seemed unwise. You’ll see why.
studying reading 97
write useful things there. Important things. New insights.” If you glance at his books, you might indeed see that he has written in the margins and seldom uses a highlighter. However, if you look more carefully at what he’s written, you might see that he has simply echoed the words that are in the text. In other words, what he’s done is similar to highlighting and not at all the novel insights he believes them to be (although perhaps the process of writing this marginalia is cognitively useful to our reader). It’s important to use multiple sources of evidence. Of course, we usually have more to our agenda than simply learning about reading. Many of us plan to bring our insights to bear when we’re designing eBook hardware and software. We want to get the paperlike and social functionality correct (as we’ve described in Chapters 3 and 4) as we preserve the essence of the activity. But we also want to use this knowledge about reading when we design advanced functionality that will make people want to read on a computer. Certainly we can say that you’ll be able to fit your entire library on a portable device, and that’s a powerful idea, but we also want to be able to do things readers and scholars haven’t been able to do before. For these new capabilities to make sense, we have to know as much as we can about the reading-related activities we are trying to support. This chapter discusses several common types of lab and field studies that are used to examine reading. Lab studies are often (but certainly not always) quantitative and concerned with applying formal methods in controlled situations to arrive at statistical descriptions of the phenomena under investigation. Field studies, on the other hand, seek to describe the phenomena in the wild, and usually produce qualitative results. Sometimes, field studies are motivated by relatively narrow research questions, e.g., how law students annotate in their casebooks. More often, they start with fairly open-ended questions: How do people read (or perhaps even use) magazines and newspapers? Needless to say, a study may combine a variety of methods and may result in a combination of qualitative and quantitative data. But for the sake of clarity, this chapter pigeonholes the contrasting types of studies to give you a starting point for learning how to conduct them. I focus primarily on field studies for the pragmatic reason that they reflect my own methodological commitments. Thus, this chapter should be regarded as a starting point; general references should be consulted before you try to undertake a study yourself. In the end, it takes time—and a series of minor mistakes—to really learn how to do effective studies.
This hypothetical story closely parallels an experience I had while interviewing an epidemiological researcher in the field.
Of course, you should try to avoid beginners’ mistakes. But you shouldn’t be overly harsh on yourself should you make one: everyone does. Everyone screws up a recording, for example, or accidentally asks a leading question.
98 reading and writing the electronic book
5.1
TYPES OF STUDIES
The most important thing to do before you embark on any study (whether it is about reading or anything else under the sun) is to make sure your research questions are clearly articulated. Some methods also suggest that you state your going-in hypotheses, but research questions are the bare minimum. In general, it’s always wise to know why you’re doing a study. Having a research question or research questions in mind allows you to decide what type of study you want to do and what it will take to produce convincing results. In practice, most of us have favorite methods of data gathering and analysis, but even if we do, there is often some latitude in defining the study’s particulars. For example, you might be inclined toward field studies, but the kind of activity you’re studying, and the lens through which you want to look at it (suppose you’re interested in building a taxonomy of types of reading people engage in throughout the day) suggests that you choose a diary study rather than an interview study. Often, methods are combined to good effect, and a diary study is accompanied by an end-of-the-day debriefing interview (e.g., see Adler et al. 1998). We will start this section with a brief look at lab studies; these are usually characterized by controlled experiments that take place in a lab and result in a series of measurements. In studies of reading, this often means that the participants all read the same material, prepared in advance, and that conditions are carefully chosen so that most aspects of the situation are controlled while one thing is varied at a time (to ensure meaningful results). Because there are so many good reference books on this topic, this description will emphasize a few aspects of quantitative studies of reading in the lab, just for the purpose of contrasting them with field studies. We go on to discuss several different methods for conducting field studies, including a specific example of how you might perform a field study to examine a particular reading-related activity or an instance of eBooks in use. Field studies are over-represented in this chapter because I believe they can play a central role in designing eBook hardware and software. Field studies usually involve people reading what they would normally read where and when they would normally read it. Often, these studies yield qualitative data—lots of qualitative data—that is useful for guiding design. These studies should not be confused with market data or the anecdotal field reports that are used in marketing; data analysis
Certainly, some methodologies, such as ethnomethodology, prefer that the researcher go in with as few preconceptions as possible so as to approach the research in a data-driven, bottom–up way, but at this stage, articulating your research questions clearly is not a bad idea.
studying reading 99
and methodological rigor should allow you to have as much confidence in the results of qualitative studies as with any lab study. Each type of study is valuable for revealing different sorts of insights about reading-related phenomena. The purpose of this chapter then is to help you formulate research questions, to sort out what type of study to use to give you the data necessary to answer your questions, and to understand the ways in which each type of study might go wrong, the pitfalls in each method.
5.2
QUANTITATIVE/LABORATORY STUDIES
Generally, when we talk about quantitative studies in reading, we are talking about laboratory experiments, and the disciplinary standards we appeal to in presenting results come from psychology (although they can also come from physiology or neuroscience, since reading is a physical and cognitive activity). We can envision situations where quantitative techniques may also be derived from social psychology or sociology, e.g., when we start branching out and exploring how reading materials are shared socially. For the sake of brevity, this section will focus on two common reading-specific aspects of quantitative studies: performance metrics for reading and eye-tracking.
5.2.1 Performance Metrics for Reading Quantitative studies require careful thought about metrics. In the case of reading, we need to contemplate reproducible measurements that may be used to characterize reading performance. Most psychologists who study reading rely on measures of speed and accuracy, but there are others that are fairly common. Accuracy metrics vary: they may involve comprehension, memory, or recognition and may be couched in tasks such as proofreading. Choosing common measurements and designing controls carefully makes it possible for researchers to compare the results of different studies. Dillon (1992) differentiates between outcome metrics, such as speed and accuracy, and what he calls process metrics, which measure the efficacy of navigation or manipulation of the material. Process metrics tend to involve recording data while the subject is reading; these metrics thus focus on recognition in terms that are measurable with today’s equipment (e.g., fixation duration). Not all measures are objective. Subjective satisfaction with a reading experience or some part of the reading experience (e.g., page layout, typeface aesthetics, or anti-aliasing effects) is a fairly common metric. Sometimes, this subjective measure is approached from a negative perspective in terms of reported fatigue, tension, or eyestrain. If the evaluation instrument is a questionnaire, a Likert scale is often used to elicit subjective satisfaction data on a specific issue. Naturally, when
100 reading and writing the electronic book
you are designing subjective measures like this, there is a trade–off between how cognitively taxing they are for respondents (choosing among five levels on a Likert scale is much easier than choosing among 10) and how much information they yield for you (will a 10-level Likert scale give you meaningful distinctions?).
5.2.2 Eye-Tracking For some researchers, the first thing that comes to mind when they think about reading is eye tracking. Eye tracking results in measurements that record what the subjects were looking at, how long they were fixated on a given point, and how quickly their eyes were moving. In eye-tracking parlance, reading consists not of a continuous smooth movement along the lines of text, but rather of a series of brief fixations, interspersed with saccades, which are directed and coordinated movements of the eyes. This sequence of repeated fixation and saccade may be interpreted to determine how letterforms and words are recognized. Eye-tracking has come a long way from the awkward and bulky apparatus first introduced in the beginning of the 20th century and more or less continuously refined since then. Because eye tracking predates the personal computer (indeed, it was famously used by Fitts and his team for cockpit design and analysis in the 1950s), researchers have used it to study HCI and user interface design—with varying degrees of success—for the entire short history of the field. Tinker’s landmark work on reading, beginning in the 1930s, was predicated on the ability to track and study eye movements under a variety of typographic and layout conditions (Tinker 1963). Rayner (1983), a researcher at University of California, San Diego, subsequently performed an extensive series of eye-tracking experiments to study reading and other reading-related phenomena. See Jacob and Karn’s (2003) fascinating history of the use of eye-tracking in the HCI discipline for additional details about how this technique and apparatus evolved and has been used over the last century. Fans of eye-tracking point out that the particular technical, analytic, and physical issues that have rendered the technique less than optimal for studying reading are slowly being addressed. For one thing, eye-trackers are now commercially available and inexpensive; there are a large cadre of researchers with expertise on how to use them. They interfere less with reading than they used to since the head-mounted units are small and no longer interfere with free head movement; there are also eye-trackers that may be installed in front of the reader (thus, adjacent to the computer screen he or she is looking at) instead of on the reader’s head. Today’s eye-trackers operate by recording reflections of infrared light as they return from the reader’s cornea and retina. Analytic techniques have progressed as well, and now there are good software tools for extracting the relevant measure
The determined researcher can build his or her own eye tracker for as little as a few hundred dollars. See Babcock and Pelz (2000) for practical plans for constructing an eye tracker out of readily available materials.
studying reading 101
ments. Useful tracking metrics have been developed, and a community of researchers who use the technique has grown up over time so the work can be reviewed and critiqued. However, eye-tracking to make sense of reading remains problematic for several reasons. Interpretation of eye-tracking data is still something of an art; correlating what is on the screen with the eye-tracking data can be tedious; mapping low-level eye-tracking data onto higher-level cognition is by no means straightforward; and certainly, laboratory conditions for eye-tracking require some important constraints on how people read. Most importantly, eye-tracking is but one window onto reading. While it provides important data about some aspects of reading—word and letter recognition, most importantly—it has not shed as much light on how people read in the wild, where they often make no effort to read carefully, or understand all parts of the material equally well. Instead, eye-tracking has been used to validate phenomena identified through other data sources. For example, in his often-cited 1997 Alertbox column titled How Users Read on the Web, usability expert Jakob Nielsen declared that users don’t read on the Web, adding: People rarely read Web pages word by word; instead, they scan the page, picking out individual words and sentences. In research on how people read websites we found that 79 percent of our test users always scanned any new page they came across; only 16 percent read word-by-word (Nielsen 1997). The research Nielsen cites used activity logs in which Web browsers of 25 users were instrumented to collect data about everyday activities in the wild. Even here, it is not clear how these finding relate to electronic books: does it matter if the material in question is a textbook? What if the instructor has been careful to emphasize that the material will be on a test? Does it matter that the textbook was borrowed, not purchased? Does it matter that the student has a “B” average, and this test won’t change his grade? It is easy to see that many factors influence reading when it is taken out of the lab. Thus, it is important to integrate our very fine-grained understanding of how people read letters and words—and, generally, to look at elements of the page—with a more nuanced understanding of the social and material circumstances of reading. Triangulation among multiple methods is a good way of achieving this integration.
5.3
FIELD STUDIES
Field studies are a valuable way of learning about how people go about their work (and their play) in the wild. In other words, instead of bringing users (or potential users) into the lab, you go out and study them in their own settings. For reading, this can mean interviewing students in a university classroom or in their dorm rooms; it can mean setting up a video camera so that a reader can record
102 reading and writing the electronic book
herself thumbing through The New Yorker on her living room couch; it can mean observing people reading magazines, novels, and business documents in airport lounges. A field study does not prescribe a particular data collection regimen or analytic framework, but rather provides you with a set of methods for getting at what people really do. The results of field studies can give you valuable insights into human activities; these insights can, in turn, inform design, give you new ideas or shape inventions, or just help you refine your understanding of what people do. The exploratory nature of some field studies enables us to identify aspects of a technology design situation that we might not have conceived of if we’d jumped right into a laboratory study. In the absence of field studies, technology designers and developers often rely on introspection (thinking about what they do themselves); exposure to the full demographic range of people you expect to use, a technology is often an eye-opener. In fact, at one time, many everyday applications— such as e-mail, document preparation systems, text editors, and the like—were developed with the idea that software professionals could easily reflect on their own practices and make design decisions accordingly. The assumption that introspection will tell you everything you need to know (especially if the activity is something you engage in regularly yourself ) is simply not true, even for activities as basic as reading. First, it is unlikely that you are representative. Many of us who have spent a considerable amount of time in the field have learned that the participants in our studies have different practices than our own and have developed surprising work-arounds for the technology they use routinely. Going out into the field means that we don’t just learn about the isolated activity we’ve set out to study; we also learn about the fragility and robustness of infrastructures as well as the priorities of different workplaces, different home environments, different professions, and different avocations. In the case of eBooks, this might mean that we learn something about what else students want to read if we give them their course materials on an eBook platform. Or we might learn about what other applications they want to run on the device we’ve given them. We may even learn that they aren’t willing to carry both an eBook reader and a laptop computer. The other important drawback with introspection is that it is unlikely that you know what you really do; this is one of the main justifications for looking closely at the practices of others. Quick: What’s the last thing you read? Where were you when you did it? Did you read all of it? In order? Chances are you won’t remember the fine points of your last reading experience. In fact, you’ll probably find yourself wondering, “what counts as reading?” Working out these questions is part of the process of developing a good study. I’m going to talk about field studies in this section without using the word ethnography. When used properly, ethnography refers to the study of human activities in a way such that a full ac-
studying reading 103
count of their context is provided in the form of “thick description.” Recently, the term ethnography has been appropriated by HCI practitioners and equated very generally with any kind of interview study. To me, this feels a bit sloppy, but others may be comfortable with this expansive use of the term. As I said early in this chapter, reading and reading-related activities (e.g., annotation or clipping) are difficult subjects for field studies. Reading is essentially invisible and unselfconscious (i.e., people usually don’t remember the details of the last time they read a book or magazine). Reading involves varying degrees of mobility: it’s not just that people read on buses and planes; they also might move to a different place in their offices to read (e.g., a more comfortable chair or a place where there is better lighting). Although reading is social—after all, communication is social— much of what we want to study remains implicit in a social situation. Do field studies limit innovation? One of the more pernicious beliefs about field studies—besides dismissing them as unscientific—is that field studies hamper creative development. There is a misconception that going out into the field invariably involves asking people what they do and what kind of technology they want. This is simply not true: a good user researcher seldom just asks people what they do—as we’ve already learned, they often don’t know what they actually do—or even what they want (although this is sometimes a helpful elicitation vehicle for learning about them). Instead of limiting innovation and invention, we have found that field studies actually serve to guide innovation and make any intellectual property that is developed to be more valuable as a result. Innovation that solves a real problem and technology, that is designed mindful of the entire use context, are bound to be more useful and relevant than technology that is designed by simply asking “what if ” and pushing invention the next step forward. One of my colleagues once told me that automated teller machines (ATMs) would have never been invented if field studies had been used to guide the innovation process. To which I’d say, if field studies had been used, perhaps ATMs would have been developed years earlier because of unsatisfactory work-arounds for getting cash outside of normal banking hours. One of the most common (and most difficult) push-backs that a fieldworker can encounter is that field work is simply too time-consuming, that too much data is gathered, and that it’s too intractable. Of course, one easy way to counter this argument is to say that a field study can be designed to match the depth of the focal questions, and questions that are relatively easy to answer will not require as much time and resources in the field, nor will they require as extensive an analysis. On the other hand, the data we gather in the field may be mined multiple times as we learn more about a domain or delve more deeply into a technology. Observational data, in particular, may yield surprises when you look at it again. Furthermore, a good fieldworker will notice all kinds of things beyond the focal questions, which may suggest additional areas of inquiry further on in the design process or in the design of a future, related technology.
104 reading and writing the electronic book
With all these things in mind, let’s look at a few common types of field studies. I have selected types of studies that reveal things about reading rather than trying to enumerate the entire bag of tricks most field researchers develop.
5.3.1 Interview Studies Interview studies are the most common kind of field studies performed by field researchers. Interviews may act as a supplement (and complement) to other data-gathering methods. Chapter 5.4 will lead you through two examples of eBook-related studies—a reading-related practice (clipping) and an eBook technology deployment (Microsoft Reader on Pocket PCs)—that are based for the most part on interviews. Interview studies are very different from surveys even though they seem related. To conduct an interview study, the researcher goes into the field to talk with the informant in the place where the activity normally occurs; while this is straightforward for some kinds of office work, it might be less so studying a reading-related activity because reading is so intensely mobile. Hence, interviews might be conducted in the place where the participant does other kinds of reading-related work (e.g., writing) or where the participant uses other resources (e.g., a favorite couch or a home office). Interviews are generally semistructured (i.e., they follow a general prescriptive structure) and open-ended (i.e., the interview is shaped to some extent by what the interviewee says). How much the interview can veer off course into uncharted territory depends on the interviewer’s preferences and various study constraints (e.g., how long the interview can take). Because reading generally involves particular artifacts—print books, computers, notebooks, and so on—we say that the interviews are artifact-centered and are often conducted using the artifact to elicit certain elements of practice. For example, if you are studying annotations, it is important to have some of the participant’s recent annotations at hand and not just go on the participant’s (probably inaccurate) memory of what the annotations are like. Interviews can be tailored to conform to a number of methods and are usually coupled with an analytic framework. In “Data Analysis” -, we discuss these methods briefly. Regardless of the method, interviews are usually recorded (using either an audio or a video recorder) and transcribed for further analysis. A field researcher may videotape interviews to capture elements of the context, especially if the researcher feels she can record aspects of interaction. Often, audio-only interviews are supplemented with photographs of the settings or artifacts under discussion; photos can be very useful reminders of what you looked at during an interview and may be essential for resolving cryptic references during transcription and analysis. Because digital photos can be taken at high resolution and zoomed to recover details, they can add a whole new dimension to your data.
studying reading 105
5.3.2 Diary Studies A diary study is a study in which the researcher asks the participant to record critical events that the researcher is interested in. Often, the researcher supplies the participant with a structured form so the participant can readily record all of the relevant aspects of the incident. Diary studies are particularly useful when the activity of interest occurs at unpredictable times (and possibly in unpredictable places), making them a good candidate for studying something as ubiquitous as reading. For example, in a diary study of reading, such as the one performed by Adler et al. (1998), participants were asked to record their interactions with any sort of document during the course of their working day. Instead of making their participants judge what should be included as reading (which might cause them to exclude events of interest to the researchers), they were asked to log all kinds of document activity and the activity’s duration. At the end of each day, a researcher called each participant to elicit more details of the interactions through structured interviews. Sometimes researchers avoid using structured forms if they feel that writing something down is apt to interrupt what the person is doing or is simply too burdensome for the kind of people they have recruited for the study. Instead, the researcher may leave a camera, audio recorder, or some other kind of appropriate recording device (e.g., a video camera) to capture the critical events. Often, a follow–up interview is necessary to complete a full description of the events and to fill in missing details.
5.3.3 Observational Studies It is helpful to observe study participants in the field. This allows you to become familiar with the participants, their environment, the artifacts they use, and all kinds of communicative and collaborative practices that are largely implicit (and possibly invisible) to the practitioners. Like most methods, observational studies yield results proportional to the effort you expend. In other words, if you spend a lot of time in the field, you can become very familiar with the practices of a community. Modern recording technology allows you to capture what you are seeing for subsequent analysis (Suchman & Trigg 1991) and for use in cooperative design cycles, as well as to communicate aspects of use to developers (Brun-Cottan & Wall 1995). At first blush, this sounds ideal, but reading poses a particularly problematic set of observational circumstances. Reading is in some ways very intimate: people read in bed, in the bathtub, when they are alone, and when they can’t sleep. Reading is also hard to capture: people read on
Indeed, one important problem with diary studies is that you must give participants bright-line guidelines on what you want them to log. It can be very difficult for participants to decide what counts as the activity under investigation, especially if the activity is something invisible like reading.
106 reading and writing the electronic book
buses, while they are waiting, intermittently. The materiality of reading is important: whether someone is on a couch with a laptop on her lap or on the floor with the Sunday Times spread out in front of him is of consequence. And as we pointed out earlier in this chapter, watching someone read can be interpreted in a variety of ways. Although technologies like eye-tracking and fMRI have become less cumbersome, there is still a point to doing observational studies in which people read as they normally do. The study described in Marshall and Bly (2005b) is an example of how you might apply this method to answer a particular set of questions about navigation. In this study, participants videotaped themselves reading a current issue of a weekly magazine, The New Yorker, when and where they normally would. We subsequently viewed the videotapes to log different kinds of reading-related activities including navigation, manipulating the medium (e.g., folding back a page so only two columns of text are visible), and lapses of attention (e.g., talking to a roommate or dozing off ). We also captured peripheral activities like reaching for a drink, shifting position, and face or head scratching, as well as the way the participant held the magazine (e.g., one- or two-handed). In the logs, we noted whether the study participant was scanning or merely glancing at a page, what page he or she was looking at, and what article, feature, cartoon, or advertisement was the apparent focus of the participant’s attention. In short, we made every effort to fully describe the reading sessions so we could return to the video segments that exemplified different kinds of navigation or different types of physical interaction with the magazine. Unfortunately (or perhaps this is exactly what we should expect if someone is reading as they normally would), participants’ reactions to recent video snippets of themselves reading demonstrate that they are not confident they remember what they were doing. Figure 5.1 contains a snippet of an interview transcript; there are three people in the conversation, the participant (P) and the two field researchers (R1 and R2). The three of us have just watched a video excerpt of the participant reading The New Yorker. The participant begins by trying to describe what she was doing; by the time we are reviewing the snippet of her reading session with her, it is fairly clear that she does not remember it very well and cannot fully reconstruct the motivations for her actions. It is also apparent from the transcript that her self-stereotype about how she reads the magazine is contradicted by her actions, thus underscoring the value of close observation.
5.3.4 Surveys and Questionnaires A questionnaire or survey is a research instrument that consists of a series of fairly well-defined questions; usually a questionnaire is self-administered, while a survey may be administered by the researcher or someone acting in the researcher’s stead. Often, the responses are limited to a range of predetermined selections (the canonical A, B, C, or D), although most questionnaires and surveys
studying reading 107 P:
I often see, you know, like where the poems are, and go to them directly. Or the fiction directly. Or, like I said, if there’s a poem I want to read a review of, I might go to that. You know, it’s just something that catches my eye
R1: In the taping, you did not do that. You didn’t go from here to any particular place. Can you tell us why—do you remember why? P:
I don’t remember why. [laughs] I don’t know if it was just—I think I was being more thorough . . . I think I may have gone to that little independent bookseller section.
R1: You haven’t come to that yet at this point. P:
But what’s that then?
R1: That’s this page. So you’re sitting here . . . And then you flip to here. So you’re reading here. P:
Oh, okay.
R1: You appear to be reading here. P:
Right.
R1: Then you flip to here. Then you turn the page to here. And then you turn the page to here. P:
Oh, no wonder. [yawns] That caught my eye.
R1: Okay, so watch it again, would you? And see if you can recollect why you ever went backwards. From here. Because you had already come through it forwards. P:
Okay. I had?
R2: Yeah, you’d been on that page before. [watching] P:
Oh, god. I probably wanted to read that—I probably must have not have noticed that, and I probably wanted to read that. I’m not sure though.
R1: So if you look at this, you don’t remember what might’ve prompted you to go back a few pages. Yeah. Go ahead and do it again. [watching] P:
[laughs a little] I don’t know. [pause] I really don’t remember. [pause] Maybe I thought I’d missed something.
FIGURE 5.1: Interview to reconstruct an observed reading session.
include a few open-ended questions that allow the respondent to communicate about topics that the researcher may have missed or to give an answer the researcher may have not anticipated. Designing a good questionnaire is something of an art. Too long and you lose your respondent (the respondent becomes fatigued and does not finish the survey); too short and you don’t collect enough information to reach meaningful conclusions. It has long been acknowledged that how
108 reading and writing the electronic book
a question is asked is of enormous consequence; it is important to avoid leading questions and to ask important questions in several different ways so the response can be cross-checked. Questionnaires and surveys can be administered to a large number of respondents and may thus support statistical data analysis—and they may be used to good advantage if the researcher is not a native speaker of the language in which the questionnaire is administered—but they have a number of well-known drawbacks, including producing misleading results. Because they can be administered on the Web, questionnaires can be useful in resource-limited situations. Questionnaires may also serve as screeners for a smaller set of interviews; the questionnaires allow you to gather preliminary data, screen for desired characteristics, and establish a participant’s willingness to be interviewed and further questioned about his or her practices. Inventive methods of distributing questionnaires and surveys have been developed with the advent of social networking software, which can be used as a method of contacting a broad group of people with particular interests and commonalities. Amazon’s Mechanical Turk can also be used as a method of quickly gathering paid respondents with desired characteristics. Survey and questionnaire data can be usefully triangulated with and augmented by richer data sources such as interviews, observational data, telemetry, or activity logs.
5.3.5 Instrumenting Software Another way of gathering a detailed record of interactions with technology (especially if that’s is the main thing you are investigating) is to instrument the software so that it appends the desired information to a log of events. For example, you might instrument an eBook so that it records event details each time a user turns a page or scrolls, taps on a link to follow it, or accesses a translation facility. Interactions may be recorded at a very fine-grained level. This kind of detailed log or record of interactions is often called telemetry because it is literally measurement at a distance.10 Like satellite telemetry, this interaction record may be transmitted to another computer (your own computer or a server you’ve set up for this purpose) for analysis. These logs can grow to be quite large; thus, researchers often harvest them multiple times before they are complete so the data is not lost in the event of an unanticipated crash or virus. Once the data is complete (i.e., the study period—as you’ve defined it—is over), you will need to use some kind of consolidation and preprocessing software to aggregate the records appropriately
My colleagues Frank McCown, Michael Nelson, and I have used SurveyMonkey, a Web-based tool (http://www .surveymonkey.com/), to collect data from distant respondents—some in Europe and Asia—who have lost websites. These surveys were coupled with follow-up phone interviews of a subset of the participants to tease out details not covered in the questionnaire (McCown, Marshall, & Nelson, to appear). 10
Telemetry originally referred to data streamed from a satellite or other spacecraft.
studying reading 109
(e.g., to filter out what you want or to identify sessions). This kind of software is easy to write and may be developed using a scripting language (which is usually high-level and easy to debug). The preprocessed data may then be processed using any one of a number of data analysis packages such as SPSS, Matlab, or even Excel. There are also applications that will produce a continuous real-time record of what is happening on the user’s screen.11 In other words, you can literally replay a use session. This kind of recording can be used as a complementary view of a session that is recorded some other way (either through the collection of telemetry or by capturing it on video). Such screen recording may also enhance the researcher’s understanding of another form of data (e.g. interview results, a diary entry, or observations).
5.4
PERFORMING A FIELD STUDY OF READING
This section will walk you through a typical field study. Of course, no study is typical, and sometimes study tasks are discussed as if they are sequential, but they are actually interleaved. I’m assuming that the pilot portion of the study won’t convince you that you don’t want to embark on the study at all, although it might. Several times I have been convinced by the pilot study that I need to go all the way back and reformulate my research questions; you should always be open to this possibility. In an effort to be as concrete as possible, this section draws examples from two different studies. One is a study that looks at a particular activity, clipping (Marshall & Bly 2004, 2005a), and the other is a study of a technology in the field, a deployment of Microsoft Reader eBook software on a Pocket PC (Marshall & Ruotolo 2002).
5.4.1 Research Questions and Study Design I’ve brought up this step before, but I don’t think I can emphasize it enough: always begin with a crisp formulation of what it is you want to learn, even if it is general (“how do students interact with their textbooks”), and then think through the best way to answer these questions. This can be the most creative part of your study if you don’t give it the short shrift. Let’s examine the research questions that drove the two example studies. In the clipping study, we were interested in learning generally how people clipped material from print and electronic publications. We wanted to know how people saved, shared, stored, and managed clippings in their homes and offices and on their computers. Were clippings useful? How did people encounter the things that they clipped? Did some people do more with clippings than others? What roles did the clippings fill? Were some people givers—information brokers—and did others more commonly act as recipients? 11
One such application is Camtasia (http://www.techsmith.com/camtasia.asp)
110 reading and writing the electronic book
In the Pocket PC eBook deployment, we wanted to understand the role these small form factor devices played in a larger ecology of reading, including both print and digital resources. Were the Pocket PCs an effective vehicle for reading, and if so, what kinds of reading did they support? Which material would students choose to read on the devices? How did faculty members use the devices in class? What functionality was useful, and what functionality was missing? What role might the Pocket PCs play in collaboration? In both studies, we were interested in the bigger picture. In each case, we were surprised by what we found. Sometimes, it is useful to record what you believe going in. It’s not that you’re trying to be right; articulating what you think you’ll find will help you identify blind spots and surprises—for often, at the conclusion of a study, you’ll read your results and think to yourself “I knew that all along” because your findings will seem obvious. But if you can go back to your going-in assumptions, you may find that you’ve learned far more than you thought. Recording these beliefs will also help you present your findings and convince your audience of what you’ve learned. Once you’ve articulated the questions, you can design a study that will answer them. You can choose and refine the methods you’ll use and devise a reasonable schedule. In the case of the clipping study, we were interested in diverse demographics (people often ascribe clipping to their mothers—is that a fair stereotype?) and genres of reading material. We were also interested in both home and work settings. We knew we wanted to conduct interviews that centered on examples of clippings—artifact-centered interviews—rather than performing an observational study. We also decided to recruit people from two different West Coast cities (West Coast simply because both interviewers live on the West Coast, but two cities because some industries might be overrepresented if we just recruited people in the San Francisco Bay Area). The Pocket PC eBook deployment relied on a partnership between Microsoft and the eText Center at the University of Virginia. Microsoft had supplied the Pocket PCs and the Electronic Text (eText) Center had prepared the materials and had recruited faculty members and students to be involved in the project. Thus, answering the research questions required that the students and faculty members actually had sufficient opportunity to use the Pocket PCs over the course of a school term and that the Pocket PCs come equipped with the appropriate software (the eBook reader, in particular) and course material; we did not want to introduce obstacles that would have arisen from the participants’ need to set up the devices themselves. Hence, a significant amount of planning and pre-work was necessary to support our study. Pilot the study. It is helpful to rough out some of the elements of a study before you go much further. How will you answer your research questions? What will you try to observe? What will you ask about (and look for) during your interviews? Will the data that you gather be sufficient to answer your research questions? At this point, because you’re just determining the feasibility of your
studying reading 111
approach, you might ask a friend, relative, or close colleague to help you pilot the study. That way you’ll be able to find out how long it will take to do whatever you plan to do in the field (how long a typical interview will take, for example, if you keep it on track and ask all of your questions), and you’ll discover whether it is possible to answer the questions you plan to ask. It is all too easy to design an interview with unanswerable questions or obscure terminology. Pilot studies are a useful way of finding out that you’ve chosen overly broad questions or that some of your questions are difficult to answer. It is easy to assume too much. For example, if you were performing the clipping study, you might discover that you don’t want to cover both print and digital clippings; you just want to pursue digital practices. Sometimes it is necessary to first pilot a study on yourself (or yourselves) even before you try it on friends and family members. You may discover that what you want to do is very difficult or too invasive (e.g., you want to go through all the files on someone’s hard drive). Pilot studies are also a good time to listen to yourself asking your interview questions; are you asking leading questions that will bias your results?
5.4.2 Finding Participants Depending on the nature of the study, it may be more or less obvious whom you want to participate. How many people should you recruit? The answer to this question is frustratingly vague: it depends. There is no absolute upper or lower limit, but it will depend on the nature of your research questions and what you intend to do with the data you gather (and how you’re planning to convince your audience of the validity of your findings). You may need more participants if you plan to perform statistical analyses that rely on per-person data. Exploratory questions about collaboration that require extensive observation may involve a fixed group of people. You may want to use saturation as a guideline for deciding how many participants to recruit; that is, you may keep adding participants until you sense that you are hearing familiar stories from the new participants. No matter how tempting it is to stay close to home, make sure your participant base adequately represents your eventual users (or that it represents the population that you want to talk about). If you’re designing an electronic textbook, recruiting students is fine (although you may want to make sure you span several disciplines or both undergraduate and graduate-level students). If you’re designing general-purpose software, you might want to cover a much broader demographic range; although students are close at hand, limiting your participant base to students may give you misleading results. Sometimes it is helpful to develop a screener. A screener is a series of questions that prospective participants must answer to qualify for your study; the questions should correspond to the qualifications you expect. For example, in the case of the clipping study, you may require that participants regularly read magazines or newspapers. If you’re interested in digital clippings, you may require that participants have a home computer that they use regularly and that they spend a certain
112 reading and writing the electronic book
number of hours per week (e.g., at least 4 hours) surfing the Internet. It’s surprising how quickly the requirements mount when you stop to think them through. Remember that the quality of your findings depends on your ability to recruit enough participants for the study who meet your qualifications. Often, it is necessary (and indeed appropriate) to offer some kind of small honorarium as a reward for participating and to acknowledge that the participants’ time is valuable. Sometimes, people will participate in studies out of interest or curiosity or out of desire to improve a technology that they already value, but usually the honorarium seals the deal.12 Allow enough time in your schedule to conduct open-ended interviews. Otherwise, you may overextend yourself and find yourself rushing to finish an interview that is giving you great data. As a rule of thumb, four interviews per day is a very full schedule. You have to get from one interview to the next, and you should leave yourself enough time to write up your field notes at the end of each day and to jot down some preliminary debriefing notes between interviews. Otherwise, your valuable notes will become muddled; it’s surprising how confused you can get about who said what—and what you saw where—when you are performing four interviews per day.
5.4.3 Developing an Interview Script An interview script13 helps you remember the topics and questions you want to be sure to cover with each participant. Some field researchers cover all of the material on the script in the order it appears. Others just use the questions as a checklist to ensure they cover everything they want to ask about, and they allow the interview to shape itself organically, as different topics arise in conversation. Similarly, some field researchers stick to their scripts, steering the participant back on track if the participant rambles; others find that this storytelling is a rich source of data. It is up to you to decide how you will treat your script. When you are developing a script, remember that the best data comes from specific examples and events. So rather than asking, “What do you generally do with articles you’ve clipped from NYTimes.com?” (assuming you already know the participant reads The New York Times online), you might instead ask, “Can you show me the last article you clipped from the Times?” If the answer to that question is “No,” you can loosen the constraints: “Can you show me any articles you’ve clipped from the Times?” Asking for specific examples helps participants get away from self-stereotypes 12
That said, it is unwise to offer a large honorarium. You may recruit participants that are desperate for the money and that, in turn, can have a deleterious effect on your data. How much is enough? How much is too much? It’s hard to say, but we’ve encountered problems when honoraria exceed several hundred dollars. Gift certificates or software can be a suitable substitute for cash honoraria, but remember that Mac users resent Windows software and vice-versa. 13
Some field researchers refer to the plan embodied by the script as an interview protocol.
studying reading 113
(how they think they behave or how they’d like to behave) and it can prompt information that the participant would have otherwise forgotten. Note that asking the same question several different ways, from different perspectives, is sometimes helpful. Asking the same thing in different ways helps the participant recall the situation; and enables you to triangulate and confirm the answers. Of course, there are limits: you don’t want to irritate the participant. Beware of leading questions. Leading questions are an interviewer’s worst enemy; they usually involve assumptions about what you think people do or how you think they will answer your question. For example, in the clipping study, a leading question might be, “Do you save clippings from the newspaper in a folder?” Suppose the participant answers, “No. I don’t usually save clippings that way.” Do we even know if he reads the newspaper? If this is asked as an open-ended question and we ask about particular artifacts, we might learn that the participant reads the morning
Tour of primary work setting Technologies (computers, devices, peripherals like printers, software). Ask about utility/use of each. Documents (printed & electronic materials, reference materials, etc.). Other aspects of the interview setting (video sweep). Past experiences Any previous classes that used online materials? Tell me about them. How did you read them in past classes? (Ask for specific example) If there’s an example, ask about annotations, both for online and for print. Do you save old course materials? How about your own notes? Ever refer to them again? (Ask to see) Reading questions, situated in this week’s activities Where did you do this week’s assignment (can be more than one place)? (Did you finish it?) What did you bring to class with you this week? How did you read the assignment (on paper, on Pocket PC, on desktop, on laptop, combination)? Did you annotate/mark-up/highlight etc. the materials? (ask to see) Did you work with anyone on the assignment? How? Pocket PC specific questions When’s the last time you used the Pocket PC? (frequency of use) What’s the last thing you did with it? (getting at whether it’s used for other purposes) Downloads? How found? What did you do with them? Discuss audio, try to tease out how multi-modal capabilities should work. Did you bring the Pocket PC to class? Did you use it in class? For what? Do you carry it with you anywhere else? Use it for anything else? Would you get one? Would you keep this one? What’s useful? What would make it more useful? How did you get started? Did you get training? (Find out whether students help each other.) Integration with other activities What work do you expect to do to finish this class (e.g. read, find add’l sources, write a paper, final)? What have you done so far? Ask about communication & collaboration (students and faculty)
FIGURE 5.2: Sample interview script from Pocket PC study.
114 reading and writing the electronic book
newspaper in print, but doesn’t actually save articles from it because his partner takes it to work with him. Instead, he finds the same article in the online version of the newspaper and copy–pastes the article’s text into an email message that he sends to himself. The way you ask the question can either facilitate or preclude the richest and most complete answer. Figure 5.2 shows the student interview scripts for the Pocket PC deployment.14 Introductory material reminding the interviewer to explain the study, data collection methods, and participant privacy is omitted for the sake of brevity; usually, a script includes prompts for standard prefatory remarks and getting permission to record.15
5.4.4 Preparing Materials Field studies of electronic publications—if they involve a technology deployment—may require material preparation. This step involves identifying all of the appropriate material, transforming it into the necessary format, and loading it onto the reading platform the participant will use. Depending on the type of study (and how open-ended it is), you may want to develop a process that allows participants to put additional documents on their devices. This is a good time to address the question of support. If you are deploying technology in the field, you will need to be prepared to support it. Support is a deliberately broad term; it can mean anything from installing software to rescuing data after a crash to helping a participant get data off the device at the conclusion of the study. Support is an implicit part of the bargain. You can also prepare in other ways. Be sure to do whatever advance research you can before you meet participants. Be familiar with the kind of materials you’re planning to ask about. For example, if the interview is about The New York Times, read the newspaper yourself for some period before (and possibly during) the field portion of the study; also explore the Web site. That way, you’ll know when weekly features appear and you’ll be able to ask sensible follow–up questions.
5.4.5 In the Field Whether you’re observing or interviewing, there are certain things to remember when you head out into the field. Because there is so much to do and so much to remember, field researchers often go out in pairs; occasionally, for home visits, it also seems safer to conduct visits in pairs.
14
As a logistical note, I like to print out my scripts in a large font on one side of a piece of paper that I place in a plastic cover sheet so that it’s easy to find and consult it. I seldom write out questions verbatim, but rather write short cues that remind me of topics to be sure to cover. 15
Your organization or university will have a policy regarding human subjects. These policies and procedures for approving protocols that are used in studies involving human subjects vary from institution to institution.
studying reading 115
First, make sure that you have the right recording equipment for the job and that it’s in working order. You’ll need sufficient media to cover the duration of the interview (and more). These recordings are a fundamental part of your raw data. Label all media as soon as possible after you finish recording (preparing labels in advance is a good habit to get into). Think carefully about how you’re going to record at a field site. Remember that recordings are irreplaceable, so treat them with care and back them up if you are able to. Video recordings capture settings very well and are important if you’re planning to analyze interactions (most modern digital cameras will record enough video to do a lay-of-the-land sweep of the room; in fact, even cell phones will do this in a pinch). Audio recording is essential; the fastest note taker in the world can’t keep up with a speaker and notes don’t capture vital characteristics like vocal inflections. Whether you’re observing or interviewing, think about where to place the equipment in the room. You want to be able to record all the participants. Often, artifacts (e.g., documents or publications) can help you understand an interview. You can either take photos of these artifacts, or sometimes a participant is able to give you samples of something that has been under discussion. Today’s high-resolution digital cameras are capable of capturing a document at a remarkable level of fidelity to the original. Even if you’re recording everything, take notes. Recordings are not infallible; even the most experienced field researcher will sometimes delete a file or ruin a recording. Notes can also capture things that elude the recording (e.g., something that’s just off-camera). Remember to keep administrative paperwork (e.g., recording permissions, schedules, protocols, screeners) on file in case you need them later. One practice that we have found helpful is to discuss the interview with your research partner immediately after you have left the field site. What are the three or four high-level observations you are walking away with? What surprised you the most? The anthropologist Francoise Brun-Cottan has called these preliminary findings intelligent noticings. They are no substitute for a careful analysis of ALL of the data you have gathered in the field, but they help keep you on track and ensure that you’re asking the right questions and attending to the right details.
5.4.6 Data Analysis Data analysis is another over-broad term. It can mean many things. But the most important thing to remember is not to let the intelligent noticings you’ve extracted from your field visits substitute for real data analysis. Those epiphanies may be important, but they also may be misleading. They are what struck you at the moment, when you’re deeply engaged in the interview itself; the critical distance that is part and parcel of the analytic process cannot be overemphasized. You will need to support your conclusions, and you can’t do that by appealing to drive-time epiphanies.
116 reading and writing the electronic book
When you have completed a field visit, the first thing to do is write up your field notes. It is best to do this immediately after the conclusion of an interview (or set of interviews); the longer you wait, the more you will forget (and the more painful the process will become). Then transcribe your recordings and organize all of the data you’ve gathered. At this point, the analytic methods you bring to bear on your data will depend crucially on your disciplinary commitments and the set of tools you feel comfortable using. Don’t confuse tools with methods. This discussion is going to stop short of describing specific methods. There are many, and each method has its strengths and weaknesses, its advocates and detractors. There are also fine, detailed references describing any method you may choose to apply. Several popular methods are discussed briefly in Sharp, Rogers, and Preece (2007). You’ll probably have an analytic method in mind before you start and you may need to apprentice yourself to a practitioner of one method or another to pick up the nuances of putting it into practice. Once you have selected an analytic method—or at least an analytic strategy—choose a tool that supports the method. Some field researchers prefer to put their data in the semiformal represen tations offered by software like NVivo, while others use Excel to code their data in an enormous table (or a collection of special purpose tables); still others use generic database software and develop analysis schemas, while others use Post-its and wall space. Generally speaking, most techniques require the researcher to code, categorize, organize, and reorganize the data until patterns emerge. If you are using formal categories, it is important to recognize when you are shoehorning your data into them; similarly, if you are performing a top-down analysis that organizes your data according to a predefined structure, it is important to be on the alert for emergent patterns. Principled analysis makes identifying and supporting your findings much easier. Drawing design implications and recommendations from analytic findings is more difficult (some would say it is something of an art). At the very least, experience and examples help. Some resources for this process can be found in qualitative method books aimed at design practitioners (e.g., Sharp, Rogers, and Preece 2007). Because this section has focused on practical aspects of basic field research at the expense of other possibly relevant methods and techniques (e.g., usability studies and focus groups that one might use to refine hardware and/or software design, or cooperative prototyping methods one might use to solicit greater end user involvement), the HCI-oriented reader may wish to consult other guides, textbooks, and references to learn more about these approaches to informing design. • • • •
117
chapter 6
Content: Markup and Genres So far, we have focused on eBook platforms and the practice of reading. What’s still missing is content: what will we read on eBook platforms? In other words, we’ll shift our attention from reading to writing. There are several perspectives we might take on content. The first is pragmatic: to be an eBook, text must be properly prepared so that the software knows how to display it. This underlying representation of the content will be transformed into what the reader sees on the screen. Furthermore, we must consider that there is a significant amount of digital content already, and this content is available in a variety of formats: it may be in HTML, marked up for display in a Web browser; it may be in a word processing or document preparation format; or it may be tagged with XML tags and compliant with modern Web standards. The first wave of eBooks focused less on markup standards and more on multimedia production; in those early eBooks, there was an understandable concern for creating content. The second wave of eBooks saw a greater attention to the negotiation of a uniform representation for eBooks. This attention to content representation is crucial. After all, who wants to buy an eBook that is tied to a particular platform that might become obsolete in 5 years? Ideally, from a publisher’s perspective, content creation must be reasonably easy, device-independent, and must foster reuse in a number of different formats for different purposes. Thus, the first section of this chapter will focus on markup and standards. We also discuss Digital Rights Management (DRM), the means by which publishers have copy-protected digital works. We then briefly talk about the content preparation process in its many variations. Recent mass digitization efforts have resulted in a great many books that may be read online; but at the same time, other materials have been prepared and marked up in different ways—by rekeying, by intercepting existing electronic texts, and by transforming existing formats and collections. There are many ways to create and deliver content.
Ironically, the final form for Kindle-resident eBooks is incompatible with other reading platforms (even eBook readers running on PC laptops) in spite of the significant standardization efforts that took place in the wake of the second wave of eBooks.
118 reading and writing the electronic book
The next two sections of this chapter delve into genres. We first examine paper genres that have been reborn for a digital era—books, magazines, newspapers, textbooks, and academic journals have all been reinvented, in some cases to replace forms that are quickly growing obsolete, and in other cases to stand side by side with their print cousins. It is also becoming apparent that new digital genres are evolving quickly. Many of us don’t get through our days without checking in on our favorite blogs or looking up something in Wikipedia. These common forms serve as a springboard to launch us into other new genres: wikis, hypertext fiction, cell phone novels, and other emerging forms. We then devote a section to eBooks in libraries by presenting a short account of an eBook program in a California public library. Details of the program allow the reader to see how the library administration developed the program and handled problems such as populating the devices with content; checking out eBooks that consist of bundled hardware, software, and content; and supporting patrons’ use of the new eBook devices. Finally, we include a brief discussion of eBook sustainability. Naturally this is a topic that merits considerably more attention than we are able to give it here; no one wants to collect a personal library that is rendered obsolete at regular intervals throughout their lifetime. Yet digital preservation is far from a solved problem; it is complicated by issues of hardware and software obsolescence, DRM schemes, changing content standards, and the overall vulnerability of digital storage. This discussion is intended to flag the importance of digital sustainability. Interested readers are directed to supplementary reading to understand the scope and magnitude of the problem.
6.1
CONTENT REPRESENTATION
When we look at a page of—or a window onto—an electronic book, what we see is text prepared for reading. That is, a software application transforms an underlying representation of an eBook, usually stored in one or more computer files, into a form suitable for the eBook platform to render on the screen. This transformation may be accomplished in a number of different ways, depending on the publishers’ and software developers’ goals. We often refer to this underlying representation as a format or language; eBook files usually have extensions that reflect which format the publisher has chosen for the eBook. Figure 6.1 illustrates this basic process. Content preparation involves important choices about how the book’s structure and appearance are represented and stored, and whether the eBook’s appearance is computed on the fly or
So, for example, Moby-Dick-chapter1.html might be one of a number of source files that the publisher has prepared in HTML, the standard used by Web pages. An eBook application is used to compile all of the eBook’s source files into the form that may be put on a reading device or rendered by reading software. This final processed eBook file might be MobyDick.azw (AZW is the type of file used by the Kindle) or MobyDick.lit (LIT is the type of file used by Microsoft Reader).
content: markup and genres 119
FIGURE 6.1: Content preparation process.
specified fully within the file. If the eBook’s appearance is computed on the fly, the content is often marked up in such a way that it reflects the book’s structure (e.g., the tag might signal the beginning of a new chapter and the tag might likewise signal its end); an XML-derived format is usually chosen for this purpose. If the eBook’s appearance is predetermined and communicated directly to the application, the format of the stored eBook usually specifies the precise location of textual and graphical elements on its pages. For example, the eBook format might tell the rendering application that a particular image is on page 130 starting at (0,0) and that the image is 4.5×7.4 inches. Formats may also occupy a territory that is somewhere in between these extremes. As we will see, each choice of format has both advantages and drawbacks. EBook formats are still far from standardized, partially because the industry is still young and partially because there are commercial reasons for adopting particular formats (not the least of which is tying content to a specific platform). There are currently dozens of eBook formats; thus, this chapter will not document a specific format, but rather will give a high-level account of different types of formats and their underlying motivations. There is an ongoing effort to reach consensus on an XML-based eBook format that many different publishers and platform manufacturers would use, but commercial efforts haven’t uniformly adopted it; for example, Amazon’s Kindle uses its own wrapper around the more widely used Mobipocket format. However, the effort to create an eBook format standard resulted in an organization—the International Digital Publishing Forum (IDPF)—to advocate for this approach to representing
120 reading and writing the electronic book
FIGURE 6.2: A range of format choices.
and packaging content; their continued advocacy may eventually lead to a more uniform content representation. To understand the different approaches to representing eBook content, it helps to look at the extremes and several points between them. Figure 6.2 illustrates this range of formats. Of course, formats can be converted from one to another, but since they are not all equally expressive, some transformations are not completely reversible. The choice of format is important since format specifies and constrains the way the content can be used subsequently. At one extreme of the range, pages are preformatted in such a way that a rendering engine, the application that presents content on the screen, is given precise specifications for how each page should appear; in other words, much like a printed page, the page will look the same regardless of the hardware used to display it. The primary advantage of this type of representation is that the publisher has precise control over the layout—the look—of every page of the eBook. As we saw in Chapter 2, it might be important to control the look of poetry, where line breaks in the proper places can mean the difference between the perception of verse and doggerel. Furthermore, the fixity of the paper book is preserved. In other words, if I tell you to look at page 130 in your book, it will be the same as page 130 in mine regardless of whether you’re reading the book on your phone or on your new high-resolution monitor. Some publishers—and some readers—find this fixity to be tremendously important; others do not. It is easy to see how fixity helps address the co-navigation problem identified in the “Reading together” discussion in Chapter 4. However, it is possible to achieve this intent via other means (e.g., through navigation functionality).
See http://www.openebook.org/ for more information about the organization and the specifications that it maintains.
content: markup and genres 121
PDF and PostScript are examples of this strategy for describing how eBook content is displayed on the screen. Choosing a fixed layout has the implicit assumption that the text is probably neither changing nor evolving and is worth the upfront effort that is put into the way the page looks. At the other end of the spectrum, software bears the burden of computing the page layout from a description of the work’s structure and a separate specification of its style; the XML markup language, coupled with cascading style sheets (CSS), is an example of this strategy. XML is extensible and can be made to conform to the requirements of a book. Style is separated from functional structure and so may be applied across many files intended to have the same look. Because the layout is computed, ongoing content changes are not expensive. An adaptive strategy has the distinct advantage of making the content reflow to fit the display characteristics or the reader’s preferences. That is, if the reader is visually impaired, larger fonts may be chosen, or if the screen is small, the page can be rendered to fit so the reader need not pan to see the complete page. Furthermore, functional markup allows people and applications to refer abstractly to different elements: I can link to Chapter 3 rather than needing to know that Chapter 3 begins on page 38. Certainly the choice of content representation has an immediate effect on how functionality is implemented. If the publisher has chosen a fixed page representation like PDF, an annotation may be anchored at a specific point on a page: a given piece of marginalia may be rendered starting 0.3 inches down and 0.25 inches across on the 50th page of the book. On the other hand, if the publisher has selected a markup language for representing content, that same annotation might be anchored to the third paragraph in Chapter 3.1. Naturally, these are just examples and the details of an implementation may look very different, but it is easy to see how something as basic as the content representation can have considerable bearing on what is easy to develop, or even what is possible. Between the two extremes are two other points that may well represent how much existing content has been prepared. A file in a text editor’s format (such as MS Word) makes it easy to manipulate layout, but more difficult to control it precisely; RTF also mixes style and function. HTML, when used properly, may describe a document’s structure and promotes interoperability among platforms, but does not lend itself to precise control of how the document looks. Again, the style is inexorably tangled with the document’s structure and various work-arounds have been devised to achieve a desired appearance. Thus, eBook markup choices come down to a series of trade–offs: 1. Is it more important to impose a fixed layout (as is the case with certain kinds of poetry) or is it more important to support a flexible and extensible layout (as is the case with content that changes frequently such as news articles)? 2. What kind of format information exists already? How have the eBooks been prepared?
122 reading and writing the electronic book
3. What are the basic characteristics of the content files: are the works long or short? Are there many of them or few? 4. What are the accessibility concerns? Will the content need to be re-rendered to meet the needs of many different audiences? 5. Who are the intended audiences and what kinds of devices will they have? What kinds of reading will they be doing? 6. What is the internal structure of the works? How will the structure play into navigation? Will the content need to be subdivided, excerpted, or restructured according to the internal structure? In practice, most commercial eBook manufacturers try to support multiple formats or they support the conversion into their own proprietary format (a format which generally incorporates their own DRM approach, a topic we will discuss at some length later in this chapter). The Amazon Kindle uses Amazon’s proprietary format, AZW, which is based on the Mobipocket format. Beyond the basic choice of how to represent book content, which we will cover by a brief subsection on page description languages and markup languages, there are several other problems associated with the preparation of content for display on the screen. Each problem will be discussed briefly. First comes the simple logistical matter of composing books from smaller units—how can files be packaged so they can be assembled into coherent units? Next we’ll look at functionality that goes beyond the paper book. Here too there are many choices to be made: is additional code embedded in the content files? What language is it in? Will it fall victim to malware? What about embedded media files? Then there’s the matter of protecting the content: for better or worse, many publishers are worried about how they will protect the content from the obvious threat of piracy. One of the primary advantages of digital content—ease of copying and transmission—becomes a formidable problem for publishers when they consider the prospects of unlawful use. Finally, there’s the matter of content preparation: how does all this content get marked up or turned into a pageready format? Although most publishers receive new material in digital form these days, it is not usually in the form that they want it to be in. How does this initial transformation take place? Taken together, the following sections should give you a good idea of how the constituent content elements are prepared and woven together to form an eBook.
Mobipocket SA is a French company that was acquired by Amazon in 2005. Mobipocket Reader was the eBook software the company offered. Mobipocket format is based on the OEB format, a format that was defined by a publishing consortium, the International Digital Publishing Forum. Many of the current eBook formats have their roots in OEB, but they are modified in minor ways to work with different DRM software and to accommodate slight variations in source material.
content: markup and genres 123
6.1.1 Page Description Languages There are some obvious reasons for storing electronic books in a document image format like PDF or PostScript: a format that so closely reproduces the form of paper books is familiar and provides both the publisher and the reader with some of the advantages of paper. The publisher has strong control over the book’s design, how it is laid out and rendered as a platform-independent twodimensional document. Text, fonts, images, and vector graphics are included in the source file. Because PostScript is a programming language, to render any page, all of the previous pages must be computed; in PDF, pages are static and displayed independently (Andersson et al. 1997). Nonetheless, for the purposes of this discussion, they are quite similar and will be treated as exemplars of a class. Naturally content is not created directly in this type of format (unless we’re talking about images themselves, which may be created in a graphics format). Instead, a conversion process transforms a file written in an editor (e.g., a Word document) and saves it as a PDF file or a TIFF file or some other sort of literal description of what is on each page.
6.1.2 Markup Languages Markup languages are tags that are embedded in documents to specify structural elements of the document (e.g., paragraphs, headings, and tables). They may also encode presentation instructions (e.g., what font to use; which characters to italicize), but many document designers have argued for principled use that pulls these stylistic elements out into a separate template known as a style sheet. Tags may also be further specialized to encode semantic elements of a document; that is, the encoding of these elements specifies additional information about the role the element plays within the genre. For example, a proper noun in a screenplay might be tagged to specify that the name represents a character. In some markup languages, a document type description (DTD) acts as a schema that specifies and constrains the generic structure of a set of documents; more recently, XML schemas have taken the place of DTDs as the means of constraining the logical structure of a marked-up document. Tags are usually enclosed by standard delimiters like angle brackets (“”) so they may be easily parsed out of running text. For the same reason, tags usually include an opening tag (e.g., ) and a closing tag (e.g., ). This pairing enables the rendering software (e.g., a browser)
At the logical extreme of this approach, an eBook can be literally stored as a series of page images (e.g., as fixedsize tiffs). Since this is an impractical way of storing eBook content—for example, it is difficult to search tiffs and an eBook that is a sequence of tiffs offers few advantages over paper—we will not discuss it further, but will focus instead on the more practical PDF solution.
124 reading and writing the electronic book
to easily pick out what text the tag pertains to. These conventions also allow the markup to be checked for well-formedness (e.g., Are all the tags closed? Is the document structure hierarchical?) and validity (e.g., Are all the tags defined by an explicit or implicit DTD?). HTML is the original markup language of the Web. It began as a simple functional specification of basic Web page elements like , , ,
(for <paragraph>), (for heading level 1), and so on. It also included basic layout instructions (e.g., and <emphasis>). The original HTML specification was lax on closing tags; that is, the markup did not have to be well formed to be rendered. It also did not have to be valid; tags that were not recognized were usually just skipped by the rendering program. This laxness enabled people to put their material on the Web with little struggle, which was essential for bootstrapping. XML is the current de facto language for specifying the markup to be applied to a given collection of documents. That is, XML is a meta-language that allows tags to be defined and interpreted to specify any necessary elements. There are many extensions to XML that have been defined to solve specific problems on the Web; for example, XPath allows different components of an XML document to be extracted from the document, and XQuery allows components to be extracted via a query. The reader who is interested in XML should refer to the many textbooks and guides written on the subject (e.g., Harold & Means 2004; St. Laurent & Fitzgerald 2005) and can consult the standards documents on the Web that describe XML extensions like XPath and XQuery. The Text Encoding Initiative (TEI) is relevant to our discussion of eBooks for two reasons: many scholarly texts in the humanities, social sciences, and linguistics have already been encoded according to this evolving specification, and it supports some types of processing that we will discuss in Chapter 7. The TEI is a set of conventions—realized as tags—for encoding texts so these texts can be processed in various ways. These conventions have evolved through use; that is, texts that are representative of different genres have been marked up using TEI and the tag sets and DTDs have been extended accordingly.
6.1.3 Packaging Files There is another factor to be considered in representing content: how are complex works united if they have been prepared in multiple parts? A book may consist of a number of separate files; these files need to be prepared so they may be compiled to form a complete work. A packaging file is often used for this purpose. It also enables common elements to be factored out and specified in a single place so the work may be kept internally consistent. Packaging files usually include the following elements:
Many people, myself included, feel that if the Web had been stricter about well-formedness and validity and had offered more complex tags, it never would have taken off the way it did.
content: markup and genres 125
• •
• • •
The metadata umbrella for the complete work (e.g., the packaging file might specify metadata elements such as the work’s title and author); An enumeration of all of the separate components of the work such as the files that contain the constituent parts, navigation structures, images included by reference, and style specifications; The linear reading order of the components (i.e., the order in which they are assembled to create the final work); Any additional information to knit the files together into a book such as a predesigned (rather than computed-on-the-fly) Table of Contents; and Any necessary DRM provisions.
Packaging files also get around the problem of platform variations. That is, different platforms have slightly different requirements and limitations; thus, books must be built for them using slightly different constituents. These variations can be handled by the packaging file. The DRM provisions are the most controversial element of the packaging file; these provisions are discussed more fully later in this chapter.
6.1.4 Accessibility Accessibility refers to the characteristics of eBooks that allow people with visual impairments to read them. Disability advocates have maintained pressure on eBook content providers and eBook platform manufacturers to adhere to accessibility standards and principles. These standards have been developed for the Web and are documented at http://www.w3.org/WAI/.
6.1.5 Digital Rights Management Digital Rights Management (DRM) is an inclusive term that refers to a range of techniques for controlling (or restricting) the use of electronic content. In its broadest sense, DRM might mean that the content is simply encrypted (to prevent free copying); it might mean that some actions are prohibited (e.g., modifying the text in any way); and it might mean that specific rights have been granted (the reader might be able to make a limited number of copies, but might not be able to print the text). The license to use the content is granted to a principal, which might be a person (who is authenticated in some way) or a thing (a device, such as a laptop or eBook reader, or a storage unit, such as a CD-ROM or flash memory) and might constrain use to particular actions (the reader may be able to excerpt passages up to a particular length). Originally, DRM was conceived as statements in a declarative formal language which were embedded in the source file to specify exactly what the reader could and could not do with the content he or she had purchased (Stefik 1997). Many library professionals and copyright lawyers
126 reading and writing the electronic book
were enraged by the technique because it flattened fair use, which is a nuanced concept that allows content to be used in a flexible way in educational or other noncommercial settings. For DRM to work correctly, every use scenario had to be anticipated and codified appropriately. To many, DRM was an affront. So how did something as unpopular and controversial as DRM come about? Digital materials is easy to copy. What’s more, digital copies do not degrade like analog copies do: a copy of a copy, isn’t any lower fidelity than the original. This meant that publishers were initially resistant to the very idea of eBooks, even though electronic content meant reduced preparation and distribution costs for them. DRM mollified the publishing industry to some extent; it protected the content against the unlimited copying that had so upset the music industry. Meanwhile, other stakeholders voiced concerns. Libraries and educational institutions have long relied on fair use to support their missions. Teachers have freely excerpted material to present to their classes, and libraries have protected scholarly use from unwarranted restrictions unintentionally imposed by publishers. Readers did not want to purchase eBooks that would “expire” if they weren’t read quickly enough or eBooks that were tied to hardware that was bound to become obsolete. Nor were eBook purchasers or readers always comfortable with being identified; generally, there has been a tradition of reader anonymity in institutions like libraries. Furthermore, there have never been restrictions on passing on print books to friends to expose them to a new writer or an interesting topic. DRM introduced barriers that didn’t exist for print material. Distributors and authors have a stake in digital rights too: distributors don’t want to adopt a DRM scheme that introduces new barriers to purchase; if readers are scared away, the distributors (bookstores and the like) will needlessly lose customers. But they also have an interest in protecting content from being freely shared. Authors have similar concerns. Naturally, they would like to protect their revenue streams, but they are also interested in attracting new readers and gaining broader exposure. It is not clear that publishers, readers, authors, and other stakeholders will ever reach an equitable solution to the digital rights problem if they continue to use existing DRM technologies; it’s very difficult to anticipate the range of possible uses and situations. If we step back and apply the dreadful term “user” to the people who purchase or read eBooks (for no-one who reads can say
Publishers realized from the start that different media and genres required different levels of protection. Thus, a newspaper containing time-sensitive information only required minimal protection; trade reference books, textbooks, and cookbooks required medium protection; and genre fiction (romance, science fiction, mystery) and selfhelp books required maximum protection.
Amazon’s Kindle, for example, has used DRM that prevents protected books from being read on any other device (including other Kindles).
content: markup and genres 127
“users of books” without the mental image of a book being used to prop up an uneven table leg), we can imagine a whole range of licensing models that might challenge explicit DRM provisions, including: • •
• •
Subscription models; Leasing models (e.g., even now some students have a de facto lease on their textbooks since they purchase them in the fall and sell them back to the bookstore in the spring, at a substantial loss); Purchase; and Sampling (print books offer the opportunity to examine the content fairly thoroughly before the book is purchased; what comparable options are offered to the eBook reader?).
Note that these are only examples. One of the problems facing DRM is the need to anticipate the range of these models and the full spectrum of uses to which the material is put. The following are examples of use situations that might strain DRM: • • • •
•
Annotating (or otherwise interacting with content); Extracting (e.g., fans may wish to extract passages from genre fiction to use in subsequent email discussions); Repurposing (e.g., chapters are taken from different books to form a new book as they are with course packs); Computation (e.g., new works may be created by algorithmic processing of the content of a number of works; texts can be represented in new ways using visualization techniques; and even search requires advance computation of an index); and Sequential use (one copy is passed from hand to hand). How is institutionalized lending (i.e., the lending that is done by institutions like libraries) related to individual lending?
Again, this is limited set of examples; in practice there are many more. To make matters worse, licensing might apply to a number of different granularities of material. A book in its current material embodiment is partly constrained by the technologies and social structures that have evolved in the wake of the printing press. Can a chapter out of a book be licensed? Can a reader buy a license to a previously unanthologized set of works? Can a reader buy a license to a whole library (i.e., to unrelated collections of books)? The licensing models remain to be worked out.
Amazon and other booksellers provide limited samples of certain books for prospective readers to examine, but the extent of the sample and the portion of the book offered for examination is up to the publisher, not the reader.
128 reading and writing the electronic book
6.1.6 DRM Technologies DRM technologies encompass several approaches: formal languages that describe how the material can be used (we will refer to them as rights languages), encryption, and piracy tracking via digital watermarks and fingerprints. These approaches are frequently combined to get the desired mix of capabilities. Private key encryption. When DRM uses private key encryption, the publisher has a copy of a key which allows it to encode a file to a form that needs to be decrypted to be read. The reader also has a copy of the key and can therefore decode the content he or she has purchased from the publisher, but the file is of little use to anyone who does not have the key. Note that the file can still be copied, but it cannot be decoded. To protect the key, it is sometimes tied to the reader’s computer, so it is impossible for the reader to transfer the key along with the content. Rights languages. Rights languages specify what actions are allowed for the work; the eBook software then decodes these statements. Typical actions that are either allowed or disallowed are printing, copying content, accessibility capabilities (which might include, for example, text-to-speech), and commenting.10 The rights provisions may also specify which version of the software the eBook may be opened with; hence, if additional rights provisions have been added in later versions of the software, the reader cannot get around them by opening the eBook file with an earlier version of the software. Piracy tracking. A DRM strategy might include a means of tracking pirated content. This means may include digital watermarks or digital fingerprints. Digital watermarks consist of extra information embedded in the content before it is distributed (e.g., at publication time); a digital watermark supports the detection of unlicensed content and it may also enable the distribution chain to be identified, thus limiting the number of parties that are implicated in the piracy. Digital fingerprints are also hidden information that is embedded in the content, but in this case the information has been added to the content during decryption by the reader; hence, the work can be traced to the principal (the person who purchased it or the device or storage originally associated with the device).
6.1.7 DRM in Use Microsoft Reader. Microsoft’s Reader software takes a multilevel approach to offering DRM protection to publishers and authors. These levels of DRM are apt to reflect the degree to which publishers want to protect certain kinds of material. First, there is the minimal protection of sealed eBooks. Sealed eBooks provide full access to the capabilities of the Microsoft Reader software, but protect the eBook content against modifica10
These examples have been taken from Adobe Acrobat.
content: markup and genres 129
tion. At first, this might strike you as odd: who modifies books? But consider what havoc could be wreaked by changing the author’s name. The next level is inscribed eBooks. Inscribed eBooks take a piracy-tracking approach to DRM; they enforce strictures against naive piracy (unintentional piracy that is usually the result of a purchaser who hasn’t thought through the problems with unauthorized copying) by making the purchaser’s name visible on the eBook’s cover page. Finally, Owner Exclusive eBooks provide full DRM protection; that is, they require that principal be identified and associated with the eBook before the book can be read. They also enforce whichever digital rights restrictions the publishers specify (e.g., restricting the copy–paste of passages or preventing the reader from printing more than a page at a time). Unfortunately, one of the Microsoft Reader capabilities that may be disabled by some publishers who use full DRM is the text-to-speech function that is an essential component of accessibility; this restriction is a good example of how DRM may work against the greater good.11 Kindle. Amazon’s Kindle uses a proprietary DRM system which protects the purchased content against being read on a device with a serial number other than the one it was purchased for; the DRM is built into the Kindle’s native AZW format. Like Microsoft’s Reader, the text-to-speech capability is often disabled via the DRM (this time at the request of the Authors’ Guild). On the original version of the Kindle, the DRM prevented the user from introducing his or her own content onto the device. After the hardware’s release, the Kindle was quickly hacked to enable other content in Mobipocket format to be “wrapped” in such a way that the content knew the user’s Kindle’s serial number and could thus be read on the specific hardware. The next release of the Kindle accepted unprotected Mobipocket eBooks (Mobipocket books with their native DRM protection are not readable on the Kindle). Creative Commons. Creative Commons is a licensing alternative to DRM that allows publishers and authors to mark their work to indicate the conditions they wish to apply to it. There are four main licensing conditions that enable authors to specify that: 1. 2. 3. 4.
11
they want attribution if their work is reused; they want any derivative works shared under the same licensing conditions as the original; they want their work to only be distributed or reused for noncommercial purposes; or they only want verbatim copies of the work to be distributed (no derivative works allowed).
Note that this may also be due to the author’s unwillingness to sign away rights to derivative works. Careful publishers may categorize rendering print content as spoken word as a derivative work.
130 reading and writing the electronic book
6.1.8 Standards Efforts The primary standards efforts associated with eBooks are in the form of specifications that have been developed by working groups within the IDPF, an industry consortium. The main specifications the group has developed thus far have to do with formats. The first is an XML-based specification for reflowable digital books and publications, the Open Publication Structure, which addresses many of the challenges we have identified in this section; it was the first specification the group developed. The second specification is the Open Packaging Format Specification, which describes the format elements that allow different content files to be pulled together to form coherent books. This specification includes publication-level metadata, format schemas, and navigation structures. Finally, a separate working group has developed a specification for an Open Container Format, which is an attempt to standardize the way eBook files are encapsulated (i.e., wrapped with DRM and other elements that support archival storage and interchange of publications). IDPF has not included DRM specification in its mission, possibly because of DRM’s controversial nature and because vendors already have proprietary DRM systems that they see little value in standardizing. Hence, interoperability—the ability to read eBooks on multiple devices—largely hinges on the ability to cope with publishers’ varying DRM schemes. DRM standards for eBooks, such as they are, have been pushed into the Rights Expression Language (REL) for MPEG-21, which is targeted at multimedia content.12
6.2
CONTENT PREPARATION
There is no “gold standard” or identified best practices for eBook preparation. Although most writers have been using digital text editors for the last 20 years or so, eBook projects have rarely started from digital files. Rather, they have required that a process be developed for moving from a print book to a digital file or series of files. Even when the digital form of the material has been available, sometimes the preparers have elected to start fresh, because the digital files have been in a native format suitable for going to press (e.g., Quark) that is difficult to work with as source material for other digital forms. The preparation of new eBooks as part of a publication process differs from other digitization processes (i.e., conversion of existing books to eBooks). New books published as eBooks must be marked up with the tags that are required by the eBook software that is going to render the book. These tags may conform to the OEB specification we discussed earlier, or they may be specific to the particular eBook provider; in practice, most of these formats are close to XHTML with some very simple CSS information to specify the particular look of the eBook. Publishers, distributors, or libraries that are interested in publishing to more than one platform, or in creating a more sus-
12
Jerome McDonough, private communication, May 29, 2009.
content: markup and genres 131
tainable form, often create eBooks using an XML interlingua that allows them to preprocess the files in different ways for different platforms; in academic settings, this markup often conforms to TEI conventions.13 Once an eBook has been marked up, it is compiled into its final form using a special purpose application that converts the marked-up text to the form needed by the rendering program. EBooks may also be created from existing print works. Over the past two decades, many libraries have sponsored their own digitization projects, often aimed at putting special collections online. The University of Virginia’s eText Center was one of the first of its kind and is an example of an organization that was designed to support the digitization of scholarly collections.14 We will first discuss this type of effort, and then we will move on to a second related type of effort, mass digitization, which involves the wholesale conversion of entire libraries and large collections to eBook form. The creation of eBooks from print usually involves scanning to capture the literal page images and may involve either OCR or re-keying of text to recover the content. The texts are then marked up, usually in the general way we have described, for further processing into any number of eBook formats. In practice, the difficulties associated with these early scanning efforts were often underestimated (Marshall 2003). A graduate student who worked at the eText Center described the conversion process this way: I spent two weeks sitting on the scanner. Open the scanner. Put the book in. Close the scanner. Run the scan. Open the scanner. Flip the page. [laughs] And this was fortunately just images. We were not trying to OCR it. It had been keyboarded by a company in India or wherever they do keyboarding . . . The handling wasn’t that big a deal, but these were like 1616 folio pages, and they’d already been disbound. People don’t have an appreciation for how tough older paper is. The good all-rag paper. So they weren’t that fragile, and I wasn’t under that many strictures as far as the handling. But I was trying not to beat them up too much. That just slowed me down a little bit. And then I spent some time—the texts had been keyboarded, and in the keyboarding, they’d done a lot of the basic markup tags. And what I did was—more scut work—[I] ran them
13
For self-publishers, this step usually involves saving a word processing file to HTML format before editing the tags to meet the needs of the specific eBook format; for academic libraries, this step usually involves starting from a minimally tagged flat text file and adding extra markup until it conforms to the interlingua they have selected as their core eBook format. 14
Project Gutenberg is a second early digital book effort; it has been around since 1971 and relies on volunteer effort to process the books.
132 reading and writing the electronic book
through [Perl scripts] to check for errors . . . And then you get this file that spits out and says there are problems on all of these lines. And my job was to go to those lines and figure out what the problem was, fix it . . . Sometimes it was simple things like [missing paragraph tags]. It was actually kind of engaging. It was more engaging than the scanning. Because sometimes it would actually be a problem-solving thing. It would be a mystery. Why is the machine choking on this? It looks fine. What’s the problem? And so there would be some detective work in figuring out what is the problem. And sometimes it would be the tag 30 paragraphs before that opened and didn’t get closed. Or because it was still open was clashing with this tag. While these scholarly print-to-digital conversion processes produced collections suitable for developing innovative analysis techniques, they did not bridge the growing print/digital divide. What was happening was that born-digital materials that were offered on the Web were replacing a legacy of print books. Students, scholars, and more or less everyone else were gradually abandoning the library and only looking for information online; we were reaching an era in which things that weren’t online simply didn’t exist. Mass digitization projects sought to change that. Mass digitization projects are a relatively recent phenomenon; they involve the conversion of content on a much larger scale than previous projects. Rather than selecting individual items of significance or small collections of interest to convert, these projects aim at digitizing entire academic libraries (Coyle 2006). Currently, there are several such efforts underway. Google Books, which is a partnership between an industrial concern, Google, and a growing number of academic libraries, and the Open Content Alliance (OCA), which is a library-driven initiative that also involves multiple partners and the Internet Archive, are two such large-scale efforts. These efforts rely on efficient page-by-page scanning techniques. Google has developed proprietary techniques and scanning hardware for their project; the Internet Archive has developed parallel techniques that are available to others. OCR is used to process the texts so that the underlying content is recorded as text and thus may be indexed and searched. The processes are not fully automated (i.e., sometimes you will see a page with a photograph of a human finger or thumb that was used to hold the book open); human labor must also be used to add minimal markup to the texts. The Google Books project is not intended to result in eBooks; rather, it will provide a sophisticated full text index to literally millions of books. Users will be able to read short excerpts of the books, but mostly they will be able to identify books they want to purchase. On the other hand, the OCA’s effort is intended as a means of providing eBook access to a huge body of works that are now in the public domain. The Google Books effort is enormously controversial as of this writing (especially with regard to its potential dominant position in controlling access to Orphan Works) and the interested reader is invited to look for further accounts of the project and its potential impacts on society.
content: markup and genres 133
6.3
PAPER GENRES REBORN
EBooks represent the rebirth of what is essentially a print genre. In some ways, eBooks invite the implementation of “book emulators”, user interfaces that are aimed at reproducing the experience of reading a print book. Indeed, some software duplicates the print book right down to the physics of page turning (Liesputra & Witten 2008). Thus the initial wave of digital content has parroted its print predecessors. eBooks closely parallel print books. eNewspapers work at capturing the form of the printed broadsheet (almost down to making your hands inky). eMagazines, eTextbooks, and electronic journals all stand alongside (and in some instances are targeted at replacing) what has gone before them. In this section, we will examine some representative genres. Needless to say, this list is not fully inclusive: the hope is that there are sufficient examples that the reader can extrapolate from them.
6.3.1 eNewspapers Of late, the print newspaper (and indeed the ultimate future of journalism) has been thrown into turmoil; the long-term commercial viability of the endeavor has been called into question, and a number of news dailies have simply folded or have been thrown onto the auction block in search of sponsors with deep pockets and a deeper sense of the public good. It’s not that people have become any less interested in the news; sometimes seems that they are more obsessed with consulting more up–to–date sources of news than ever before. It is just that increasingly readers have been turning to the Web for this genre of material and newspapers’ economic support from regular subscriptions and advertising has concomitantly weakened. Although citizen journalists have been playing a more important role in reporting on local matters (Thurman 2006), it is still the case that people are relying on traditional sources of reportage when it comes to big-ticket investigative journalism and international affairs. Although some of this journalism comes to them from news aggregators (rather than from editorially assembled newspapers), it still seems that they are seeking the same (or similar) engagement that they had with the print genre online (Marshall 2007). That is, although journalism is in a state of crisis, normal newspaper readers are not wringing their hands over the situation. Rather they are looking expectantly for a genre that has all of the advantages of the print newspaper coupled with the increased timeliness and interactivity of an online form. In 2006, we conducted a field study of a (RSS) Really Simple Syndication-based news reading application called the Times News Reader that was intended to provide an experience that was closer to that of reading a daily paper. The application literally delivered The New York Times to subscribers’ laptops and desktops; that is, it took advantage of intermittent broadband connectivity (assumed to be the state of affairs in many households) to download the entire paper (or subscriber-selected sections) onto the subscriber’s reading device so the subscriber would have it
134 reading and writing the electronic book
in hand when it came time to read the paper. The application would display the paper when and where the subscriber felt like reading it; much attention was paid to elements like legibility and navigation. The picture the developers (a team from the Times and Microsoft) had in mind went something like this: a commuter would grab her Kindle-like device as she ran out the door to catch the morning train into the city. Overnight, the newest edition of the daily paper would have been downloaded and cached on her small portable device. She’d turn on the device on the crowded train and page through the morning paper; the application was designed such that by using one button, a reader could quickly scan the entire paper. There’d be no problem folding the paper and no ink-smeared hands. On the way home, the commuter might read specific articles to relax; the text was highly legible and used a Times-branded typeface. The text of the articles would reflow into specially designed templates according to adaptive layout capabilities (i.e., the layout would be two columns on a small display; a bit larger display, and it would be three columns; on a highresolution monitor, it might be four or five columns); photographs could be enlarged so detail was visible on demand. Advertising would resize itself to fit alongside the story and would not intrude. When the reader was back online (at home or in the office), normal online functionality would once again be available. Live links would take her to supplementary material or special interactive multimedia presentations on the Web. The application also offered the usual functionality one associates with news Web sites, such as the ability to save, share, search, and print articles, coupled with standard desktop functionality, such as the ability to annotate content. The field study of the device in use among a selected group of Times subscribers in three cities provided a window onto some aspects of the future of newspapers as a genre and how news might be presented on a screen. The Times News Reader mixed the characteristics of a print newspaper with those of a newspaper Web site: would it be perceived as a replacement for the print paper or a replacement for the Web site? What are people looking for as the form becomes all digital? In an earlier study, Watters et al. (2004) found that a broadsheet metaphor was more effective than a document metaphor for presenting news; the Times News Reader was a good vehicle for investigating this laboratory finding in the field. We were also able to explore subscribers’ expectations of a print-like publication: did it fulfill their purposes in reading the paper? Finally, we were able to investigate additional newsreader functionality that would potentially retain subscribers (or deepen their loyalty to the publication). What did we learn? First, we learned that a hybrid newsreader of this sort is bound to be compared to news Web sites as much as it is compared to the print newspaper. And as such, it must have favorable characteristics in both realms. There are some purposes for which the print newspa-
content: markup and genres 135
per cannot be replaced. One participant said that she would not give up the Sunday print edition because, “it’s a religion for me . . . I like vegging out and getting a Bloody Mary;” another said: In the summer, my family and I . . . have a beach house and that’s one of the times that we’re all together. We all sit around and we read different sections of the paper. That wouldn’t happen if we were all getting our info from the computer. But in many other regards, participants realized that the news reader had the potential to go beyond the print form as well as (1) adapt to a range of reading practices; (2) offer more extensive functionality than the Web site; and (3) help the reader to construct a sustainable resource of things he or she has read. In other words, post-Web newspaper readers knew that they didn’t simply read the articles one after another, but rather they skimmed and scanned, looking for additions to a continuing narrative or getting a quick fix on current news. Participants wanted post-Web content and features that were as least as extensive and that new features as what was on the Web. For example, crossword puzzle fans could already work the puzzle competitively or socially on the newspaper’s Web site; this ability was conspicuously absent from the software in the study. Readers have grown to expect personalization coupled with timeliness in online news. In the end, it seems like the digital genre will continue to echo some of the social functions of the print newspaper. Readers of newspapers like The Times realize that complete disintermediation can lead to naive interpretations of events and their significance and a fragmentary awareness of the world at large, yet they have also come to expect more from their online news experience.
6.3.2 eMagazines Although we have already discussed electronic newspapers in this section, other issues emerge when we broaden the genre to include other kinds of periodicals, especially magazines. One particular area of note is the role of electronic periodicals in current models of information behaviors (Pettigrew, Fidel, & Bruce 2001). That is, much of the other reading we have focused on is either immersive (as with a novel) or purposeful (as it is in active reading). When people read electronic magazines, there is more opportunity for serendipity and for encountering information they weren’t even looking for. Reading this way—a type of reading we will see again later when we explore new digital genres—is an important counterpart to directed browsing and searching (Erdelez 1997). Encountering information can facilitate discovery or foster creativity (Toms 2000). There are many reasons to keep this type of reading experience in mind as we design digital genres from print ones. There is a trend toward minimizing this sort of encounter in the digital world. After all, we can pinpoint peoples’ interests and needs; we might think that there is no
136 reading and writing the electronic book
particular reason to continue to foster overly broad delivery vehicles like magazines. Even though magazines show no real sign of disappearing, they do show signs of becoming highly specialized and personalized and are designed to minimize accidental brushing up against content that might be outside the reader’s normal interests. We must be careful that these efforts to meet readers’ specific information needs don’t short circuit current channels that enable people to encounter new ideas outside of their defined interests. As we saw in Chapters 2 and 3, we read magazines differently than we read other sorts of material. EMagazine publishers have attempted to duplicate the experience of turning magazine pages,15 but casual navigation (e.g., flipping quickly through the magazine’s pages or refolding a magazine page to focus on sidebars, cartoons, and other embedded content) has yet to be duplicated, and it’s not clear whether it can be. Even the physical venues in which one buys magazines, such as newsstands and bookstores, are conducive to encountering new publications and unanticipated stories. Readers are attracted by pictures, advertising, cartoons, and other material that is more or less segregated in the digital world. It seems important to preserve some of these aspects of the print genre.
6.3.3 eTextbooks and Course Packs Electronic textbooks. Textbooks are often seen as a genre that would derive great benefit from being produced, revised, and distributed electronically. We have seen students at all levels burdened by backpacks full of print textbooks; eTextbooks would theoretically reduce that burden. Furthermore, textbooks require frequent revision and replacement; eTextbooks would facilitate that process. Yet early studies of electronic textbooks found them to be poorly accepted by students and faculty alike and unlikely to be adopted if anyone has a choice in the matter. In 2005, Jay Dominick, a graduate student at the University of North Carolina and the CIO at nearby Wake Forest University, took advantage of an opportunity to study eTextbooks in a real classroom setting as part of his dissertation research (Dominick 2005). He wanted to know why the genre transition from print to digital was so universally unsuccessful. His deployment of eTextbooks involved four courses and five instructors who used different kinds of health, physiology, exercise, and anatomy texts and taught using a range of pedagogical techniques. Dominick makes many interesting observations about the textbook genre before he goes on to discuss his findings about eBooks. The first observation is that textbooks themselves, as print artifacts, are not a well-loved form. The second is that eTextbooks suffer from being constrained to parrot the physical form of a textbook: making the eTextbook so closely follow the existing print genre is a mistake. 15
For example, Zinio (http://www.zinio.com) is a service that more or less duplicates print magazines on the screen.
content: markup and genres 137
Dominick goes on to make the point that eTextbooks must change in four ways if they are to catch on (and he feels that they will catch on in the due course of time). First, the economic ecosystem that surrounds the textbook must change, mainly to take full advantage of the fact that an eTextbook is digital.16 The second change he anticipates involves the legal environment of eTextbooks: the DRM was simply too cumbersome and too unforgiving for the students to feel that they had made a fair purchase (i.e., among other things, their fair use rights were limited, and they did not have a reasonable expectation that they could even open the textbook in 10 years); furthermore, they could not engage in the normal practice of reselling the books at the end of the course. The third problem that Dominick notes is precisely along the lines that we would expect from Chapters 2 and 3: the students did not like reading from the eTextbooks; reading on the screen caused fatigue; the eTextbooks were less mobile than regular textbooks; and the presentation on the screen did not allow them to maintain a proper sense of orientation. Dominick observes: The overall conclusion that I draw is that it is textbook reading in general that causes the sense of fatigue, and that the addition of an electronic interface probably served to give focus to a general displeasure with school reading. Said in a different way, it is the nature of the textbook itself as currently conceived, independent of the form of its presentation that causes that discontent (Dominick 2005, p. 366). He concludes that the third problem arises from the stilted mode of interaction the eTextbooks offer, coupled with the fact that the material in the books is simply not that engaging for the students. The final point that Dominick makes about the eTextbooks is general indictment of the educational system. The important lesson to take away here is that there is little point in carrying an unsuccessful genre forward from print to digital. Rather, this transition might provide exactly the right opportunity to rethink the form and its social role. As he winds up his dissertation, Dominick predicts that in two generations, textbooks as we know them will disappear completely. Course packs or course readers. Course packs or course readers are the materials that have been assembled by the instructor for a particular course. They are usually copyright-cleared, photocopied, and bound by a third party.17 The course materials may include everything from journal papers to newspaper articles, to individual chapters of longer books, to private essays.
16
Dominick makes the important observation that the person purchasing the book (often, the student’s parents) is not the person reading the book. Furthermore, the publisher is selling the book to the instructors, not to the students. The bizarreness of the commercial circumstances that make up textbook economics cannot be overstated. 17
In the United States, copy shops like Kinko’s make the assembly of course packs a part of their business.
138 reading and writing the electronic book
Course readers are already making the transition to digital form and are perhaps more interesting candidates for eTextbooks than normal textbooks are. In our Pocket PC eBook study, almost all of the students (particularly the undergraduates) reported being anxious to replace their course packs with electronic texts (Marshall & Ruotolo 2002). The course packs are heavy and bulky; they materials are usually read quickly; they have no long-term value to the students (they are usually tossed or recycled at the end of a class); they usually represent secondary materials for the course; they are costly and cannot be sold back to the bookstore; and the readability could easily be improved if they were in eBook form. It is important to note that copyright clearance is a major aspect of preparing the course readers; it contributes to the difficulties associated with their production and their ultimate cost. Unfortunately, some of the students who participated in the study did not realize that copyright clearance was at the root of the high cost of course readers.
6.3.4 Electronic Journals Academic journals have been in a state of crisis for the past several decades. Subscriptions to academic journals are expensive for research libraries to maintain, yet they are a critical resource for researchers in any field. Over the years, the literature has become progressively more fragmented and esoteric; library budgets can no longer afford the growing number of expensive periodicals that faculty members demand. Furthermore, because much of the labor that goes into journals is free— researchers write articles and other researchers and editors participate in the peer-review process as part of their everyday work—there has been a growing realization that the system not only needs to change, but also that it can change (Amiran, Orr, & Unsworth 1991). In the early 1990s, humanities researchers realized that they had the means to wrest control from the traditional academic publishers. The Journal of Postmodern Culture was the first journal to “go digital” (Amiran & Unsworth 1991); it has been followed by many others who are continuing to pioneer a new open-access publication model. There is much written on the subject from many different perspectives; the interested reader need only browse from pivotal publications in the humanities and social sciences (Unsworth 2006) or in the sciences (Lynch 2007) as starting points. There is a very real perception that the model for scholarly publishing can and will change (and is indeed in the process of changing already).
6.4
NEW DIGITAL GENRES
The past two decades have seen the rise of numerous new digital genres such as Web logs (blogs), wikis (Wikipedia, in specific), hypermedia and multimedia, hypertext fiction, and non-textual forms. It seems more important to look at two phenomena that these emerging genres share rather than to pin down how the new genres are used and by whom or how they might be realized as
content: markup and genres 139
eBooks. These things are mercurial and the list keeps growing (e.g., Twitter tweets, cell phone novels, status walls and profile pages, and chat logs, to name a few). The more important of the two phenomena is the way these emerging digital genres have blurred the line between reading and writing: blog readers are often blog writers; wikis are a participatory form; and hypertext fiction is most famously about decentering the text, eroding the authority of the writer. A number of new forms—tweets, update walls, and transcripts of yesterday’s instant messages—fall under this rubric. Many believe that it is this blurring of roles between reader and writer that will keep reading alive. A culture of participation and interactivity are central to today’s emerging digital genres. The second important aspect of emerging digital genres is the way in which they fluidly recombine and are delivered in multiple modes: the situation of reading—the reader’s context, the delivery device, and the reading application—can no longer be carefully anticipated. One digital form is scraped, funneled, or otherwise projected into or onto another. Tweets are folded into blogs; maps are the portals through which one finds stories; wikis form a scaffolding for organizing other written forms. Because these forms are familiar, this discussion will identify a few salient points about each one, identify a few starting points in the literature, and move on. Web logs or blogs. Web logs, or more commonly blogs, are a form of journal, episodic chronicle, or news report that is published periodically using a preconfigured format template. The publication is presented in reverse chronological order and often includes links to other material on the Web. Because they are often published irregularly, many blogs are available through subscription services such as RSS or Atom. Subgenres of blogs include political blogs, gossip blogs, personal journals, technology reviews, travel writing, and cooking and food blogs; blogs may also be sponsored by corporations and serve business purposes. They may sometimes appear in non-textual forms (e.g., podcasts, which are audio files, or photo blogs, which are obviously visual content). Interactivity is provided by a comment feature; comments are an important part of blogs, with whole discussions taking place in the wake of the initiating blog entry or post. Sometimes the blog itself is considered the participatory arm of a less interactive print form like a newspaper. Current research topics that fall under the eBook purview include finding which blog to read (Hearst, Hurst, & Dumais 2008); visualizing blog archives (Indratmo, Vassileva, & Gutwin 2008); citizen journalism (Thurman 2006); and exploring reader and writer roles (Baumer, Sueyoshi, & Tomlinson 2008). Wiki writing and Wikipedia. A Wiki is a style of Web site that uses software to manage a set of interlinked Web pages. A fundamental aspect of Wikis is that they support the easy creation and modification of content by a number of contributors; versioning and version history mechanisms allow changes to be tracked, and associated discussion pages allow the changes to be explained and
140 reading and writing the electronic book
deliberated. Wikipedia is the best known example of a large-scale Wiki and is something of a Petri dish for different kinds of studies of how the content evolves through negotiation (Almeida, Mozafari, & Cho 2007). Although there is a long history of software to manage collaborative hypertexts (e.g., Chang 1998), Wikis are the first such application to be widely used by non-researchers. Hypertext fiction. Hypertext fiction began as a participatory form. Some of the fiction itself was collaborative (Coover 1992); in other cases, the reading itself—the order in which the reader chose to encounter the individual nodes or lexia—was considered a form of participation (Moulthrop 1993). Early hypertext novels required their own specialized reader software; even today, specific navigation functionality used by some of these novels is not available in common reading software like Web browsers.
6.5
EBOOKS AND LIBRARIES
EBooks occupy a puzzling niche in public and academic libraries. Libraries have been in the business of serving content in one form or another to their constituencies. They have adapted to the availability of different genres (e.g., film, sound); new media (e.g., record albums, videodisks, CD-ROMs, and microforms); and new delivery technologies (e.g., OPACs, the Internet). Yet eBooks don’t fit neatly into this ecology; there’s something about them that just doesn’t work. Libraries have been bravely embarking on pilot programs that involve eBooks. They have purchased reading hardware (Rocket eBooks, SoftBooks, and now Kindles and Sony Readers), eBooks that can be read in standard Web browsers, and various other experimental means for delivering material into their patrons’ hands (e.g., see McKnight & Dearnley 2003). A number of these programs have disappointed the librarians who have tried valiantly to make them work. In this section, I will give a brief account of one such program,18 followed by a summary of the results of comparable efforts of this type. There are general problems with eBooks in libraries, not the least of which is the result of the interaction of eBook technology, DRM restrictions, sustainability, and the library’s mission. The reason I’m including a description of a specific program is that the problems that that the library encountered in implementing this program are general and instructive; I have deliberately omitted the name of the library director and the public library district involved in this program to protect their privacy.
6.5.1 A Pilot EBook Program in a Public Library This program took place as the second generation of eBook readers reached maturity, resulting in the availability of purpose-built eBook platforms and prepared titles. The library district’s
18
This account is derived from a phone interview I conducted with the Library Operations Director of a public library district who started the pilot program.
content: markup and genres 141
eBook program was funded by the local Library Foundation19; initial funding for the program was $15,000. At the time of the interview, the library director had spent $3,500 of a $5,000 advance; $2,000 was spent on content. How did the library director decide to launch an eBook program? She had experience with technology as a librarian at a large aerospace company’s technical information center before she had taken the public library position. She became interested in eBooks as a new library technology, and she turned to listservs as a resource when she needed more information about them. Her enthusiasm for this type of project was reinforced when she saw Robert Garthwaite, Franklin’s Vice President of Worldwide Sales and Marketing, speak about the eBookman at a big public library conference. About 100 of her fellow library directors attended the talk. When Garthwaite asked “Who has eBook readers,” only about four hands shot up. She wanted to be one of the pioneers. The library director decided to use funds to purchase purpose-built eBook readers because she strongly believed that a public library has a service mission, and if she had gone with a NetLibrary-like content model, library patrons would have to own PCs to read eBooks (or they would need an MP3 player to access Audible’s content). At the time of the program, the library director felt that some members of the community would be excluded by a content-only model that required the patron to own hardware.20 The library director initially purchased four REB1100s (the successor to the Nuvomedia’s Rocket eBook devices). SoftBook Press was the content provider; at the time of the program, content could also be purchased from Powell’s Bookstore or Barnes and Noble. SoftBook Press set up a deposit account for her (deviating from their usual credit card purchase model), but they still needed a credit card to open the account. She ended up using her personal credit card. The library director bought the actual hardware for the program at a large office supply chain store. The store’s convenience was fortunate because two of the original four devices were defective; she returned them, but one from the second batch was similarly defective. She was dismayed at the devices’ 50% failure rate, because she didn’t have the time to troubleshoot them over the phone with customer support; she needed them to work from the outset. Because the device required an analog phone line to download eBook content, and the library had a (new) digital phone system, the library director had to pay $450 for an analog line to be
19
The Library Foundation is a local nonprofit supported by community members. The nonprofit approached the library director for proposals, and she gave them several, including the eBook project. They funded it because they wanted to fund something technological, although they didn’t know what an eBook was at the time. 20
The library district’s stated mission, in fact, is: “to provide all people with unrestricted and free access to its services and to a balanced, unbiased, and diverse collection of books and other materials to meet the community’s informational needs.” At the time of the program’s inception, laptop computers and MP3 players were less ubiquitous than they are today.
142 reading and writing the electronic book
reinstalled at the library. As a work-around, a member of the library staff took the eBook devices home to load new content. The library director originally wanted to load new content using removable storage, but she had difficulties finding the appropriate hardware. What did patrons need to check out an eBook from the public library? The cost and potential fragility of the hardware made it necessary for patrons to provide a picture identification card and a signed agreement in addition to their usual library cards. To package the eBook readers for checkout, the library director purchased thermal lunch pails to store them in. She thought hard about the characteristics of the container and asked other librarians who had implemented similar programs what they had done. The container she chose had a hard liner so it retained its shape and a black bottom so it didn’t show dirt. Each container was outfitted with a luggage tag and embroidered with the name of the library district on the front. The eBook kits she designed included: • • • • • •
The REB1100 eBook platform; A laminated reference card with information; A user agreement to explain the fines and fees;21 A user survey to gather information for the library; A cleaning cloth; and A charger for the eBook.
The four devices that were available for checkout were each loaded with different titles (representative of the library’s most popular genres). Two of the eBook readers contained four mystery/ adventure titles, one contained six nonfiction best sellers, and the other contained six fiction best sellers. Circulation desk clerks had to be trained to check out and in the eBook kits. At check in time, the circulation clerks reset the eBooks to the first page and removed all highlights and bookmarks. The reference desk provided user support. The library director felt that the circulation desk and the reference librarians would thus know whether the program was a success; she was not just interested in the early adopters who would be immediately drawn to the eBooks, but rather to the nonusers and their potential to become interested. The library director felt that people in the community were not sufficiently aware of the electronic services the library offered them; whether this feeling translated into promoting new capabilities provided by the library’s catalog or raising patron awareness of the library’s resources was not clear. Because the library director cited a veteran library volunteer as an example of this shortfall (the volunteer was unaware that the library had a subscription to 21
The overdue fine was $1 per day. There was a $20 fine for a lost battery, and a $5 fine for a lost stylus.
content: markup and genres 143
InfoTrac22), her feelings about this problem might indicate something more systemic that would have a strong bearing on whether the typical patron knew that the eBooks even existed.
6.5.2 EBook Experiences in Other Libraries Although this anecdotal account of one library’s experience with the devices might seem peculiar, more rigorous case studies of a library’s adoption of eBooks are similarly riddled with stories about work-arounds. How are the eBooks checked out and returned? How is material loaded onto the devices? Although the devices are loaded with the most popular reading material, the vehicle is necessarily more fragile than the comparable print books. McKnight and Dearnley (2003) reported similar results: the outcome of their pilot (using similar hardware) caused them to conclude, “It is not clear from the outcomes that portable eBooks provide a viable delivery mechanism within a public library.” But what of other pilot programs that did not feel that the library needed to supply eBook hardware and instead just supplied content? Garrod cites a less formal source, OverDrive, an intermediary source for digital content, as saying that their titles are “flying off the library shelves” of the Cleveland Public Library (Garrod, 2003) Indeed these programs seem to have met with greater success, especially since laptop and audio player ownership has grown more ubiquitous; many libraries now support downloadable eBook content, and there are well-documented best practices that librarians can consult when they are setting up their own programs.23 In general, the success of these programs depends on different characteristics, including: •
• •
•
• 22 23
The degree to which the eBooks are discoverable as part of the library’s collection (e.g., Are eBooks well integrated into the catalog and the library’s Web presence? Is new material promoted so that library patrons are aware of it as it is acquired?); The library staff ’s familiarity with the procedures for accessing and downloading the eBooks, and their general familiarity with the material that is available electronically; The patron’s ability to use the digital material in a larger activity (e.g., If the patron is a student, does the library support storing notes or annotations? Does the library help enforce proper citation practices?); If the library is an academic library, are the eBooks well-integrated with other types of educational material? (e.g., Do courseware management systems refer directly to the eBooks?); Are the library’s electronic resources sufficiently visible on the open Web?
InfoTrac is a popular information resource for periodical literature. It is marketed to libraries and schools.
For example, see http://www.slideshare.net/chadmairn/library-best-practices-for-ebook-and-eaudiobook-circulation.
144 reading and writing the electronic book
•
• •
Are there hardware stations in the library that allow patrons who do not own the appropriate hardware to use the digital content locally? (e.g., Are there MP3 players or appropriate applications and headphones on hand that would enable patrons to listen to audio content at the library?) Is the DRM embedded in the eBooks sufficiently flexible to address patrons’ needs? In other words, can the eBooks be circulated appropriately? Does the authentication system strike an appropriate balance between privacy interests and the need to protect the content?
Certainly eBooks have advantages that are bound to make them popular in a public or academic library, not the least of which is the ability to access them over the Internet and the inherent portability of digital content. In some circumstances, they may also not be copy-limited (i.e., multiple users might be able to access them at the same time). But what of eBooks that require specialized hardware: will Amazon’s Kindle suffer the same fate in libraries as the earlier REB1100? The Kindle has some of the same characteristics, albeit more storage and better overall ergonomic design. Studies to date have not involved actually putting the Kindles into normal library circulation, but many of the same issues are bound to arise.
6.6
SUSTAINABILITY AND DIGITAL PRESERVATION
Sustainability and digital preservation are topics that merit more than a section, but here we will raise a few major issues intrinsic to the long-term fate of eBooks. Let’s just raise the simplest possible question: suppose a consumer purchased an REB1100 in 2001. A decade from now—when the hardware, software, and content are almost 20 years old—will the consumer still be able to read her personal library of eBooks? It is immediately obvious that many forces are at work, beyond the immediate lifespan of the REB1100’s display and storage: will the content’s format still be viable? Will the DRM provisions let the eBooks’ owner move her library to a modern platform? How will she move the content from the REB1100 hardware to another platform (ignoring the first question of whether the DRM will let her do so)? And what of her annotations—can they be moved to the new platform too? Will they still be readable? All at once we can see what all the fuss about standards is about. In the current state of affairs, it is unlikely that the answer to the first question will be a resounding yes. In fact, it’s far more likely it will be a sheepish no. Because most readers did not invest very much in content for the second-generation reading platforms, it is likely those platforms will fade away without the consumer (who was albeit an early adopter) even noticing that he or she has lost a small collection of eBooks and perhaps even a few experimental annotations and bookmarks. It is not clear how today’s
content: markup and genres 145
Kindle owners (who have a great deal more storage and hence may spend a lot more money on content) will feel; some provision for converting or replacing obsolete content will be necessary. What would we need to reproduce our personal digital libraries? Let’s use a narrow description of personal digital libraries for the moment and omit the many types of content we may have created ourselves—photos, videos, email, calendars, address books, and the other digital belongings we amass in the course of our everyday use of computers—and just concern ourselves with eBooks and records of our interactions with eBooks. To sustain our personal libraries, we’ll need at least: • •
• •
•
A record of the published content that’s in our library (something tantamount to our own card catalog); A record of our library’s personal geography (in other words, records of our interactions such as which books we’ve read, which books we’ve shared, our annotations, our clippings and notes, and any organization we’ve imposed on our library); Access to the eBooks that are still “in print” (i.e., eBooks that can be replaced by editions formatted to be readable on current platforms without negotiating a repurchase); Access to the eBooks that are still available through other sources (i.e., eBooks available from an alternate source, perhaps through OCA, but not necessarily from the same publisher as our original purchase); and A potential means of converting (migrating) the eBooks that are out of print and no longer available electronically.
In other words, to sustain a personal digital library of published material, we will need a record of what we had; a record of what we did with it; and a means of either replacing or migrating the published content to new platforms. While it is unwise to speculate that personal digital libraries contain material that is truly permanent—valuable beyond a lifetime, passed on as a digital legacy to heirs—it seems prudent to believe that at least some eBooks and the artifacts produced by reading them will have meaning for a long time.24 However, we cannot exclude shorter-term material either because people are notoriously poor judges of what they’ll keep “forever” and what they’ll toss at the next opportunity and never miss. The clipping about vacation spots might seem useless after reservations have been made and the vacation is long over, but it may retain its value as a pleasing reminder of a perfect
24
I’m deliberately omitting personal libraries that are important from a cultural perspective (e.g., the libraries of well-known writers or of historically important figures). These are the province of professional archivists and are likely to be handled in a different manner than we handle our own digital belongings.
146 reading and writing the electronic book
moment in the sun (or even a humorous reminder of 7 days huddled inside a yurt during an unexpected rainy spell). We can’t necessarily predict an item’s value or permanence. It’s easier to keep electronic material than to cull it, and it’s easier to lose it than it is to maintain it. Many people are beginning to realize exactly that. For example, most students say they would (or do) keep more of their class work if the materials were electronic; they lose their patience with comparable paper archives, since they must be moved or stored. When I asked the students how they felt about giving up the Pocket PC at the conclusion of the study, they said things like: I only ever got rid of two course books in my life. Those were the Astronomy and Calculus. I keep all my books. And that’s something about the eBook. It’s partly that I have to give up the device, and I won’t be able to play solitaire on the bus anymore. But it’s also kind of creepy, kind of weird, being a grad student, and having been a committed student for a long time, being a bibliophile. Being used to keeping my books. They’re so ephemeral. And even if I save the files somewhere, I’ll have to go through this process of reconstruction someday, of resurrecting them. Reconstructing them. You know, buying a computer, transferring them to the new computer, opening them up. And I also wonder, am I going to refer to them? Although not all students feel this way—some count on the revenue generated by selling their textbooks back to the bookstore at the end of the term—it’s also clear that scholarship requires at least some measure of stability. Elsewhere we have explored what researchers and scholars expect to keep (Marshall 2008a) and how they go about ensuring the sustainability of their digital belongings (to varying degrees of success) today (Marshall 2008b). Others have looked into the larger problem of maintaining the viability of published material at significant timescales (Abrams 2005; Arms & Fleischhauer 2005; Baker et al. 2006; Lynch 1999; Maniatis et al. 2005). Instead of going into detail here, we will simply say that content formats and storage strategies should be designed with an eye toward their permanence and sustainability, and realistic economic models for sustainability still need to be developed. • • • •
147
chapter 7
Beyond the Book The greatest temptation we face in innovating to go beyond the personal library, print book, and printed page in designing eBooks is to simply follow our imaginations. While our imaginations may well be a great source for new capabilities, we should take care to temper them with an understanding of what readers—readers with different backgrounds and skills, readers with different purposes, readers using eBooks and other digital material that represent radically different genres—actually do. This knowledge won’t keep us from being innovative; rather, it will help shape the products of our imagination so they are genuinely useful (and usable). In the book Close to the Machine, Ellen Ullman talks about the day the users of the software she was developing became real flesh-and-blood beings to her: I started to panic. Before this meeting, the users existed only in my mind, projections, all mine. . . . Now I was confronted with their fleshly existence. . . . The machine events already had more reality, had been with me longer, than the human beings at the conference table. Immediately, I saw it was a problem not of replacing one reality with another but of two realities (Ullman 1997). It is easy to see how eBook and personal digital library functionality can diverge from human practice; keeping functionality and practice well aligned is the motivation behind Chapter 5, and it is something that is important to keep in mind as we start talking about beyond paper capabilities for eBooks.
7.1
BEYOND PAPER CAPABILITIES
The developers of the XLibris prototype initially focused on creating a reading experience that was very much like reading on paper (Schilit, Price, & Golovchinsky 1998b). Thus, the first time we deployed the pen tablet reading platform, readers found the experience intuitive, and very much like reading and annotating a print technical article (Marshall et al. 1999). But after they had used the prototype for a while, they began to ask the obvious question: why would I read on a computer if it is just like reading on paper?
148 reading and writing the electronic book
A reading group member who had volunteered to take part in the deployment said: I could have more from this device, because it was too much like plain, ordinary paper. And there must be a high powered computer behind it. But I wasn’t really taking advantage of the power. The original prototype had several experimental capabilities that took advantage of the digital representation of the document and the reader’s interactions with it. For example, a feature called the Reader’s Notebook could be used to gather all of the annotated passages in the document; it was as if the annotations (the highlights, underlines, and marginalia, along with the passages that the marginalia referred to) had been clipped out of the document with a scissors and taped in order to a single page (or multiple pages if there were a lot of annotations). During the initial deployment, we thought our target audience—a reading group that was discussing a series of technical articles—would use this collection of annotations during their faceto-face meeting to talk about the paper; each person’s Reader’s Notebook would remind the individual of what he or she was interested in. Thus, before the meeting began, instead of printing out each person’s marked up copy of the article, we gave each participant a hardcopy of his or her Reader’s Notebook. We thought these personalized collections of marked passages would help focus the discussion on what each person had found to be important in the article. In practice, this capability was not as useful as we had hoped. The readers who had made few marks on the technical article needed the rest of the text to provide context. In fact, everyone wanted the whole document with their marks in context, but this effect was particularly profound for those readers who had made very few marks. One of them said: “[The annotations] were all there in the physical document, and we’d gone from the beginning to the end anyway.” Readers who had annotated extensively were similarly taken aback: they had received such a lengthy collection of passages that their Reader’s Notebooks were longer than the original article (the layout of the Reader’s Notebook inserted extra space to separate excerpted passages for the sake of readability). Thus, a reader who had highlighted extensively and had written substantial marginalia said: “I realized [the annotations] are totally useless because I highlighted so much of the paper.” Furthermore, any visual cues to keep the reading group members on the same page during their discussion were missing: they had lost the advantage of all looking at copies of the same document. The Reader’s Notebook had produced personalized views of the article that were necessarily different for each reader. Perhaps, the Reader’s Notebook would have been useful for a different activity (e.g., writing an essay synthesizing a number of sources), for a different document genre
beyond the book 149
(e.g., one that was longer), or if it had included some different capabilities (e.g., the ability to manipu late the clippings in a two-dimensional space). But as it was, the functionality was mismatched with participants’ practices in our original technology intervention. A second somewhat obvious capability that was missing from the XLibris prototype (by design) was the ability to follow embedded links; instead, XLibris suggested related reading using a thumbnail in the margin as a link to a new document. This new document was potentially relevant to the content that the reader had annotated. We also proposed a citation-chasing capability to the reading group members, reasoning that technical articles often referred explicitly to external content; these links could also be presented in a way that was not visually disruptive. But when we discussed citation following with the study participants (who were all seasoned researchers), they expressed little desire to interrupt their reading to chase down a citation. One researcher said, “The problem with references is, if you don’t know them, and you’re not familiar with the sources, they’re usually duds.” Another said, “Well, you know, it’s hard. Because during the reading you don’t want to go to it, right? Because it’s too distracting.” In a later version of the prototype, we reached a compromise solution and designed a “to read” list that would hold the links the reader wanted to follow until he or she was done reading. The lesson to take away from these examples is not to throw up our hands in frustration and limit development to building the perfect paper emulator; there would be little point in that (beyond potentially saving paper). Instead of developing the perfect paper emulator, we can develop capabilities based on our understanding of specific disciplinary practices, specific types of reading, and specific genres of material. We can tame the vagaries of our imaginations by checking in with our readers.
7.1.1 Domain- and Practice-Specific Capabilities Adding value to eBooks requires a good understanding of what people are doing when they read and why they are doing it. Let’s look at a few brief examples, and how some specific capabilities may be derived from them. Translation of ancient Greek texts. One predictable mode of student annotation occurs in foreign language texts used as textbooks; students routinely translate difficult words and squeeze the translations between the lines, presumably so they can refer to the translations again if the words
This finding was echoed in our later research on scholarly research and digital preservation (Marshall 2008). However, during their meeting, the story might have been different: several times, they referred to other papers that several members of the group had read, and several other members had not. If wireless connections had been available at that time, it might have been interesting to have live links to other papers during the group’s meeting.
150 reading and writing the electronic book
FIGURE 7.1: A dynamically added back button supports quick checking of a link’s destination.
are forgotten (Marshall 1997). Most instructors who teach from foreign language texts confirm this need. Hence, we might predict that students studying these texts on the screen would appreciate per-word translation features. Indeed, students found the Perseus project’s translation feature (one in which they could retrieve a word-by-word English translation of the Ancient Greek texts) to be one of the most compelling reasons to read their assignments on a computer (Marchionini 1995). Citation linking in legal briefs. Citations are essential for legal research; outbound links connect a case with precedents and inbound links can help an attorney or law student discover whether the case still represents good law (i.e., whether the case has been overturned or reinterpreted by subsequent cases). The major legal research services, such as Lexis and Westlaw, have facilities for chaining forward and back to facilitate this kind of checking. In a study of law students preparing for Moot Court (Marshall et al. 2001), we found that the students typically liked to initiate their research using a central case as a starting point, looking for “a string to pull,” in lieu of searching Lexis or Westlaw. This style of working suggested that the standard eBook forward and back controls be replaced with two alternate types of navigation: one to support quick citation checking (the ability to look at a link’s destination and return quickly to the original document) and another one to juggle among pages in multiple documents. Hence, two domain-specific navigation facilities were added to the reading software: (1) a dynamic “back” button to support quick traversal-and-backtrack appeared in the margin of legal
FIGURE 7.2: A transparent overlay shows thumbnails of working pages to support quick transitions among them.
beyond the book 151
FIGURE 7.3: Two chapter-based visualizations of the content of Melville’s Moby Dick. (a) From Moby Dick, Chapter 1, Loomings. (b) From Moby Dick, Chapter 41, Moby Dick.
cases that were link destinations (Golovchinsky 2002); and (2) a semitransparent overlay was added to support moving among several documents used in a writing activity. Figure 7.1 shows the Back button that appeared in the margin. Figure 7.2 shows the navigation overlay that supports juggling among recently used documents; the document pages shown include not only pages from legal cases but also a page from the clippings notebook and a page that represents a document workspace. Visualizing a text. The availability of digital texts and eBooks has produced some fundamental changes in humanities scholarship. Later in this chapter, we will look at analysis tools that work at the collection level, but first we can consider eBook facilities that support analysis of the work, rather than supporting other kinds of work that the reader is doing with the book. There is a tradition of developing this type of tool, starting with facilities to analyze link patterns within a work (Bernstein et al. 1991). Instead of looking at link structures, we will use a simple illustration of book-level analysis: two word-cloud visualizations using the tool Wordle that show parts of Moby Dick’s narrative arc. Figure 7.3a shows a visualization of Chapter 1, Loomings, that establishes the novel’s setting, and Figure 7.3b shows a visualization of Chapter 41, Moby Dick, in which Captain Ahab is in pursuit of the whale. Note that this example is not intended as a meaningful analysis of the content of Moby Dick, but is simply meant to illustrate the idea of within-text analysis. In practice, readers often find these visualizations difficult to interpret. Using them in any meaningful way often requires training; choosing the right analytic tool for the job requires careful thought.
7.1.2 Within-Book Search Within-book search is often the function readers cite when they are asked about the advantages of an eBook. Yet if they haven’t used a within-book search capability, they may not be able to tell
http://www.wordle.net/.
152 reading and writing the electronic book
you exactly how they would use it. Might they use it to find the name of a forgotten character in a complicated murder mystery? Perhaps. Locating a character’s introduction into the story is an oftdescribed search scenario. But, in practice, there are other ways in which search may be useful. Let’s listen to a graduate student in English Literature as she talks about an interface in which she specifies the search term by highlighting the word she wants to find in the text. The search feature will take her to the next (or previous) instance of that word: If I were doing the highlight thing [to specify the search term], I’d have to go through the whole rigamarole. Stop. Highlight. Again, that feels to me like a feature of the text that has been designed by some guy sitting in a room going, “Oh. What are readers gonna want to do with texts? Well, they might want to search for a word.” And so they put in a really . . . elementary search thing. Which works great if you are like, “Okay. I’m reading this Agatha Christie mystery. I want to find the first appearance of Mr. Brown. So I write in ‘Brown’ and I find—oh, there’s the appearance of Mr. Brown. I’m all good.” It just feels like something that has been designed for this sort of theoretical general reader. I’m not theoretical. And I’m not a general reader. I’m doing a specific kind of reading. I’m an English grad student. I’m engaging with the text in these really particular ways. I’m thrilled to have the ability to search. But . . . I need to be able to do certain things in the search that I’m not able to do. The eBook software’s designer was thinking about the forgotten character in the complicated murder mystery when he developed the search capability in question. Notice the reader doesn’t have to find the soft keyboard in the user interface: all she has to do is highlight the word in question with her stylus when she re-encounters it in the story, and away she goes. What’s wrong with that? It’s very easy to imagine a situation in which the word the reader wants to find is nowhere in sight. What’s more, within-book search may be used in at least two more ways. In addition to using search to locate or refind a familiar place within a book (i.e., to refind something the reader has seen before), it also may be used for certain kinds of research and analysis and it can be vital for navigating through unfamiliar terrain within a book. Let’s examine all three functions (navigation, refinding, and analysis) and explore how these functions might be added to the eBook’s user interface. Remember that these are within-book functions; there are several other functions that will be added to the list when we talk about collection level functionality. First, let’s look at navigation. The following snippet is taken from an interview with the same English literature graduate student as she explains how she read her eBook to prepare for class;
beyond the book 153
she has already explained that she had looked at the book before, but the last time she’d had time to read, she had only read particular sections: This week [the professor] asked us to look at an early 19th century critical text that relates in some way to the 18th century reading we’ve been doing . . . I don’t have any sense of what is in the Biographia, and what I’m looking for. . . . So okay, begin at the beginning. Chapter 1. And I read the heading to Chapter 1. It would be really nice if there were a table of contents to this. There isn’t. [pause] So Chapter 1. I get to the end of the description of what’s in the chapter. It says, “Comparison between the poets before and since Mr. Pope.” Well that sounds like [it’s] right up my alley. . . . So I basically searched for “Pope” and got to this page. Which was page 43 in the Reader. Well, once I’m there, it’s obviously in the middle of a paragraph, in the middle of a section. So to get an idea of where I am, and what he’s talking about, and sort of get oriented, I have to scroll up a little bit. Well, the beginning of the paragraph is, “The second advantage, which I owe to my early perusal.” I still need more context. I need to find out what he’s talking about. So I go back, skim read, get a picture of what he’s talking about. Then I scroll forward again to where I landed and start reading more seriously now that I’ve got the gist of [it]. (Marshall & Ruotolo 2002) While this use is not completely dissimilar from the mystery scenario we poked fun at earlier, it is different in some important ways. First, after she arrives at the word she searched for, she does not simply start reading there. Naturally, she has to back up, “to get an idea of where I am, and what he’s talking about.” She backs up to the first structural element she encounters, the beginning of the paragraph. With that act, she verifies that she is in the right place, but she still doesn’t have enough context to start reading. Thus, she goes back to another structural element (the beginning of the subsection, perhaps) and reads a bit. Then she navigates back to where she first found the term and starts reading again. Search-based navigation thus doesn’t just use terms; it also uses structural elements. And it doesn’t just involve one jump, but rather several jumps. The second use—refinding a familiar passage in the work—may borrow some of the techniques that we have seen in desktop search (Dumais et al. 2003), but it will need to use what’s at hand for context (which may be more limited than what would be available to perform desktop
154 reading and writing the electronic book
search—consider search on a Kindle). When a user is looking for a familiar passage in an eBook, we will be able to assume that she will only want to look at pages she has seen; thus, if she has only read as far as page 67 or has skipped pages 101 through 131, this may be taken into account in the search. Annotations and other signs of reader interaction may also be important for refinding. A short segment of a student thinking aloud as she tries to remember what she read sheds some light on how we might want to implement this kind of capability: I eventually got to—I think it was Chapter 16—but I’m not sure of that. Go to the annotations index. [pause] Okay. See, I’ve got two annotations. One is “poets of the present age,” and the other is “in the present age,” and I can’t remember which of them is the start of the other section that I wanted to look at. Um . . . Let’s try “poets . . . ” Oh, yes. Poets of the present age. Chapter 16. Oops. “Striking points of difference between the Poets of the present age and those of the 15th and 16th centuries—Wish expressed for the union of the characteristic merits of both.” This looks pretty relevant to what I was looking for. . . . Almost the whole page is highlighted, because it was all, “oh! This is what I would read out in class if I if I were saying, this is what I found in the text” (Marshall & Ruotolo 2002) The third common type of eBook search is performed in service of analysis. That is, the reader may be looking for instances of a phenomenon or examples of how an author uses a particular word, image, or concept in his writing. For this type of search, the reader may need a thesaurusdriven capability because she’s not just looking for a word, she is also looking for evidence of her own past interaction. She is looking for a comprehensive list of passages that match her analytic needs. I was searching one of the things that [the professor] had suggested . . . Things to look for in the critical reading. And just one of the offhand things that she tossed off was, “do they say anything about universals versus particulars?” . . . And that caught my interest. . . . First I searched for “particular” and he uses the word particular a lot. So, you know, like a “particular theory,” a “particular couch,” whatever. And not in the sense that I’m looking for (Marshall & Ruotolo 2002).
In other words, the two strings she quotes are phrases she has highlighted in the text.
beyond the book 155
If we consider all of the ways that within-text search is used, it is clear that it may be a very powerful tool for eBook readers. But it is by no means as straightforward as the implementations offered by commercial eBook platforms thus far. Let’s collect the diverse search functionality implied by the students’ quotes in this section: • • • • • •
Multiple input modes for specifying search terms; The ability to use structure in search (e.g., “find the first chapter where this term is used”); The ability to find the next instance of a term (this functionality is the most similar to what exists in eBook products today); Interaction-based search (e.g., “find this term on any page I’ve read so far”); Annotation-based search (e.g., “find any place this term appears in an annotation anchor or in my marginalia”); and The ability to find all instances of a term, image, or concept (possibly using a thesaurus to broaden the reach of the search).
We can see that the power of within-text search has yet to be fully realized in eBook products, although some interesting search-related capabilities have been implemented in various eBook prototypes (Golovchinsky, Price, & Schilit 1999). It will be difficult to develop a search capability that is sufficiently powerful that it meets the needs of the most sophisticated reader without being too arcane for basic use. Similarly, search functionality may be very domain- or task-specific; it will be difficult to balance the complex needs of a particular user community with the much more limited needs of the everyday reader when it comes to providing general, easy-to-use search.
7.2
PORTABLE PERSONAL LIBRARIES AND COLLECTION-LEVEL FUNCTIONALITY
Thus far, we have explored eBooks as independent entities in spite of our initial appeal to Clifford Lynch’s now 10-year-old argument that we should “think of portable personal digital libraries, not portable eBooks, as the future role of these appliances” (Lynch 2001). Now it is time to step back and examine eBooks in their broader context, either as elements of an external collection that the reader is using as a resource (e.g., Open Content Alliance’s collec tion of digitized books) or as part of the reader’s own library (Lynch’s portable personal digital library). We will distinguish between the two in this section only as far as we need to; naturally, some functionality and conceptual frameworks are the same. This section is by no means comprehensive, but it is a start. How will readers interact with large numbers of eBooks? How will they find them? How will they gather and organize them? How might they visualize these collections? And how might they work with the accumulated artifacts of
156 reading and writing the electronic book
their interactions with individual eBooks (e.g., the annotations they’ve made or the excerpts they’ve saved)? We have already explored some of these questions in Chapter 4’s discussion of social use. This chapter is an opportunity to wrap it up, to look beyond the immediate. Because a discussion of collections more or less concludes the topic and gently nudges and abuts on other related topics in this series (including IR and PIM), this is the final chapter of this lecture. The interested reader is directed to the relevant companion lectures to take up the discussion where this one leaves off.
7.2.1 Search at the Collection Level How important is within-collection search? It depends on both the collection itself (how it is structured) and the task it is being used for. Collection-level search may be used to locate a specific item, to refind an item, or in service of an exploratory task. Other lectures in the series will address these topics. As we might suspect, if the material in the collection is prepared and structured in a manner that facilitates access by titles, headings, and tables of content, search may be less important for basic navigational functions than if the material is basically unstructured. For example, in the Pocket PC study, the undergraduates were using Salem witch trial documents that had been prepared in numerous short segments and were accessible through a comprehensive table of contents; it was easy for the students to browse and navigate to where they needed to go via this explicit, well-organized structure. They especially used the alphabetical listings to find courtroom documents by name. On the other hand, the secondary materials that the graduate students were using were longer and less numerous. What’s worse, the displayed metadata they were offered by the eBooks, the individual titles, were almost indistinguishable from one another. Thus, the graduate students were far more dependent on search. The undergraduate class tended to revert to search to look for biographical material on the Salem witch trials that was spread across multiple documents. The students couldn’t search over the eBook collections on their Pocket PCs; they had to switch to the eText Center’s online collection to search. Thus, there was good evidence of the point at which the students gave up on using structure to navigate and began to search across the collec-
We have seen this effect in the Web at large. When the Web had fewer sites, and they could be organized by hand, a good index like Yahoo made the Web readily browseable. Now that the Web is much larger, and the structure is basically unknown, we rely on search engines to get where we are going, whether we know the specific Web site we are looking for or not. In fact, Google’s initial success might be traced in a large measure to the Web crossing that threshold and Google’s search taking advantage of the Web’s implicit social structure.
beyond the book 157
tion to find what they needed. Some of the graduate students actively sought this capability for the mobile collections on their Pocket PCs: Is there a way—and of course this would be hideous on the memory and everything—so I probably wouldn’t actually want to do it even if I could—a way from the [Pocket PC’s eBook] Library to search multiple texts at once for the same string? We need not venture further into how search is used to locate a specific eBook (known item search), to refind an eBook, or to explore the collection, since we can appeal to the other lectures in this series that cover these topics.
7.2.2 Re-encountering If searching one’s personal library is an activity triggered by memory (“I know I have Naked Lunch here somewhere”) and browsing one’s personal library is an activity triggered by interest (“I’m going to look through Burroughs to find a topic for my term paper”), then re-encountering is their complement. It is an activity intended to stimulate memory (“Seeing this magazine article will remind me to send my father a letter”) or evoke past interests (“This newspaper front page reminds me of where I was during the first moon landing”). Let’s look more closely at re-encountering. People deliberately put printed material (books, clippings, and so on) in places in the physical world so they will see them again without having to remember to look for them (Barreau & Nardi 1995; Malone 1983). If these things are left in plain sight or put into files, where they will be re-encountered when the person is looking for something else, they may be intended as reminders. Table 7.1 summarizes the role of re-encountering relative to other modes of finding material stored in our personal libraries. For example, an IT manager reading a trade publication at work might notice an article about a new application that he thinks might be useful for his family’s computer. Because he’s at work, he may not read the article at the time he encounters it; instead, he may tear it out and put it in his briefcase to read at home. He doesn’t expect to remember to look at it. Rather, he fully expects to re-encounter it when he goes through his briefcase in the evening. Or an activist might tear an op–ed piece from The New York Times and leave it on her kitchen counter to remind her to write a letter to her congressperson. Her letter may use nothing from the op–ed piece, but by encountering it again and again, she knows it will eventually spur her to action. People save all kinds of things (e.g., books, clippings, music) with the hope that they will re-encounter
In truth, he may not see it again for weeks. These examples have been adapted from interview data.
158 reading and writing the electronic book
Table 7.1: Re-encounter as a complement to searching and browsing. General Technique
Basic Assumption
Applicability to Personal Digital Libraries
Standard IR based on content analysis
User has the ability to express and reformulate information needs.
Does not take advantage of personal library characteristics.
Desktop search
Users know what they are looking for; they remember having seen it, but not where it is.
Refinding is a powerful mode of access for personal libraries when user remembers what to look for.
Browsing via hierarchical structure and hypertext links
Collection structure matches an area of interest or task. Multiple items may be sought.
Appropriate for wellorganized personal collections.
Re-encounter
Documents/items in conjunction with place stimulate memory. Users don’t remember what they are looking for until they see it.
Effective for refinding material that was saved without a purpose in mind or material that has been forgotten.
them, and the materials—both their form and content—will evoke memories and moods from the past. But as it is, even in our limited physical storage systems—our bookshelves, filing cabinets, attics, basements, storage areas, garages—we have stuff we don’t remember we have. When we see the stuff, we usually remember where we got it and why we have kept it, but that’s after the fact. We would never actively seek these items. We rely on re-encountering them to give them value. Empirically, in contextual interviews in homes and offices over the years, people are often surprised by the stuff they have. It’s not that the material is unfamiliar once they’ve seen it; it’s just
beyond the book 159
that they didn’t remember keeping it in the first place. They will not search for this content, and they have no motivation to browse for it. Yet study participants were visibly pleased by forgotten physical and digital items they unearthed. Staging potential re-encounters with digital materials is more difficult than staging the comparable re-encounters with physical items. Our personal computers and mobile devices have a dearth of places where you can really “leave things out”—especially for extended periods of time—despite the fact that we use physical metaphors like the desktop. In fact, operating systems user interface developers do their best to encourage people to put digital things away, to manage display space better, to iconify, to truncate text, to make things fit, and to make things neat. Furthermore, as the ability to search for personal resources improves and removes the burden of filing from computer users (Dumais et al. 2003), the chance of certain kinds of serendipitous reencounter diminish. After all, you are more likely to find exactly what you want and are less likely to run into something you’ve left in a remote corner of your personal digital library. How can this effect be overcome? There is no easy answer. In the 1980s, Xerox introduced Rooms, software that implemented an extensible virtual desktop (Henderson & Card 1986). A user could maintain as many desktops as she wanted to: a room for her email, a room for each project she was involved in, a room where she kept personal stuff, and so on. But Rooms ultimately did not catch on. Physical space metaphors have their limits in virtual environments, especially when they lack a sense of place and social purpose. Harrison and Dourish point out that places can be readily distinguished from spaces: Physically, a place is a space which is invested with understandings of behavioural appropriateness, cultural expectations, and so forth . . . A conference hall and a theatre share many similar spatial features (such as lighting and orientation); and yet we rarely sing or dance when presenting conference papers, and to do so would be regarded as at least slightly odd . . . We wouldn’t describe this behaviour as “out of space”; but it would most certainly be “out of place” (Harrison & Dourish 1996). So, the activist might go into the kitchen and see the op–ed piece on the counter because she’s feeding the cat and then remember that she was supposed to write a letter. Leaving aside the problems with re-representing the physical space in the virtual (and the fact that it seems a bit silly to have a virtual kitchen), this form of re-encountering has no counterpart in the digital world.
While there is considerable appeal to these re-representations of physical space—witness Second Life the costs of such an approach must be carefully evaluated to weigh them against the benefits.
160 reading and writing the electronic book
Understanding the role of re-encounter, and the way re-encountering works in the physical world presents us with an opportunity to move beyond current metaphors for presenting and manipulating stored personal digital library content.
7.2.3 Gathering and Triage Gathering is a counterpart to annotation. Annotations may represent within-document interpretation; gathering and triage (sorting according to more specific criteria) represent the interpretation of the relevant documents relative to one another. It is important to record where the material that readers have gathered came from, and it is equally important for readers to be able to informally express why they have kept it. In other words, a reader should be able to say, “this is the most significant thing I found” or “I’m only going to read this if I have time” as easily as it is for her to say what the material is about. Our previous discussion of annotation may inform the many functions of gathering and triage. Paper practices don’t always support gathering very well. In past studies, we have found that there is a tension between how people organize material and how they use it. Furthermore, there is a need for spatial persistence that is not supported by many work settings. For example, in a study of law students, we found that the students tended to organize printouts of legal cases into three categories: precedents that supported their side, precedents that ran counter to their side, and precedents that were a close match on the basis of legal facts. The law students had two common ways of implementing this organization. The first was to file the printouts in a three-ring notebook. That way, the organization stayed in place, but to use any two documents together, the student had to take them out of the notebook. The second was to spread out the documents around them in piles, but in that case, the students seldom had the luxury of leaving them that way, especially after an assignment was finished. So physical space is useful and promotes informal expression and critical thinking and is effective for memory, but it is limited and difficult to manage. Figure 7.4 shows four different ways the law students organized and maintained the materials they used to write legal briefs. Based on observations of this sort (including extensive observations of the work of intelligence analysts), for the last two decades, my colleagues and I have been working on applications for gathering information and representing this kind of lightweight interpretation. Readers use the visual and spatial characteristics of document surrogates to express evolving structure (Marshall & Shipman 1995). The Visual Knowledge Builder (VKB) (Shipman et al. 2001) and Tinderbox (Bernstein 2009) are good current examples of applications designed to support gathering and triage. Our older research on information triage demonstrated that the tools used to organize information shape the process in crucial ways (Marshall & Shipman 1997). In this study, participants performed an analysis task under one of three conditions. Participants were given either (1) hardcopy documents, writing implements, and a physical surface upon which to organize the documents; (2)
beyond the book 161
FIGURE 7.4: Different methods of organizing cases gathered to write a legal brief. (a) Gathered materials are organized into rough piles representing pro, con, and facts match. (b) Gathered materials are organized into a notebook. (c) Gathered materials are thrown into a milk crate to be organized when they are used. (d) Gathered materials are kept in a notebook, but a more nuanced organization is reproduced each time the materials are used.
digital documents (represented as objects that could be manipulated), a means to change the objects’ visual attributes (e.g. change the objects’ color), and a workspace in which to organize the objects; or (3) digital documents (represented as objects as in condition 2), but with the additional ability to organize the document surrogates in a spatial hierarchy. We found that participants were more likely to focus on organizing (at the expense of reading) if they had explicit tools to do so. Figure 7.5 contrasts the results of 3 different participants, one working under each of the three conditions. In this study, a search capability added a bias for performing the analysis task using information consolidated in a single document, rather than using the information that was spread over many documents. It was evident that the tools changed the way people distribute their attention
162 reading and writing the electronic book
(a)
(b)
(c)
FIGURE 7.5: Using document surrogates to perform an analysis task under 3 different conditions. (a) Using print documents on a desk. (b) Using document surrogates in a two-dimensional space. (c) Using document surrogates in a hierarchy of spaces.
between an overview and the detail of the documents. In other words, given fewer organizing tools, participants read more; given a spatial overview and the ability to create hierarchies, they organized more. Recent research on VKB has addressed the interaction between reading and organizing information (Bae et al. 2006, 2008); this work provides an architectural model for how an eBook or reading application may be integrated with applications that support search, gathering, and organizing. The reader’s interest—as expressed through annotations or other manipulation of the characteristics of the materials—may be propagated among the views to support the triage and organizing activity.
7.2.4 Supporting Browsing with Computed Visualizations There are many ways to support browsing and other forms of information encountering as counterpoints to search. One involves using the intrinsic structure of the materials in the library, possibly in conjunction with records of reading and interaction (e.g., annotations or page views), to create visualizations of one’s personal digital library or of an eBook collection. Visualizations, such as Data Mountain (Robertson et al. 1998) or Perspective Wall (MacKinlay, Robertson, & Card 1991), or library-specific visualizations that represent books and provide a means of organizing them are a step in this direction (e.g., see Cubaud, Stokowski, & Topol 2002; Good et al. 2005). For visualizations to be effective, they must be readily interpreted by the reader; it can be difficult for nonspecialists to “read” and understand the implications of abstract visualizations like ThemeScapes (Wise et al. 1995). Furthermore, the reduced representations of individual elements (the thumbnails, icons, labels, etc.) must be meaningful too. All too often, individual elements in
beyond the book 163
visualizations are presented in such a way as to render them undecipherable—not enough words are visible; the thumbnails are indistinguishable from one another; the salient features are not displayed; and so on. Both problems—providing an overview and displaying reduced representations of individual items—must be solved in order to create an effective visualization. Hornbæk and Frøkjær (2001) found that students prefer this type of “overview+detail” interface; they work somewhat faster using the fisheye overview, but as with other methods that put more information in front of a person without regard to how pleasant it is to look at, people worked quickly but were less accurate. A linear presentation of text in a single scrolling window, which is the most familiar presentation of a long document, is poorer along many usability dimensions. In other words, the best interface in terms of usability, performance, and reader satisfaction is one that provides a detailed look at the item in focus, presumably a page view, and an overview of the page’s position in the document structure. Similarly, extending this finding to a collection or personal library, the reader should be able to see the eBook in focus, coupled with an overview of the collection. Some of problems with visualization design are well known and may be addressed by consulting general authorities in this area. Tufte (1990) and Bertin (1983) are both good references in this area.
7.2.5 Metadata for Personal Digital Libraries Personal library visualizations may be enhanced by choosing techniques that make good use of records of reading and interaction in addition to other per eBook metadata. To do so, we can conceive of every interaction within a personal digital library as forming a persistent record. On its own, each record—each annotation, each clipping, each log entry—doesn’t have much value. But taken together, these records may form a personal geography of one’s own collection of reading materials. To ground this discussion, let’s look briefly at the records of interactivity we can accumulate. To start with, let’s consider that every element in a personal digital library comes with a certain amount of helpful metadata, e.g., eBooks are likely to come with cataloging information. This metadata is intrinsic to the material. This book has focused mainly on a small set of explicit reader interactions—annotation, clipping, and gathering—to tease out certain principles of ideal interactivity in a personal digital library setting. However, we should cast our net more broadly to explore the technological implications of gathering, saving, using, and presenting records of reading and interaction. Thus, our next concern should be implicit records of interaction, since these require no added user effort. Who gave me this news article? What part of a reference work have I accessed the most
This finding echoes the finding by Chapparo et al. (2005) that we reported in Chapter 2: if a document is more pleasant to read, subjects will spend longer reading it.
164 reading and writing the electronic book
TABLE 7.2: A summary of types of personal digital library metadata and related implications. Metadata Type
Examples
Interactivity and Creation
Storage and Representation
Privacy Concerns
Intrinsic description
Title, author, international standard book number (ISBN), source URL, etc.
Metadata that comes with the content.
Many canonical forms, usually represented as attribute value pairs; value may be a link to the actual content.
Should be similar to those of the document the metadata is describing.
Implicit records of reading or activity data
Access tracking (what pages were read, when, and in what order); contextual records like eBook purchase date and GPS coordinates.
Reading records that may be collected crucially depend on navigational capabilities. Capture should never degrade system performance.
Telemetry is often collected locally and moved to server later. Like use logs, there are many underlying assumptions about actions’ meanings.
Private. Since these are essentially invisible as they are captured, they are sensitive. They reveal what a reader has seen and done. May be aggregated.
Intentional, but unselfconscious, records of reading
Underlines, highlights, marginalia, clippings.
Direct and to-hand tools are essential. Creating these should not interrupt primary activity (reading).
Need canonical representations if we are to preserve these. External storage allows documents to be annotated by other than their owner.
Private. May be valuable (and less private) in aggregate.
Intentional user metadata (the result of a focal activity)
Bookmarks, shared clippings, annotations made to communicate with others, the organization of materials in a personal library.
Less sensitive to formality and indirectness (i.e., the value is high so the user will put up with more: keyboard input, menu selection, etc.).
Should be saved external to content so they can be reattached to content in the event of platform changes or content loss; external storage also permits nonowner annotation.
These records may be shared. Likely to be more intelligible than other records of reading.
beyond the book 165
often? Which page edges would have darkened if this were a print book? Have I ever listened to this podcast? Which paths have I taken through this hypertext fiction? Because there’s so much implicit information that can be recorded, one important research topic is to ascertain which of it is actually helpful and how it may be presented (e.g., Hill et al. 1992; Kelly & Belkin 2004). Our third concern is intentional—but still unselfconscious—records of reading. Many annotations fall into this category. As I noted earlier, people are not necessarily aware of how much they’ve underlined or highlighted while they’re trying to understand the material presented in a complex technical paper; that’s why it’s best to distinguish these annotations as unselfconscious. Instead of being a separate activity, they are closely tied to the act of reading. It is indeed rare that you’d hear someone saying, “I’m going to annotate Moby Dick” instead of saying, “I’m going to read Moby Dick.” Sometimes these intentional, unselfconscious records also reflect interpretive ambiguities: I might push the icons representing two eBooks closer together on my desktop because I think they’re somewhat related, but the relationship is not so obvious that I want to put them in a folder together. Finally, some of our interactions are deliberate efforts to make the material in our personal libraries more valuable, e.g., the translations we squeeze between the lines of a foreign language text. Table 7.2 summarizes the types of reading records that can act as eBook metadata and the implications these metadata types have for storage and representation. Recording and storing these types of personal digital library metadata will aid in creating a reading geography. These records must also be presented in a sensible way to be used to the reader’s advantage in subsequent activities.
7.3
CONCLUSION
This lecture has covered eBooks from a number of different perspectives: how people read and interact with eBooks; how readers use eBooks socially; how writers and publishers prepare content; how print genres are realized as digital forms and how digital genres emerge; how eBook functionality can go beyond the print book; and finally, how eBooks and records of their use can grow into personal digital libraries. We have also explored types of studies that researchers might do, given what we know and don’t know about eBooks. In the end, it is important to remember that there is no single way that people will read; there is no single device that people will use for reading; there is no single format that will make content universally accessible; and there is no single role for eBooks in our social affairs. Instead of putting all our eggs—or our eBooks—in one basket, it is more realistic to consider how reading will grow to encompass constellations of reading technologies. While this lecture strives to point out some universals—that it is important to support reading comfort; that it is
166 reading and writing the electronic book
FIGURE 7.6: An office work area: the reader is surrounded by display surfaces large and small, mobile and stationary, in use and dormant.
important to provide access to external resources such as catalogs, search engines, document repositories, and digital libraries; that mobile devices are a significant enabler to reading when and where we choose—I am also convinced that there is no single reading solution for the mobile worker in the digital or physical library. (Marshall et al. 2001) An eBook can be the principal venue for reading a novel; it may also serve as an auxiliary display, almost like a sheet of paper, or as a reference while the reader writes using a different computer. As we look at the people around us, we might see them reading on the small screen at hand (e.g., an iPhone) because that’s what they have and it can be used on a crowded subway with little ado. On the other hand, the future of reading may not be so closely tied to a single platform. Figure 7.6 shows a typical work area with many screens that represent diverse form factors. Readily visible are two landscape-oriented displays and one portrait-mode display. But wait! Look more closely. There are two smaller screens all but hidden on the desktop: one on her mobile phone and one on a controller for some communications hardware. Because I’m familiar with this office, I also know there are several other displays that are just off-camera and a laptop that is blocked from view. It’s not difficult to envision a Kindle in this picture nor it is difficult to imagine the study participant reading on any of the screens she already has on her desk. Perhaps she will even carry
beyond the book 167
one of them with her to a quieter place to read. She already has two mobile devices and six stationary screens ready-to-hand: will she want a purpose-built device like a Kindle? Modern homes and offices have no shortage of reading surfaces. I began this lecture with a quick look at a wave of hyperbole that washed over the 1990s: the end of books, the death of text. Fewer than 20 years later, we are reading things we did not imagine then. Tweets, blogs, Facebook walls, and cell phone novels are just a few of the genres that have emerged since then. Yet now they’re a vital part of our lives. Earlier today, I saw a tweet roll by that said, “Wishing that more folks would take advantage of http://ping.fm, rather than posting their status updates only to a single site, such as FB.” Several hours before that, @FStutzman said, “For the past few weeks, I’ve been getting more traffic to my sites from Twitter than from Google. I feel this has implications.” Yes, it does have implications. For some, the seams between the genres have meaning; others wish the seams would simply disappear. It seems impossible (or at least inadvisable) to predict the future of reading. What it is possible to say from all of this is that reading is a hybrid. It is neither the little girl curled up in a window seat in the sun with her copy of Harry Potter and the Half-Blood Prince nor is it the graduate student sequestered in her library carrel, surrounded with the source materials for her dissertation. It is both. Readers do focus on a single reading surface or display at times, but then they bring it into a broader context. Reading is an unselfconscious orchestration of many things; the success of a next generation of eBooks relies on our ability to see them as part of a larger system of diverse genres, technologies, and activities. • • • •
This tweet is attributed to a tweep I know only as @sfslim.
169
References Abrams, D., Baecker, R., & Chignell, M. (1998). Information archiving with bookmarks: personal web space construction and organization, in Proceedings of CHI ’98, ACM Press, New York, pp. 41–48. Abrams, S. (2005). Establishing a global digital format registry, Library Trends, vol. 54, no. 1, pp. 125–143. doi:10.1353/lib.2006.0001 Adler, A., Gujar, A., Harrison, B. L., O’Hara, K., & Sellen, A. (1998). A diary study of work-related reading: design implications for digital reading devices, in Proceedings of CHI ’98, ACM Press, New York, pp. 241–248. Adler, M. J. & van Doren, C. (1972). How to read a book, Simon and Schuster, New York, NY. Agosti, M., Ferro, N., Frommholz, I., & Thiel, U. (2004). Annotations in digital libraries and collaboratories—facets, models and usage, Lecture Notes in Computer Science, vol. 3232, pp. 244–255. Allen, R. B. & Schalow, J. (1999). Metadata and data structures for the historical newspaper digital library project, in Proceedings of CIKM ’99, ACM Press, New York, pp. 147–153. Almeida, R. B., Mozafari, B., Cho, J., (2007). On the evolution of Wikipedia, in Proceedings of the International Conference on Weblogs and Social Media (ICWSM 2007), March 26–28, 2007, Boulder, CO, USA. Amiran, E. & Unsworth, J., (1991). Postmodern culture: publishing in the electronic medium, The Public-Access Computer Systems Review, vol. 2, no. 1, pp. 67–76. Amiran, E., Orr, E., & Unsworth, J. (1991). Refereed electronic journals and the future of scholarly publishing, in Advances in Library Automation and Networking, vol. 4, ed J Hewitt, JAI Press Incr, Greenwich, CT, Retrieved 1 June 2009, http://www3.isrl.illinois.edu/~unsworth/ advances.html. Andersson, M., Eisley, W., Howard, A., Romano, F., & Witkowski, M., (1997). PDF printing and publishing: the next revolution after Gutenberg, Micro Publishing Press, Torrance, CA. Angelé, J. & Emeraud, T., (2006). BiNem electronic paper, Gekkan Display—Techno-Times Japan October. doi:10.1889/1.2433315 Arms, C. & Fleischhauer, C., (2005). Digital formats: factors for sustainability, functionality, and quality, in Proceedings of IS&T Archiving 2005, Society for Imaging Science and Technology, Springfield, VA, 2005.
170 reading and writing the electronic book
Armstrong, A. (2008). Books in a virtual world: the evolution of the e-book and its lexicon, Journal of Librarianship and Information Science, vol. 40, no. 3, pp. 193–206. doi:10.1177/096100060 8092554 Babcock J. & Pelz, J. (2000). Building a lightweight eyetracking headgear, in Proceedings of ACM SIGCHI Eye Tracking Research & Applications Symposium 2000, ACM Press, New York, pp. 109–114. doi:10.1145/968363.968386 Badi, R., Bae, S., Moore, J. M., Meintanis, K., Zacchi, A., Hsieh, H., Shipman, F., Marshall, C. C. (2006). Recognizing user interest and document value from reading and organizing activities in document triage, in Proceedings of the ACM Conference on Intelligent User Interfaces (IUI 2006), ACM Press, New York, pp. 218–225. doi:10.1145/1111449.1111496 Bae, S., Hsieh, H., Kim, D., Marshall, C. C., Meintanis, K., Moore, J. M., Zacchi, A., & Shipman, F. (2008). Supporting document triage via annotation-based visualizations, Proceedings of ASIST 2008, Wiley InterScience, pp. 1–16. doi:10.1002/meet.2008.1450450241 Bae, S., Marshall, C. C., Meintanis, K., Zacchi, A., Hsieh, H., Moore, J. M., Shipman, F. (2006). Patterns of reading and organizing information in document triage, in Proceedings of ASIS&T 2006, Wiley InterScience, pp. 1–27. doi:10.1002/meet.14504301160 Bae, S., Badi, R., Meintanis, K., Moore, J. M., Zacchi, A., Hsieh, H., Marshall, C. C., Shipman, F. M. (2005). Effects of display configurations on document triage, Lecture Notes in Computer Science, vol. 3585, pp. 130–143. doi:10.1007/11555261_14 Baeza-Yates, R. & Ribeiro-Neto, B. (1999). Modern information retrieval, ACM Press, New York. Bazin, P., 1996, Toward metareading, in The future of the book, ed G Nunberg, University of California Press, Berkeley, CA, pp. 153–168. Baker, M., Shah, M., Rosenthal, D. S., Roussopoulos, M., Maniatis, P., Giuli, T., & Bungale, P. (2006). A fresh look at the reliability of long-term digital storage, in Proceedings of Eurosys 2006, ACM Press, New York, pp. 221–234. doi:10.1145/1218063.1217957 Baker, N. (1997). The size of thoughts, Vintage, New York. Bargeron, D. & Moscovich, T. (2003). Reflowing digital ink annotations, in Proceedings of CHI ’03, ACM Press, New York, pp. 385–393. doi:10.1145/642611.642678 Barreau, D. & Nardi, B. (1995). Finding and reminding: file organization from the desktop, SIGCHI Bulletin, vol. 27, no. 3, July, pp. 39–43. Barthes, R. (1975). The pleasure of the text, Farrar, Straus, and Giroux, New York. Bartlett, J. F. (2000). Rock ‘n’ scroll is here to stay, IEEE Computer Graphics Applications, vol. 20, no. 3, pp. 40–45. doi:10.1109/38.844371 Bates, M. (1989). The design of browsing and berrypicking techniques for the online search interface, Online Review, vol. 13, no. 5, pp. 407–424. doi:10.1108/eb024320 Baudrillard, J. (1996). Cool memories II, trans. C Turner, Duke University Press, Durham, NC.
references 171
Baumer, E., Sueyoshi, M., & Tomlinson, B. (2008). Exploring the role of the reader in the activity of blogging, in Proceedings of CHI 2008, ACM Press, New York, pp. 1111–1120. doi:10.1145/1357054.1357228 Belkin, N. J., Oddy, R. N., & Brooks, H. M. (1982). Ask for information retrieval: Part I. Background and theory, Journal of Documentation, vol. 38, no. 2, pp. 61–71. doi:10.1108/eb026722 Bernstein, M. (1988). The bookmark and the compass: orientation tools for hypertext users, SIGOIS Bulletin, vol. 9, no. 4, pp. 34–45. doi:10.1145/51640.51645 Bernstein, M. (2009). Shadows in the cave: hypertext transformations, Journal of Digital Information, vol. 10, no. 3, Retrieved August 10, 2009, http://journals.tdl.org/jodi/article/view/714/488. Bernstein, M., Bolter, J., Joyce, M., Mylonas, E. (1991). Architectures for volatile hypertext, in Proceedings of Hypertext ‘91, ACM Press, New York, pp. 243–260. doi:10.1145/122974.122999 Bertin, J. (1983). Semiology of graphics, University of Wisconsin Press, Madison, WI. Betrisey, C., Blinn, J. F., Dresevic, B., Hill, B., Hitchcock, G., Keely, B., Mitchell, D. P., Platt, J. C., Whitted, T. (2000). Displaced filtering for patterned displays, Digest of Society for Information Display Symposium, vol. 31, no. 1, pp. 296–299. doi:10.1889/1.1832941 Bier, E., Good, L., Popat, K., & Newberger, A. (2004). A document corpus browser for in-depth reading, in Proceedings of Joint Conference on Digital Libraries, ACM Press, New York, pp. 87–96. doi:10.1145/996350.996373 Birkerts, S. (1994). The Gutenberg elegies: the fate of reading in an electronic age, Faber and Faber, Boston, MA. Bishop, A. P. (1998). Digital libraries and knowledge disaggregation: the use of journal article components, in Proceedings of ACM DL ‘98, ACM Press, New York, pp. 29–39. Blomberg, J., Giacomi, J., Mosher, A., & Swenton-Wall, P. (1993). Ethnographic field methods and their relation to design, in Participatory Design: Principles and Practices, eds D. Schuler & A. Namioka, Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 123–154. Blomberg, J., Suchman, L., & Trigg, R. (1996). Reflections on a work-oriented design project, Human–Computer Interaction, vol. 11, no. 3, pp. 237–265. doi:10.1207/s15327051hci1103_3 Bly, S. (1997). Field work: is it product work?, Interactions, vol. 4, no. 1, pp. 25–30. doi:10.1145/ 242388.242398 Boardman, R. & Sasse, M. A. (2004). Stuff goes into the computer and doesn’t come out: a crosstool study of personal information management, in Proceedings of CHI ’04, ACM Press, New York, pp. 583–590. Bogart, L. (1989). Press and Public: who reads what, when, where, and why in American newspapers, Lawrence Erlbaum Associates, Hillsdale, NJ. Boguraev, B., Kennedy, C., Bellamy, R., Brawer, S., & Wong, Y. Y. (1998). Dynamic presentation of document content for rapid online skimming, in Proceedings of AAAI Spring 1998 Symposium on Intelligent Text Summarization, AAAI Press, Stanford, CA, pp. 109–118.
172 reading and writing the electronic book
Bolter, J. (1991). Writing space, Lawrence Erlbaum Associates, Hillsdale, NJ. Bowker, G. & Star, S. L. (1999). Sorting things out: classification and its consequences, MIT Press, Cambridge, MA. Braten, I., & Stromso, H. I. (2003). A longitudinal think-aloud study of spontaneous strategic processing during the reading of multiple expository texts, Reading & Writing, vol. 16, pp. 195–218. Bringhurst, R. (1992). The elements of typographic style, Hartley & Marks, Point Roberts, WA. British Library, Turning the pages, http://www.bl.uk/collections/treasures/digitisation1.html. Bruce, H., Jones, W., & Dumais, S. (2004). Information behaviour that keeps found things found, Information Research, vol. 10, no. 1, http://informationr.net/ir/10-1/paper207.html. Brun-Cottan, F. & Wall, P. (1995). Using video to re-present the user, Communications of the ACM, vol. 38, no. 5, pp. 61–71. doi:10.1145/203356.203368 Brush, A., Bargeron, D., Grudin, J., Borning, A., & Gupta, A. (2002). Supporting interaction outside of class, in Proceedings of CSCL ’02, Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 425–434. Brush A. J., Bargeron, D., Gupta, A., Cadiz, J. J. (2001). Robust annotation positioning in digital documents, in Proceedings of CHI ‘01, ACM Press, New York, pp. 285–292. doi:10.1145/ 365024.365117 Buchanan, G., & Loizides, F. (2007). Investigating document triage on paper and electronic media, in Proceedings of the European Conference on Digital Libraries 2007, pp. 416–427. doi:10.1007/978-3-540-74851-9_35 Bush, V. (1945). As we may think, Atlantic Monthly, vol. 176, no. 1 (August), pp. 101–108. Byrd, D. A. (1999). Scrollbar-based visualization for document navigation, in Proceedings of DL ’99, ACM Press, New York, pp. 122–129. Cadiz, J., Gupta, A., & Grudin, J. (2000). Using web annotations for asynchronous collaboration around documents, in Proceedings of CSCW ’00, ACM Press, New York, pp. 309–318. doi:10.1145/358916.359002 Cadiz, J. J., Venolia, G., Jancke, G., & Gupta, A. (2002). Designing and deploying an information awareness interface, in Proceedings of CSCW 2002, ACM Press, New York, pp. 314–323. doi:10.1145/587078.587122 Carr, L., Hill, G., DeRoure, D., Hall, W., & Davis, H. (1996). Open information services, Computer Networks and ISDN Systems, vol. 28, no. 7/11, pp. 1027–1036. Chang, B. (1998). In-place editing of Web pages: sparrow community-shared documents, Computer Networks and ISDN Systems, vol. 30, pp. 489–498. doi:10.1016/S0169-7552(98)00118-4 Chang, M., Leggett, J., Furuta, R., Kerne, A., Williams, J. P., Burns, S., & Bias, R. (2004). Collection understanding, in Proceedings of JCDL 2004, ACM Press, New York, pp. 334–342. doi:10.1145/996350.996426
references 173
Chapparo, B., Baker, J., Shaikh, A., Hull, S., & Brady, L. (2005). Reading online text: a comparison of four white space layouts, Usability News, vol. 7, no. 1, Retrieved May 21, 2009, http://www .surl.org/usabilitynews/71/page_setting.asp. Chartier, R. (1994). The order of books, Stanford University Press, Stanford, CA. Chellappa, R. K. & Sin, R.G. (2005). Personalization versus privacy: an empirical examination of the online consumer’s dilemma, Information Technology and Management, vol. 6, no. 2–3, pp. 181–202. doi:10.1007/s10799-005-5879-y Chen, N., Guimbretiere, F., Dixon, M., Lewis, C., & Agrawala, M. (2008). Navigation techniques for dual-display e-book readers, in Proceedings of CHI 08, ACM Press, New York, pp. 1779–1788. doi:10.1145/1357054.1357331 Chu, Y C, Bainbridge, D, Jones, M, & Witten, I (2004). Realistic books: a bizarre homage to an obsolete medium?, in Proceedings of JCDL ’04, ACM Press, New York, pp. 78–86. Churchill, E., Trevor, J., Bly, S., Nelson, L., & Cubranic, D. (2000). Anchored conversations, in Proceedings of CHI ’00, ACM Press, New York, pp. 454–461. doi:10.1145/332040.332475 Claypool, M., Le, P., Waseda, M., & Brown, D. (2001). Implicit interest indicators, in Proceedings of ACM IUI2001, ACM Press, New York, pp. 33–40. doi:10.1145/359784.359836 Cline, L. (2000). Buying electronic: the development of the electronic book market in academic libraries, Library Collections, Acquisitions, and Technical Services, vol. 24, no. 2, pp. 312–315. doi:10.1016/S1464-9055(00)00110-X Cockburn, A., Gutwin, C., & Alexander, J. (2006). Faster document navigation with space-filling thumbnails, in Proceedings of CHI 2006, ACM Press, New York, pp. 1–10. doi:10.1145/ 1124772.1124774 Connaway, L. S. (2001). A Web-based electronic book (e-book) library: the NetLibrary model, Library Hi Tech, vol. 19, no. 4, pp. 340–349. doi:10.1108/07378830110411961 Cooper, B. F. & Garcia-Molina, H. (2002). Peer-to-peer data trading to preserve information, ACM TOIS, vol. 2, no. 2, pp. 130–170. doi:10.1145/506309.506310 Coover, R. (1992). The end of books, The New York Times, June 21, 1992. Cox, D. & Greenberg, S. (2000). Supporting collaborative interpretation in distributed groupware, in Proceedings of CSCW ‘00, ACM Press, New York, pp. 289–298. doi:10.1145/358916.359000 Coyle, K. (2003). The technology of rights: digital rights management, File downloaded 5/28/09, http:// www.kcoyle.net/drm_basics.pdf. Coyle, K. (2006). Mass digitization of books, Journal of Academic Librarianship, vol. 32, no. 6, pp. 641–645. doi:10.1016/j.acalib.2006.08.002 Crabtree, A., Twidale, M. B., O’Brien, J., & Nichols, D. M. (1997). Talking in the library: implications for the design of digital libraries, in Proceedings of ACM Digital Libraries ‘97, ACM Press, New York, pp. 221–228.
174 reading and writing the electronic book
Crawford, S. Y., Hurd, J. M., & Walker, A. C. (1996). From print to electronic: the transformation of scientific communication, Information Today, Medford, NJ. Crawford, W. (1998). Paper persists: why physical library collections still matter, Online Magazine ( January/February), pp. 42–48. Creswell, J. W. (2009). Research design: qualitative, quantitative, and mixed methods approaches, Sage Publications, Thousand Oaks, CA. Cubaud, P., Stokowski, P., & Topol, A. (2002). Binding browsing and reading activities in a 3D digital library, in Proceedings of JCDL 2002, ACM Press, New York, pp. 281–282. doi:10.1145/ 544220.544282 Dearman, D. & Pierce, J. (2008) It’s on my other computer!: computing with multiple devices, in Proceedings of CHI 2008, ACM Press, New York, pp. 767–776. Dearnley, J., Morris, A., McKnight, C., Berube, L., Palmer, M., John, J. (2004). Electronic books in public libraries: a feasibility study for developing usage models for web-based and hardwarebased electronic books, New Review of Information Networking, vol. 10, no. 2, pp. 209–246. Decurtins, C., Norrie, M. C., & Signer, B. (2003). Digital annotation of printed documents, in Proceedings of CIKM ’03, ACM Press, New York, pp. 552–555. doi:10.1145/956863.956971 Dervin, B. (1999). Chaos, order, and sense-making: a proposed theory for information design, in Information design, ed R Jacobson, MIT Press, Cambridge, MA, pp. 35–57. Dillon, A. (1992). Reading from paper versus screens: a critical review of the empirical literature, Ergonomics, vol. 35, no. 10, pp. 1297–1326. doi:10.1080/00140139208967394 Dillon, A. (1993). Designing usable electronic text, Taylor & Francis, London. Dillon, A., Kleinman, L., Bias, R., Choi, G., & Turnbull, D. (2004). Reading and searching digital documents, in Proceedings of ASIST 2004, Wiley InterScience, pp. 267–273. doi:10.1002/ meet.1450410131 Dillon, A., Kleinman, L., Choi, G. O., & Bias, R. (2006). Visual search and reading tasks using ClearType and regular displays: two experiments, in Proceedings of CHI ‘06, ACM Press, New York, pp. 503–511. Dillon, D. (2001). E-books: the University of Texas experience, Library Hi Tech, vol. 19, no. 2, pp. 113–124. doi:10.1108/07378830110394826 Dominick, J. (2005). The in-situ study of an electronic textbook in an educational setting, Doctoral dissertation, University of North Carolina, Chapel Hill, NC. Douglas, J. Y. & Hargadon, A. (2000). The pleasure principle: immersion, engagement, flow, in Proceedings of Hypertext 2000, ACM Press, New York, pp. 153–160. Dumais, S., Cutrell, E., Cadiz, J. J., Jancke, G., Sarin, R., & Robbins, D. (2003). Stuff I’ve seen: a system for personal information retrieval and re-use, in Proceedings of SIGIR 2003, ACM Press, New York, pp. 72–79.
references 175
Egan, D. E., Remde, J. R., Gomez, L. M., Landauer, T. K., Eberhardt, J., & Lochbaum, C. C. (1989). Formative design–evaluation of SuperBook, ACM Transaction on Information Systems, vol. 7, no. 1, pp. 30–57. doi:10.1145/64789.64790 Erdelez, S. (1997). Information encountering: a conceptual framework for accidental information discovery, in Proceedings of International Conference on Research in Information Needs, Seeking, and Use in Different Contexts, Taylor Graham Publishing, Los Angeles, CA, pp. 412–421. Erdelez, S. (1999). Information encountering: it’s more than just bumping into information, Bulletin of the American Society for Information Science, vol. 25, no. 3, pp. 25–29. doi:10.1002/bult.118 Erdelez, S. & Rioux, K. (2000). Sharing information encountered for others on the Web, New Review of Information Behaviour Research: Studies of Information Seeking in Context, vol. 1, pp. 219–233. Fadiman, A. (1998). Ex Libris: confessions of a common reader, Farrar, Straus, and Giroux, New York. Farzan, R. & Brusilovsky, P. (2005). Social navigation support through annotation-based group modeling, in Proceedings of the 10th International Conference on User Modeling, LNCS, vol. 3538, pp. 463–472. Fenton, E. (1996). The Macintosh font book: typographic tips, techniques, and resources, Peachpit Press, Berkeley, CA. Fetterman, D. (1998). Ethnography step by step, Sage Publications, Thousand Oaks, CA. Fishkin, K., Gujar, A., Harrison, B., Moran, T., & Want, R. (2000). Embodied user interfaces for really direct manipulation, CACM, vol. 43, no. 9, pp. 74–80. doi:10.1145/348941.348998 Flower, L., Stein, V., Ackerman, J., Kantz, M. J., McCormick, K., & Peck, W. C. (1990). Readingto-write: exploring a cognitive and social process, Oxford University Press, New York. Fowler, R. L. & Barker, A. S. (1974). Effectiveness of highlighting for retention of text material, Journal of Applied Psychology, vol. 59, no. 3, pp. 358–364. doi:10.1037/h0036750 Friedman, B., Khan Jr, P. H., & Howe, D. C. (2000). Trust online, CACM, vol. 43, no. 12, pp. 34–40. doi:10.1145/355112.355120 Fu, X., Ciszek, T., Marchionini, G., & Solomon, P. (2005). Annotating the Web: an exploratory study of web users’ needs for personal annotation tools, in Proceedings of ASIS&T, vol. 42, no. 1. doi:10.1002/meet.14504201151 Furuta, R., Shipman III, F. M., Marshall, C. C., Brenner, D., & Hsieh, H. W. (1997). Hypertext paths and the World-Wide Web: experiences with Walden’s Paths, in Proceedings of Hypertext ’97, ACM Press, New York, pp. 167–176. Garrod, P. (2003). Ebooks in UK libraries: where are we now?, Ariadne, vol. 37, October, Retrieved August 10, 2009, http://www.ariadne.ac.uk/issue37/garrod/. Gass, W. (1999). In defense of the book: on the enduring pleasures of paper, type, page, and ink, Harper’s Magazine, vol. 299, November, pp. 45–51.
176 reading and writing the electronic book
Gibson, M. & Ruotolo, C. (2001). Beyond the Web: TEI and the ebook revolution, in Proceedings of ACH ‘01, Online, http://www.nyu.edu/its/humanities/ach_allc2001/papers/gibson/index .html. Goldberg, D., Nichols, D., Oki, B., & Terry, D. (1992). Using collaborative filtering to weave an information tapestry, Communications of the ACM, vol. 35, no. 12, pp. 61–70. doi:10.1145/ 138859.138867 Golovchinsky, G. (2002). Going back in hypertext, in Proceedings of Hypertext ’02, ACM Press, New York, pp. 82–83. doi:10.1145/513338.513363 Golovchinsky, G. & Denoue, L. (2002). Moving markup: repositioning freeform annotations, in Proceedings of UIST ’02, ACM Press, New York, pp. 21–30. Golovchinsky, G., Price, M. N., & Schilit, B. N. (1999). From reading to retrieval: freeform ink annotations as queries, in Proceedings of SIGIR ‘99, ACM Press, New York, pp. 19–25. Good, L., Popat, A., Janssen, W., & Bier, E. (2005). UC: A fluid treemap interface for personal digital libraries, in Proceedings of JCDL ’05, ACM Press, New York, p. 408. Graham, J. (1999). The reader’s helper: a personalized document reading environment, in Proceedings of CHI ‘99, ACM Press, New York, pp. 481–488. Gross, M. D. & Do, E. (1996). Ambiguous intentions—a paper-like interface for creative design, in Proceedings of UIST 96, ACM Press, New York, pp. 183–192. Grudin, J. (1994). Groupware and social dynamics: eight challenges for developers, Communications of the ACM, vol. 37, no. 1, pp. 92–105. doi:10.1145/175222.175230 Gugerty, L., Tyrrell, R. A., Aten, T. R., & Edmonds, K. A. (2004). The effects of sub-pixel addressing on users’ performance, ACM Transactions on Applied Perception, vol. 1, no. 2, pp. 81–101. Gujar, A., Harrison, B. L., & Fishkin, K. P. (1998). A comparative empirical evaluation of display technologies for reading, in Proceedings of HFES ’98, ACM Press, New York, pp. 527–531. Guterman, J. (2002). Making e-books safe for the toilet, Business, vol. 2.0, November 14. Haas, C. (1996). Writing technologies: studies on the materiality of literacy, Lawrence Erlbaum, Mahwah, NJ. Halasz, F. (1991). Seven issues: revisited, in Hypertext ‘91 Closing Plenary, Retrieved June 10, 2009, http://www2.parc.com/spl/projects/halasz-keynote/transcript.html. Halasz, F. G. & Schwartz, M. (1994). The Dexter reference model, Communications of the ACM, vol. 37, no. 2, pp. 26–29. doi:10.1145/175235.175237 Halasz, F. G., Moran, T., & Trigg, R. H. (1987). NoteCards in a nutshell, in Proceedings of the ACM CHI+GI Conference, ACM Press, New York, pp. 45–52. doi:10.1145/1165387.30859 Harold, E. R. & Means, W. S. (2004). XML in a nutshell, O’Reilly, Sebastapol, CA. Harper, D. J., Koychev, I., & Sun, Y. (2003). Query-based document skimming: a user-centred evaluation, in Proceedings of 25th European Conference on IR Research, Springer, Berlin, pp. 377–392. doi:10.1007/3-540-36618-0_27
references 177
Harrison, B. L., Fishkin, K. P., Gujar, A., Mochon, C., Want, R. (1998). Squeeze me, hold me, tilt me! An exploration of manipulative user interfaces, in Proceedings of CHI ’98, ACM Press, New York, pp. 17–24. Harrison, S. & Dourish, P. (1996). Re-place-ing space: the roles of place and space in collaborative systems, in Proceedings of CSCW ’96, ACM Press, New York, pp. 67–76. Hearst, M. A. (1995). TileBars: visualization of term distribution information in full text information access, in Proceedings of CHI ’95, ACM Press, New York, pp. 56–66. Hearst, M., Hurst, M., & Dumais, S. (2008). What should blog search look like?, in Proceedings of CIKM 2008, ACM Press, New York, pp. 95–98. doi:10.1145/1458583.1458599 Henderson, D. A. & Card, S. (1986). Rooms: the use of multiple virtual workspaces to reduce space contention in a window-based graphical user interface, ACM Transactions on Graphics, vol. 5, no. 3, pp. 211–243. doi:10.1145/24054.24056 Hill, W. C., Hollan, J. D., Wroblewski, D., & McCandless, T. (1992). Edit wear and read wear, in Proceedings of CHI 1992, ACM Press, New York, pp. 3–9. doi:10.1145/142750.142751 Hill, W., Stead, L., Rosenstein, M., & Furnas, G. (1995). Recommending and evaluating choices in a virtual community of use, in Proceedings of CHI ‘95, ACM Press, New York, pp. 194–201. Hong, L., Chi, E., Budiu, R., Pirolli, P., & Nelson, L. (2008). SparTag.us: a low cost tagging system for foraging of Web content, in Proceedings of AVI ‘08, ACM Press, New York, pp. 65–72. Hornbæk, K. & Frøkjær, E. (2001). Reading of electronic documents: the usability of linear, fisheye, and overview+detail interfaces, in Proceedings of CHI 2001, ACM Press, New York, pp. 293–300. Howe, N. (1993). The cultural construction of reading in Anglo-Saxon England, in Ethnography of Reading, ed J Boyarin, University of California Press, Berkeley, CA, pp. 58–79. Ianella, R. (2001). Digital rights management (DRM) architectures, D-Lib Magazine, vol. 7, no. 6, http://www.dlib.org/dlib/june01/iannella/06iannella.html. doi:10.1045/june2001-iannella Indratmo, J., Vassileva, J., & Gutwin, C. (2008). Exploring blog archives with interactive visualization, in Proceedings AVI 2008, ACM Press, New York, pp. 39–46. Jacob, R. J. K. & Karn, K. S. (2003). Eye tracking in human–computer interaction and usability research: ready to deliver the promises (section commentary), in The mind’s eye: cognitive and applied aspects of eye movement research, eds. J Hyona, R Radach, & H Deubel, Elsevier Science, Amsterdam, pp. 573–605. Johansen, R. (1988). GroupWare: computer support for business teams, The Free Press, New York. Jones, M., Rieger, R., Treadwell, P., & Gay, G. (2000). Live from the stacks: user feedback on mobile computers and wireless tools for library patrons, in Proceedings of ACM Digital Libraries 2000, ACM Press, New York, pp. 95–102. Jones, W. (2004). Finders, keepers? The present and future perfect in support of personal information management, First Monday, vol. 9, no. 3, http://www.firstmonday.org/issues/issue9_3/jones/.
178 reading and writing the electronic book
Jones, W., Dumais, S., & Bruce, H. (2002). Once found, what then?: a study of “keeping” behaviors in personal use of Web information, in Proceedings of ASIST 2002, Information Today, Medford, NJ, pp. 391–402. doi:10.1002/meet.1450390143 Jones, W., Phuwanartnurak, A., Gill, R., & Bruce, H. (2005). Don’t take my folders away! Organizing personal information to get things done, CHI ‘05 Extended Abstracts, ACM Press, New York, pp. 1505–1508. Joyce, M. (1990). Afternoon, Eastgate, Watertown, MA. Kane, J. (2002). A type primer, Prentice Hall, Upper Saddle River, NJ. Kelly, D. & Belkin, N. (2004). Display time as implicit feedback: understanding task effects, in Proceedings of ACM SIGIR 04, ACM Press, New York, pp. 377–384. Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment, in Proceedings of 9th ACMSIAM Symposium on Discrete Algorithms, ACM Press, New York, pp. 668–677. Konstan, J., Miller, B., Maltz, D., Herlocker, J. L., Gordon, L., & Riedl, J. (1997). GroupLens: applying collaborative filtering to Usenet news, CACM, vol. 40, no. 3, pp. 77–87. Landow, G. P. (1992). Hypertext: the convergence of contemporary critical theory and technology, Johns Hopkins University Press, Baltimore, MD. Larson, K. & Picard, R. (2005). The aesthetics of reading, Human–Computer Interaction Consortium, Retrieved May 21, 2009, http://affect.media.mit.edu/pdfs/05.larson-picard.pdf. Levy, D. M. (2001). Scrolling forward: making sense of documents in the digital age, Arcade Publishing, New York. Levy, D. & Marshall, C. C. (1995). Going digital: a look at assumptions underlying digital libraries, Communications of the ACM, vol. 38, no. 4, pp. 77–84. doi:10.1145/205323.205346 Liesaputra, V. & Witten, I. (2008). Seeking information in realistic books: a user study, in Proceedings of JCDL 2008, ACM Press, New York, pp. 29–38. Lorch Jr, R. F., Lorch, E. P., & Klusewitz, M. A. (1993). College students’ conditional knowledge about reading, Journal of Educational Psychology vol. 85, no. 2, pp. 239–252. doi:10.1037/ 0022-0663.85.2.239 Lupton, E. (2007). Thinking with type: a critical guide for designers, writers, editors, & students, Architectural Press, Princeton, NJ. Lynch, C. (1999). Canonicalization: a fundamental tool to facilitate preservation and management of digital information, D-Lib Magazine, vol. 5, no. 9. Lynch, C. (2001). The battle to define the future of the book in the digital world, First Monday, vol. 6, no. 6, http://www.firstmonday.dk/issues/issue6_6/lynch/index.html. Lynch, C. (2007). The shape of the scientific article in the developing cyberinfrastructure, CTWatch Quarterly, August. MacKay, W. E. (1999). Is paper safer? The role of paper flight strips in air traffic control, ACM Transactions on CHI, vol. 6, no. 4, pp. 311–340. doi:10.1145/331490.331491
references 179
Mackinlay, J., Robertson, G., & Card, S. (1991). The perspective wall: detail and context smoothly integrated, in Proceedings of CHI ‘91, ACM Press, New York, pp. 173–179. Malloy, J. (1991). Its name was Penelope, Eastgate, Watertown, MA. Malone, T. W. (1983). How do people organize their desks? Implications for the design of office information systems, ACM Transactions on Office Information Systems, vol. 1, no. 1, pp. 99–112. doi:10.1145/357423.357430 Mander, R., Salomon, G., & Wong, Y. Y. (1992). A pile metaphor for supporting casual organization of information, in Proceedings of CHI ‘92, ACM Press, New York, pp. 627–634. doi:10.1 145/142750.143055 Manguel, A. (1996). A history of reading, Viking, New York. Maniatis, P., Roussopoulos, M., Giuli, T., Rosenthal, D., Baker, M., & Muliadi, Y. (2005). LOCKSS: a peer-to-peer digital preservation system, ACM Transactions on Computer Systems, vol. 23, no. 1, pp. 2–50. Marchionini, G. (1995). Information seeking in electronic environments, Cambridge University Press, Cambridge. Marchionini, G. (2000). Evaluating digital libraries: a longitudinal and multifaceted view, Library Trends, vol. 49, no. 2, pp. 304–333. Marchionini, G., Plaisant, C., & Komlodi, A. (2003). The people in digital libraries: multifaceted approaches to assessing needs and impact, in Digital library use: social practice in design and evaluation, eds. A. Bishop, N. Van House, & B. Buttenfield, MIT Press, Cambridge, MA, pp. 119–160. Marshall, C. C. (1997). Annotation: from paper books to the digital library, in Proceedings of Digital Libraries ’97, ACM Press, New York, pp. 131–140. Marshall, C. C. (1998). Toward an ecology of hypertext annotation, in Proceedings of ACM Hypertext ‘98, ACM Press, New York, pp. 40–49. doi:10.1145/276627.276632 Marshall, C. C. (2003). Finding the boundaries of the library without walls, in Digital library use: social practice in design and evaluation, eds Bishop, Buttenfield, & Van House, MIT Press, Cambridge, MA, pp. 43–63. Marshall, C. C. (2007). The gray lady gets a new dress: a field study of the times news reader, in Proceedings of JCDL 2007, ACM Press, New York, pp. 259–268. Marshall, C. C. (2008a). Rethinking personal digital archiving: Part 1. Four challenges from the field, DLib Magazine, vol. 14, no. 3/4, doi:10.1045/march2008-marshall-pt1. Marshall, C. C. (2008b). From writing and analysis to the repository: taking the scholars’ perspective on scholarly archiving, in Proceedings of JCDL ’08, ACM Press, New York, pp. 251–260. Marshall, C. C. (2008c). Rethinking personal digital archiving: Part 2. Implications for services, applica tions, and institutions, DLib Magazine, vol. 14, no. 3/4, doi:10.1045/march2008-marshall-pt2. Marshall, C. C. & Bly, S. (2004). Sharing encountered information: digital libraries get a social life, in Proceedings of JCDL ’04, ACM Press, New York, pp. 218–227.
180 reading and writing the electronic book
Marshall, C. C. & Bly, S. (2005a). Saving and using encountered information: implications for electronic periodicals, in Proceedings of CHI ‘05, ACM Press, New York, pp. 111–120. Marshall, C. C. & Bly, S. (2005b). Turning the page on navigation, in Proceedings of JCDL ‘05, ACM Press, New York, pp. 225–234. doi:10.1145/1065385.1065438 Marshall, C. C. & Brush, A.J. (2004). Exploring the relationship between personal and public annotations, in Proceedings of JCDL ‘04, ACM Press, New York, pp. 349–357. doi:10.1145/996 350.996432 Marshall, C. C. & Jones, W. (2006). Keeping encountered information, Communications of the ACM, vol. 49, no. 1, pp. 66–67. doi:10.1145/1107458.1107493 Marshall, C. C. & Ruotolo, C. (2002). Reading-in-the-small: a study of reading on small form factor devices, in Proceedings of JCDL ‘02, ACM Press, New York, pp. 56–64. Marshall, C. C. & Shipman III, F. M. (1995). Spatial hypertext: designing for change, Communications of the ACM, vol. 38, no. 8, pp. 88–97. doi:10.1145/208344.208350 Marshall, C. C. & Shipman III, F. M. (1997). Effects of hypertext technology on the practice of information triage, in Proceedings of ACM Hypertext ’97, ACM Press, New York, pp. 124–133. Marshall, C. C., Price, M. N., Golovchinsky, G., & Schilit, B. N. (1999). Introducing a digital library reading appliance into a reading group, in Proceedings of Digital Libraries 99, ACM Press, New York, pp. 77–84. doi:10.1145/313238.313262 Marshall, C. C., Price, M., Golovchinsky, G., & Schilit, B. N. (2001). Designing e-books for legal research, in Proceedings of JCDL ’01, ACM Press, New York, pp. 41–48. doi:10.1145/379437 .379445 Marshall, C. R. & Rossman, G. (2006). Designing qualitative research, Sage Publications, Thousand Oaks, CA. McCown, F., Marshall, C. C., & Nelson, M. L. (to appear). ‘Why websites are lost (and how they’re sometimes found),’ Communications of the ACM. McKnight, C. & Dearnley, J. (2003). Electronic book use in a public library, Journal of Librarianship and Information Science, vol. 35, no. 4, pp. 235–242. doi:10.1177/0961000603035004003 Meyer, P. (2004). The vanishing newspaper: saving journalism in the information age, University of Missouri Press, Columbia, MO. Miles, M. B. & Huberman, A. M. (1994). Qualitative data analysis, Sage Publications, Thousand Oaks, CA. Moran, T. P., Chiu, P., van Melle, W., & Kurtenbach, G. (1995). Implicit structures for pen-based systems within a freeform interaction paradigm, in Proceedings of CHI ‘95, ACM Press, New York, pp. 487–494. Moulthrop, S. (1993). You say you want a revolution: hypertext and the laws of media, in Essays in postmodern culture, eds E Amiran & J Unsworth, Oxford University Press, New York, pp. 69–97.
references 181
Nelson, T. (1984). Literary machines, edn 87.1, The Distributors, South Bend, IN. Nielsen, J. (2009). Kindle 2 usability review, Alertbox, March 9, Retrieved August 10, 2009, http:// www.useit.com/alertbox/kindle-usability-review.html. Nielsen, J. (1997). How users read on the Web, Alertbox, October 1, Retrieved May 19, 2009, http:// www.useit.com/alertbox/9710a.html. Nunberg, G. (ed.) (1996). The future of the book, University of California Press, Berkeley, CA. Nunberg, G. (1993). The places of books in the age of electronic reproduction, Representations, vol. 24, pp. 13–37. OEBF (2002). Open eBook™ publication structure 1.2 specification, Open eBook Forum, Retrieved August 10, 2009, http://www.idpf.org/oebps/oebps1.2/download/oeb12-xhtml.htm. O’Hara, K. & Sellen, A. (1997). A comparison of reading paper and on-line documents, in Proceedings of CHI ’97, ACM Press, New York, pp. 335–342. doi:10.1145/258549.258787 O’Hara, K., Smith, F., Newman, W., & Sellen, A. (1998). Student readers’ use of library documents: implications for library technologies, in Proceedings of CHI ‘98, ACM Press, New York, pp. 233–240. Pettigrew, K. E., Fidel, R., & Bruce, H. (2001). Conceptual frameworks in information behavior, Annual Review of Information Science and Technology, vol. 35, pp. 43–78. Pettigrew, K. E., Durrance, J. C., & Unruh, K. T. (2002). Facilitating community information seeking using the internet: findings from three public library-community network systems, Journal of the American Society for Information Science and Technology, vol. 53, no. 11, pp. 894–903. doi:10.1002/asi.10120 Pickens, J., Golovchinsky, G., Shah, C., Qvarfordt, P., & Back, M. (2008). Algorithmic mediation for collaborative exploratory search, in Proceedings of SIGIR 2008, ACM Press, New York, pp. 315–322. doi:10.1145/1390334.1390389 Price, M. N., Golovchinsky, G., & Schilit, B. N. (1998). Linking by inking: trailblazing in a paperlike hypertext, in Proceedings of HT ’98, ACM Press, New York, pp. 30–39. Putnam, R. (1995). Bowling alone: America’s declining social capital, Journal of Democracy, vol. 6, no. 1, pp. 65–78. doi:10.1353/jod.1995.0002 Qayyum, A. & Bilykh, I. (2005). Navigational characteristics of e-document readers, in Proceedings of ASIS&T, vol. 42, no. 1. doi:10.1002/meet.1450420109 Quinn, S. (2009). In search of: the best online reading experience, Poynter Online, March 4, Retrieved May 18, 2009, http://www.poynter.org/content/content_view.asp?id=78569. Rayner, K. (1983). Eye movements in reading: perceptual and language processes, Academic Press, New York. Remde, J. R., Gomez, L. M., Landauer, T. K. (1987). SuperBook: an automatic tool for information exploration—hypertext?, in Proceedings of ACM Hypertext ’87, ACM Press, New York, pp. 175–188.
182 reading and writing the electronic book
Renda, M. E. & Straccia, U. (2005). A personalized collaborative digital library environment: a model and an application, Information Process Management, vol. 41, no. 1, pp. 5–21. doi:10.1016/ j.ipm.2004.04.007 Rioux, K. S. (2000). Sharing information found for others on the World Wide Web: a preliminary examination, in Proceedings of the 63rd Annual Meeting of the American Society for Information Science, Information Today, Medford, NJ, pp. 68–77. Robertson, G., Czerwinski, M., Larson, K., Robbins, D., Thiel, D., & van Dantzich, M. (1998). Data mountain: using spatial memory for document management, in Proceedings of UIST ’98, ACM Press, New York, pp. 153–162. Roescheisen, M., Mogensen, C., & Winograd, T. (1995). Shared web annotations as a platform for third-party value-added information providers: architecture, protocols, and usage examples, Technical Report STAN-CS-TR-97-1582, Stanford University, Stanford, CA. Salton, G. & McGill, M. J. (1986). Introduction to modern information retrieval, McGraw-Hill, New York, NY. Samuelson, P. (2003). DRM {and, or, vs.} the law, Communications of the ACM, vol. 46, no. 4, pp. 41–45. doi:10.1145/641205.641229 Sarwar, B., Konstan, J., Borchers, A., Herlocker, J., Miller, B., Reidl, J. (1998). Using filtering agents to improve prediction quality in the grouplens research collaborative filtering system, in Proceedings of CSCW ’98, ACM Press, New York, pp. 345–354. doi:10.1145/289444.289509 Schilit, B. N., Golovchinsky, G., & Price, M. N. (1998). Beyond paper: supporting active reading with free form digital ink annotations, in Proceedings of CHI ’98, ACM Press, New York, pp. 249–256. Schilit, B. N. & Kolak, O. (2008). Exploring a Digital Library through Key Ideas. in Proceedings of JCDL’08, ACM Press, New York, pp. 177–186. Schilit, B. N., Price, M. N., & Golovchinsky, G. (1998). Digital library information appliances, Proceedings of DL ’98, ACM Press, New York, pp. 217–226. doi:10.1145/276675.276700 Schilit, B. N., Price, M. N., Golovchinsky, G., Tanaka, K., & Marshall, C. C. (1999). As we may read: the reading appliance revolution, IEEE Computer, vol. 32, no. 1, pp. 65–73. Schraefel, M. C., Zhu, Y., Modjeska, D., Wigdor, D., & Zhao, S. (2002). Hunter gatherer, in Proceedings of WWW ‘02, ACM Press, New York, pp. 172–181. doi:10.1145/511446.511469 Schriver, K. (1997). Dynamics of document design, John Wiley and Sons, New York. Sellen, A. & Harper, R. (2001). The myth of the paperless office, MIT Press, Cambridge, MA. Sharp, H., Rogers, Y., & Preece, J. (2007). Interaction design: beyond human–computer interaction, Wiley and Sons, Sussex, UK. Sheedy, J. E., Subbaram, M. V., Zimmerman, A. B., & Hayes, J. R. (2005). Text legibility and the letter superiority effect, Human Factors, vol. 47, no. 4, pp. 797–815. doi:10.1518/001872005 775570998
references 183
Sheridon, N., Howard, M., & Richley, E. (1997). Gyricon displays and electric paper, in Proceedings of Society for Information Display, San Jose, CA. Shipman F., Hsieh, H., Moore, J. M., & Zacchi, A. (2004). Supporting personal collections across digital libraries in spatial hypertext, in Proceedings of JCDL ’04, ACM Press, New York, pp. 358–367. doi:10.1145/996350.996433 Shipman, F., Hsieh, H., Maloor, P., & Moore, J.M. (2001). The visual knowledge builder: a second generation spatial hypertext, in Proceedings of Hypertext ’01, ACM Press, New York, pp. 113–122. Shipman, F. M., Price, M. N., Marshall, C. C., Golovchinsky, G., & Schilit, B. N. (2003). Identifying useful passages in documents based on annotation patterns, in Proceedings of ECDL ’03, Springer Verlag, Heidelberg, Germany, pp. 101–112. Shipman, F. & Marshall, C. C. (1999). Formality considered harmful: experiences, emerging themes, and directions on the use of formal representations in interactive systems, Computer Supported Cooperative Work (CSCW), vol. 8, no. 4, pp. 333–352. Shum, S. B., & Sumner, T. (2001). JIME: an interactive journal for interactive media, First Monday, vol. 6, no. 2. Silberman, S. (1998). Ex Libris: the joys of curling up with a good digital reading device, Wired, vol. 6, no. 7 ( July), pp. 98–104. Spiekermann, E. (2002). Stop stealing sheep & find out how type works, Adobe Press, San Jose, CA. Stefik, M. (1997). Shifting the possible: how trusted systems and digital property rights challenge us to rethink digital publishing, Berkeley Technology Law Journal, vol. 12, no. 1. St. Laurent, S. & Fitzgerald, M. (2005). XML pocket reference, 3rd edn, O’Reilly, Sebastapol, CA. Suchman, L. A. & Trigg, R. H. (1991). Understanding practice: video as a medium for reflection and design, in Design at work: cooperative design of computer systems, eds J Greenbaum & M Kyng, Erlbaum, Hillsdale, NJ, pp. 65–90. Suh, B., Woodruff, A., Rosenholtz, R., & Glass, A. (2002). Popout prism: adding perceptual principles to overview+detail document interfaces, in Proceedings of CHI ’02, ACM Press, New York, pp. 251–258. Sun, L. & Guimbretière, F. (2005). Flipper: a new method for digital document navigation, in Proceedings of CHI ‘05 (Extended Abstracts), ACM Press, New York, pp. 2001–2004. Surowiecki, J. (2004). The wisdom of crowds: why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations, Little, Brown, New York. Teevan, J., Dumais, S. T., & Horvitz, E. (2005). Personalizing search via automated analysis of interests and activities, in Proceedings of SIGIR 2005, ACM Press, New York, pp. 449–456. doi:10.1145/1076034.1076111 Terry, D. (2008). Replicated data management for mobile computing, Morgan & Claypool, San Rafael, CA.
184 reading and writing the electronic book
Thorngate, W. (1987). On paying attention, in Recent trends in theoretical psychology, eds Baker et al., Springer-Verlag, New York, pp. 247–263. Thurman, N. (2006). Participatory journalism in the mainstream: attitudes and implementation at British news websites, in Proceedings of 7th International Symposium on Online Journalism, April 8, Austin, TX. Tinker, M. A. (1963). Legibility of print, Iowa State University Press, Ames, IA. Toms, E. G. (2000). Serendipitous information retrieval, in Proceedings of the First DELOS Network of Excellence Workshop on Information Seeking, Searching and Querying in Digital Libraries, Dec. 11–12, Zurich, Switzerland, http://www.ercim.org/publication/wsproceedings/DelNoe01/ 3_Toms.pdf. Tufte, E. R. (1990). Envisioning information, Graphics Press, Cheshire, CT. Twidale, M. B. (2000). Interfaces for supporting over-the-shoulder learning, in Proceedings, HICS 2000, The Beckman Institute, University of Illinois at Urbana–Champaign, IL, pp. 33–37. Tyrrell, R. & Leibowitz, H. (1990). The relation of vergence effort to report of visual fatigue following prolonged near work, Human Factors, vol. 32, no. 3, pp. 341–357. Ullman, E. (1997). Close to the machine: technophilia and its discontents, City Lights Books, San Francisco, CA. Unsworth, J. (ed) (2006). Our cultural commonwealth: the report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences, John Unsworth (Commission Chair), with commission members and Marlo Welshons (editor), ACLS, New York. Uren, V., Buckingham Shum, S., Bachler, M., & Li, G. (2006). Sensemaking tools for understanding research literatures: design, implementation and user evaluation, International Journal of Human–Computer Studies, vol. 64, no. 5, pp. 420–445. doi:10.1016/j.ijhcs.2005.09.004 van Dam, A. (1988). Hypertext ‘87 keynote address, Communications of the ACM, vol. 31, no. 7, pp. 887–895. doi:10.1145/48511.48519 van Oostendorp, H. (1996). Studying and annotating electronic text, in Hypertext and cognition, eds J Rouet, J Levonen, A Dillon, & R Spiro, Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 137–47. Virshup, A. (1996). The teachings of Bob Stein, Wired, vol. 4, no. 7 ( July), http://www.wired.com/ wired/archive/4.07/stein_pr.html. Vishik, C. M. (1997). Internal information brokering and patterns of usage on corporate intranets, in Proceedings of Group ‘97, ACM Press, New York, pp. 111–118. Watters, C. R., Shepherd, M. A., Chiasson, T., & Manchester, L. (2004). An evaluation of two metaphors for electronic news presentation, Digital documents: systems and principles, Springer, Berlin, pp. 223–241.
references 184a
Weinreich, H., Obendorf, H., Herder, E., Mayer, M. (2008). Not quite the average: an empirical study of Web use, ACM Transactions on the Web, vol. 2, no. 1, http://doi.acm.org/10.1145/13 26561.1326566. Wellman B., Haase, A. Q., Witte, J., & Hampton, K. (2001). Does the Internet increase, decrease, or supplement social capital? Social networks, participation, and community commitment, American Behavioral Scientist, vol. 45, no. 3, pp. 436–455. doi:10.1177/00027640121957286 Whittaker, S. & Hirshberg, J. (2001). The character, value, and management of personal paper archives, ACM Transactions on Computer–Human Interaction, vol. 8, no. 2, pp. 150–170. doi:1 0.1145/376929.376932 Williamson, K. (1998). Discovered by chance: the role of incidental information acquisition in an ecological model of information use, Library & Information Science Research, vol. 20, no. 1, pp. 23–40. doi:10.1016/S0740-8188(98)90004-4 Wise Jr, J. A., Thomas, J. J., Pennock, K., Lantrip, D., Pottier, M., Schur, A., & Crow, V. (1995). Visualizing the non-visual: spatial analysis and interaction with information from text documents, in Proceedings of IEEE Symposium on Information Visualization ‘95, IEEE Press, Los Alamitos, CA, pp. 51–58. doi:10.1109/INFVIS.1995.528686 Wolfe, J. (2000). Effects of annotations on student readers and writers, in Proceedings of JCDL 2000, ACM Press, New York, pp. 19–26. doi:10.1145/336597.336620 Woodruff, A., Gossweiler, R., Pitkow, J., Chi, E., & Card, S. (2000). Enhancing a digital book with a reading recommender, in Proceedings of CHI 2000, ACM Press, New York, pp. 153–160. doi:10.1145/332040.332419 Zhai, S., Smith, B. A., Selker, T. (1997). Improving browsing performance: a study of four input devices for scrolling and pointing tasks, in IFIP Conference Proceedings, vol. 96, pp. 286–293.
185
Author Biography
Catherine C. Marshall is currently a senior researcher at Microsoft Research’s Silicon Valley Labo ratory after a stint in Microsoft’s product divisions as part of the Advanced Reading Technologies Team. She was a long-time member of the research staff at Xerox Palo Alto Research Center (PARC) and is an affiliate of the Center for the Study of Digital Libraries at Texas A&M University. The author has delivered keynotes at WWW, Hypertext, Usenix FAST, CNI, VALA, ACH– ALLC, and a variety of other Computer Science and Library and Information Science venues. This lecture is a synthesis of many of these talks. Visit Marshall’s webpage at http://www.csdl.tamu .edu/~marshall for more information about her publications, blog, contact details, and how she is related to Elvis.