The World Told and the World Shown Multisemiotic Issues
Eija Ventola and Arsenio Jesús Moya Guijarro
The World Told and the World Shown
Also by Eija Ventola FROM LANGUAGE TO MULTIMODALITY: New Developments in the Study of Ideational Meaning (edited with C. Jones, 2008) INTERPERSONAL COMMUNICATION (edited with G. Antos and T. Weber, 2008) PERSPECTIVES ON MULTIMODALITY (edited with C. Charles and M. Kaltenbacher, 2004) THE LANGUAGE OF CONFERENCING (edited with C. Shalom and S. Thompson, 2002) DISCOURSE AND COMMUNITY: Doing Functional Linguistics (editor, 2000) COHERENCE IN SPOKEN AND WRITTEN DISCOURSE: How to Create It and How to Describe It. (edited with W. Bublitz and U. Lenk, 1999) ACADEMIC WRITING: Intercultural and Textual Issues (edited with A. Mauranen, 1996) A FUNCTIONAL AND SYSTEMIC LINGUISTICS: Approaches and Uses (editor, 1991) THE STRUCTURE OF SOCIAL INTERACTION: A Systemic Approach to Semiotics of Service Encounters (1987)
Also by Arsenio Jesús Moya Guijarro THE TEACHING AND LEARNING OF FOREIGN LANGUAGES WITHIN THE EUROPEAN FRAMEWORK (La enseñanza de las lenguas extranjeras en el marco europeo) (edited with J. I. Albentosa and C. Harris, 2006) La Enseñanza de la Lengua Extranjera en la Educación Intantil (edited with J. I. Albentosa, 2003) Narración Infantil y Discurso: Estudio Lingüístico de Cuentos en Castellano e Ingles (with J. I. Albentosa, 2001) TALK AND TEXT: Studies in Spoken and Written Discourse (edited with A. Downing and J. I. Albentosa, 2000) PATTERNS IN DISCOURSE AND TEXT: Ensayos de Análisis del Discurso en Lengua Inglesa (edited with A. Downing and J. I. Albentosa, 1998)
The World Told and the World Shown Multisemiotic Issues
Edited By
Eija Ventola University of Helsinki, Finland and
Arsenio Jesús Moya Guijarro Universidad de Castilla-La Mancha, Spain
Selection and editorial matter © Eija Ventola and Arsenio Jesús Moya Guijarro 2009 Chapters © their individual authors All rights reserved. No reproduction, copy or transmission of this publication may be made without written permission. No portion of this publication may be reproduced, copied or transmitted save with written permission or in accordance with the provisions of the Copyright, Designs and Patents Act 1988, or under the terms of any licence permitting limited copying issued by the Copyright Licensing Agency, Saffron House, 6-10 Kirby Street, London EC1N 8TS. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil claims for damages. The authors have asserted their rights to be identified as the authors of this work in accordance with the Copyright, Designs and Patents Act 1988. First published 2009 by PALGRAVE MACMILLAN Palgrave Macmillan in the UK is an imprint of Macmillan Publishers Limited, registered in England, company number 785998, of Houndmills, Basingstoke, Hampshire RG21 6XS. Palgrave Macmillan in the US is a division of St Martin’s Press LLC, 175 Fifth Avenue, New York, NY 10010. Palgrave Macmillan is the global academic imprint of the above companies and has companies and representatives throughout the world. Palgrave® and Macmillan® are registered trademarks in the United States, the United Kingdom, Europe and other countries. ISBN: 978–0–230–57635–3 hardback This book is printed on paper suitable for recycling and made from fully managed and sustained forest sources. Logging, pulping and manufacturing processes are expected to conform to the environmental regulations of the country of origin. A catalogue record for this book is available from the British Library. A catalog record for this book is available from the Library of Congress. 10 9 8 7 6 5 4 3 2 1 18 17 16 15 14 13 12 11 10 09 Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne
Contents
List of Illustrations
vii
List of Tables
viii
List of Figures
ix
List of Appendices
xi
Acknowledgements
xii
Notes on Contributors
xiv
1
Introduction. The World Told and the World Shown: Multisemiotic Issues Eija Ventola and Arsenio Jesús Moya Guijarro Part I
2
1
Multimodal Theories: Coding the Visual
Multisemiosis and Context-Based Register Typology: Registerial Variation in the Complementarity of Semiotic Systems Christian M. I. M. Matthiessen
11
3
Developing Multimodal Texture Martin Thomas
39
4
Metonymy in Visual and Audiovisual Discourse Charles Forceville
56
5
What Makes Us Laugh? Verbo-Visual Humour in Newspaper Cartoons Elisabeth El Refaie
6
Citizenship and Semiotics: Towards a Multimodal Analysis of Representations of the Relationship between the State and the Citizen Giulio Pagani
75
90
Part II Children’s Narratives and Multisemiotics 7
On Interaction of Image and Verbal Text in a Picture Book. A Multimodal and Systemic Functional Study Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz v
107
vi
8
Contents
The Text-Image Matching: One Story, Two Textualizations María Cristina Astorga
124
Part III Text and Visual Interaction in Advertising and Marketing 9
Sequential Visual Discourse Frames Kay L. O’Halloran and Victor Lim Fei
10 A Systemic Functional Framework for the Analysis of Corporate Television Advertisements Sabine Tan 11
Multisemiotic Marketing and Advertising: Globalization versus Localization and the Media Anna Hopearuoho and Eija Ventola
139
157
183
Part IV Multisemiotics in Enacted Roles and Virtual Identities 12 Taking the Viewer into the Field: Interaction between Visual and Verbal Representation in a Television Earth Sciences Documentary Alison Love 13 Developing the Metafunctional Framework for Analysing Multimodal Hypertextual Identity Construction Arianna Maiorani Part V 14
220
Integrating Text, Visual and Space Multimodally
From Musing to Amusing: Semogenesis and Western Museums Maree Stenglin
15 Floods and Fidget Wheels: A Comparative Systemic Functional Analysis of Slessor’s ‘Five Bells’ and Olsen’s ‘Salute to Five Bells’ Kathryn Tuckwell Index
207
245
266
289
Illustrations 3.1 The back of UK and Taiwan Head and Shoulders shampoo bottles 42 3.2 Three faces of a UK Sensodyne Original Toothpaste pack: front (Face 1), side (Face 2) and back (Face 3) 43 4.1 Billboard for Interpolis Insurances, photographed in Haarlem, Holland, summer 2006; original in colour 60 4.2 Billboard for ABN-Amro, photographed at Schiphol airport, Holland, 2006, original in colour 62 4.3 Extreme close-up of priest mouth’s shouting in Joan’s ear (a film still from La Passion de Jeanne d’Arc, Carl Dreyer) 65 4.4 A priest puts a pen in Joan’s hand, urging her to sign a declaration she recants (film still from La Passion de Jeanne d’Arc, Carl Dreyer) 66 4.5 Fontaine fiddles with the car door handle, considering escape (a film still from Un Condamné à Mort s’est Échappé, Robert Bresson) 67 4.6 Fontaine opens his handcuffs with a pin (a film still from Un Condamné à Mort s’est Échappé, Robert Bresson) 68 5.1 Mac (Stan McMurty), Daily Mail, 5 November 2004, p. 17 76 5.2 Peter Schrank, Independent, 15 October 2004, p. 38 76 6.1 Co-occurrence of public sector and private sector bus liveries 98 7.1 Narrative process: symmetrical interaction 115 7.2 Narrative process: complementary interaction 116 8.1 The orientation stage in the simplified and non-abridged versions 127 9.1 Cartier advertisement and centrefold (Time Asia, 8 May 2006) 140 11.1 Toyota advertisements in different media 191 13.1 First phase of LOTRO character creation – choice of race and gender. Retrieved from www.youtube.com, April 2008 226
vii
Tables 2.1
2.2
7.1 9.1 10.1 10.2 15.1 15.2
Intersection of visual and aural contact values (adapted from Martin, 1992: Table 7.3), with examples of multisemiotic combinations; cells representing prototypical spoken language and prototypical written language are shaded The different SOCIO -SEMIOTIC PROCESS types and PHENOMENAL DOMAIN values for written mode with examples of multisemiotic documents Visual and verbal interaction SF-MDA framework for print advertisements (based on O’Halloran, 2008a) Phasal and information structure of the HSBC-text Intersemiotic repetition of conceptual narrative theme Functions and systems in painting, following O’Toole (1994: Chapter 1) Experiential meanings in Five Bells
viii
24
30 119 142 167 168 273 278
Figures
2.1 The stratification of semiotic systems: connotative and denotative semiotic systems; context as connotative system, stratification of denotative systems into content plane and expression plane 2.2 Multisemiotic integration and diversification as a cline represented in terms of the ordered typology of systems 2.3 Multisemiotic possibilities – cline of integration of different semiotic systems 2.4 The system network of MOOD, with systems realized prosodically by tone integrated systemically (illustrated for ‘declarative’ mood) with systems realized by the presence and relative sequence of elements of the modal structure of the clause 2.5 Integration of (ideational) meanings realized by images into linguistic semantics in WHO weekly reports 2.6 Integration of display (table) in multisemiotic text by means of identifying relational clause, with display construed as Token and interpretation of display as Value 2.7 Ontogenesis – from multimodal protolanguage (Halliday’s, 2004, phase I) via a transitional period to language and ‘paralanguages’ (phases II and III) 2.8 Mode, calibration of division of labor among systems operating in context – division of semiotic labour between denotative semiotic systems (language and paralanguage); division of socio-semiotic labour between denotative semiotic systems and social systems 2.9 Material distance and semiotic distance (tenor) – Edward T. Hall’s ‘distance sets’ 2.10 Kinds of semiotic system in relation to registerial range – from special-purpose systems to general-purpose ones 3.1 Waller’s (1987) illustrations of Gestalt principles of grouping 8.1 System of reference in two versions of The Sly Fox and the Red Hen 10.1 Soundscapes: waveform analysis of the HSBC-text 13.1 Construction page schematic structure 13.2 Systemic functional representation of the hyper-contextual functional process of causation ix
13 14 16
17 19
20
22
25 27 32 49 131 169 227 228
x
Figures
14.1 Activities associated with the eighteenth-century museum 14.2 Participants associated with the eighteenth-century museum 14.3 Activities associated with a hybrid museum 14.4 Participants associated with a hybrid museum
250 251 256 257
Appendices 10.1a Excerpt of transcription template for the multimodal analysis of dynamic moving images 10.1b Phasal and narrative structure in the HSBC-text 10.1c List of notations/notational symbols 13.1 Systemic functional analysis of The Fellowship of the Ring 15.1 ‘Five Bells’ analysis 15.2 Olsen’s mural, Salute to Five Bells 15.3 A detail of Olsen’s mural
xi
172 174 178 231 283 285 285
Acknowledgements
The editors of this book want to thank all the authors for embarking upon this project with us when it was first announced during two symposia on multisemiotics held at the University of Castilla-La Mancha and the University of Helsinki. We thank the authors for the co-operation and for their patience during the production process of the book. We would also like to thank the Departments of English of our corresponding universities for providing us the necessary financial means for the working meetings we have had during the preparation of this book and for financing some editorial help for us. We are particularly grateful to Tuomo Hiippala for revising the references included in each chapter and doing the index and to Edie Cruise for her stylistic suggestions and helpful advice. Last but certainly not least, we wish to express our gratitude for permission to reproduce both visual and written material in this book to the following companies and institutions: ●
●
●
●
●
●
●
●
Procter and Gamble for kind permission to reproduce Illustration 3.1, The back of UK and Taiwan Head and Shoulders shampoo bottles. GlaxoSmithKline for permission to use Illustration 3.2, Three faces of a UK Sensodyne Original Toothpaste pack: front, side and back. R. Waller for permission to use the material shown in Figure 3.1, Waller’s (1987) illustrations of Gestalt principles of grouping. Interpolis insurances for Illustration 4.1 from Billboard for Interpolis, photographed in Haarlem, Holland, 2006. ABN-Amro for Illustration 4.2 from Billboard for ABN-Amro, photographed at Schiphol airport, Holland, 2006. GAUMONT for permission to reproduce Illustration 4.3, Extreme close-up of the priest mouth’s shouting in Joan’s ear (a film still from La Passion de Jeanne d’Arc, Carl Dreyer, France © 1928 GAUMONT, and Illustration 4.4, A priest puts a pen in Joan’s hand, urging her to sign a declaration she recants (a film still from La Passion de Jeanne d’Arc, Carl Dreyer, France © 1928 GAUMONT). GAUMONT/NOUVELLES ÉDITIONS DE FILMS for permission to reproduce Illustration 4.5, a film still from Un Condamné à Mort s’est Échappé, Robert Bresson, France © 1956) and Illustration 4.6, a film still from Un Condamné à Mort s’est Échappé, Robert Bresson, France © 1956. Cartoonists Stan McMurty and Peter Schrank for their kind permission to reprint their work in Illustration 5.1, mac (Stan McMurty), Daily Mail, xii
Acknowledgements
●
●
●
●
●
●
●
xiii
5 November 2004, p.17 and in Illustration 5.2, Peter Schrank, Independent, 15 October 2004, p.38. John Law for permission to use the photograph in Illustration 6.1, Co-occurrence of public sector and private sector bus liveries. Walter Books Ltd, London SE11 5HJ for permission to reproduce Illustrations 7.1 and 7.2 from GUESS HOW MUCH I LOVE YOU by Sam McBratney, illustrated by Anita Jeram. Illustrations © 1994. Ladybirds Books Ltd for the kind permission to use Illustration 8.1, from The Sly Fox and The Little Red Hen retold by Joan Stimson, illustrated by Brian Price Thomas © Ladybird Books Ltd 1993, and from The Sly Fox and Red Hen written by Sue Ullstein, illustrated by John Dyke © Ladybird Books Ltd 1987. Group Marketing HSBC Holdings plc – HGHQ for the illustrations and material included in Chapter 10 from stills from HSBC’s ‘Okey Doke’ motocycle television commercial © 2004. Toyota Motor Sales, U.S.A., Inc. for the Avalon image in the banner ad and Toyota Finland for the other ads in Chapter 11. HaperCollins Publishers for permission to use 50 lines of Poem Five Bells by Kenneth Slessor. Sydney Opera House Trust for courtesy to use images included in Chapter 15.
Every effort has been made to acknowledge ownership of copyright. The editors offer their apologies if any further copyright holders have been infringed upon unknowingly. The publisher will be glad to make suitable arrangements with any copyright holder whom it has not yet been possible to contact.
Contributors María Cristina Astorga is an Associate Professor in the Faculty of Humanities at the National University of Río Cuarto, Argentina. She is also a Lecturer in Theories of Second Language Acquisition in the MA Programs at the National University of Río Cuarto and The National University of Córdoba. The focus of her research is on development in academic foreign and second language writing from systemic functional and cognitive perspectives. Elisabeth El Refaie is a Lecturer in Communication at the Centre for Language and Communication Research at Cardiff University, United Kingdom. The focus of her research is on visual and multimodal forms of narrative, rhetoric and humour, and she is currently working on a project which uses the graphic novel to explore multimodal semiotics. Her work has appeared in scholarly journals such as Visual Communication, Journal of Pragmatics, Journal of Sociolinguistics and Journal of Contemporary European Studies. Charles Forceville, Associate Professor in the Media Studies Department of the University of Amsterdam (Holland), studied English language and literature. He published Pictorial Metaphor in Advertising (Routledge 1996), and currently co-edits Multimodal Metaphor (Mouton de Gruyter 2009). His research has appeared in journals including Metaphor and Symbol, Journal of Pragmatics, Language and Literature, Poetics, Poetics Today, The New Review of Film and Television Studies, and in edited books. He also serves on the advisory boards of Metaphor and Symbol, Journal of Pragmatics, Public Journal of Semiotics, Atlantis, and Digital Studies. Anna Hopearuoho, a former student of the Department of English, University of Helsinki, Finland whose MA thesis, titled Advertising to Women through the Internet – a comparative multimodal study between Finnish and English, focused on the multisemiotic aspects of Internet advertising and the cultural factors influencing advertising through the Internet. She is currently working for a Finnish company, Wärtsilä Corporation, where her duties include preparing promotional material for both traditional and electronic media. Victor Lim Fei is a PhD Research Scholar at the Multimodal Analysis Lab, Interactive Digital Media Institute at the National University of Singapore, Singapore. He has been awarded the Singapore Ministry of Education Postgraduate Scholarship and has been a past recipient of the National University of Singapore Research Scholarship and the Singapore Public Service Commission Scholarship. His research interests are in multimodality, xiv
Contributors xv
literacy and pedagogy and he has also published several papers and book chapters on image–text relations and curriculum development. Alison Love is an Associate Professor of English Language and Linguistics at the National University of Lesotho, Lesotho, where she teaches discourse analysis, sociolinguistics and pragmatics. She has also taught at universities in Zambia, Zimbabwe and Botswana. Her research interests include the discourse of academic disciplines and political discourse, particularly that of Southern Africa. She has published articles in English for Specific Purposes and Discourse and Society and chapters in a number of collections. Arianna Maiorani is a Lecturer in Linguistics at Loughborough University, United Kingdom. Her main fields of research are discourse analysis applied to the study of literary texts and multimodality discourse analysis applied to the study of visual outputs, websites and online environments. Among her most recent publications is ‘Movies “reloaded” into commercial reality: representational structures in “The Matrix” trilogy promotional posters’, in From Language to Multimodality, New Developments in the Study of Ideational Meaning, Carys Jones and Eija Ventola, Equinox 2008. Christian M. I. M. Matthiessen is a Chair Professor of Linguistics and Head of the Department of English at the Hong Kong Polytechnic University. He has been involved in the development of Systemic Functional theory, description, modelling and application since his early work on text generation by computer, including exploration of multimodal generation in the mid-1980s. His work on multimodality and multisemioticity has been in the area of text analysis, computational modelling (e.g. the Multex system in the second half of the 1990s), and theoretical underpinnings. Arsenio Jesús Moya Guijarro, Professor of Language and Linguistics at the Fray Luis de León Teacher’s College, University of Castilla-La Mancha, Spain, does research in discourse and text analysis. He has published several articles on information, thematicity and multimodal discourses in international journals such as Word, Text, Functions of Language and Journal of Pragmatics. His research interests are also in Children’s Literature and Applied Linguistics. Within this framework he has co-edited The Teaching and Learning of Foreign Languages within the European Framework. Kay L. O’Halloran is Director of the Multimodal Analysis Lab, Interactive and Digital Media Institute (IDMI) and Associate Professor in the Department of English Language and Literature at the National University of Singapore. Kay O’Halloran is an internationally recognized scholar in multimodal analysis and she has given plenary addresses on multimodal approaches to mathematics and science and the use of digital technology for multimodal analysis at many international conferences. Kay O’Halloran is Principal Investigator for several large projects in the Multimodal Analysis Lab. For further information, please see http://multimodal-analysis-lab.org/.
xvi Contributors
Giulio Pagani, Department of Linguistics and English Language at Lancaster University, United Kingdom, has been awarded the Arts and Humanities Research Council grant to pursue doctoral research on representing the state in the English and French semiotic and social systems. He also teaches discourse analysis at the University of East Anglia. His research interests include systemic functional linguistics, social semiotics and political discourse analysis, particularly in application to the discourse and activities of public sector institutions. María Jesús Pinar Sanz is a Lecturer in Linguistics and Discourse analysis at the University of Castilla-La Mancha, Spain. Her research interests are in multimodal discourse analysis and, more specifically, in aspects related to the analysis of election campaigns and political advertising. She has published several articles on the generic structure of political ads and the relationship between the verbal and visual elements not only in political texts, but also in children’s narratives. Maree Stenglin is a Lecturer in Literacy and Learning in the Faculty of Economics and Business at the University of Sydney, Australia. Her research interests include literacy and learning, discourse analysis, English for Academic Purposes (EAP), multi-modality and the semiosis of 3D space. Her most recent publications focus on interpersonal communication, spatial semiotics and multimodal semiotics in the context of education. Sabine Tan is a PhD Research Scholar in Language Studies at the National University of Singapore, Singapore. Her research interests include social semiotics, visual communication, and multimodal discourse analysis. She is particularly interested in applications of systemic functional theory and its derivatives to the analysis of business discourses, corporate television advertisements, corporate web pages, televisual and Internet-based news discourse, and other emergent multimodal discourse genres. Martin Thomas works at the Centre for Translation Studies at the University of Leeds, United Kingdom, where he has contributed to various projects involving corpus linguistics and translator training. In 2005, he began his doctoral research on multimodal variation across fast-moving consumer goods packaging from English- and Chinese-speaking markets. Though based in translation studies, this project also draws on multimodal discourse analysis, information design and computational linguistics. Kathryn Tuckwell is a researcher at the Centre for Language in Social Life at Macquarie University, Australia, and she has worked on discourse analysis projects focusing on news reporting, professional discourse in medical and legal settings and pharmaceutical advertising. She is also completing a doctoral thesis on complexity, investigating in particular the linguistic features of a popular science explanation of emergent complexity. Along
Contributors xvii
with multimodality and intersemiosis, her other interests include literary stylistics and the semiotics of street art. Eija Ventola, Professor at the Department of English at the University of Helsinki, Finland, studied English language and literature in Finland and linguistics in Australia. She has held professorial posts in Germany and in Austria and visiting researcher and guest professorships in various countries. Her research areas include functional linguistics, analysis of various kinds of discourses (e.g. casual, service, academic, business, media), applications of linguistics into teaching and learning, and issues of multisemiotic aspects of communication. She has published and edited 14 books altogether and written over 80 research articles. Together with her students, she regularly organizes international MUST-research symposia, Multi-Semiotic Talks (e-mail:
[email protected]) which focuses on the challenges that multisemiotic changes in global and media communication set on our societies.
This page intentionally left blank
1 Introduction. The World Told and the World Shown: Multisemiotic Issues Eija Ventola and Arsenio Jesús Moya Guijarro
This collection of papers represents the research of scholars working within different contexts, sub -disciplines and languages in different parts of the world, while sharing the frameworks of systemic functional linguistics and visual semiotics. The volume is concerned with the development of multimodal, or rather multisemiotic, meaning-making theory, and it enhances the ways of multisemiotic analysis of texts and visuals in today’s mediaoriented world, hence the title of the book, The World Told and the World Shown: Multisemiotic Issues. It draws the attention of linguists and students alike to the fact that language rarely stands alone in written and spoken discourses, that is, mono -modally, and that we urgently need to sharpen our tools in analysing discourses multisemiotically. We cannot continue analysing language alone, but need an integrated multisemiotic approach, and the volume shows various ways of analysis within such an approach, which will be conducted on multisemiotically realized discourses. The principal aim of the volume is to point out the ways in which spoken and written discourses combine with other modes, simultaneously making use of the multiple resources of different semiotic systems as they are subsequently created and consumed. The chapters discuss the relationship between the discourses that ‘tell’ and visuals (either still or moving, like film) that ‘show’. The viewpoint that all the various modes specialize in the transmission of particular meanings is shared by all of the writers of the volume, and their understanding of the way discourses work in today’s world is a semiosis of such varied modes. Discourses in our modern societies always make use of the various resources of semiotic systems, and the following chapters show how we can interpret what people say and do by means of words and images. The innovative component of this book in comparison to those existing in the field is the application of current multisemiotic theories to a great variety of genres: picture books, billboards, cartoons, advertising, web games, 1
2
Eija Ventola and Arsenio Jesús Moya Guijarro
science documentaries, poetry, etc. The volume begins with chapters that take the theorizing of the text/discourse-visualization a step beyond current frameworks. The book, which is divided into five sections, highlights the importance of cultural and social aspects in the configuration of language and visualizations as well as their uses in the community. The first Part, Multimodal Theories: Coding the Visual, contains five chapters that represent multimodal views in systemic functional linguistics, cognitive linguistics and social semiotics. They focus on some relevant expansions of current multimodal theories from their own perspectives which, for the reader, are complementary approaches. The concepts introduced are, for example, the cline of integration of telling and showing, multimodal cohesion, metonymy, multimodal issues in representations of humour, semiotic metaphor and resemiotization. They challenge current views and encourage the theoretical and analytical experimentation which can break conventional boundaries of research on multisemiotics. Part I begins with Chapter 2 by Christian Matthiessen, Multisemiosis and Context-Based Register Typology: Registerial Variation in the Complementarity of Semiotic Systems. Matthiessen discusses some essential aspects of multisemiotic systems operating together in the same context. He explores these systems in terms of a typology of systems of different orders – physical, biological, social and semiotic systems, and he proposes ‘a cline of integration’ for different semiotic systems. He argues that at one pole of this cline, different semiotic systems are in fact integrated within one and the same semiotic system and gives the integration of ‘melody’ into language in the form of intonation as an example. However, as we move towards the other pole of the cline of semiotic integration, he claims, we find semiotic systems that are increasingly distinct and separate from one another. Thus, it is necessary to account for these distinct and separate systems that, nevertheless, operate together to create meaning in a mutually supportive way. This involves exploring the context in which they are coordinated. He illustrates the operation of parameters set out to study context – in particular, Mode and Field. He shows the value of investigating cooperation of multisemiotic systems, especially when some meanings are ‘at risk’ within a register that operates in a particular kind of context. Chapter 3, Developing Multimodal Texture, by Martin Thomas, first shows how the theory of systemic functional linguistics has been adapted by semioticians, and how the theoretical multisemiotic tools have been expanded to cover such systems as information value, salience and framing. By looking at designs of packages that come from three distinct locales, China, Taiwan and the United Kingdom, he is able to point out the necessity of developing the theory to begin to account for multimodal texture as well. In those cases in which the systems of framing and salience proposed by the grammar of visual design are not sufficient to account for the texture of multimodal
Introduction. The World Told and the World Shown
3
messages, the field of typography (modulation and segmentation) provides us with further tools allowing the creation of multimodal cohesion. In Chapter 4, Metonymy in Visual and Audiovisual Discourse, Charles Forceville’s starting point is a cognitivist-oriented approach to an originally literary concept, the metaphor, which, as he points out, has traditionally been considered a matter of language. Now it is a common assumption of cognitivist linguists that other tropes besides the metaphor are worthy of their attention, particularly of metonymy, although research has still strongly focused on verbal aspects of the manifestations of metonymy. However, Forceville argues that, like the metaphor, metonymy is a conceptual phenomenon rather than a verbal one, and it should also appear in sign systems other than language. In this chapter, he formulates parameters that can help us guide further research into non-verbal and multimodal metonymy. To support his claims, Forceville analyses a number of pictorial and multimodal metonyms in advertisements and film to show that cultural knowledge and narrative context turn out to be essential in the construction of the metonymy and its interpretation. In Chapter 5, What Makes Us Laugh? Verbo -Visual Humour in Newspaper Cartoons, Elisabeth El Refaie outlines the three main approaches to humour: superiority theory, incongruity theory and release theory. She attempts to formulate an integrated approach to what is told and what is shown in cartoons in British newspapers. The chapter develops ways of understanding humour and the creative mechanisms and social functions of laughter and ridicule. Earlier approaches have focused on verbal humour only, jokes in particular, and they have ignored the important role of visuals, music, sound and voice in many cases of humour. The chapter develops ways of theorizing and analysing multisemiotic humorous texts and emphasizes the importance of perceived intentionality, cultural knowledge and the shared common ground in understanding humour in cartoons. In the last chapter of this section, Chapter 6, Citizenship and Semiotics: Towards a Multimodal Analysis of Representations of the Relationships between the State and the Citizen, Giulio Pagani examines the discursive construction of states and citizens by considering the meanings of the multisemiotic texts made publicly available. He proposes a systemic functionally based model for analysing multisemiotic meaning-making resources. His chapter focuses on the semiotic potential of discourses in public sector service provision. Complementing the cognitive perspective discussed by Forceville in the previous chapter, he demonstrates how a combined analysis of register, semiotic metaphor and ‘resemiotization’ can be used to track meaning-making and interaction across a range of modes. He shows how the critical investigation of multimodal discourse resources is a valuable and worthwhile task for analysing how states and citizens shape their expectations of each other.
4
Eija Ventola and Arsenio Jesús Moya Guijarro
Part II, Children’s Narratives and Multisemiotics, includes two chapters that cover the interaction between the verbal and the visual in children’s narrative picture books. The meaning potential of such tales can only be fully revealed by detailed multimodal analyses – they show how what is told and what is shown complement and enhance one another. Using and adapting earlier frameworks on visual design and functional linguistics, the authors in this section highlight the ways in which the intersemiotic interaction of verbal and non-verbal modes contribute to the process of constructing meanings in picture books, written for both native and non-native young second language readers. In Chapter 7, On Interaction of Image and Verbal Text in a Picture Book. A Multimodal and Systemic Functional Study, Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz analyse the co -deployment and interaction of verbal and visual elements in Guess How Much I Love You, a children’s narrative for six-year-olds and under. The study reveals an essentially symmetrical/complementary creation of meaning at both the visual and verbal levels. As the narrative is intended for young children, no cases of contradictory or counterpointing interactions have been identified; rather, the visual and verbal components seem to reinforce each other and fulfil complementary roles in the meaning-making process. In Chapter 8, The Text-Image Matching: One Story, Two Textualizations, María Cristina Astorga analyses the interaction of text and image within the context of EFL/L2 -learning. The author compares two different versions of the same story intended for young readers whose mother tongue is English and for those learning English as a foreign or second language. This comparative analysis focuses simultaneously on both modes – the told and the shown – in order to determine to what extent these two versions may or may not share a resemblance. Using the grammar of visual design, the author captures the essential experiential meanings of the stories as they are communicated by both language and images, and shows how links between the processes, participants and circumstances are realized both linguistically and visually. The findings from the study suggest that in order to enhance the teaching of the visualized stories, EFL/L2 -teachers need to learn to reread picture books in new ways which involve the ability to uncover relationships of meanings between language and image. Part III, Text and Visual Interaction in Advertising and Marketing, brings together three papers that discuss and theorize how texts and visuals interact in advertising and marketing discourses. The advertising examples discussed in the chapters vary in their dealing with traditional paper format, TV-film and Internet modes. They share the common problem of sequencing in advertising and how to deal with this in respective modes. Simultaneously, perspectives are given on how luxury advertising is intermingled in media print and how corporate and product advertising is realized in television and pop -up advertisements on the Internet.
Introduction. The World Told and the World Shown
5
In Chapter 9, Sequential Visual Discourse Frames, Kay O’Halloran and Victor Lim Fei explore questions such as: What are the systems that operate in the visual mode? and How are meanings produced through sequential visual discourses? Understanding the systemic operations of visual modality is empowering as it enables the design of advertising visuals that are communicatively and ideologically effective. But at the same time, to balance this out, the consumers need to develop their critical reading abilities of these advertising texts. The chapter focuses on developing new possibilities for research on designing and reading visual discourses by considering the applications and limitations of the intersemiosis between language and images; and thus demonstrating these in practice with the analysis of a sequence of visual text in a themed Cartier paper advertisement. In Chapter 10, A Systemic Functional Framework for the Analysis of Corporate Television Advertisements, Sabine Tan shows how semiotic modes and resources combine in complex ways in corporate television advertisements. In order to enhance our understanding of these semiotic modes and their resources, this chapter proposes an integrative systemic functional multisemiotic framework for exploring the meaning potentials that are conveyed through the processes of intra- and intersemiosis in a dynamic multimodal text. It examines the multimodal meaning-making mechanisms that operate in a corporate television advertisement for an international financial institution and discusses the methodological aspects of selection criteria for the segmentation of dynamic text into appropriate constituent levels. It concludes by evaluating the semiotic approach and industrial practices in the analyses of corporate television advertisements. In Chapter 11, Multisemiotic Marketing and Advertising: Globalization versus Localization and the Media, Anna Hopearuoho and Eija Ventola discuss the need to localize global product marketing on the Internet and the consequential multisemiotic realizational differences of global product ads for local contexts. The chapter shows how a number of advertising agencies in a local market see the ‘localization processes’ and then exemplifies some of the multimodal strategies used for globalization and localization of products in Internet marketing advertising. The analysis and results generated show that there is a growing need in this field to train interdisciplinary experts able to design such advertisements while being linguistically and semiotically sensitive to the localization needs of the global market. Hopearuoho and Ventola highlight the fact that local contexts may demand totally different linguistic and other semiotic realizations both in traditional and current means of advertising through the use of the new medium, the Internet (i.e. local languages are used for advertising, and certain cultural semiotic realizations are also highlighted in the ads). Part IV, Multisemiotics in Enacted Roles and Virtual Identities, discusses the use of multisemiotic resources in an enactment of real and virtual identities. Here the focus is first on how verbal and visual modes complement
6
Eija Ventola and Arsenio Jesús Moya Guijarro
each other in a television documentary series and thus lead viewers to interact with TV presenters and experts in the field of geology. The second focus is on the interactions that are created in a virtual world. The chapters together show how we construe our own world multimodally by the enactment of our communicative roles through various semiotic modes. In Chapter 12, Taking the Viewer into the Field: Interaction between Verbal and Visual Representation in a Television Earth Sciences Documentary, Alison Love discusses the verbal and visual strategies that are used in popularizing science in a television documentary series, Earth Story, screened by the BBC in 1998 (DVD 2006). The series sets out to answer questions about the formation of the Earth, plus the forces that have changed it over time. The chapter examines the ways in which the presenters use the verbal and visual modes transporting viewers into the field of geology – literally – through showing the places geologists go and the features they examine while, more metaphorically, introducing viewers to the principles and methods of ‘doing geology’. It shows how the two modes of representation, sometimes assisted by the musical mode, complement each other to lead viewers to share and enjoy an experience as a geologist. In Chapter 13, Developing the Metafunctional Framework for Analysing Multimodal Hypertextual Identity Construction, Arianna Maiorani focuses on the fact that thousands of players all over the world, from a wide range of ages and social backgrounds, are today attracted to the virtual world and the adventures offered by online games. The chapter analyses the roles and identity construction from one type of the Massively Multiuser Online Role Player Games (MMORPG). Multimodal hyper-discourse is created as a result of playing the game when one enters the discourse generated by the virtual community of players. To do so, the player has to become a visually active, interactive and creative participant. This process is social, which therefore implies interaction and communication. The identity that a player/ participant takes on in order to participate in the hyper-discourse of the game is a social construction that is created as a response to the hyper-social context of the game and to his/her own social context. The chapter also tests the ability of the Hallidayan metafunctional framework and its meaning categories to capture these kinds of worlds, hyper-social multisemiotic discourse activities and identities that the game generates through the use of visual and verbal/audio resources. The last section of the book, Part V, Integrating Text, Visual and Space Multimodally, is concerned with the integration of text, visuals and space, as well as the development and use of multimodal resources in meaningmaking contexts. The first chapter in this section gives us an interesting view on how Western museums have developed over time and through different stages into places of telling and showing, and even today into multimodal places of entertainment. The second chapter in this section discusses how a piece of literature, such as a poem, is visualized in a culturally
Introduction. The World Told and the World Shown
7
significant way as a mural, thus completing the discourse of ‘the world told and the world shown’ in this volume. Chapter 14, From Musing to Amusing: Semogenesis and Western Museums, by Maree Stenglin, applies social semiotic tools to illuminate the ideology of Western museums in two seminal moments of their evolution as cultural and multisemiotic institutions: the emergence of the public museum in the eighteenth century, and the evolution of the hybrid museum of the late twentieth and early twenty-first centuries. Stenglin uses the model of social context, developed in systemic functional linguistics and the notion of semogenesis, that is, the ways in which meanings unfold over time, to show how the ideology of telling and showing in exhibition spaces has been construed in concrete moments. In particular, semogenesis is conceived as projecting both stratified planes of social context: context of situation (register) and context of culture (genre). This relationship of projection is an important one as it enables social semioticians to systematically explore multisemiotic meanings from the perspective of social change. Finally, in Chapter 15, Floods and Fidget Wheels: A Comparative Systemic Functional Analysis of Slessor’s ‘Five Bells’ and Olsen’s ‘Salute to Five Bells’, Kathryn Tuckwell discusses multisemiotic isomorphism between a poem and a mural – two very distinct forms of art: the former ‘tells’ and the latter ‘shows’. The poem ‘Five Bells’ is very evocative of Sydney Harbour and is written by the Australian modernist poet Kenneth Slessor who lived most of his life near Sydney Harbour and drew his inspiration for the poem from it. The mural is John Olsen’s ‘Salute to Five Bells’, and it pays homage to the poet, the poem and the poem’s images of the Harbour. It was commissioned from Olsen in 1971, the year Slessor died and the famous Sydney Opera House was still being built on a small peninsula, surrounded on three sides by the Harbour. The study of the comparison of the poem and the mural demonstrates how multisemiotic systemic functional analyses can improve our comprehension of how verbal and visual systems operate in meaningmaking. The isomorphism between different semiotic systems gives us evidence that meaning-making has an inherent and universal structure, which is used by everyone who makes meaning, regardless of their form of expression – not just artists, writers and musicians, but everyone who uses language and other modes of meaning-making. Human communication and experience has throughout ages been recorded through writing systems of languages and images (sometimes as parts of writing systems). Recording speech and action changed the description of human experience when gramophones, tape recorders, film/video cameras, computers and Internet were developed in the previous century. The challenge for this century is to develop tools to capture the complexity of this new world of discourses as integrated multisemiotic realizations of human communication and experience. The chapters in this volume are a step towards developing ways of the intersemiosis of multimodal
8
Eija Ventola and Arsenio Jesús Moya Guijarro
meaning-making. The contributors have not only argued for the necessity of studying contextual meanings in both verbal and non-verbal manifestations in different genres, but they have also sought ways and solutions for analyzing such multimodal communicative artefacts as product packaging, film and TV, cartoons, picture books, games, advertising in magazines, on TV and in Internet, even public transport, museums and works of art in order to see how the integration of modes works and what we can learn from it. The authors in this volume are concerned with the effects and implications of multisemiotic integrations and call for further research in understanding our social realities in the integrated multisemiotic global world. Their work points out that Multisemiotics seems to be an appropriate discipline to deal with the complex communicative manifestations of the world we live in now. We, as editors, hope that the readers will find their immersion into ‘The World Told and the Word Shown: Multisemiotic Issues’ a rewarding experience and we hope that the discussions in this volume will entice them to participate and contribute to the exploration to this exciting area of Multisemiotics.
Part I Multimodal Theories: Coding the Visual
This page intentionally left blank
2 Multisemiosis and Context-Based Register Typology: Registerial Variation in the Complementarity of Semiotic Systems Christian M. I. M. Matthiessen
2.1
Introduction
This chapter discusses some aspects of multisemiotic systems operating together within one and the same context. Semiotic systems are systems capable of carrying or even (in the case of higher-order semiotic systems such as language) of creating meaning. Multisemiotic systems are semiotic systems that operate in parallel in the carrying or creation of meaning, working together within one and the same context (for a recent overview of systemic functional contributions to the study of such systems, see Martinec, 2005, and for a recent foundational systemic functional account of multimodal documents, see Bateman, 2008; for recent collections of contributions, see, for example, O’Halloran, 2004; Ventola, Charles and Kaltenbacher, 2004; Royce and Bowcher, 2006; and for a recent text book, see Baldry and Thibault, 2006). A prototypical example of multisemiotic systems would be people interacting in face-to-face conversation engaging different parts of the body (vocalization, facial expression, gesture, posture) to exchange meanings. From the point of view of the interactants exchanging meaning, this semiotic deployment of different bodily, or somatic, systems is a Gesamtkunstwerk, ‘a unified work of art’.1 The question of how such systems operate together – of how they are organized to create a unified, or at least a coordinated, flow of meaning in their context, is one of the key concerns of this chapter. Section 2.2 explores multisemiotic systems briefly in terms of a typology of systems of different orders of complexity – physical (first-order systems), biological (second- order systems), social (third- order systems) and semiotic systems (fourth- order systems). This typology will make it possible to explore multisemiotic systems in the environment of systems of lower orders, that is, in the environment of social, biological and physical systems. 11
12
Christian M. I. M. Matthiessen
Then, after the presentation of this ordered typology of systems, Section 2.3 proposes a sketch of a cline of integration of different semiotic systems. At one pole of this cline, different semiotic systems are in fact integrated within one and the same semiotic system, as in the case of the integration of ‘melody’ into language in the form of intonation. However, as we move towards the other pole of the cline of semiotic integration, we find semiotic systems that are increasingly distinct and separate from one another, and we need to account for how they operate together to create meaning in a mutually supportive way by exploring the context in which they are coordinated. The final section, Section 2.4, explores the contextual parameters – in particular, Mode and Field, and illustrates the value of investigating multisemiotic systems by reference to the meanings that are ‘at risk’ within a particular register (or ‘genre’) operating in a particular kind of context characterized by some range of values of Field, Tenor and Mode.2
2.2
Multisemiotic systems and types of system
When different semiotic systems, such as language and ‘body language’, language and image, or language and music, operate together in the creation of meaning in a multisemiotic system, they operate within one and the same context, and they are coordinated within this context (with context being interpreted as a connotative kind of semiotic system, within which multiple denotative semiotic systems operate; see Martin, 1992).3 So functionally these different semiotic systems are integrated within the context they operate in so that they can create meaning seamlessly and synergistically. Context is the semiotic environment, the environment of meaning, in which all semiotic systems operate. Since one key ‘architectural’ feature of all semiotic systems is that they are stratified into two planes, the content plane and the expression plane (each of which may be internally stratified into further levels of organization), context can be interpreted as the highest stratum within this hierarchy of stratification (see Halliday, 1978; Martin, 1992; Ghadessy, 1999); it is the stratum above the content planes of all denotative semiotic systems, see Figure 2.1. At the other end of the hierarchy of stratification, the expression plane, multisemiotic systems are, however, not integrated but they are instead diversified: they are realized through different expression systems, such as those of spoken language (vocalization), ‘body language’ and music. This is of course the reason for recognizing the condition of multimodality in the first place: ‘multimodality’ is a feature of the expression planes of semiotic systems in the first instance. In addition to being semiotic, this diversification of the expression plane is also manifested materially. We can explore this material manifestation by means of the ordered typology of systems originally proposed by Halliday
Multisemiosis and Context-Based Register Typology 13
Context
Content
Expression
Connotative
Denotative
Figure 2.1 The stratification of semiotic systems: connotative and denotative semiotic systems; context as connotative system, stratification of denotative systems into content plane and expression plane
(e.g. Halliday, 1996, 2005; Halliday and Matthiessen, 2006; Matthiessen, 2007, forthcoming). In this typology, systems operating in different phenomenal realms are ordered in complexity from less complex systems to more complex ones: First- order systems: Physical systems. These were the first systems to emerge in the universe, with the ‘big bang’, and have the widest phenomenal coverage, extending throughout the universe.
●
● Second- order systems: Biological systems [+ life]. These are physical systems with the added feature of ‘life’; they are living physical systems, which means that they self-replicate, individuate and are subject to evolution. They emerged under very special, constrained physical conditions – what James Lovelock (1991) calls the narrow window of life – on our planet around 3.5 billion years ago. ● Third- order systems: Social systems [+ ‘value’, or social order]. These are biological systems with the added feature of ‘value’, or social order; they are biological populations organized socially into networks of social beings (‘persons’) playing different roles in different networks and characterized by division of labour.
Fourth- order systems: Semiotic systems [+ ‘meaning’]. These are semiotic systems with the added feature of ‘meaning’; they are social systems that can also carry or even create meaning: persons operating in roles in social networks are also ‘meaners’ taking on speech roles and creating and sustaining
●
14
Christian M. I. M. Matthiessen
semiotic networks, or ‘communication networks’, through the ongoing exchange of meanings. Physical systems and biological systems can be grouped together as material systems – systems of matter; and social systems and semiotic systems can be grouped together as immaterial systems – socio-semiotic systems: systems of value and meaning (or ‘meaning’ in a broad sense; see Halliday, 2005). We can now interpret multimodality within the expression plane of semiotic systems as multimateriality within the lower- order systems of matter – that is, within biological systems and physical systems as in Figure 2.2. This multimateriality covers the ‘signifying body’ (cf. Thibault, 2004) operating in its signifying environment. The two higher- order systems in the ordered typology of systems – that is, social systems and semiotic systems – coordinate and integrate patterns within the two lower- order systems – that is, biological systems and physical systems. Socio-semiotic systems give ‘meaning’ to matter (cf. Halliday, 2005): social systems impose social order (‘value’) on the world of matter, and semiotic systems impose semiotic order (meaning in its narrower sense of valeur and signification; cf. Hasan, 1985) on this social world. For instance, social constructs such as tools, artefacts and dwellings are manifested in many materially diverse ways, but such materially divergent manifestations may have the same ‘value’ in the social system. Socio-semiotic integration
Connotative: context Denotative: content Denotative: expression
Semiotic
Social
{+ meaning}
{+ value}
“Meaning”
Biological {+ life}
Physical
Material diversification
“Matter”
Figure 2.2 Multisemiotic integration and diversification as a cline represented in terms of the ordered typology of systems
Multisemiosis and Context-Based Register Typology 15
2.3
Cline of semiotic integration
Having discussed the typology of systems, let us now focus on integration and diversification within semiotic systems. The hierarchy of stratification within semiotic systems is as it were a replay of the ordered typology of systems operating in different phenomenal realms – both within a primary semiotic system such as protolanguage and within a higher-order semiotic system such as language. Between the coordination and integration within the context of semiotic systems and the diversification within the expression plane of semiotic systems, we can recognize different degrees of integration within the content plane of semiotic systems (see Figure 2.3). These different degrees of integration form a continuum or cline of semiotic integration. This cline of integration is defined by its two outer poles – the pole of maximal integration and the pole of minimal integration. (i) At the pole of maximal integration, there is one semiotic system, and the different expressive systems involving different ‘modalities’ are integrated within one and the same content stratum. This is how multimodality within spoken language has been modelled in systemic functional linguistics since Halliday’s groundbreaking work on intonation and grammar in the early 1960s (e.g. Halliday, 1963, 1967; Halliday and Greaves, 2008). The approach is illustrated by the fragment of the lexicogrammatical system of MOOD and the phonological system of TONE in Figure 2.4. Here the integration is achieved within the stratum of lexicogrammar along the systemic (paradigmatic) axis. That is, systemically it does not make a difference whether terms in systems are realized by the presence of elements in the modal structure of the clause (e.g. ‘indicative’ realized by the presence of the Mood element), by the relative sequence of elements (e.g. ‘declarative’ realized by the sequence of Subject ^ Finite), or by the direction of the pitch movement in an intonation contour (e.g. ‘reserved’ realized by ‘tone 4’ – phonetically a fall–rise pitch movement). What matters systemically is simply that systemic values such as ‘indicative’, ‘declarative’ and ‘insistent’ are realized in such a way that these values are kept distinct in the expression. (ii) At the pole of minimal integration, two or more semiotic systems are completely separate as (denotative) semiotic systems – that is, separate in terms of both their content systems and their expression systems, and these systems are integrated and coordinated only at the highest level of semiotic organization, that is, within the (connotative) semiotic system of context (for the distinction between denotative and connotative semiotic systems, based on Hjelmslev, 1943, see Martin, 1992). An example of this case relating to the earlier illustration of intonation being integrated within language as linguistic ‘melody’ would be language and music in a folk ballad, as modelled by Steiner (1988). He describes these two semiotic systems separately, and, having done this, is then in a position to show how they interact.4
Cline of integration Maximal integration
Same system, same stratum
Minimal integration
Same system, same higher stratum
Same system in context
Degree of diversification
Different syntagm
Yes/no interrogative Interrogative
Finite ^ Subject Wh- interrogative
Minor STATUS Free
Clause
MOOD TYPE
Major
FREEDOM
+Residue (+Predicator)
+Mood (+Finite; +Subject)
Different lower stratum
+Wh; Wh ^ Finite
Indicative
Neutral Protesting Declarative Subject ^ Finite
Imperative
Tentative Marked Reserved
Bound
Different semiotics
Insistent
PRETONIC
Without pretonic With pretonic +Pretonic; Pretonic ^ Tonic
Tone 2
1+ wide 1. medium 1- narrow 2. straight
Tone 3
2_ broken
Tone 1 Tone group +Tonic
[Tone 1]
.2 high [Tone 2]
Simple
TONIC Tone 5 COMPOSITION
Figure 2.3
Phonology
[Tone 3]
—3 low
5_ low
Compound Lexicogrammar
4_ low 5. high
—2 low .3 mid
4. high Tone 4
.1 even —1 bouncing ...1 listing
+Tonic 2; Tonic ^ Tonic 2
Tone 13 Tone 53
Multisemiotic possibilities – cline of integration of different semiotic systems
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
Yes/no interrogative Interrogative
Finite ^ Subject Wh- interrogative
Minor STATUS Free
Clause
MOOD TYPE
Major
+Wh; Wh ^ Finite
Indicative
FREEDOM
+Residue (+Predicator)
+Mood (+Finite; +Subject)
Neutral Protesting Declarative Subject ^ Finite
Imperative
Tentative Marked Reserved
Bound
Insistent
PRETONIC
Without pretonic With pretonic +Pretonic; Pretonic ^ Tonic Tone 1 Tone 2
Tone group +Tonic
1+ wide 1. medium 1- narrow 2. straight
[Tone 1]
.2 high [Tone 2]
Simple
Tone 3
TONIC Tone 5 COMPOSITION
.3 mid
Phonology
4_ low 5. high
[Tone 3]
—3 low
5_ low
Compound Lexicogrammar
—2 low
2_ broken 4. high
Tone 4
.1 even —1 bouncing ...1 listing
+Tonic 2; Tonic ^ Tonic 2
Tone 13 Tone 53
Figure 2.4 The system network of MOOD, with systems realized prosodically by tone integrated systemically (illustrated for ‘declarative’ mood) with systems realized by the presence and relative sequence of elements of the modal structure of the clause
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
18 Christian M. I. M. Matthiessen
(iii) Intermediate between the two outer poles of the cline of integration are cases where two or more semiotic systems can be modelled as integrated into one system at the stratum of semantics – that is, at the higher of the two content strata. This possibility has in fact been explored since the mid-1980s in computational systems capable to generating multisemiotic presentations embodying coordinated semiotic strands, such as online text accompanied by pointing gestures (e.g. Reithinger, 1987) and online text accompanied by maps (e.g. Matthiessen et al., 1998). Even though they are typically not referred to in the literature on ‘multimodal analysis’, systems of this kind are quite important because they include fully explicit models of multisemiotic systems. In the modelling of face-to-face interaction in terms of the cline of integration, it is likely that gesturing and language can be integrated within a single system of meaning at the level of semantics; the high degree of interaction between them (including subtle synchronization of the onset of gestures relative to points of the unfolding of clauses) suggests that this may be both possible and necessary (cf. e.g. McNeill and Duncan, 2000; Haviland, 2000). For example, in terms of experiential meaning, it seems clear that gestures need to be related to the elements of the figure that is realized by a clause in its experiential manifestation – the process, participants or circumstances (see Halliday and Matthiessen, 1999) – as points of complementarity and synchronization; and language-gesture systems differ with respect to the complementarity of the two in ways that indicate some form of semantic integration, as shown by McNeill and Duncan (2000, pp. 149–51) with respect to the construal of motion through space. For instance, while Spanish tends to lexicalize the path of motion in verbs of motion, English tends to lexicalize the manner of motion (verbs incorporating path such as cross, exit, enter commonly being of Romance origin) and use them with expressions of path (as in tango into the dining room, float out of the harbour) (see Talmy, 1985, and the research building on his foundational study); and speakers of Spanish tend to indicate the manner of motion gesturally rather than lexically.5 In the modelling of the written mode in terms of the cline of integration, the same degree of integration within semantics may be possible with language and images. For example, in our study of the World Health Organization (WHO)’s Weekly Epidemiological Reports (WERs), we were able to integrate the meanings of images (maps and graphs) and text in English in an account of the semantics of the domain that the reports are concerned with, the domain of communicable diseases, as shown in Figure 2.5 (for discussion, see Matthiessen, 2006). The semantic system specific to the register of these reports can be modelled based on the meanings realized by texts in English (or French),
Multisemiosis and Context-Based Register Typology 19
Figure 2.5 Integration of (ideational) meanings realized by images into linguistic semantics in WHO weekly reports
yielding the network shown in the diagram. Within this overall semantic system, it is then possible to locate the subset of meanings that are realized by images – by graphs and maps (and also be tables). These meanings are concerned with quantification and location of phenomena that are measured in the reports, like deaths and outbreaks of communicable diseases. The images are integrated into the multimodal ‘text’ by means of clauses in language that relate references to displays (Map 1, Table 1 and so on) to the linguistic text, as in: In 2003, Afghanistan reported 8 polio cases (5 P1 and 3 P3). As at the end of May 2004, 2 P1 and one P3 cases had been reported (Map 1); this relationship between text and image is often construed explicitly
20
Christian M. I. M. Matthiessen
by ‘relational’ clauses of the ‘identifying’ type, as in Table 1 summarizes the scope of the SIAs and their impact on reported cases of neonatal tetanus (NT) (from WER 8113). In this later case, the clause construes an identity between a display, a table and an interpretation of this display; the table is construed as Token (Table 1) and it is related to a Value giving a ‘gloss’ in English indicating how it is to be interpreted; Figure 2.6. represents this relationship. Similarly: Table 3 shows the results in more detail; Fig. 1 shows the proportionate coverage of at-risk populations in implementation units by type of drugs used in MDA; Fig. 3 shows the involvement of various research institutions in the priority areas of research; Table 1 reports the distribution of the 9585 cases of dracunculiasis reported in 2007 by month and compares this with 2006. The Process of such relational clauses construes the relationship between the two semiotics; it is realized by a verbal group with a ‘symbolizing’ verb: show, report, summarize and the like. The cline of integration can also be explored in terms of semohistory – in terms of phylogenesis (the evolution of semiotic systems in the species), ontogenesis (the development of semiotic systems in the individual) and logogenesis (the unfolding of semiotic systems as texts; for these three timeframes, see Halliday and Matthiessen, 1999, p.18). It seems likely that all three semohistories may involve movements in either direction along the cline of integration: over time, semiotic systems may become more highly integrated, moving towards the pole of maximal integration; or alternatively, integrated semiotic systems may gradually split into more
Value: the scope of the SIAs and their impact on reported cases of neonatal tetanus (NT) Process: summarizes Token: Table 1
Figure 2.6 Integration of display (table) in multisemiotic text by means of identifying relational clause, with display construed as Token and interpretation of display as Value
Multisemiosis and Context-Based Register Typology 21
independent semiotic systems. Let me illustrate these movements in reference to phylogenesis and ontogenesis, and then comment briefly on logogenesis. Within the phylogenetic timeframe, there have been significant shifts in the degree of integration of language and image on the page. Before the twelfth century, illuminations ‘were a critical part of the text’, but in the eleventh and twelfth centuries ‘images became subordinated to the text which came, increasingly, to be seen as the primary conveyor of meaning’ (Olson, 1994, p. 112). After the development of printing technology in the fifteenth century, books would often be assembled from different sources: the author of the text had nothing to do with the illustrations, which would be added later by the printer. Later, text and image gradually became more highly integrated. However, today both printed matter and electronically delivered web material often take the form of a collage where images dominate, and it can be difficult for readers to know how to integrate them with the text (this being one of the challenges for educational institutions in dealing with multisemiotic literacy). Within the ontogenetic timeframe, we can explore how young children learn how to mean, and the study of this process of development can shed interesting light on movements along the cline of integration: somewhere around the age of five to eight months, human infants begin to develop a protolanguage in interaction with their intermediate caregivers in four contexts that are critical in early development – regulatory (a kind of ‘enabling’ context from an adult point of view; cf. Table 2.2 in Section 2.4.4), instrumental (a kind of ‘doing’ context), interactional (a kind of ‘sharing’ context) and personal (also a kind of ‘sharing’ context) (see e.g. Halliday, 2004). This protolanguage is inherently multimodal (as shown by Halliday’s description; for discussion, cf. Matthiessen, 2006), and from this multimodal protolanguage, both language and ‘paralanguages’ will develop, as young children make the gradual transition from their protolanguages to the mother tongues spoken around them, thus expanding their overall meaning potentials, as shown schematically in Figure 2.7. Here there would thus seem to be a move from one integrated protolinguistic system to coordinated but less integrated post-infancy semiotic systems – language and paralanguages. (Thus a (post-infancy) language, a mother tongue, can be fully instantiated in a phone conversation, where the channel can only convey vocalizations; but a protolanguage cannot, since it is not confined to vocalization.) The expression plane of protolanguage is somatic; it is based on the bodily resources of infants. The same is true of the expression planes of the language and paralanguages that emerge during the transitional period, but at some point children often begin to experiment with an exosomatic expression plane – drawing (going through a number of stages of different drawing
22
Christian M. I. M. Matthiessen
Somatic expression
+ Exosomatic expression {+Writing} Language {monomodal: phonology— phonetics: vocalization}
Protolanguage {multimodal}
{+Drawing}
Paralanguages {vocalization, gesture, facial expression, posture & CC} Phase I
Phase II
Phase III
Ontogenesis Figure 2.7 Ontogenesis – from multimodal protolanguage (Halliday’s, 2004, phase I) via a transitional period to language and ‘paralanguages’ (phases II and III)
systems; see e.g. Willats, 1997); and when they enter their first institution of formal education, they will gradually learn to write.6 Within the logogenetic timeframe, we can explore how speakers and writers create meaning by instantiating semantic regions within the overall meaning potential – typically, meanings that are ‘at risk’ within some particular register. As we investigate how meanings are created in the course of the unfolding of text, we can identify emergent patterns involving two or more semiotic systems. Such emergent patterns can be interpreted in terms of the notion of local, instantial systems (cf. Matthiessen, 1993a, 1995, 2002) – systems that are formed out of the patterns of the instantiation of a more general meaning potential further up the cline of instantiation (a registerial sub-potential or by a further step the overall meaning potential of a language). It seems plausible that when different semiotic systems that are part of a multisemiotic system are instantiated alongside one another in a multisemiotic text, they become more highly integrated in the instantial systems of the unfolding text, and patterns of correlations across the systems emerge.
2.4
Context: Mode, Tenor and Field
Regardless of the degree of integration of semiotic systems deploying different media of expression, the key question in a multisemiotic system is
Multisemiosis and Context-Based Register Typology 23
how the different resources for creating meaning complement one another and how the semiotic labour (the work of creating meaning in context) is divided among them. We can certainly approach this question ‘from below’, considering it in terms of the affordances of the different media of expression – including, for example, the temporal characteristics and durability of the medium in material terms.7 This is clearly significant, but the possibilities will ultimately have to be accounted for within our description of the context in which the different semiotic systems operate and within which the semiotic labour is divided among them. 2.4.1 Mode: Medium and Channel Within context, the contextual parameter of Mode relates to the ‘modality’ of the expression planes of different semiotic systems since it is Mode that is concerned with the semiotic role of the semiotic systems operating within a given context and the role of a particular semiotic system depends on the affordances of its medium or media of expression (cf. Matthiessen, 2006, in relation to multimodality). In terms of Mode, both MEDIUM (spoken/written) and CHANNEL (aural/ visual/tactile/olfactory/gustatory) are important factors determining the potential for different combinations of semiotic systems. With respect to language, the MEDIUM is either spoken or written – or signed, in the case of sign languages of deaf communities such as Auslan (see e.g. Johnston, 1992); and these different modes can combine with different ranges of other semiotic systems, but these ranges will also depend on the nature of the CHANNEL .8 One complex but important aspect of the interaction between medium and channel has to do with implications in terms of time and space in semiotic terms – whether ‘speaker’ and ‘addressee’ operate in the same spatio-temporal realm or not, and whether they process instances in real-time or not. Here sign languages are particularly interesting from a multimodal point of view because they are like spoken languages in being processed in real time and like written languages in being processed visually – one can compare the cline from paralinguistic gestures to signs in sign languages (cf. McNeill, 2000a, 2000b) to the cline from drawing to writing (cf. Matthiessen, 2006, and references therein). The CHANNEL of communication is, as Martin (1992, p. 510) puts it, ‘the semiotic construction of communication technology’, thus representing the semioticization of the affordances of the material channel; it is concerned with the bandwidth of semiosis between ‘speaker’ and ‘addressee’ as far as the expression plane is concerned, ranging from minimal bandwidth when they are not in any direct sensory contact to maximal bandwidth when they are in full sensory contact. Naturally, the greater the bandwidth is, the greater the opportunities will be for multiple semiotic systems to operate together in the creation of meaning simply because a greater range of expression systems will be available.
24 Christian M. I. M. Matthiessen
Exploring the possibilities of channel, Martin (1992, p. 511) presents a matrix where the different possibilities of AURAL and VISUAL CONTACT are intersected. He gives examples of the all intersections represented by the cells of the matrix, and we can now present a version of his matrix with indications of the potential for combinations of semiotic systems (Table 2.1); these intersections define settings for different registers – for meanings at risk from the point of view of aural and visual types of contact. If we add other sensory channels of contact, we can make further differentiation in terms of bandwidth (e.g. tactile contact determining the potential for semiotics of touch and olfactory contact determining the potential
Table 2.1 Intersection of visual and aural contact values (adapted from Martin, 1992: Table 7.3), with examples of multisemiotic combinations; cells representing prototypical spoken language and prototypical written language are shaded Visual contact
Aural contact
none
one-way
two-way
none
one-way
two-way
print media, electronic media
silent film, surveillance
signing (sign language); mime
writing ⫹ images [diagrams, drawings, photos etc.]
images ⫹ writing [subtitles, captions]
signing ⫹ gesturing
radio, audio recording
television, film, video; electronic media (with audio)
lip-reading
speaking ⫹ paralanguage
images ⫹ speaking ⫹ paralanguage ⫹ writing [subtitles, captions]
... ⫹ gesturing ⫹ facial expression ⫹ gaze ⫹ posture (⫹ proxemics)
telephone (including mobile), intercom, internet chat
video intercom
face-to-face conversation, video mobile, video internet chat
speaking ⫹ paralanguage
speaking ⫹ paralanguage
speaking ⫹ paralanguage ⫹ gesturing ⫹ facial expression ⫹ gaze ⫹ posture (⫹ proxemics) ⫹ touch ⫹ smell (⫹ taste)
Multisemiosis and Context-Based Register Typology 25
for the semiotics of fragrances and odours, as in the use of perfume). For instance, while face-to-face conversation prototypically involves maximal bandwidth (although there are principled exceptions, like service encounters at a ticket counter with a very limited view of the server), face-to-face conversations based on the technologies of videophones and video Internet chat are more constrained (not only in terms of audio [visual contact] and video [aural contact], but also obviously in terms of smell, touch and taste since these cannot yet be handled digitally in the same way as audio and video). 2.4.2 Mode: Division of semiotic labour Within Mode, the CHANNEL of communication thus determines the potential for different combinations of semiotic systems as far as the expression plane is concerned, but Mode is also concerned with the DIVISION OF SEMIOTIC LABOUR among the semiotic systems that enter such combinations. We can characterize this crudely in reference to language as a cline between two outer poles – one where all semiotic labour is done linguistically, and one
Connotative semiotic system: context (mode) Division of socio-semiotic labour Division of (semiotic) labour
Denotative semiotic system Language
Paralanguage
Social system
Figure 2.8 Mode, calibration of division of labor among systems operating in context – division of semiotic labour between denotative semiotic systems (language and paralanguage); division of socio-semiotic labour between denotative semiotic systems and social systems
26 Christian M. I. M. Matthiessen
where all semiotic labour is done non-linguistically, by some semiotic system or systems other than language.9 For instance, in the WHO’s Weekly Epidemiological Reports (WERs), a good deal of the semiotic labour is done linguistically (in English or French text), and images (maps and graphs, in particular) make quite restricted, specific contributions, as indicated above. Here it is possible to integrate the account of the meaning of images into the semantic system of language (the linguistic ‘domain model’; see Matthiessen et al., 1998; Matthiessen, 2006) – the mid-region of the cline of integration in Figure 2.3; and it seems plausible that this would generally be the case when one semiotic system is ‘nuclear’ in the contribution to the semiotic labour in a given context and the other system or systems make more of a supporting, peripheral contribution. In cases such as the WHO’s WERs, language carries the main semiotic burden, but in other cases it may be another kind of semiotic system that carries the main burden – as when language is used to label or annotate images for purposes of presentation, sorting or retrieval from some kind of archive. One interesting – and critical – aspect of the DIVISION OF SEMIOTIC LABOUR among denotative semiotic systems is the extent to which they operate in semiotic harmony with one another: the interpretation of music in terms of the harmony of chords, consonance, dissonance, counterpoint and so on may be a useful model for exploring how different simultaneous semiotic systems work together. For instance, a good deal of multisemiotic humour is based on the ‘dissonance’ among different semiotic systems, as in Norman Thelwell’s classic cartoons involving jocular advice about buying a house, gardening or caring for a pony; here the interpretation of the apparently straight-laced text is undercut by the meaning conveyed by the drawing. Taking account of the division of semiotic labour along these lines of semiotic harmony is a considerable challenge in terms of theory, modelling and description: we must take account of the meanings that are engendered under different conditions of semiotic harmony, including the tensions associated with irony and humour. In investigations of language, such phenomena are of course also familiar, as in the work on the ‘polyphony’ of the different metafunctional strands of meaning and the work on additional layers of meaning created through the strategy of metaphor.10 Baldry and Thibault (2006), for example, take account of the way that different semiotic systems work together under the heading of ‘resource integration’. In an account of multisemiotic systems and ‘texts’, Mode is thus responsible for the ‘semiotic construction’ of the affordances within the expression plane and for the division of semiotic labour. However, Field and Tenor are, of course, also part of the account ‘from above’ – the contextual account – of multisemiotic conditions. The focus will be on Field, but first the obvious point that Tenor is equally important is made briefly.
Multisemiosis and Context-Based Register Typology 27
2.4.3 Tenor (in relation to Mode) Tenor interacts in interesting ways with visual and aural contact (see Table 2.1 in Section 2.4.1). As part of his investigation that he called ‘proxemics’, Hall (e.g. 1966) has shown that ‘material distance’ realizes ‘semiotic distance’ – more specifically, interpersonal distance within Tenor (to put this in our terms): see Figure 2.9. The more ‘intimate’ the Tenor of the relationships is, the wider the bandwidth of the channel of communication is – the widest bandwidth being associated with intimate face-to-face conversation, where there is two-way contact in terms of all senses (cf. Table 2.1). The wider the bandwidth is, the greater the range of interpersonal meanings that can be expressed will be, since the face is a key resource for the expression of interpersonal meanings. Conversely, the more ‘public’ the Tenor of the relationship is, the narrower the bandwidth of the channel of communication is – the narrowest being associated with public addresses such as public speeches, where there is at best one-way contact in terms of vision and hearing (members of the public can see and hear the speaker as an individual but the speaker can only see and hear members of the public as a collective).11 This interpersonal distance obviously relates directly to the design of public spaces and buildings (cf. O’Toole, 1994) and also to the design of furniture (illuminated by the investigation of the semiotics of IKEA tables by
Public
Social-consultative
Casual-personal
15'
12'
7'
48"
30"
18"
Full body contact
Handreach
Face out of focus
Offaction + heat sense
Arm’s length No visual distortion
Whole face in foveal vision
8"
Intimate
Figure 2.9 Material distance and semiotic distance (tenor) – Edward T. Hall’s ‘distance sets’
28
Christian M. I. M. Matthiessen
Anders Björkvall, whose research shows that tables differ in terms of their potential for interaction). After these brief comments on Tenor, let us now turn to Field. 2.4.4
Field
Field has been characterized in terms of two basic parameters (see e.g. Halliday, 1978, pp. 142–3) – what I will call the SOCIO -SEMIOTIC PROCESS (‘that which is going on’) and the PHENOMENAL DOMAIN (‘subject matter’).12 Both are relevant to the ideational aspect of semiotic systems, as can be seen in the important distinction Kress and van Leeuwen (1996) draw between ‘narrative representation’ (Chapter 2) and ‘conceptual representation’ (Chapter 3). The SOCIO -SEMIOTIC PROCESS is ‘what’s going on’ in the Field – the nature of the activity. Drawing on Ure’s (unpublished) work on a context-based register typology, we can recognize eight distinct primary socio-semiotic process types (see Matthiessen, Teruya and Wu, forthcoming), grouping them into first- order and second- order processes. (i) One of these is a first- order process in Halliday’s (1978, pp. 142–3) sense – processes that we can now interpret as processes within social systems in the ordered typology of systems in Figure 2.2; this first- order process is the social process of ‘doing’, such as teamwork in a fishing expedition or in surgery in an operating theatre, and here language and other denotative semiotic systems come in merely to facilitate the execution of this firstorder social process. (ii) The other seven processes are second- order ones (again in Halliday’s sense) – processes that we can now interpret as processes within semiotic systems in the ordered typology of systems in Figure 2.2; they are inherently semiotic processes (and so also social, of course) and operate in contexts that are constituted not only socially but also semiotically.
●
●
We can summarize the socio-semiotic processes of Field as follows: semiotic processes (semiotic processes constitutive of context): ❍ processes of expounding (general knowledge) – explaining/classifying; ❍ reporting (on sequences of particular events, or regions of places) – recording (events)/surveying (places); ❍ recreating various aspects of socio-semiotic life (typically particular, personal and imagined experiences) – narrative and/or dramatizing; ❍ sharing (typically particular, personal experiences and values); ❍ recommending (courses of action) – advising/exhorting (promoting); ❍ enabling (courses of action) – empowering/regulating; ❍ exploring (positions and values) – arguing/evaluating; social processes (social processes constitutive of context, semiotic processes facilitating): ❍ doing (social action, with semiotic processes facilitating).
Multisemiosis and Context-Based Register Typology 29
These different socio-semiotic process types put different ideational aspects of denotative semiotic systems ‘at risk’ and also have different implications for multisemiotic combinations. For instance, ‘expounding’ processes are likely to mobilize ‘conceptual representations’ in Kress and van Leeuwen’s (1996) account, and they may also mobilize ‘narrative representations’; but ‘reporting’ and ‘recreating’ processes are more likely to mobilize ‘narrative representations’. If we intersect the different SOCIO - SEMIOTIC PROCESS types with values along the other contextual parameter within Field – the PHENOMENAL DOMAIN, we can give examples of likely multisemiotic combinations (just as we did for combinations of two types of channel within Mode, in Table 2.1), as illustrated in Table 2.2 for written document (i.e. the upper left region of Table 2.1) with language as the primary semiotic. The examples given in Table 2.2 can, of course, be multiplied, but they illustrate certain favoured combinations of values within socio-semiotic process and values within phenomenal domain. For example ● ● ● ●
expounding and −temporal sequence: taxonomic diagrams; reporting and +temporal sequence: timeline diagrams; recreating and +temporal sequence: story book illustrations; enabling and +temporal sequence: flowcharts.
An investigation of the conditions of multimodality and multisemiotic systems raises various interesting questions – one key question being to what extent different semiotic systems are general-purpose ones and to what extent they are register-specific (in the sense, for example, of a registerspecific semantic system; see e.g. Halliday, 1973; Patten, 1988). It does seem that visual semiotic systems tend to be much more register-specific than language is – that is, they tend to have evolved within particular types of situation as register-specific semiotic systems for operating within a certain range of Field, Tenor and Mode settings. For example, the flowcharts construe temporal sequences of events within some phenomenal domain, and they are typically used in either enabling contexts (designing or enabling a procedure, including algorithms) or in expounding ones (documenting a flow of events), having originally been designed in the 1920s for use in mechanical engineering. If we compare them with language, we can locate the systemic paths used for construing temporal sequences of events by means of clause complexes (cf. Halliday and Matthiessen, 2006, pp. 119–22, 364–5). 2.4.5
Semiotic systems and registerial range
Different settings of values within the Field, Tenor and Mode potential define different cultural domains – different sub-systems of the context of
30 Christian M. I. M. Matthiessen Table 2.2 The different SOCIO -SEMIOTIC PROCESS types and PHENOMENAL DOMAIN values for written mode with examples of multisemiotic documents Phenomenal domain temporal sequence
spatial extension
generality
⫹
⫹
⫹
⫺
chemistry textbook: explanation with diagram of sequences of chemical reactions
⫺
⫺
⫹
⫺
chemistry textbook: classification of elements with display of periodic table
⫹
⫹
⫺
⫹
newspaper: news report of maritime disaster with sequence of photos of rescue of passengers
⫹
⫹
⫺
⫹
history book: historical recount with time line diagram
⫺
⫹
⫺
⫹
guide book: survey of a region with map showing major natural features and places of interest
recreating
⫹
⫹
⫺
⫹
story for adolescents: narrative with drawings of scenes and characters
sharing
(⫾)
(⫾)
⫺
⫹
(personal) email message with photos
doing
⫺
⫹
⫺
⫹
shopping list with drawing of product to be purchased
recommending
⫺
⫺
⫹
⫺
newspaper: agony aunt column (without image, or with image of ‘agony aunt’ as authority projecting advice)
enabling
⫹
⫹
⫹
⫹
chemistry textbook: procedure for lab experiment with flow chart
expounding
reporting
example: written concreteness text with image
Continued
Multisemiosis and Context-Based Register Typology 31 Table 2.2 Continued Phenomenal domain
exploring
temporal sequence
spatial extension
generality
example: written concreteness text with image
⫹
⫹
⫺
⫹
guide book: walking tour (topographic procedure) with map showing route
⫺
⫺
⫺
⫺
newspaper: book review with image of cover and/or author
culture within which (denotative) semiotic systems operate13 (see Halliday, 2007 [1991], p. 275). These different cultural domains constitute the ecological niches in which registers – different sub-systems of a semiotic system – operate. Any semiotic system can thus be characterized in terms of the range of registers that it embodies; in fact, it can be modelled as an aggregate of those registers (cf. Matthiessen, 1993b). Some semiotic systems embody a wide registerial range, whereas other semiotic systems embody a narrow range. We can thus identify a cline extending from semiotic systems that are special-purpose systems embodying a single register to semiotic systems that are general-purpose ones embodying a wide – and open – range of registers (see Figure 2.10). As far as language is concerned, the outer poles of the cline ordering semiotic systems from ‘special-purpose’ to ‘general-purpose’ ones are defined by protolanguages and standard languages. Protolanguages are very constrained in terms of their contexts of use – these contexts being instrumental, regulatory, interactional and personal, as mentioned in Section 2.4.3; each context of use has its own little specialized meaning potential. In contrast, standard languages have huge registerial ranges, encompassing not only the spoken registers of the home and the neighbourhood, but also the extended range of written registers associated with the modern nation states – registers of science and technology, on the one hand, and registers of administration and control, on the other. In a similar way, we can locate semiotic systems other than language along the cline represented in Figure 2.10. The extent of the registerial range of any semiotic system will be subject to considerations of what combinations of Field, Tenor and Mode values they can operate across. For example, while language can operate across the modal combinations set out in Table 2.1, most other semiotic systems are more constrained in terms of ‘channel’: one of the reasons why language has such a wide registerial range is precisely that it has evolved a written mode alongside the earlier spoken mode (which can in turn be transferred to contact by touch through Braille). It is possible to
32 Christian M. I. M. Matthiessen Kinds of semiotic system Special-purpose semiotic systems
Range of registers {variation in content plane according to context of use}
General-purpose semiotic systems
Protolanguage
Language: vernacular dialect
Language: standard
Figure 2.10 Kinds of semiotic system in relation to registerial range – from specialpurpose systems to general-purpose ones
think of other examples such as notation systems for movement, including dance (e.g. Laban Movement Analysis) and for music; but these do not seem to extend the registerial ranges of dance and music beyond transcription – beyond the fact that they can be transcribed in a notation system, just as spoken language can be transcribed using writing. Here it is significant that (unlike notation schemes for music and dance) writing did not evolve as a notation system for transcribing a different mode of itself – spoken language; rather it evolved out of drawing in contexts of use other than those associated with the existing registers of spoken language. Writing encroached on the ‘registers’ of drawing – registers of book-keeping and trade.
2.5 Conclusion This chapter has touched on some central issues in the modelling of multisemiotic systems operating synergistically in a unified context, approaching these issues in terms of systemic functional theory – a theory that has informed some of the most central contributions to our understanding of multimodality and of multisemiotic systems. The chapter began by locating the phenomena under consideration within a holistic conception of an ordered typology of systems of operating within phenomenal realms of increasing complexity – physical, biological,
Multisemiosis and Context-Based Register Typology 33
social and semiotic systems. While multimodality and multisemiosis are semiotic phenomena in the first instance, they also have social, biological and physical implications precisely because the typology of systems is an ordered one – ordered in increasing complexity: semiotic systems are also social, and social systems are also biological, and biological systems are also physical. If we confine ourselves to multimodality and multisemiosis in the human species (but see e.g. Benson et al., 2002; Benson and Greaves, 2005; and cf. Matthiessen, 2004, for an evolutionary perspective), we can explore semiotic systems in terms of the conditions – both enabling and constraining ones – inherent in human societies (see e.g. Johnson and Earle, 2000) and human bodies (see e.g. Thibault, 2004). For example, we can investigate the affordances inherent in the human body in terms of the expressive resources of different semiotic systems; and we can also investigate the affordances of extending the human body as a resource for expression inherent in the (socially constructed) biological and physical environment of the human body – our ‘habitat’. After locating the phenomena under investigation within a holistic conception of systems of different kinds, a cline of semiotic integration was sketched, extending it from maximal integration to minimal integration. In the case of maximal integration, different ‘modalities’ operate on the expression plane within one and the same semiotic system, being integrated within the content plane of that system. An example of this case is the account of intonation within systemic functional linguistics, pioneered by Halliday in the 1960s and discussed above. Here the integration is possible even within the lower of the two content strata – the stratum of lexicogrammar; the integration takes place within the systemic (paradigmatic) axis of organization and the ‘modality’ of intonation is handled by realization statements associated with terms in systems. There may of course be tensions between intonation and (segmental) structure, as in the case of an example such as // -2 ^ you / have a /photograph of / this girl // (from Halliday, 1970), where the modal structure is that of a ‘declarative’ clause – Subject: you ^ Finite: have, but the tone is that prototypically associated with a ‘yes/ no interrogative’ clause – tone 2, phonetically a rising tone. However, such tensions are handled without difficulty in the description in terms of delicacy: while the ‘unmarked’ declarative key is realized by tone 1 (phonetically a falling tone), there are also four marked declarative keys, including a querying one realized by tone 2 (as shown in Figure 2.4 in Section 2.3). In the case of minimal semiotic integration, independent (denotative) semiotic systems operate in parallel within one and the same context (connotative semiotic system), and the coordination of semiotic processes within these parallel semiotic systems is purely a matter of context. An example of this case is the account of language and music in a folk ballad offered by Steiner (1988). Here one challenge is precisely to account for how the different semiotic systems complement one another in creating meaning and
34
Christian M. I. M. Matthiessen
how they create meaning synergistically so that the semiotic sum is greater than the partial contributions by the parallel semiotic systems. Steiner’s account suggests that one possibility is a metafunctional kind of complementarity, with one semiotic system contributing relatively more to one metafunctional mode of meaning and another relatively more to another metafunctional mode of meaning. Intermediate between maximal and minimal integration are cases of multisemiotic systems where there is some degree of integration – ranging from low to high – within the semantic stratum of the content plane. Two examples of these intermediate cases were mentioned above – written reports recording events with quantitative information represented not only in running text but also in tables, graphs and maps (the WHO’s WERs) and spoken language and gestures jointly construing movement through space (as in narrative recreating text or direction-giving enabling texts). In both these examples, the complementarity in the creation of meaning lies within the experiential mode of meaning in the first instance, and the semiotic systems are so finely coordinated that they would have to be integrated within the semantic stratum. The cline of semiotic integration can also be seen ‘from above’, from the vantage point of context. Here all contextual parameters are important, but Mode – the parameter concerned with the role played by different semiotic systems in a given context – is central: it determines the division of semiotic labour in the coordinated, synergistic creation of meaning in a given context. Having explored Mode, a brief discussion on Tenor and Field followed. The general point of this final part of this chapter was that different settings of Field, Tenor and Mode are associated with different registerial combinations of semiotic systems.
Notes 1. This was the term Richard Wagner used in the middle of the nineteenth century to represent his vision of operatic performances where music, language, set design and the other components of an opera are fused or unified so that they work together to create the total effect. Thus interactants in face-to-face conversation experience a seamless flow of fused or unified meaning rather than separate ‘channels’ using different parts of the body. (The seams may of course emerge if the synchronization of the different bodily systems is disrupted, as when a film is lip-synched or when either audio or video is even slightly delayed.) 2. The chapter was written up (and the work upon which it is based was carried out) before John Bateman’s (2008) foundational contribution to the analysis of ‘multimodal documents’ instantiating recognizable genres. For my view of the complementarity of the terms ‘register’ and ‘genre’, see Matthiessen (1993b). A register is a functional variety of language (in Halliday’s, e.g. 1978, classical sense of the term); a genre is a contextual construct (in Martin’s, e.g. 1992) sense of the term – corresponding roughly to a ‘situation type’ in Halliday’s (e.g. 2002) account.
Multisemiosis and Context-Based Register Typology 35 3. It is of course also perfectly possible for two or more semiotic systems to operate independently of one another within different contexts that are manifested within the same material setting without impacting on another in social and semiotic terms. 4. At the time of his description in the 1980s, the account of APPRAISAL proposed by Martin and White (2005) had not yet been developed, so it was not yet possible to relate the emotive meaning of music (as emphasized by Steiner) to the semantics of appraisal in language. This is now a possible project, and in computational linguistics there has been interesting work on controlling the selection of music as part of the generation of text in reference to interpersonal meaning (at Nayoyuki Okada’s laboratory, Kyushu Institute of Technology, in the 1990s). (This takes us back, by another route, to the notion of a Gesamtkunstwerk; the extended collaboration by Alfred Hitchcock, as director, and Bernard Hermann, as composer, illustrates this type of multisemiosis in film-making.) 5. There are interesting implications for second/foreign language learning. Research into English and Spanish language learning by E. Negueruela et al. (2004) suggests that even quite advanced learners tend to gesture according to the conventions of their mother tongue. 6. In the history of languages, writing systems were derived from drawing systems; see Matthiessen, 2006, and references therein. 7. This issue is a critical one in investigations of the ‘deep’ history of semiotic systems, where we must rely on the material record that can be unearthed (cf. Matthiessen, 2004, and references therein). 8. With respect to semiotic systems other than language, the nature of the MEDIUM is of course also relevant; but other semiotic systems tend only to operate with one type of medium. We can of course include systems of notation for transcribing or composing ‘text’ in a semiotic system. In the case of language, writing serves this kind of role for speech, so although writing did not originally evolve as a representation of speaking, it can now be used in that way; but in the case of certain other semiotic systems there are distinct notation systems such as notation systems for music and for dance. These are semiotic systems in their own right, not medium-based variants of another semiotic system. 9. This cline is analogous to the traditionally recognized mode cline between ‘language as constitutive’ and ‘language as ancillary’ [language in action]; this cline can now be reinterpreted as a cline between ‘denotative semiotic as constitutive’ and ‘denotative semiotic as ancillary’ – represented by the vertical axis in Figure 2.8. In reality, there are of course several such clines since semiotic systems other than language need to be further differentiated. 10. In Halliday and Matthiessen (1999, Chapter 6), we interpret metaphor as involving semantic junctures – semantically ‘joint’ categories, and this notion of juncture also relevant to the analysis of multisemiotic systems. In one of the cognitive traditions, Fauconnier and Turner (e.g. 1998, 2002, forthcoming) have proposed the mechanism of conceptual blending or integration, and there is now a considerable body of literature exploring this mechanism. It can be applied to both the analysis of metaphor and the analysis of meaning in multisemiotic systems. 11. The possibilities may, of course, be technologically enhanced, initially by some form of public address system, but now also by means of video cameras focusing, for example, on the upper bodies of speakers on a stage and screens showing the audience a significantly magnified version. 12. Cf. ‘activity sequence’ and ‘taxonomy’ in Martin (1992, pp. 536–42).
36
Christian M. I. M. Matthiessen
13. Also characterized as ‘institution’ – ‘networks of regions of social semiotic space’ – in Halliday (2005 [2002], pp. 254–5).
References Baldry, A. and P. J. Thibault (2006) Multimodal Transcription and Text Analysis: A Multimedia Toolkit and Coursebook (London and Oakville: Equinox). Bateman, J. A. (2008) Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal Documents (London and New York: Palgrave Macmillan). Benson, J. D., P. H. Fries, W. S. Greaves, K. Iwamoto, E. S. Savage-Rumbaugh and J. Taglialatela (2002) ‘Confrontation and support in bonobo-human discourse.’ Functions of Language, 9(1): pp. 1–38. Benson, J. D. and W. S. Greaves (eds) (2005) Functional Dimensions of Ape-Human Discourse (London: Equinox). Fauconnier, G. and M. Turner (1998) ‘Conceptual integration networks.’ Cognitive Science: A Multidisciplinary Journal, 22.2: pp. 133–87. —— (2002) The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities (New York: Basic Books). —— (2008) ‘Rethinking metaphor’, in Raymond W. Gibbs Jr (ed.) The Cambridge Handbook of Metaphor and Thought (Cambridge: Cambridge University Press), pp. 53–66. Ghadessy, M. (ed.) (1999) Text and Context in Functional Linguistics (Amsterdam: Benjamins). Hall, E. T. (1966) The Hidden Dimension (New York: Doubleday). Halliday, M. A. K. (1963) ‘Intonation in English Grammar.’ Transactions of the Philological Society, 62 (1): pp. 143–69. —— (1967) Intonation and Grammar in British English (Janua Linguarum Series Practica 48) (The Hague: Mouton). —— (1970) A Course in Spoken English: Intonation (London: Oxford University Press). —— (1973) Explorations in the Functions of Language (London: Edward Arnold). —— (1978) Language as Social Semiotic: The Social Interpretation of Language and Meaning (London and Baltimore, MD: Edward Arnold and University Park Press). —— (1991) ‘The notion of “context” in language education’, in T. Le and M. McCausland (eds) Interaction and Development: Proceedings of the International Conference, Vietnam, 30 March–1 April 1991. (University of Tasmania: Language Education). Reprinted in M. A. K. Halliday (2007) Language and Education. Volume 9 in the Collected Works of M. A. K. Halliday, edited by J. Webster (London and New York: Continuum), pp. 269–90. —— (1996) ‘On grammar and grammatics’, in R. Hasan, C. Cloran and D. Butt (eds) Functional Descriptions: Theory into Practice (Amsterdam: Benjamins), pp. 1–38. —— (2002) ‘Computing meanings: Some reflections on past experience and present prospects’, in G. Huang and Z. Wang (eds) Discourse and Language Functions (Shanghai: Foreign Language Teaching and Research Press), pp. 3–25. Reprinted in M. A. K. Halliday (2005) Computational and Quantitative Studies. Volume 6 in the Collected Works of M. A. K. Halliday, edited by Jonathan Webster (London and New York: Continuum), pp. 239–67. —— (2004) The Language of Early Childhood. Volume 4 of Collected Works of M. A. K. Halliday, edited by J. Webster (London and New York: Continuum). —— (2005) ‘On matter and meaning: the two realms of human experience.’ Linguistics and the Human Sciences, 1.1: pp. 59–82.
Multisemiosis and Context-Based Register Typology 37 Halliday, M. A. K. and W. S. Greaves (2008) Intonation in the Grammar of English (London: Equinox). Halliday, M. A. K. and C. M. I. M. Matthiessen (1999) Construing Experience Through Meaning: A Language-Based Approach to Cognition. Republished in 2006 (London: Cassell). Hasan, R. (1985) ‘Meaning, context and text: Fifty years after Malinowski’, in J. D. Benson and W. S. Greaves (eds) Systemic Perspectives on Discourse (Norwood, NJ: Ablex), pp. 16–50. Haviland, J. (2000) ‘Pointing, gesture spaces, and mental maps’, in McNeill (ed.): pp. 13–46. Hjelmslev, L. (1943) Omkring Sprogteoriens Grundlaeggelse (Cøpenhagen: Akademisk Forlag). (English version. 1961 Prolegomena to a Theory of Language. Madison, Wisconsin: University of Wisconsin Press.) Johnson, A. W. and T. Earle (2000) The Evolution of Human Societies: From Foraging Group to Agrarian State (Stanford: Stanford University Press). Johnston, T. (1992) ‘The realization of the linguistic metafunctions in a sign language.’ Language Sciences, 14.4: pp. 317–55. Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design (London: Routledge). Lovelock, James (1991) Gaia: The Practical Science of Planetary Medicine (Sydney: Allen & Unwin). McNeill, D. (ed.) (2000a) Language and Gesture (Cambridge: Cambridge University Press). —— (ed.) (2000b) ‘Introduction’, in McNeill (ed.): pp. 1–10. McNeill, D. and S. D. Duncan (2000) ‘Growth points in thinking-for-speaking’, in McNeill (ed.): pp. 141–61. Martin, J. R. (1992) English Text: System and Structure (Amsterdam and Philadelphia: Benjamins). Martin, J. R. and P. R. R. White (2005) The Language of Evaluation, Appraisal in English (London and New York: Palgrave Macmillan). Martinec, R. (2005) ‘Topics in multimodality’, in R. Hasan, C. M. I. M. Matthiessen and J. Webster (eds) Continuing Discourse on Language. Volume 1 (London: Equinox), pp. 157–81. Matthiessen, C. M. I. M. (1993a) ‘Instantial systems and logogenesis.’ Written version of paper presented at the third Chinese systemic functional symposium, Hangzhou University, Hangzhou, 17–20 June 1993. —— (1993b) ‘Register in the round: Diversity in a unified theory of register analysis’, in M. Ghadessy (ed.) Register Analysis: Theory and Practice (London: Pinter), pp. 221–92. —— (1995) ‘THEME as an enabling resource in ideational “knowledge” construction’, in M. Ghadessy (ed.) Thematic Developments in English Texts (London and New York: Pinter), pp. 20–55. —— (2002) ‘Lexicogrammar in discourse development: Logogenetic patterns of wording’, in G. Huang and Z. Wang (eds) Discourse and Language Functions (Shanghai: Foreign Language Teaching and Research Press), pp. 91–127. —— (2004) ‘The evolution of language: A systemic functional exploration of phylogenetic phases’, in G. Williams and A. Lukin (eds) Language Development: Functional Perspectives on Evolution and Ontogenesis (London: Continuum), pp. 45–90. —— (2006) ‘The multimodal page: A systemic functional exploration’, in T. D. Royce and W. L. Bowcher (eds) (2006) New Directions in the Analysis of Multimodal Discourse (Hillsdale, NJ: Lawrence Erlbaum).
38
Christian M. I. M. Matthiessen
Matthiessen, C. M. I. M. (2007) ‘The “architecture” of language according to systemic functional theory: Developments since the 1970s’, in R. Hasan, C. M. I. M. Matthiessen and J. Webster (eds) Continuing Discourse on Language. Volume 2 (London: Equinox), pp. 505–61. —— (forthcoming) The Architecture of Grammar. (New Delhi: Decent Books). Matthiessen, C. M. I. M., K. Teruya and W. Canzhong. (forthcoming) Doing Discourse Analysis. Book MS. Matthiessen, C. M. I. M., L. Zeng, M. Cross, I. Kobayashi, K. Teruya and C. Wu (1998) ‘The Multex generator and its environment: Application and development’, in Proceedings of the International Generation Workshop ‘98, August 1998, Niagara- onthe-Lake, pp. 228–37. Negueruela, E., J. P. Lantolf, S. R. Jordan, and J. Gelabert (2004) ‘The “private function” of gesture in second language communicative activity: a study on motion verbs and gesturing in English and Spanish.’ International Journal of Applied Linguistics, 14(1): pp. 113–47. O’Halloran, K. L. (ed.) (2004) Multimodal Discourse Analysis (London and New York: Continuum). Olson, David R. (1994) The World on Paper: The Conceptual and Cognitive Implications of Writing and Reading (Cambridge: Cambridge University Press). O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press (Pinter)). Patten, T. (1988) Systemic Text Generation as Problem Solving (Cambridge: Cambridge University Press). Reithinger, N. (1987) ‘Generating referring expressions and pointing gestures’, in G. Kempen (ed.) Natural Language Generation (Dordrecht: Martinus Nijhof), pp. 71–81. Royce, T. D. and W. L. Bowcher (eds) (2006) New Directions in the Analysis of Multimodal Discourse (Hillsdale, NJ: Lawrence Erlbaum). Steiner, E. (1988) ‘Language and music as semiotic systems: The example of a folk ballad’, in J. D. Benson, M. J. Cummings and W. S. Greaves (eds) Linguistics in a Systemic Perspective (Amsterdam: Benjamins), pp. 393–441. Talmy, L. (1985) ‘Lexicalisation patterns’, in T. Shopen (ed.) Language Typology and Syntactic Description. Volume III. Grammatical Categories and the Lexicon (Cambridge: Cambridge University Press), pp. 57–149. Thibault, P. J. (2004) Brain, Mind and the Signifying Body: An Ecosocial Semiotic Theory (London and New York: Continuum). Ventola, E., C. Charles and M. Kaltenbacher (eds) (2004) Perspectives on Multimodality (Amsterdam: Benjamins). Willats, J. (1997) Art and Representation (Princeton: Princeton University Press).
3 Developing Multimodal Texture Martin Thomas
3.1
Introduction
This chapter describes part of a wider research agenda concerned with variation in messages on fast-moving consumer goods packaging from three markets or locales: China, Taiwan and the United Kingdom. These pack messages are multimodal. More specifically, they are graphic: in Twyman’s (1985) terms, they are realized through a combination of verbal, schematic and pictorial resources. Significantly, verbal messages are expressed typographically. Moreover, the packs on which the messages are displayed are three- dimensional objects, typically presenting a number of faces. The aim here is to begin to describe the resources that serve the textual ‘enabling function’ (Halliday, 1978, p. 50) in multimodal texts, particularly in pack messages. This will provide a starting point for investigating the dimensions of variation across locales. The description will subsequently be tested and refined using the annotated comparable corpus I have described elsewhere (Thomas, 2007). In this chapter, I review influential accounts of texture and cohesion from the literature on multimodality. I suggest that these approaches should be enriched by incorporating expert knowledge from the field of typography. Finally, having outlined the theoretical tools which seem useful for the analysis of multimodal texture, I show how these might be applied when analysing pack designs.
3.2
Adapting the metafunctions for multimodal analysis
In the past decade or so, significant advances have been made in adapting the metafunctional model of language developed within systemic functional linguistics (SFL) (Halliday, 1978) for use with other semiotic modes (see O’Toole, 1994; Kress and van Leeuwen, 1996). Rather than treating semiotic modes individually, as does O’Toole (1994), Kress and van Leeuwen’s account explicitly seeks to accommodate ‘composite visuals’, ‘which combine text 39
40 Martin Thomas
and image, and, perhaps, other graphic elements’ (1996, p. 183). It therefore provides a suitable point of departure for our discussion. Before looking at their account in detail, it is worth putting it in some context. Kress and van Leeuwen (1996, p. 13) maintain that communication in all semiotic modes ‘fulfils two major functions’. In Halliday’s terms, these are the ideational and interpersonal metafunctions, which they rename representation and interaction respectively. Their equivalent of Halliday’s textual metafunction is renamed composition. The resources of this function are divided into three ‘interrelated systems’: information value, salience and framing. It should also be noted that, unlike O’Toole, Kress and van Leeuwen do not adopt a rank scale: they maintain that visual structures are not like linguistic structures (1996, p. 2).1 3.2.1
Information value
Kress and van Leeuwen map information value onto dimensions of visual space. Kress and van Leeuwen call these dimensions Centre-Margin, IdealReal and Given-New. They suggest certain limits for the applicability of their model: they are ‘largely concerned with the description of the visual semiotic of Western cultures’ and note that cultures with different reading directions ‘are likely to attach different values to these positions’ (Kress and van Leeuwen, 1996, p. 199). In the following subsections, we shall treat the definition for each dimension in turn. It will become clear that they are presented as being generally applicable within the ‘visual semiotic of Western cultures’. 3.2.1.1 Centre-Margin [I]f a visual composition makes significant use of the centre, placing one element in the middle and the other elements around it, we will refer to the central element as Centre and to the elements around it as Margins. For something to be presented as Centre means that it is presented as the nucleus of the information on which all the other elements are in some sense subservient. The Margins are these ancillary, dependent elements. (Kress and van Leeuwen, 1996, p. 206) Intuitively this sounds plausible. Indeed, data from the 100 million-word British National Corpus2 suggest that central and marginal are commonly used in these senses, albeit in other contexts. However, despite Kress and van Leeuwen’s observation that ‘central composition plays an important role in the imagination of young Asian designers’ (1996, pp. 203–6), I have so far been unable to find convincing examples of the Centre-Margin principle in pack designs collected from the United Kingdom or Taiwan. While it is not uncommon for elements of pack design, particularly brand logos, to be organized in this way, the principle does not extend across a whole pack face. It may be that the three- dimensional form of pack designs allows
Developing Multimodal Texture
41
central information to be displayed on the front of the pack, while marginal information is relegated to less prominent faces. 3.2.1.2 Ideal-Real If, in a visual composition, some of the constituent elements are placed in the upper part, and other different elements in the lower part of the picture space or the page, then what has been placed on the top is presented as the Ideal, what has been placed at the bottom as the Real. (Kress and van Leeuwen, 1996, p. 193) There seems to be some evidence to support this. On the back of many shampoo bottles, such as those shown in Illustration 3.1, it is common to find the brand name and an accompanying promise about what the product will deliver in the upper part and more concrete instructions for use and storage, as well as details about ingredients and manufacture in the lower part. However, it also seems plausible that, as the elements towards the top are more salient (Kress and van Leeuwen, 1996, pp. 212–13), this may account for the apparent pattern in the placing of messages. 3.2.1.3 Given-New [W]hen pictures or layouts make significant use of the horizontal axis [...] the elements placed on the left are presented as Given, the elements placed on the right as New. (Kress and van Leeuwen, 1996, p. 187) In developing this part of their theory, Kress and van Leeuwen explicitly refer to Halliday’s account of the information structure in the English clause. Concerns have been raised about the validity of extending the concept in this way. Bateman et al. (2004, pp. 66–7) point to the absence of methodologies for establishing the general reliability of such an extension, while Thibault (2000, p. 330) questions the appropriateness of extending the notion of constituency to visual texts. For the present purposes, another consideration, to which Kress and van Leeuwen themselves allude, is the variable reading direction of Chinese. If Kress and van Leeuwen’s prediction is correct, and reading direction influences the allocation of information value to different positions in visual layouts, we should expect the theory to hold for English examples but not necessarily for examples produced for a Chinese-speaking audience.3 However, rather than speculating further about the principles underlying the theory, let us see whether it can account for the data. Given their oblong shape and their typically landscape orientation on the supermarket shelf, toothpaste boxes would seem to make good examples of layouts ‘which make significant use of the horizontal axis’ (Kress and van Leeuwen, 1996, p. 187). Indeed, on the first face in the example shown in Illustration 3.2, on the left we are presented with the brand name, Sensodyne, which is
42
Martin Thomas
Illustration 3.1 The back of UK and Taiwan Head and Shoulders shampoo bottles
presumably already known to the majority of potential consumers and rightfully occupies the Given position in terms of information structure. To the right, in the New position, we find the claim that the product provides ‘clinically proven relief from the pain of sensitive teeth’. It seems plausible to accept this part of the message as being ‘not yet known, or perhaps not yet agreed upon by the viewer’ (Kress and van Leeuwen, 1996, p. 187).
Developing Multimodal Texture
43
Illustration 3.2 Three faces of a UK Sensodyne Original Toothpaste pack: front (Face 1), side (Face 2) and back (Face 3)
Now let us turn to the second face of the same pack, also shown in Illustration 3.2. It might be noted that this face is not the one preferred for presentation to the potential consumer. Nevertheless, the background to and framing of the message elements it contains echo those on the front of the pack. Here too we have a strong horizontal axis, with different elements being distributed to the left and to the right of the layout. However, while a substantial amount of verbal text now occupies what should be the Given position on the left, the word ‘toothpaste’ is located centrally within the right-hand, New, position. It would be hard to make the case that this part of the message may be seen as ‘contestable’ or ‘problematic’ (Kress and van Leeuwen, 1996, p. 187). In sum, Kress and van Leeuwen’s account of information value offers an innovative framework through which to interpret visual layouts. They provide examples which demonstrate that their hypotheses are supported in real texts. However, my attempt to apply the theory to packaging shows that it does not account for all visual layouts, even those which appear to meet the criterion of being polarized along a particular axis. Moreover, it is not
44
Martin Thomas
necessary to look beyond the ‘visual semiotic of Western cultures’ to find counterexamples. Of course, the single example presented here cannot be taken as persuasive evidence in support of an alternative rule, but it does highlight the need to exercise caution in applying such rules too generally. 3.2.2 Salience and framing Two other systems are described in Kress and van Leeuwen’s (1996) account of composition: salience and framing. Salience refers to the fact that elements ‘are made to attract the viewer’s attention to different degrees’ (1996, p. 183). With regard to framing they explain that the ‘presence or absence of framing devices [...] disconnects or connects elements of the image, signifying that they belong or do not belong together in some sense’ (1996, p. 183). While in their definition Kress and van Leeuwen refer to images, they go on to explain that the concept may be applied to ‘composite visuals’. Intuitively the systems of salience and framing seem less contestable than the application of information value to account for visual layouts. However, here too we find a number of problems. First, the presence of some realization resources, for example, colour and placement/space, under both systems (1996, pp. 224–5) suggests that it might be hard to interpret the role of a particular instance of a realization property in a given situation. This might not be a serious problem: it is quite plausible for colour to play a role in both salience and framing and these two systems might be instantiated simultaneously in a particular case. Perhaps more surprising, given the range of visual material to which Kress and van Leeuwen seek to apply their theory, is the absence of typographic resources for realizing the potential of the two systems. While colour, size and placement may be related to features of both typographic compositions and images, no resources specific to the setting of text are proposed.4 I will return to this matter in Sections 3.2.4 and 3.2.5. Finally, the terms salience and framing and their definitions do not seem to cover the variety of resources for textual meaning-making that we find in ‘composite visuals’. Salience implies a rather unidimensional conception of the potential for graphic expression of messages. Moreover, the definition relies on the notion of attracting the viewer’s attention and thus involving the engagement between the producer and the consumer of the text. Hence it might be seen as straying into the realm of interpersonal meaning. Indeed, it might be noted that O’Toole (1994, p. 24) locates relative prominence under the modal function, which he associates with Halliday’s interpersonal metafunction. Meanwhile, framing emphasizes boundaries and separation. In this case, however, it seems that the term is itself the problem, rather than the intended scope of its application: Kress and van Leeuwen admit ‘continuities or similarities of color, visual shape, etc.’ into the list of resources for the realization of connection within the framing system (1996, p. 225).
Developing Multimodal Texture
45
3.2.3 The legacy – why does this matter? In extending SFL to other semiotic modes, O’Toole (1994) and Kress and van Leeuwen (1996) have adopted some of the basic principles of the theory, notably those of metafunction, system and, in O’Toole’s case, rank. However, as has been shown, the allocation of systems to metafunctions is at times contestable (e.g. Kress and van Leeuwen’s inclusion of salience as part of composition) and some of these systems are incomplete or else very indelicate (e.g. Kress and van Leeuwen’s salience and framing). In addition, O’Toole’s adaption of rank is hard to reconcile fully with the notion of constituency as conceived by Halliday.5 Nonetheless, these contributions should not be undervalued. They have helped to open up a field of study which was previously largely out of bounds to linguists. It is only because they have created such wide-ranging and influential theoretical frameworks that a critique of the kind presented here is either necessary or possible. It is significant that the work of O’Toole, Kress and van Leeuwen on ‘music, dance and other languages of art’ is acknowledged in the third edition of An Introduction to Functional Grammar (Halliday and Matthiessen, 2004, p. 20). However, the fact that this acknowledgement is used to draw a line between these ‘other languages’ and verbal language is perhaps symptomatic of a more general problem: there is a lack of integration between the established framework of SFL and approaches which seek to extend that framework to accommodate other semiotic modes. This is particularly evident in the case of Kress and van Leeuwen, who ‘seek to break down the disciplinary boundaries between the study of language and the study of images [and] to use compatible language, and compatible terminology in speaking about both’ (1996, p. 183). Despite this stated aim, and the multimodal nature of their ‘composite visuals’, they make very limited reference to the verbal language components in their examples. In relation to texture and cohesion, they also make only limited reference to the terminology of linguistic analysis. Their adaptation of the concept of Given^New information structure is a notable exception. This approach is problematic for two main reasons. First, as we saw in Section 3.2.1, avoiding the terms of linguistic analysis has not enabled them to avoid criticisms, such as Thibault’s (2000, p. 330). Second, and perhaps more crucially, as was suggested in Section 3.2.2, there seems to be a gap in Kress and van Leeuwen’s ‘visual grammar’ at precisely the point where the visual meets the verbal: the graphic realization of verbal messages is neglected. These shortcomings are all the more significant because they are reproduced in the literature which has been influenced by Kress and van Leeuwen’s model. For instance, despite Thibault’s earlier scepticism, in their recent account of ‘how the metafunctions are typically enacted in visual genres’, Baldry and Thibault cite Kress and van Leeuwen (1996, pp. 186–202)
46
Martin Thomas
when restating the claim that the most important textual/compositional resources when expressing the textual metafunction would appear to be: (a) horizontal structure when presenting visual information as Given or New and (b) vertical structure when presenting visual information as Ideal or Real. (Baldry and Thibault, 2006, p. 39) This is not to say that these authors neglect other aspects of texture entirely. Thibault (2000, pp. 336–7) has developed a concept of ‘visual collocation’, building on corpus linguistics. Baldry (2004, p. 100) points out that ‘visual/verbal ellipsis is constantly at work’ in the advertisements he analyses. Baldry and Thibault (2006, pp. 136–40) describe the resources of deixis between verbal and visual semiotic modalities. However, their analysis relies on the notion of intermodal relations: there is no explicit recognition of the fact that the verbal messages to which they refer are themselves realized visually. 3.2.4 Other insights from SFL The apparent neglect of the role of typography and its contribution to texture and cohesion in the literature discussed in the previous sections does not reflect the full picture. Notably, Bernhardt (1985) and Lemke (1998) fill in some of the gaps. In his early contribution, Bernhardt addressed the nature of cohesion in what he called ‘visually informative prose’ directly: In visually informative prose, graphic cues take over the function normally realized by cohesive ties, especially those which would serve to assign prominence at critical boundaries, such as at paragraph divisions and at points of major transition from one large division to the next. Both visually informative and non-visually informative texts achieve rhetorical structure, but the means by which they do so differs. (Bernhardt, 1985, p. 22) On one level, Bernhardt seems here to be setting up binary choices between linguistic ‘cohesive ties’ and ‘graphic cues’. However, this is mitigated by his suggestion that texts sit on a ‘continuum of visual organization’ (1985, p. 20). Although the points along this continuum lack formal definition and Bernhardt seems to conflate document elements (e.g. lists) with text types (e.g. legal texts), the concept usefully breaks down the distinction between the poles of visual and non-visual informativeness: different texts make use of graphic cues to different degrees. Bernhardt (1985) compares the resources available for achieving rhetorical control in visually informative as well as in non-visually informative
Developing Multimodal Texture
47
texts. In doing this, he integrates well- developed linguistic approaches to cohesion (citing, e.g. Halliday and Hasan, 1976). Thus, rather than talking in terms of dual systems, such as Kress and van Leeuwen’s salience and framing, Bernhardt (1985, p. 29) is able to suggest a range of resources for functions such as subordination, coordination and other types of relations, as well as for partitioning and emphasis. Like Bernhardt, Lemke (1998) is particularly interested in the way in which typography interacts with the verbal elements in text. Rather than offering alternative names for the metafunctions as adapted for other semiotic modes, Lemke proposes names for generalized metafunctions which apply to all meaning-making. Thus, the ideational metafunction of language is part of the presentational semiotic metafunction, interpersonal belongs under orientational, and textual comes under organizational. He relates the role of typography to his metafunctional approach as follows: Typography is quite conventionally used as an orientational as well as an organizational resource in printed text. Orientationally, the use of italic and boldface types signals emphasis or importance, as does the relative point-size of type in titles, headings, abstracts, footnotes, captions, labels, etc. Organizationally, paragraphing and sectioning of text, and geometric relations of figure space to caption space indicate to us which elements are to be preferentially read in relation to which other elements; what goes with what. (Lemke, 1998, p. 95) Thus he seems to agree with O’Toole (1994) that emphasis is part of orientational or, in Halliday’s terms, interpersonal meaning, while the textual, or organizational role of typography has to do with grouping elements. However, while he provides examples of these functions in his analyses, Lemke does not attempt to systematize resources for the realization of these types of meaning. While rather less ambitious than Kress and van Leeuwen’s attempt at providing a ‘grammar of visual design’ (1996, pp. 1–5); Bernhardt (1985) and Lemke (1998) provide interesting and subtle analyses that demonstrate the importance of the typographic expression of verbal language. 3.2.5 Contributions from the field of typography From a different, though complementary, perspective, work on typography has also made significant contributions which might usefully be incorporated into a framework for analysing multimodal cohesion and texture. Waller (1987) offers two terms to describe the typographic mediation of language: modulation and segmentation. These broadly anticipate the distinction between salience and framing made by Kress and van Leeuwen (1996). However, as we shall see, Waller’s terms cover a richer set of functions and resources.
48
Martin Thomas
In Waller’s words (1987, p. 99), ‘Modulation describes the way in which the meaning of an utterance may be colored or emphasized by tone of voice, facial expression or gesture’. He goes on to explain It is possible to use italics and bold type to add some vocal quality to writing but only to a strictly limited degree. At the discourse level, though, typographic modulation is common. Textbook designers, for example, often specify different typographic ‘voices’ to distinguish between, say, the main text, quotations, captions and study guidance. (Waller, 1987, p. 100) Thus conceived, typographic modulation is not simply a matter of attracting attention or signalling emphasis, as is implied by Kress and van Leeuwen’s salience and which, as has been noted in Section 3.2.2, arguably strays into the interpersonal realm. In fact, typography contributes to the orchestration of elements in a text in ways which cannot be reduced to a single scale. In Waller’s treatment, graphic segmentation is related to linguistic cues. Though he does not adopt Hallidayan terminology, it is clear that Waller sees a textual role for the combined resources of typography and language: Segmentation describes the marking of boundaries in spoken or written language. In speech, this may [be] done with pauses, with gesture or with tone of voice. In writing, boundaries may be represented by space, rules or punctuation marks. At levels higher than the sentence (for example, sections of a book, or when the subject of a conversation changes), boundaries are also typically marked by the use of ‘metalanguage’ – language whose function is to structure or monitor the discourse as a whole. Words like ‘Well’ (in speech) or ‘Introduction’ (in books) are metalinguistic. In writing, metalanguage is often signaled typographically: headings, for example, function because of the way they look as well as through what they say. (1987, p. 100) Waller’s terms do not lend themselves to the kind of unidimensional, polarized scales of Kress and van Leeuwen’s salience and framing: it would make little sense to speak of something as being realized with ‘maximum modulation’. Moreover, while his definition of segmentation emphasizes boundaries realized at various levels, his discussion of Gestalt principles (illustrated in Figure 3.1) demonstrates clearly that the grouping of elements may be achieved through proximity (Examples a, b and c) and similarity (Example d). Typography is especially significant in studies which involve multimodal comparison across languages and cultures because the typographic resources available for modulation and segmentation depend on the writing system of the language that is being realized. Language-specific features of typography form a thematic thread running also through Bringhurst’s
Developing Multimodal Texture
a
b
49
c
d Figure 3.1
Waller’s (1987) illustrations of Gestalt principles of grouping
(2002) work in which such features are present at every level from the individual character up to the page or document. Of course, Bringhurst is not alone in pointing to the language- dependent nature of layout: as we have seen, Kress and van Leeuwen (1996, p. 199) suggest that their model of information value may be dependent on reading direction. However, Bringhurst provides concrete examples of ways in which typographic realization depends on features of the language being realized. For example, he (2002, p. 37) explains that, ‘a text in German would ideally have a little more lead than the same text in Latin or French, purely because of the increased frequency of capitals’. Elsewhere, and of particular relevance to present concerns, he (2002, p. 70) notes that, ‘setting column heads vertically as a space-saving measure is quite feasible if the text is in Japanese or Chinese, but not if it is written in the Latin alphabet’. More work is needed to assess the extent to which pack designs from Chinesespeaking locales take this typographic choice. Aside from these language-specific features of typography, Bringhurst (2002, p. 63) also mentions a number of traditional typographic practices which have a cohesive function. These have been used even within linear (in Bernhardt’s terms ‘non-visually informative’) text where we might expect linguistic cohesive ties to be sufficient.
3.3 Applying the tools Before drawing some general conclusions about the range of theoretical tools discussed so far, it would be useful to illustrate how some of them
50 Martin Thomas
might be applied in analysis. I will focus on the Sensodyne pack shown in Illustration 3.2. As we saw in Section 3.2.1.3, this example does not seem to fit Kress and van Leeuwen’s model of information value: despite making use of the horizontal axis, the distribution of its elements does not follow the Given-New pattern. It therefore offers a good opportunity to see what other resources contribute to its texture. 3.3.1 Typographic modulation and segmentation Looking at examples of packaging, we find that the choices made about the use of typographic resources are not necessarily those that one might expect. Neither do the functions that the choices seem to be intended to serve necessarily meet our expectations. For example, while we might expect an italic style to be used to add emphasis (Lemke, 1998, p. 95), it is not uncommon to find examples where all of the verbal messages on a pack face have been realized in italics. Thus, in terms of Gestalt principles, it would seem that the italic style has the effect of grouping through similarity. Equally, we might expect upper- case type to be chosen in order to add emphasis. However, it is not unusual for entire lists of ingredients to be realized in upper case, which suggests that motivations other than increasing salience might be involved (cf. Waller’s ‘typographic voices’). Finally, if size were simply about salience (Kress and van Leeuwen, 1996, pp. 224–5) or emphasis (Lemke, 1998, p. 95), from Face 3 of the pack shown in Illustration 3.2 we should then infer that the digits in the barcode are intended to attract more attention than the verbal messages. In this case, it would seem that production and consumption constraints have played a defining role (Delin, Bateman and Allen, 2002, pp. 56–7): the size of the barcode is determined by the need for it to be easily readable by a scanner or, failing that, by staff, at various points in the distribution process, most significantly at the checkout. In the Sensodyne example, the dominant resources used in the modulation and segmentation of messages are shape, colour and space. In Kress and van Leeuwen’s (1996, p. 217) terms, it makes use of ‘visual rhyme’ through the ‘repetition of colors and shapes in different elements of the composition’. This device not only links the Sensodyne brand name and logo to the background on the front of the pack, but also serves to create cohesion across the faces of the pack. Another obvious feature of the visual design of the Sensodyne pack is the use of a very restricted colour palette. This use of colour, and the way it is varied across the products in the Sensodyne brand family, reminds us of Bernhardt’s (1985, p. 26) observation that features of visual design can be used to build a shared identity across a set of texts. Indeed, the Head and Shoulders examples (see Illustration 3.1) show how colour may be used to create a shared visual identity between product variants from different locales. The Sensodyne colour palette, and the use of generous amounts of white space, also creates a rather ‘clinical’ feel. This echoes the ‘clinically
Developing Multimodal Texture
51
proven ...’ claim on the front of the pack and may be intended to build on GlaxoSmithKline’s pharmaceutical background in order to set up interpersonal relations between the producer and the consumer of the text. Such a stance would also seem to be realized by the heavy use of the imperative mood in the verbal messages on Face 2 of the pack in Illustration 3.2, for example Ask your dentist, Keep safely away from children, and the repeated Do not use ... .While this complementarity between the choice of linguistic and typographic resources might be argued here to support interpersonal meaning, it simultaneously contributes to textual meaning by creating relationships between parts of the text. In terms of the typographic realization of specific messages, various patterns can be noted. First of all, on Face 2, the same brighter hue of blue is used for all headings. This would seem to be an example of colour being used to realize salience as suggested by Kress and van Leeuwen’s account. However, this colour is also used for some non-heading text, which is realized in a smaller font. In general terms, only where bright blue text is immediately followed by darker blue text does this suggests a Head^Body relationship. Moreover, there is a hierarchy of headings, which is realized through type case selection: the headings indicating the important information in the first two (left and middle) columns are realized in upper case; the lists of ingredients are signalled by headings in sentence case followed by a colon. It might be noted that while such case distinctions are frequently used in the setting of English verbal messages, this resource is not available for the expression of Chinese. 3.3.2 Visual and linguistic cohesion As was suggested in Section 3.2.4, rather than there being an exclusive binary choice between the use of linguistic and typographic resources for producing cohesion, it is not hard to find examples where they work to reinforce one another. One such example would be the use of lists in which an abridged form of each item is presented on the front of a given pack with the same items reproduced in an elaborated form on the back. The back of our Sensodyne pack (Face 3 in Illustration 3.2) provides an example of another type of cohesion in which linguistic and typographic resources work together. In terms of colour and case, the headings and body text on Face 3 follow the patterns we saw on Face 2. White space is used to segment the messages into chunks, though this time the three chunks of text are arranged in a single, much wider column. In terms of linguistic choices, each bright blue heading consists of a Wh-interrogative clause, the answer to which is contained in the following dark blue body text. Thus, rather than salience, colour here seems to be used to differentiate typographically between the two voices present in the verbal text. As well as binding each chunk together as a semantic unit, this repeated visual and linguistic pattern builds cohesion across the face.
52
Martin Thomas
It is also significant that anaphoric references in the verbal language can be resolved within the typographically segmented chunk in which they occur. Thus, the It’s which opens the body text of the first chunk refers to the What is in the heading; These factors in the second chunk refers to the items in the preceding bulleted list; and This active ingredient in the third chunk refers back to the strontium chloride in the preceding sentence. 3.3.3
Intermodal relations
So far, my analysis of the Sensodyne example has focused on the graphic realization of otherwise mostly verbal messages. While it moves beyond the scope of most linguistic analysis, this might be considered as intramodal analysis: typography is largely functioning as a set of resources for the expression of verbal messages. However, as was mentioned in the introduction, packs communicate through pictorial as well as verbal and schematic resources (Twyman, 1985), such as the recycle icon shown on Face 2 in Illustration 3.2. Pictorial elements may serve a number of purposes, among them building brand and product recognition, illustrating the difference before and after use, and providing evidence for a claim, which may be realized verbally. In some cases, there may be explicit deictic references between verbal and pictorial messages. In other cases, the relationship is implicit. Implicit relations may be strengthened through proximity and the use of frames, borders and other typographic resources of segmentation. In other cases, as in the illustration of a tooth on the back of our Sensodyne pack (see Face 3 of the pack in Illustration 3.2), the relationship between the image and the linguistic message is not merely implicit, it is open to various interpretations. It might be intended to relate specifically to a particular element or elements of the verbal message. Being located towards the bottom of the pack face, it might be related by the Gestalt principle of proximity specifically to the third chunk of verbal text. Alternatively, the dark pink circle may be intended to highlight the ‘exposed dentine’ to which reference is made in each of the three chunks of verbal text. Equally, it might act as a contextualizing background for the face as a whole. It is also possible that it is intended to fulfil more than one of these functions. However, even where there is ambiguity with regard to the semantic value of an image, and how this relates to the ideational meaning of the verbal elements, the formal qualities of its graphic realization may make a contribution in textual terms. The tooth illustration is realized in the same restricted palette as the rest of the pack. On this face, it acts as a visual bridge between the verbal text realized in two hues of blue and the overlaid pink arcs to the right (cf. Bringhurst, 2002, p. 63). The rings from the Sensodyne logo on the front of the pack are repeated in another ‘visual rhyme’, this time acting as a frame around the illustration (Kress and van Leeuwen, 1996, p. 217). The simplified lines of the illustration also echo the unadorned sans serif type
Developing Multimodal Texture
53
face used to set the verbal text. In fact, we find a very similar formal consistency between images and verbal text on the UK Head and Shoulders pack in Illustration 3.1. Thus, formal qualities of the typographic realization of the verbal messages are also found in the realization of pictorial messages adding to the Gestalt effect of grouping though similarity. Finally, in the Sensodyne example, the rather schematic, flattened style of the image may contribute further to the clinical look of the whole pack design.
3.4 Conclusion While some pack messages may be consistent with Kress and van Leeuwen’s (1996) theory of information value, for example in terms of the distribution of Ideal and Real elements along the vertical axis, it is not difficult to find counterexamples to what are presented as general rules. Moreover, deviation from the theory does not follow the cultural lines suggested by the authors. The systems of salience and framing as proposed by Kress and van Leeuwen (1996, p. 183) are not sufficient to account for the range of typographic mediation of visual verbal messages. Rather, each constitutes a part of what Waller (1987, pp. 99–100) calls modulation and segmentation respectively. Moreover, both modulation and segmentation are often realized simultaneously using the same resources: a heading helps to orient the reader through emphasis and also realizes segmentation by indicating that the following text is related to it. Indeed, recalling Halliday’s (1973, p. 110) metaphor of ‘parallel’ wiring, this simultaneity suggests a strong case for treating modulation and segmentation as belonging to two metafunctions: orientation and organization (cf. Lemke, 1998, p. 95). In the case of pack messages, at least, it is not a matter of choosing between visual cues and linguistic cohesive devices, as has sometimes been suggested in the literature (see, e.g. Bernhardt, 1985, p. 22; Kress and van Leeuwen, 1996, p. 185): cohesion is realized through a combination of both linguistic and graphic resources. In sum, the dimensionality of cohesion, if cohesion there is, must be equal to that of the semiotic modes. Much work remains to be done. In terms of the research issues introduced here, while we know that cohesive resources, both linguistic and graphic, are language- dependent, empirical testing through corpus analysis is needed to find the extent of variation across locales. Such analysis should also make a significant contribution to the development of a fuller account of the neglected space between language and visual layout. Indeed, the graphic expression of language would seem to offer a sensible starting point for an attempt to refine existing models which seek to extend linguistic frameworks into a general theory of multimodal meaningmaking. I hope that the analyses presented here have demonstrated that cohesion does not only cut across rank, if indeed we can apply the principle
54
Martin Thomas
of rank multimodally, but it also transcends semiotic modes. Given the role of typography in realizing verbal messages and in enabling verbal and pictorial messages to share common formal features, it might be better to conceive of visual cohesion as operating transmodally rather than intermodally, as has previously been assumed.
Acknowledgements I would like to express my gratitude to the editors of this volume and to Tuomo Hiippala who offered many useful comments and suggestions on this chapter.
Notes 1. In a passage much extended in the second edition of Reading Images, Kress and van Leeuwen assert that: ‘we have not imported the theories and methodologies of linguistics directly into the domain of the visual, as has been done by others working in this field. For instance, we do not make a separation of syntax, semantics and pragmatics in the domain of the visual; we do not look for (the analogues of) sentences, clauses, nouns, verbs, and so on, in images’ (2006, p. 19). 2. Distributed by Oxford University Computing Services on behalf of the BNC Consortium. 3. Chinese perhaps offers a particularly good opportunity to test Kress and van Leeuwen’s hypothesis: while in certain contexts Chinese characters may be read vertically from top to bottom, the unmarked information structure in the Chinese clause follows the same pattern as in English, that is, Given^New (Li, 2007, p. 186). 4. In a more recent article, van Leeuwen (2005) points to what he calls ‘a fundamental oversight’ in omitting typography from Reading Images (Kress and van Leeuwen, 1996). Subsequently, van Leeuwen (2006) elaborates his approach to typography as a semiotic mode. However, despite acknowledging the cohesive work done by ‘layout, colour and typography’ (p. 139), he concentrates on letter forms and does not explore the ‘textual meaning potential of typography’ in detail (p. 153). Indeed, he refers the interested reader back to the treatment of the meaning potential of layout in Kress and van Leeuwen (1996). 5. In a personal communication (5 July 2007), O’Toole confirmed that he intended no direct mapping between them. In particular, his Work rank has no direct equivalent in Halliday’s rank scale, in which Text does not feature. As Halliday and Hasan (1976, p. 2) put it, ‘a text does not consist of sentences; it is realized by, or encoded in, sentences’.
References Baldry, A. (2004) ‘Phase and transition, type and instance: Patterns in media texts seen through a multimodal concordancer’, in K. O’Halloran (ed.) Multimodal Discourse Analysis: Systemic Functional Perspectives (London: Continuum), pp. 83–108. Baldry, A. and P. J. Thibault (2006) Multimodal Transcription and Text Analysis (London: Equinox).
Developing Multimodal Texture
55
Bateman, J., J. Delin, and R. Henschel (2004) ‘Multimodality and empiricism’, in E. Ventola, C. Charles and M. Kaltenbacher (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 65–87. Bernhardt, S. A. (1985) ‘Text structure and graphic design: The visible design’, in J. D. Benson and W. S. Greaves (eds) Systemic Perspectives on Discourse, Vol. 2 (Norwood, NJ: Ablex), pp. 18–38. Bringhurst, R. (2002) The Elements of Typographic Style, 2.5 edn, (Point Roberts: Hartley and Marks). Delin, J., J. Bateman and P. Allen (2002) ‘A model of genre in document layout.’ Information Design Journal, 11(1): pp. 54–66. Halliday, M. A. K. (1973) Explorations in the Functions of Language (London: Arnold). —— (1978) Language as Social Semiotic (London: Arnold). Halliday, M. A. K. and C. M. I. M. Matthiessen (2004) An Introduction to Functional Grammar, 3rd edn (London: Arnold). Halliday, M. A. K. and R. Hasan (1976), Cohesion in English (London: Longman). Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design (London: Routledge). —— (2006) Reading Images: The Grammar of Visual Design, 2nd edn (London: Routledge). Lemke, J. (1998) ‘Multiplying meaning: Visual and verbal semiotics in scientific text’, in J. R. Martin and R. Veel (eds) Reading Science: Critical and Functional Perspectives on Discourses of Science (London: Routledge), pp. 87–113. Li, E. S. (2007) A Systemic Functional Grammar of Chinese: A Text-Based Analysis (London: Continuum). O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). Thibault, P. J. (2000) ‘The multimodal transcription of a television advertisement: Theory and practice’, in A. Baldry (ed.) Multimodality and Multimediality in the Distance Learning Age (Campobasso: Palladino) pp. 311–85. Thomas, M. (2007) ‘Querying multimodal annotation: A concordancer for GeM’, in Proceedings of the Linguistic Annotation Workshop at ACL 2007 (Stroudsburg, PA: Association for Computational Linguistics), pp. 57–60. Twyman, M. (1985) ‘Using pictorial language: A discussion of the dimensions of the problem’, in T. Walker and R. Duffy (eds) Designing Usable Texts (Orlando, FL: Academic Press), pp. 245–312. van Leeuwen, T. (2005) ‘Typographic meaning.’ Visual Communication, 4(2): pp. 137–43. —— (2006) ‘Towards a semiotics of typography.’ Information Design Journal and Document Design, 14(2): pp. 139–55. Waller, R. (1987) The Typographic Contribution to Language. Unpublished PhD thesis, Department of Typography and Graphic Communication, University of Reading.
4 Metonymy in Visual and Audiovisual Discourse Charles Forceville
4.1
Introduction
This chapter discusses pictorial and multimodal equivalents of what in cognitive linguistics (CL) is called ‘metonymy’. CL has long focused almost exclusively on metaphor, which is defined as ‘understanding and experiencing one kind of thing in terms of another’ (Lakoff and Johnson, 1980, p. 5). According to CL, metaphor is central to cognition, since human beings are claimed systematically to understand abstract concepts in terms of concrete phenomena. However, in the past decade, metonymy has gradually begun to attract sustained attention as no less crucial in ruling human cognition. The generally accepted difference between metaphor and metonymy is that the two things combined in metaphor belong to different conceptual domains (e.g. ‘love is a battlefield’), while those in metonymy belong to the same conceptual domain (e.g. ‘count noses’). In short, in metaphor we get A-as-B; in metonymy B-for-A. As in metaphor research, the predominant focus in recent studies on metonymy (Barcelona, 2000; Dirven and Pörings, 2002) is on linguistic manifestations of this trope alone. However, it is important that claims about human thinking are not exclusively made on the basis of verbal expressions of tropes. There is already a robust body of work on pictorial metaphor (Forceville, 1988, 1996, 2005a, 2007a, 2007b; Forceville and Urios-Aparisi, in press; Whittock, 1990; Carroll, 1994, 1996) and some theorization of pictorial oxymoron and pictorial grouping (Teng and Sun, 2002; see also Kennedy, 1982; Teng, 2006). Investigating non-verbal metonymy is a logical next step. The analyses offered in this chapter will both help evaluate CL claims about metonymy and provide insights into regularities of multimodal discourse. As a general background, I will assume that a communicator always has a reason to use a metonym, and to use one metonym rather than another. This is commensurate with (i) Sperber and Wilson’s (1995) claim that any act of communication is presumed, by its audience, to be optimized in 56
Metonymy in Visual and Audiovisual Discourse
57
terms of relevance, (ii) with Clark’s view of discourse as a ‘joint activity’, in which, crucially, ‘the knowledge, beliefs, and suppositions [the participants] believe they share about the activity’ accumulate incrementally (Clark, 1996, p. 38), (iii) with Tomasello’s insistence that it is the ‘joint attentional frame’ between speaker and listener ‘which sets the context for the reading of the specific communicative intentions behind a word or utterance’ (Tomasello, 2003, p. 89), and finally (iv) with Gibbs’ (1999, p. 4–5) idea that ‘the recovery of communicative intentions is an essential part of the cognitive processes that operate when we understand human action of any sort’. The chapter is organized in the following way. The goal is to identify certain phenomena occurring in discourses that are not (exclusively) verbal and propose to analyse these as metonyms. This will be done by outlining CL views on metonymy in Section 4.2. Then salient metonyms in two advertising campaigns and two feature films will be analysed in Sections 4.3 and 4.4. After the discussion presented in Section 4.5, the chapter ends by making some general claims about the use and functions of metonymy in multimodal discourse.
4.2
Metonymy in CL
Metonymy ‘allows us to use one entity to stand for another’ (Lakoff and Johnson, 1980, p. 36). Probably the best-known variant of metonymy is synecdoche, in which a part stands for the whole (‘he is a brain’). Other types of metonymy include producer for product, object for user, controller for controlled, institution for people responsible, the place for the institution, the place for the event (ibid., pp. 38–9). Like metaphor, metonymy thus pertains to a relation between two phenomena, but whereas in metaphor the relation straddles what in the given context are to be understood as two different domains, a metonymy ‘involves only one conceptual domain, in that the mapping or connection between two things is within the same domain’ (Gibbs, 1994, p. 322). Similar definitions can be found in Taylor (2002, p. 325), Kövecses (2002, p.14), and in Wales (2001, p. 252). Another aspect of metonymy needs to be emphasized: a communicator’s choice to use a specific metonym (the source concept) rather than the entity to which it metonymically refers (the target concept) always implies some change in salience or viewpoint: Metaphor and metonymy do not only involve a mapping of a conceptual network from a source domain onto a target domain, as claimed by cognitive approaches, but also involve a shift in perspective which makes possible the mapping from the one domain to the other by selecting suitable aspects of the source network, and also the source domain, which can be satisfied on the target domain. (Bartsch, 2002, p. 50–1)
58 Charles Forceville
Warren (2002, p. 123) draws attention to this phenomenon when she claims that ‘the essence of metonymy is highlighting’, a statement she later specifies when pointing out that in metonymy ‘the source expression ... forms together with the connector [i.e. shared property] a predication restricting the reference of the target’ (ibid., p. 126). Such channelling of meaning in the direction envisaged by the producer of a metonym, in turn, can only be achieved given a shared ‘body of knowledge and belief encapsulated in an appropriate frame’ (Taylor, 2002, p. 324–5; see also Gibbs, 1994, p. 339). Finally, Ruiz de Mendoza and Díez Velasco (2002) propose to distinguish between target-in-source metonyms, in which a superordinate domain (the ‘matrix domain’) stands for a subdomain (e.g. ‘pill’ for ‘contraceptive pill’); and source-in-target metonyms, in which a subdomain stands for a matrix domain (e.g. ‘hands’ for ‘sailors’ in ‘all hands on deck’). They propose to discuss the former in terms of ‘domain reduction’ and the latter in terms of ‘domain expansion’ (Ruiz de Mendoza and Díez Velasco, 2002, p. 495–9). On the basis of this very brief survey of CL views on verbal metonymy, let me provide the following characteristics of metonymy, phrasing them in such a way in which they can be applied to non-verbal and multimodal specimens: 1. A metonym consists of a source concept/structure, which via a cue in a communicative mode (language, visuals, music, sound, gesture ...) allows the metonym’s addressee to infer the target concept/structure. 2. Source and target are, in the given context, part of the same conceptual domain. 3. The choice of metonymic source makes salient one or more aspects of the target that otherwise would not, or not as clearly, have been noticeable, and thereby makes accessible the target under a specific perspective. The highlighted aspect often has an evaluative dimension. Before turning to real-life pictorial and multimodal specimens of metonymy, let us imagine some non-verbal variations on the endlessly cited ‘The ham sandwich is waiting for his check’ (Lakoff and Johnson, 1980, p. 35), as uttered by a waitress, Zoe, to alert a colleague, Luella, that a certain customer, wants to pay for his ham sandwich. Zoe’s decision to use a metonym (here: of the source-in-target variety) can be explained by the relevance theory principle of least effort to achieve the desired communicative effect: ‘the ham sandwich’ takes less effort to process than ‘the person who ordered the ham sandwich’ (cf. Sperber and Wilson, 1995, p. 123ff.). The fact that the effort-saving is minimal does not invalidate the principle – and if Zoe and Luella communicate all day long in a busy snack bar, they save a lot of effort using such metonyms. The aspect made salient by this metonym is that the customer is considered in his capacity of paying consumer (PRODUCT for CONSUMER OF PRODUCT).
Metonymy in Visual and Audiovisual Discourse
59
Now imagine that the customer is a scruffy, semi- drunk beggar who has a little mouth organ on which, to Zoe and Luella’s chagrin, he never stops playing ‘Frère Jacques’. After the man has signalled he wants his cheque, Zoe, instead of saying ‘the ham sandwich wants his check’ might also whiswants his check’ to Luella. In the given situation, the first four notes tle ‘ of ‘Frère Jacques’ uniquely refer to the pertinent customer and thus are a perfectly appropriate metonym for ‘the man who ordered the ham sandwich’. Apart from designating the customer, the metonym also evokes certain connotations. For instance, Zoe may thus convey that they will at last be rid of the annoying man. Or she may want to impress Luella by showing what an original metonym she has invented for the customer. In relevancetheoretical terms, Zoe does what all communicators do: optimize the effect– effort balance. Not only does she get across efficiently to Luella that the customer wants his cheque – the main ‘explicature’ of the message (Sperber and Wilson, 1995, p. 182) – she also hints at something more, say, ‘good riddance!’ or ‘Am I not funny?’ – a weak implicature (ibid., p. 197 et passim). Zoe could of course choose yet other non-verbal metonyms: she could make sure that Luella looks at her and then silently mimic the customer playing his mouth organ and add ‘wants his check’. Opting for this latter might allow her to warn Luella without alerting the customer by making the gesture with her back turned to him. Again, in the given context the gesture uniquely denotes the customer by means of a metonym that could be phrased GESTURE for PERSON PERFORMING THE GESTURE, while highlighting, in Zoe’s grimacing, her negative evaluation of the man’s behaviour. In both cases, the chosen metonyms thus achieve an effect that differs from the effect resulting from deploying another metonym – or none at all. We are now ready to consider, in Sections 4.3 and 4.4, some real-life case studies.
4.3 Metonymy in advertising billboards: Two case studies Advertisements sketch a problem, need or desire that prospective customers may have for which the product or service advertised provides the solution or fulfilment. In line with this, an advertisement always makes a positive claim for the product or service promoted. These genre conventions are part and parcel of the background knowledge governing the interpretation of advertising messages (Forceville, 1996, p. 104). Against this background, it will be argued that Interpolis and ABN-Amro, the two series of advertising billboards to be discussed here, make salient use of metonymy in a manner that is not, or not exclusively, verbal. 4.3.1 Interpolis The series of billboards (Holland, summer of 2006) promotes an insurance company, Interpolis. The text Daarom, mannetjes die helpen bij vakantiepech can be translated as, Therefore, guys that help out with vacation misfortunes.
60
Charles Forceville
The pay- off, Interpolis, glashelder, translates as Interpolis, crystal clear. The billboard series, including the phrase Therefore, guys ... and the pay- off, tie in with a simultaneously broadcast TV commercial campaign. The recurrent implication of its various instalments was that Interpolis has a crystal clear policy to help out clients, and that they do so quickly and without endless bureaucracy. In one billboard (Illustration 4.1), a car has collided with a slightly outof-kilter tower. Though there is the humorous suggestion that it was the collision that caused the tower to be unhinged, more important is that we recognize the tower as the Leaning Tower of Pisa in Italy. Because of its fame as a tourist attraction, the Pisa tower is a metonym for Pisa-as-vacationdestination; even for Italy-as-vacation- destination – and in light of the entire campaign, we could even say that it stands for vacation destination tout court. Let us briefly ponder the criteria that the producer of the billboard, looking for an appropriate metonym, had to take into account – in a manner similar to the metonym Zoe had to decide on. Given the right context, there are many entities that can serve as metonymic sources for Italy: Catholic priests, the mafia, Latin lovers, pasta, etc. (Casillo, 1985). But here the producer needed a metonym that (a) is clearly depictable and uniquely
Illustration 4.1 Billboard for Interpolis Insurances, photographed in Haarlem, Holland, summer 2006; original in colour
Metonymy in Visual and Audiovisual Discourse
61
identifiable, (b) shows something that can be damaged in a way covered by a good insurance and (c) evokes the connotation of vacation. There may be more constraints: the billboard occurs as part of a series, and the series should be recognizable as such. It is clear that the metonym’s source domain each time is a building: two other billboards show a car crashing into Paris’ Eiffel Tower and London’s Big Ben, respectively. What makes this visual metonym in Illustration 4.1 an attractive advertising strategy? In the first place, the visuals of the building evoke the crash scenario in a more humorous manner than a verbal equivalent could do. Surely, ‘should you collide with the Leaning Tower of Pisa on your vacation, there are Interpolis guys who take care of the damage’ is less funny. Moreover, the scenario evoked by such a line would seem less plausible than the pictorial version under scrutiny. This also has to do with the image’s cartoon style, activating innumerable experiences on the audience’s part of improbable scenarios in comics and animated films. Another reason for the greater acceptability of the visual metonym vis-à-vis a verbal rendering may be the greater implicitness that inheres in visual rather than in verbal communication (Forceville, 1996, p. 102). The simplicity of the drawing could, moreover, be seen as echoing the ‘crystal clear’ of the pay- off. Finally, the metonym requires the viewer to solve a little puzzle (How is the picture the problem to which ‘Therefore ...’ is the solution?), which may enhance audience involvement. 4.3.2
ABN-Amro
The second example is part of a series of ads for the (formerly Dutch) ABNAmro bank. Specimens of the ad appeared both in magazines and as billboards, for instance at Schiphol airport. The ad (Illustration 4.2) features a sheep, and the phrase haute couture, as well as the pay- off-line making things possible and the bank’s name and logo. From these ingredients we have to construe a plausible scenario. Background knowledge supposedly possessed by people who have a reason to be at Schiphol, and/or readers of the kind of magazine in which the ad was published supplies the awareness that banks lend money to entrepreneurs, thereby ‘making things possible’. One line of business is creating ‘haute couture’. But what link is there between ‘haute couture’ and the sheep? I propose that this link is of a metonymic nature: the wool of the sheep is the fabric (or one of the fabrics) from which haute couture clothing is made. The sheep, then, is in a synecdochic relationship with the end product. The metonym has been advisedly chosen: the sheep’s wool is at the basis of the haute couture to be created, and wool is moreover a natural fabric. Without the material source from which clothes are made, there will by definition be no clothes at the end of the production process. The notion of basis or origin is no coincidental choice: another ad/billboard from the campaign shows a grape, with the accompanying text grand cru; a third displays a sprouting
62 Charles Forceville
Illustration 4.2 Billboard for ABN-Amro, photographed at Schiphol airport, Holland, 2006, original in colour
acorn with the text forest; and a fourth shows a brick and the text skyscraper. So, just as in the Interpolis campaign, the various metonymic sources chosen display a single concept, here that of ‘being-at-the- origin’. The metonym is thus ORIGIN for END PRODUCT. When comparing the two ads, one notable difference between the Interpolis and the ABN-Amro series is that in the first example the metonym is conveyed in purely pictorial terms. That is, the Leaning Tower of Pisa, the Eiffel Tower and Big Ben all serve as metonyms for the respective cities in which these buildings stand, a function the visuals would retain in most other contexts. This is so because these metonyms have acquired symbolic status. In contrast, the pictorial parts in the ABN-Amro series only assume metonymic status because of the link to the textual parts. In different contexts, the sheep, the grape, the acorn and the brick would not be metonyms for haute couture, grand cru, forest and skyscraper, respectively, since the former are not symbols for the latter. A distinction must be made, then, between pictorial metonyms that can be identified sui generis and metonyms that can be identified as such only thanks to additional information provided in the text.
Metonymy in Visual and Audiovisual Discourse
63
A related difference between the two campaigns is that in the Interpolis campaign, the metonym is purely visual; in the ABN-Amro campaign, it draws on a combination of visuals and language. That is, in the latter both source and target are given – but in different modes. The puzzle to be solved here is figuring out how the pictorial part functions as a metonym for the verbal part. Analogous to the distinction proposed for metaphors (Forceville, 2005a, 2006a, 2007b, 2008), the former type of metonym could be labelled monomodal, the latter multimodal. More specifically, the latter would be a ‘multimodal metonym of the verbo-pictorial variety’. In print and billboard advertising, the pertinent modes are restricted to the pictorial and the verbal, although it is thinkable that gestures can play a role as well (e.g. Cienki, 1998; McNeill, 2005; Müller, 2008). But, as will be discussed in Section 4.4, in moving images other modes, such as non-verbal sound and music, can also function in metonyms.
4.4 Metonymy in art film: Two case studies In the medium of film, less-than-total representation of an object or human creature is standard practice. Trivially, we could say that any filmic depiction of a referent that exists in the ‘real’ world is a metonymic representation of that referent, if only because film involves a reduction from three to two dimensions. But in order to preserve metonymy as a concept with explanatory value, I will assume that it is possible to cinematically represent an object or creature or event both in toto and metonymically. In cinematography, there is a limited number of standard framings to represent something less than totally: extreme close-up, close-up, medium close-up, medium shot, medium long shot (or plan américain). The two other conventional framings are the long shot and the extreme long shot, but since in these latter the pertinent human bodies are depicted in their entirety (namely: from a long and very long distance, respectively), these are here not considered as metonyms. Significantly, to explain the various framings, the authors of a canonical film analysis textbook announce, ‘we’ll use the standard measure: the human body’ (Bordwell and Thompson, 2008, p. 191). Framings of other objects are labelled in analogy to those pertaining to the human body. Since medium close-up, medium shot and plan américain have become standard ways of portraying the human body, for present purposes the (extreme) close-up is of greatest interest, since it allows for most freedom. The most common cinematic close-up is that of the human face. This preference for the human face makes sense: it is a better signaller of identity and emotion than any other body part, and therefore very suitable for depicting people who are talking and for ‘reaction shots’ – shots that show the facial expression of a person responding to emotional events in the scene. In addition, it is usually informative for the viewer to be able to correlate
64
Charles Forceville
a face with a voice via lip-sync depiction, since this helps to assess who is speaking. But apart from a facial close-up, there are other, relatively standard ways of a cinematic synecdoche: hands, too, are privileged parts of the human body, and thus may be seen as deserving a close-up. One reason for this is that, because of the flexibility of wrists and fingers, hands are capable of many more actions than other body parts. It is no coincidence, for instance, that sign languages draw primarily on hand gestures and that parts of the hand are often used figuratively in linguistic expressions (see Yu, 2000). Moreover, hand and arm positions are important for the recognition of emotional states (Forceville, 2005b; McCloud, 2006, pp. 112–13). Legs, though capable of a far less wide range of actions than hands, qualify for cinematic close-ups as well, since they typically convey the act of moving – and a character’s movement to or from a certain place is often of great narrative import (Forceville, 2006b; Forceville and Jeulink, 2007; Johnson, 2007). Thus, when in a film scene a character appears whose identity is for narrative purposes not (yet) to be revealed, clearly that person’s face should be hidden. One way to disguise the character’s identity is to film other body parts. These can be legs or feet, for instance if it needs to be stressed that the character moves in forbidden territory, such as a burglar or a spy; or hands, for instance if the character is shown to perform some secret or illegal action, such as stealing or manipulating an object, or stabbing or shooting someone. Hitherto, cinematic metonyms have been discussed in terms of framing. However, in the post-silent film era, a target referent can be cued by a sound as well. If a sound occurs without concomitant visual depiction of the phenomenon in the story world from which it emanates, this is called ‘off-screen diegetic sound’ (Bordwell and Thompson, 2008, pp. 278–9). By and large, a clearly recognizable off-screen diegetic sound alerts the audience (and often a character in the story as well) to what happens in the story-world outside the place framed by the camera. Many such sounds have become standard metonyms for non-visualized events: the sound of a closing door indicates someone has just entered or left the room; creaking floorboards suggest there is an (unwelcomed) visitor in the house, etc. There can be good narrative reasons to opt for a sonic metonym: a film-maker may want to show visually a character’s response to the sound (i.e. in a reaction shot) simultaneous with that sound: anxiety or happiness at hearing the closing door; fear or excitement at hearing the creaking floorboards. But sounds can also be efficient ways of suggesting something going on elsewhere simultaneously with what is being shown on-screen. And inasmuch as the audience has to infer the pertinent events from the sonic metonyms, these latter can be regarded as attractive little puzzles for the audience to solve. Once a sound or musical theme has been associated with an idea, event or character (as in the ‘Frère Jacques’ example discussed in Section 4.2), that sound or tune can be used as a metonym for the idea, event or character.
Metonymy in Visual and Audiovisual Discourse
65
For instance, in Fellini’s Casanova (Italy 1976) the metonymic power of the music box theme that comes to be associated with the eponymous hero resides in its ability to invoke Casanova, even when not visibly present. I will now turn to the two cinematic case studies, in which certain phenomena will be discussed in terms of metonymy, bearing in mind the observations and considerations provided above with reference to the medium of film. 4.4.1 La Passion de Jeanne d’Arc (Carl Dreyer, France 1928, b/w) Dreyer’s Jeanne d’Arc, one of the classics from the silent film era, tells the story of the Church’s trial of Joan of Arc, who is accused of blasphemy on account of her declaration that God has instructed her how to save France. She resists all pressure to retract this claim and is eventually martyred at the stake. The film is characterized by a large quantity of facial close-ups. Such standard close-ups show the person talking (the speech’s contents are rendered via intertitles or have to be inferred from narrative context) or constitute reaction shots, registering the mood or emotion in which events are absorbed: anger, derision, pity, fear, exaltation, etc. However, there are a number of more unusual close-ups or extreme close-ups – here considered as visual metonyms – that in view of their function deserve separate consideration. There are several extreme close-ups of a priest’s mouth, talking (Illustration 4.3). Here we have the metonym MOUTH for (TALKING) PERSON. It is relevant that language is associated with the Church in the film: Joan is often silent – and moreover illiterate. Thus, the metonyms of the extreme close-up of mouths emphasize the decontextualized, depersonalized ‘formality’ of church laws; the letter of the law as opposed to its spirit. This
Illustration 4.3 Extreme close-up of priest mouth’s shouting in Joan’s ear (a film still from La Passion de Jeanne d’Arc, Carl Dreyer)
66
Charles Forceville
metonymic framing makes salient how Joan, and the audience with her, experiences the priest’s words. Another close-up, occurring twice, is that of Joan’s chained legs. The metonym (CHAINED) LEGS for JOAN- ON-TRIAL highlights that Joan, arrested, is fully at the mercy of the Church, which has complete control over her, at least over her body. A third recurring close-up is that of a hand holding a pen, or writing. In one case, this is a monk writing down what is said during the trial, a metonym that, again, emphasizes the idea of the letter of the law. Several other times, the close-up of a pen-holding hand is shown at moments when the judges try to pressure Joan into signing a document testifying that she recants from her ‘blasphemy’ (Illustration 4.4). A close-up of hands features in another shot, namely when the presiding priest, confronted with a protest by Joan that a certain question is beside the point, turns to his fellow priests and asks them to indicate who supports his conviction that it is relevant. The row of metonymic hands here not only demonstrates the priests’ agreement; it also indicates the conformity of a de-individualized group to church policy. Other shots show soldiers’ hands putting a ‘crown’ on Joan’s head and giving her a ‘sceptre’ in a mockery of Christ’s passion; and a priest’s hand temptingly holding up the host for Joan, who fervently wants to go to mass, but is not allowed to because she shows no remorse. In all these examples, because of the hostile actions they display toward Joan, the hands acquire negative connotations. A series of metonymic close-ups that do not pertain to the human body are of torture instruments, after Joan is taken into the torture chamber, and later of the stake where she will be burned. These are shots from Joan’s point of view, here suggesting that in her panic and fear she focuses on the gruesome, painful details.
Illustration 4.4 A priest puts a pen in Joan’s hand, urging her to sign a declaration she recants (film still from La Passion de Jeanne d’Arc, Carl Dreyer)
Metonymy in Visual and Audiovisual Discourse
67
4.4.2 Un Condamné à Mort s’est Echappé/A Man Escaped (Robert Bresson, France 1956, black and white) Bresson’s sober film tells the story of a French resistance fighter, Fontaine, who is arrested and imprisoned by the Nazis during World War II. He has only one goal: to escape. Besides facial close-ups, hands again provide salient metonyms: handcuffed; reaching for a car door handle (Illustration 4.5); writing a letter; hiding a letter. But above all, Fontaine’s hands metonymically reveal activities that are to aid his escape from prison: using a pin to open his handcuffs (Illustration 4.6), sharpening a spoon into a chisel, whittling away at the wooden prison door, making a rope of clothes and bed springs, bending a piece of iron into a hook, etc. In the latter cases, the metonym could be labelled HANDS for ESCAPE-PLANNING FONTAINE. Occasionally the close-ups are of German soldiers’ hands – waking up Fontaine, putting food in his cell, picking up a stick to beat up a prisoner. Here the metonym is HANDS for PRISONER- GUARDING NAZIS. Contrasting these series of HAND metonyms highlights their different valuations: in the case of Fontaine, the hands make salient Fontaine’s urge to do something, and his inventiveness; in the case of the Nazis they emphasize their anonymity, their belonging – as in the case of the priests in Jeanne d’Arc – to a suppressive organization. But Bresson’s film also draws strongly on the metonymic role of sound. Fontaine (and the film audience with him) hears clocks striking the hour, schoolchildren’s distant chatter, the bells of trams passing outside the prison walls, machine gun shots indicating another execution, the moaning of a fellow prisoner being beaten up, the knocking from a neighbouring cell to establish contact, the turning of a key to open or lock a cell,
Illustration 4.5 Fontaine fiddles with the car door handle, considering escape (a film still from Un Condamné à Mort s’est Échappé, Robert Bresson)
68
Charles Forceville
Illustration 4.6 Fontaine opens his handcuffs with a pin (a film still from Un Condamné à Mort s’est Échappé, Robert Bresson)
the guards’ whistle to regiment prisoners’ walking down the corridor, the sound of a passing train that allows Fontaine and Jost, his fellow escapee, to cover up their own noise when walking on gravel, the pacing of a guard, the creaky sound of another guard’s bicycle. Both the pictorial and the sonic metonyms are used in a highly functional manner: they all refer to objects or events that have a direct bearing on Fontaine’s imprisonment and his plan to escape (for more discussion of both films, see Bordwell and Thompson, 2008).
4.5 Discussion I will now reconsider the case studies in light of the criteria for metonymy formulated in Section 4.2. 1. A metonym consists of a source concept/structure, which via a cue in a certain mode (language, visuals, music, sound, gesture ...) allows the metonym’s addressee to infer the target referent. This is borne out by all of the examples discussed. There are also differences: in the Interpolis case, the target of the metonym must be inferred from extra-textual knowledge, whereas in the ABN-Amro campaign both source and target are provided within the text (if either of them had been omitted, there would have been no metonym). What is unusual in the latter campaign is that the metonyms are multimodal. Note that in the films a referent that is at one stage cued metonymically may at a later or earlier stage be conveyed in its entirety. The sonic metonyms in Un Condamné à Mort sometimes do reveal, at some stage or other, the visual target to which they refer (clinking keys, ringing tram
Metonymy in Visual and Audiovisual Discourse
69
bells, cranking bicycle, a guard’s footsteps), and sometimes they do not (we never see the chatting children, the clock striking the hours, the train). But at a given moment, each of these sounds functions metonymically. 2. Source and target are, in the given context, part of the same conceptual domain. The examples discussed fulfil this criterion – indeed, that is why they were chosen in the first place, since it is a defining criterion for metonymy. What may require some discussion is the qualification ‘in the given context’. ‘Domain’, after all, is a concept with fuzzy borders. I propose that in each case in which there is a contiguous relationship between two entities, there is the potential to exploit this relationship metonymically. But while in many cases the target of a metonym is inferable in conventional fashion from its source, as in the case of a character’s body being inferable from any of its body parts, the metonymic relation is not always as predictable. Often, understanding the context is crucial for construing the metonymic relationship. Without the narrative context of Un Condamné à Mort, we might not recognize the target referents of some of the sonic metonyms (like Fontaine and Jost we cannot at first figure out what the cranking sound is they hear during their escape, until it is revealed to be the guard’s bicycle). Moreover, the metonymic target may not be a simple concept: A single metonymic source can be used to refer to more than one metonymic target within a single text (Brdar-Szabó and Brdar, 2007; Xianglan, 2007). The close-up of Joan’s pen-holding hand obviously refers metonymically to Joan, but no less to writing, and signing. That is, a metonymic source may cue a target that, in a given context, is difficult to label precisely; and/or cue more than one target referent simultaneously. 3. The use of the metonymic source makes salient, perspectivizes, and/or evaluates one or more aspects of the target. Bartsch’s (2002) point is supported by the cases examined here. In the Interpolis campaign, the chain of metonyms that can be summed up as BUILDING for VACATION DESTINATION highlights the notion of something you can crash into, as opposed to other possible vacation misfortunes that can befall you, such as getting ill, robbed or flooded. The metonymic sources in the ABN-Amro series systematically underline what is at the base, or origin, of the targets. The metonymic close-ups of faces in the two films in standard fashion focus on the (lack of) emotion of people speaking or responding to events, but the more unusual metonyms of body parts favour different connotations: Joan’s chained legs stress her status as prisoner, while the close-ups of hands, particularly those where HANDS stand for WRITING, make salient the power of the Church. In the context of the entire film, the metonymic chain goes further: WRITING is in turn a metonym for the CHURCH’S LEGALISTIC AND SUPPRESSIVE ARTICLES OF FAITH, to be contrasted with Joan’s deeply devout belief in God (her illiterateness is telling). In Bresson’s film, Fontaine’s hands connote his escapeplanning activities; those of the Germans connote their anonymous power. The metonymic sounds in Un Condamné à Mort, too, are important for their
70 Charles Forceville
evaluative dimensions: almost all of them refer to phenomena that either aid or impede Fontaine’s escape. The metonyms in the two films thus carry much narrative weight. What is highlighted is the protagonist’s and audience’s experience of the metonymic targets through the choice of metonymic sources – and all these entail hope and fear.
4.6 Conclusions Finally, I propose the following general claims for further critical scrutiny and detailed analysis in both monomodal and multimodal discourse: 1. Studying non-verbal and multimodal metonyms (and other tropes) helps illuminate their dynamic and highly contextualized character more than studying purely verbal specimens. Having to make verbally explicit (for scholarly purposes such as writing the present chapter) non-verbal or multimodal tropes exposes the artificial character of this activity more than having to make verbally explicit metonyms that originally already were of a verbal nature. While the metaphoric A IS B and the metonymic B FOR A are convenient shorthand descriptions, we should never forget that they are ‘impoverished’ formulas requiring ‘enrichment’ (Sperber and Wilson, 1995, p. 174 et passim) before it can be assessed how these tropes function in context. Müller and Cienki’s (in press) warning about metaphors that ‘the formula of TARGET IS SOURCE problematically reifies the two domains as static entities’ is no less pertinent to metonyms (see also Forceville, 2006a; Brdar- Szabó and Brdar, 2007). Any interpretation of a metaphor or metonym boils down to considering which predicates and/or evaluations cued or evoked by the source are mappable to the target, and the dry TARGET IS SOURCE (in a metaphor) or SOURCE FOR TARGET (in a metonym) by its very form does not encourage such considerations. A related dimension that runs the risk of being underestimated is that metonyms, like metaphors, can have a very short-lived, ephemeral effect on the discourses in which they occur, or by contrast exemplify elements that are profoundly constitutive of the meanings in these discourses. A third dynamic aspect of metonymy that may get short shrift because of the B FOR A formula is the audience’s conditions of access. Somebody who has seen more than one billboard in the two series discussed, or has seen the entire films discussed, may interpret the metonyms more richly than somebody who has seen only one billboard, or only fragments of the films (cf. Gibbs, 1993). 2. The stylistic form in which a metonym occurs affects its construal and interpretation. Aspects of the specific form of the metonymic source can add to, or intensify the connotations made salient in the target. The cartoon character of the buildings in the Interpolis campaign contributes to the idea that we should perhaps take the scenario of the car crashing into a famous building with a grain of salt. The brick chosen for the ‘skyscraper’
Metonymy in Visual and Audiovisual Discourse
71
is of a solid, old-fashioned kind, which may be a connotation transferred to the to-be-built skyscraper. The metonymic close-ups in the two black and white French films discussed here are sober, but one could imagine that in other films choices in terms of colouring or (distorting) lenses, add relevant aspects to the metonymic source – and via mapping to its target. The same holds for metonymic sounds, such as those in Un Condamné à Mort. Loudness, timbre and pitch of sounds can all play a role in activating connotations that are attributable to the target. A second dimension of form relates to how a consistent style can turn a metonym into a motif. In all four cases discussed here, the metonyms have been advisedly chosen, not just for the content they make accessible, but also to create consistency across different billboards within an advertising campaign and across scenes in a film, respectively. Indeed, this argument works two ways: only because more than one instance of the ‘same’ metonym is used, we begin to understand the full connotative value of the metonym: The Leaning Tower of Pisa is a famous building one could crash into; the sheep is an origin-part for the whole of ‘haute couture’; the hands in Jeanne d’Arc connote the repression of the Church; and in Un Condamné à Mort, the hands suggest, among other things, Fontaine’s escape plans, while the sounds in that last film acquire a value on the imprisonment–freedom continuum.1 3. In film, standard ways of framing the human body – and by extension other objects – can be usefully conceptualized in terms of conventional metonymy. If this is accepted, metonymy provides a supra-medial tool that can help build bridges between the study of visual design, film and CL. In addition, because of the way standard film frames take their cue from the human body, acceptance of this proposal yields further support for the CL notion of ‘embodied thinking’. 4. Non-verbal metonyms are of the source-in-target rather than the target-insource type (see Ruiz de Mendoza and Díez Velasco, 2002). The advertising campaigns fit this model: in the Interpolis series the buildings metonymically refer to the matrix domains ‘cities’ or even ‘(vacation) countries’, and in the ABN-Amro series the visual element is part of the matrix domain cued by the verbal target. In the film examples, the non-verbal metonyms are always subdomains that refer to a more inclusive matrix domain. This may well be typical of multimodal discourse. 5. In multimodal discourse, a metonym can be cued in various modes. In this chapter, the focus has been on visuals, sound and language, but there is no reason to assume that metonyms cannot, just as metaphors, draw on music, gesture, smell and touch (see Forceville, 2006a, 2008). 6. If a metonymic source can be detached from its discursive context without losing its connection with its target referent, it moves into the direction of being a symbol. Just as the Cross is a symbol of Christ’s suffering in most contexts, so the Leaning Tower of Pisa is a symbol of Pisa, and by extension of touristic Italy. By contrast, a brick is not a symbol for a skyscraper. The
72 Charles Forceville
fact that in both metonymy and symbolism we say that one thing ‘stands for’ something else already suggests that symbols are metonymically motivated. This issue deserves further research by cognition as well as aesthetics scholars.
Note 1. Incidentally, there is no reason to suppose that these stylistic dimensions could not play a role in purely verbal metonymy as well: formal elements such as alliteration, assonance, rhyme and rhythm are arguably verbal equivalents of extreme close-ups, camera angle, colouring, etc. And a novelist or a poet could consistently use the same (kind of) metonym to create a narrative motif in her art.
References Barcelona, A. (ed.) (2000) Metaphor and Metonymy at the Crossroads: A Cognitive Perspective (Berlin and New York: Mouton de Gruyter). Bartsch, R. (2002) ‘Generating polysemy: Metaphor and metonymy’, in R. Dirven and R. Pörings (eds), pp. 49–74. Bordwell, D. and K. Thompson (2008) Film Art: An Introduction, 8th edn (Boston, etc.: McGraw-Hill). Brdar- Szabó, R. and M. Brdar (2007) ‘Metonymic zooming in and out within a functional domain in the process of meaning construction, exemplified on CAPITALFOR- GOVERNMENT metonymies’, paper presented at the International Cognitive Linguistics Conference, Jagiellonian University Krákow, Poland, 15–20 July. Carroll, N. (1994) ‘Visual metaphor’, in J. Hintikka (ed.) Aspects of Metaphor (Dordrecht: Kluwer), pp. 189–218. —— (1996) ‘A note on film metaphor’, in N. Carroll (ed.) Theorizing the Moving Image (Cambridge: Cambridge University Press), pp. 212–23. Casillo, R. (1985) ‘Dirty gondola: The image of Italy in American advertisements.’ Word & Image, 1: pp. 330–50. Cienki, A. (1998) ‘Metaphoric gestures and some of their relations to verbal metaphoric expressions’, in J. P. Koenig (ed.) Discourse and Cognition: Bridging the Gap (Stanford, CA: CSLI), pp. 189–204. Clark, H. H. (1996) Using Language (Cambridge: Cambridge University Press). Dirven, R. and R. Pörings (eds) (2002) Metaphor and Metonymy in Comparison and Contrast (Berlin and New York: Mouton de Gruyter). Forceville, C. (1988) ‘The case for pictorial metaphor: René Magritte and other Surrealists’, in A. Erjavec (ed.) Vestnik IMS 9 (Ljubljana: Inštitut za Marksistic´ne Študije), pp. 150–60). —— (1996) Pictorial Metaphor in Advertising (London and New York: Routledge). —— (2005a) ‘Cognitive linguistics and multimodal metaphor’, in K. Sachs-Hombach (ed.) Bildwissenschaft: Zwischen Reflektion und Anwendung (Cologne: Von Halem), pp. 264–84. —— (2005b) ‘Visual representations of the idealized cognitive model of anger in the Asterix album La Zizanie.’ Journal of Pragmatics, 37(1): pp. 69–88. —— (2006a) ‘Non-verbal and multimodal metaphor in a cognitivist framework: Agendas for research’, in G. Kristiansen, M. Achard, R. Dirven and F. Ruiz de
Metonymy in Visual and Audiovisual Discourse
73
Mendoza Ibáñez (eds) Cognitive Linguistics: Current Applications and Future Perspectives (Berlin and New York: Mouton de Gruyter), pp. 379–402. —— (2006b) ‘The source-path-goal schema in the autobiographical journey documentary: McElwee, Van der Keuken, Cole.’ The New Review of Film and Television Studies, 4(3): pp. 241–61. —— (2007a) ‘Multimodal metaphor in ten Dutch TV commercials.’ Public Journal of Semiotics 1,1: pp. 19–51. —— (2007b) ‘Pictorial and multimodal metaphor in commercials’, in E. F. McQuarrie and B. J. Phillips (eds) Go Figure! New Directions in Advertising Rhetoric (Armonk, NY: ME Sharpe), pp. 272–310. —— (2008) ‘Metaphor in pictures and multimodal representations’, in R. W. Gibbs, Jr. (ed.) The Cambridge Handbook of Metaphor and Thought (Cambridge: Cambridge University Press): pp. 462–82. Forceville, C. and M. Jeulink (2007) ‘The source-path-goal schema in animation film’, paper presented at the International Cognitive Linguistics Conference, Jagiellonian University Krákow, Poland, 15–20 July. Forceville, C. and E. Urios-Aparisi (eds) (in press) Multimodal Metaphor (Berlin and New York: Mouton de Gruyter). Gibbs, Jr, R. W. (1993) ‘Process and products in making sense of tropes’, in A. Ortony (ed.) Metaphor and Thought, 2nd edn (Cambridge: Cambridge University Press), pp. 252–76. —— (1994) The Poetics of Mind: Figurative Thought, Language, and Understanding (Cambridge: Cambridge University Press). —— (1999) Intentions in the Experience of Meaning (Cambridge: Cambridge University Press). Johnson, M. (2007) The Meaning of the Body: Aesthetics of Human Understanding (Chicago: University of Chicago Press). Kennedy, J. M. (1982) ‘Metaphor in pictures.’ Perception, 11(5): pp. 589–605. Kövecses, Z. (2002) Metaphor: A Practical Introduction (Oxford: Oxford University Press). Lakoff, G. and M. Johnson (1980) Metaphors We Live By (Chicago: University of Chicago Press). McCloud, S. (2006) Making Comics: Storytelling Secrets of Comics, Manga and Graphic Novels (New York: HarperCollins). McNeill, D. (2005) Gesture and Thought (Chicago: University of Chicago Press). Müller, C. (2008) Metaphors Dead and Alive, Sleeping and Waking: A Dynamic View (Chicago: University of Chicago Press). Müller, C. and A. Cienki (in press) ‘Words, gestures and beyond: Forms of multimodal metaphor in the use of spoken language’, in C. Forceville and E. Urios-Aparisi (eds) Multimodal Metaphor (Berlin and New York: Mouton de Gruyter), pp. 291–323. Ruiz de Mendoza Ibáñez, F. J. and Díez Velasco, O. I. (2002) ‘Patterns of conceptual interaction’, in R. Dirven and R. Pörings (eds), pp. 489–532. Sperber, D. and D. Wilson (1995) Relevance: Communication and Cognition, 2nd edn (Oxford: Blackwell). Taylor, J. R. (2002) ‘Category extension by metonymy and metaphor’, in R. Dirven and R. Pörings (eds): pp. 323–34. Teng, N. Y. (2006) ‘Metaphor and coupling: An embodied, action- oriented perspective.’ Metaphor and Symbol, 21(2): pp. 67–85. Teng, N. Y. and S. Sun (2002) ‘Grouping, simile, and oxymoron in pictures: A designbased cognitive approach.’ Metaphor and Symbol, 17(4): pp. 295–316.
74 Charles Forceville Tomasello, M. (2003) Constructing a Language: A Usage-Based Theory of Language Acquisition (Cambridge, MA and London: Harvard University Press). Wales, K. (2001) A Dictionary of Stylistics, 2nd edn (Harlow: Pearson Education). Warren, A. (2002) ‘An alternative account of the interpretation of referential metonymy and metaphor’, in R. Dirven and R. Pörings (eds): pp. 113–30. Whittock, T. (1990) Metaphor and Film (Cambridge: Cambridge University Press). Xianglan, C. (2007) ‘Metonymy and pragmatic inference’, paper presented at the International Cognitive Linguistics Conference, Jagiellonian University Krákow, Poland, 15–20 July. Yu, N. (2000) ‘Figurative uses of finger and palm in Chinese and English.’ Metaphor and Symbol, 15(3): pp. 159–75.
5 What Makes Us Laugh? Verbo-Visual Humour in Newspaper Cartoons Elisabeth El Refaie
5.1
Introduction
There is now a vast literature on the subject of humour, incorporating contributions from the fields of philosophy, linguistics, sociology, anthropology and psychology. The majority of these studies have focused on verbal forms of humour, jokes in particular, and they have neglected visual and multimodal humorous genres like cartoons and comic strips. In spite of a small number of interesting studies about the perceptual, cognitive and psychological basis for grasping and appreciating visual cartoon humour (Herzog and Hager, 1995; Smith, 1996; Forceville, 2005; Marín-Arrese, 2004), there is thus still very little knowledge about the specific nature of verbo-visual forms of humour. The aim of this chapter is to begin to fill this gap in the literature by proposing a framework for the analysis of multimodal humour of a verbovisual variety. This framework is illustrated by applying it to the examples of two British newspaper cartoons about the 2004 US Presidential elections (Illustrations 5.1 and 5.2), published in the Daily Mail, a right-wing tabloid newspaper, and the Independent, a more highbrow daily newspaper aimed at a liberal readership.1 I begin by arguing that a humorous text is by no means a self- evident analytical category. The concepts of genre and intentionality, I suggest, can help us address this taxing issue. The second part outlines the three main types of humour theories, and examines their usefulness with regard to the analysis of verbo-visual texts like the two cartoons under consideration. While all these theories are able to explain some aspects of multimodal humour, taken in isolation, none of them are entirely satisfactory. Building on the work by Morreall (1983), I propose an inclusive theory of humour that considers questions of content, function and form, as well as taking the important pragmatic dimension of humour into account. In the final section, I examine the formal features of multisemiotic humour in more detail. I argue that because of historically developed conventions and the different 75
76 Elisabeth El Refaie
Illustration 5.1 Mac (Stan McMurty), Daily Mail, 5 November 2004, p. 17
Illustration 5.2 Peter Schrank, Independent, 15 October 2004, p. 38
What Makes Us Laugh? 77
material qualities of the two modes, words and images tend to fulfil different functions in the creation of a humorous newspaper cartoon.
5.2 Defining humour Any researcher exploring the nature of humour is faced with the difficult question of how to define the object of his or her inquiry and, related to this, how to choose appropriate data. Although humour is found in all cultures and thus seems to be innately human, there are clearly enormous individual differences regarding what people perceive to be funny. Humour appreciation has apparently changed radically over time and differs from one culture to another (Critchley, 2002, p. 67), and it may also be influenced by such variables as age, gender, education and class, as well as various psychological factors (Herzog and Hager, 1995; Kotthoff, 2006). The two examples used in this chapter (Illustrations 5.1 and 5.2), for instance, may not strike all readers as humorous, and may even be considered by some to be too controversial, upsetting or offensive to be at all funny. As Powell (1988) points out, defining humour is a little like attempting to get a handle on the concept of deviance: it is the act of labelling itself that determines whether something is humorous or not. Faced with this daunting task, many authors dodge the issue and rely on their own intuition to decide which humorous materials to use as data. This is highly problematic, not least because it allows researchers to select examples that fit their preconceived theories and to discard those that might contradict them. Davies (2002), for instance, who has conducted several studies of ethnic humour and who strenuously denies any link between the telling of such jokes and any actual prejudice or discrimination, has consistently excluded unambiguously racist jokes from his discussion. Empirical studies may go some way towards providing an answer to the question of whether or not a particular text is considered funny by a particular audience (Wiseman, 2002; El Refaie, forthcoming, 2010), but these also raise important methodological issues. Although laughter can represent a directly observable physiological response to individual perceptions of humour, it is important to remember that humour and laughter are by no means co- extensive (Attardo, 2003, p. 1288). Asking people about their responses to particular texts may also not be the most reliable method of determining humour perceptions. Having a sense of humour is now generally seen as eminently desirable, and respondents may, for instance, be reluctant to express any views that might be interpreted as showing a lack of this apparently self- evident social virtue. More importantly, comic meaning depends upon ‘the settings and contexts in which a joke is told, the competence of its delivery, the identity of the teller, and the recipients of the joke’ (Pickering and Lockyer, 2005, p. 9). When a joke is taken out of its natural context and used in an experiment or interview situation, it may lose much of its potential humorous appeal.
78
Elisabeth El Refaie
For all these reasons, it seems futile to try and define a humorous text in terms of how it is perceived. Following Billig (2005), my definition of humour involves trying to determine intentionality at the point of creation or delivery: a humorous text is thus one that is intended to be humorous by its author or by the person who (re-)uses the text in a particular context. This allows the analyst to draw on the concept of genre and, in some cases, to question the creators or users of a humorous text about their intentions. In the case of some genres, such as jokes for instance, the intention to be funny is central to their purpose and determines to a large extent their style and their uses in human interaction (Howitt and Owusu-Bempah, 2005). Unfortunately, as the example of newspaper cartoons clearly demonstrates, genre is not always an entirely reliable indicator of humorous intent. A newspaper cartoon is generally a single-panel drawing which is sometimes accompanied by a caption and/or other verbal elements. In Britain, it is placed on the editorial or comments pages of a newspaper and will usually employ caricature, a particular style of drawing which simplifies some features and exaggerates others (Dines-Levy and Smith, 1988, p. 241). Cartoons published in tabloids like the Daily Mail generally refer to a topical theme, but tend to do so in a very light-hearted and humorous way. Stan McMurty, the cartoonist of Illustration 5.1, for instance, told me that 90% of his cartoons are just meant to be funny and only 10% ‘are trying to be hard hitting political cartoons’ (Telephone interview, 30 June 2005). In the more serious daily newspapers such as the Independent, on the other hand, the cartoons almost always refer to current socio-political issues or events, and they are often intended to be striking and thought-provoking rather than straightforwardly humorous. 2 The cartoons I have chosen for this study were definitely intended by their creators to be humorous. McMurty described the purpose behind his cartoon as ‘really just to have a bit of a laugh’, and Peter Schrank also said he intended his cartoon to be both satirical and humorous (Telephone interview, 19 July 2005). For this reason, I believe my discussion of the cartoons as examples of humorous verbo-visual texts to be justified, although this does not necessarily mean that I found them particularly funny or that I expect the readers of this chapter to laugh at them.
5.3
Approaches to humour
Research into humour is often categorized into three main groups (Billig, 2005; Morreall, 1983): superiority theories, release theories and incongruity theories. Broadly speaking, the first two types of theories are about the content of jokes and the psychological or social functions they fulfil, while incongruity theories focus on the specific form or structure that characterizes humorous texts. In the following discussion, I examine the degree to which the various approaches are useful in explaining cartoon humour.
What Makes Us Laugh? 79
5.3.1 Superiority theories Superiority or degradation theory, which originated in ancient Greek philosophy and whose most famous proponent was the seventeenth-century English philosopher Thomas Hobbes, is essentially a theory of mockery. Laughter is explained as the response to the perception of the weakness, deformity or misfortune of others, which makes us feel a pleasant, laughterinducing sense of disparagement and superiority. At the turn of the twentieth century, French philosopher Henri Bergson (1911 [1900]) extended this theory to account for the beneficial effects of humour as a social corrective. His main argument was that laughter discourages the rigid or mechanical – and thus socially maladjusted – behaviour of individuals. In more recent adaptations of superiority theory, our humorous response to other people’s stupidity or ill fortune is explained as a result of natural evolutionary competition (Gruner, 1997), and as a way of exercising social control through the threat of ridicule and the resulting feelings of embarrassment (Billig, 2005). We can find many examples of pictorial humour that seem to validate superiority theory, such as the many political cartoons that portray political leaders as various less than complimentary characters, objects and animals (Edwards, 1997). The cartoon in Illustration 5.1 can also be explained in these terms, if we assume that the gay couple’s rather ridiculous appearance and their misfortune is the main source of humour. Similarly, viewers of Peter Schrank’s cartoon in Illustration 5.2 might be amused by the disdain they are encouraged to feel for Americans in general, represented here by the ugly and obese Uncle Sam, and for George Bush in particular, who is drawn as a tiny little cowboy desperately trying to flatter Uncle Sam. And yet, not all cartoons involve finding some weakness or absurdity in another person, and not all portrayals of the misfortunes of others are necessarily funny. 5.3.2 Release theories The second major theoretical explanation for human laughter, release or relief theory, emerged in the nineteenth century and is associated with writers such as Alexander Bain and Herbert Spencer, who regarded laughter as the release of pent-up nervous energy. However, release theory is best known in the version put forward by Sigmund Freud (1991 [1905]), which has since engendered an extensive psychoanalytic literature on humour (Barron, 1999). Freud proposed that jokes provide us with a release from the constant need to repress our natural aggressive and sexual desires, and are thus experienced as pleasurable. Like dreams, jokes come from the unconscious, but are first transformed into less explicit forms, thereby providing a socially acceptable way of breaking taboos. Although Freud did not claim that all jokes function like this, he believed that those that do, the ‘tendentious’ jokes, generally provoke more laughter. Unsurprisingly, we tend to delude ourselves about the reasons for our amusement.
80
Elisabeth El Refaie
This theory offers a convincing explanation for why jokes are most abundant in areas where there are social inhibitions, thus providing a mirror of a culture’s sense of morality. In the West, for instance, sexual humour is still an extremely common feature of many humorous greeting cards and cartoons in men’s magazines (Dines-Levy and Smith, 1988). It is also unfortunately not difficult to find examples of racist humour, which might indicate that racism is gradually taking over from sexuality as the greatest taboo in many Western societies (Billig, 2001). Unlike superiority theory, which is based on the idea that we all share the same humorous response to others’ misfortunes, a psychoanalytic approach would predict different responses from individual readers, depending on their psychological predispositions (Herzog and Hager, 1995). McMurty’s cartoon, Illustration 5.1, could certainly be interpreted in these terms, since it refers, in a suitably indirect way, to the taboo subjects of homosexuality and cross- dressing. In this case, a reader’s amusement at the depiction of a bearded bride could be interpreted as the release of repressed sexual desires, coupled perhaps with projected feelings of aggression towards gays. The cartoon in Illustration 5.2 could be described as a less ‘tendentious’ joke in Freudian terms, since it is quite explicit in its message and does not seem to refer to any strict taboos, apart perhaps from the social rules surrounding the excessive consumption of food. 5.3.3 Incongruity theories Although superiority and release theories are very useful in helping to explain the abundance of aggressive jokes and those relating to social taboos, there are many things we laugh at that seem to be relatively independent of their subject matter. It is this type of humour that incongruity theorists are particularly good at explaining, although in fact they often aim to describe the underlying structure that characterizes all types of (mostly verbal) humour. Incongruity theory also has a long and illustrious history, going back to German philosophers Kant and Schopenhauer, but being most closely associated with the English eighteenth- century philosopher John Locke and later with Arthur Koestler’s (1976 [1964]) work. In this view, humour is said to emerge from the sudden perception of an incongruity, or the ‘bisociation’ of two contrasting frames of reference (ibid.: 38). Many linguistic theories of humour (Raskin, 1985; Attardo, 2001) also follow an incongruity-based approach, often coupled with some form of script theory (Schank and Abelson, 1977). Attardo and Raskin (1991) suggest that humour results from the juxtaposition of distinct semantic scripts, which are always opposed to each other in a particular way (e.g. actual versus nonactual, normal versus abnormal, possible versus impossible), and the abrupt shift from one to the other. The first part of a joke seems to be pulling
What Makes Us Laugh? 81
listeners in one direction, but a cue suddenly reveals that they have been fooled: ‘The punchline triggers the switch from the one script to the other by making the hearer backtrack and realize that a different interpretation was possible from the very beginning’ (ibid.: 308). Although Attardo and Raskin’s ‘General Theory of Verbal Humor’ (GTVH) was developed to explain verbal jokes, there is no reason why it cannot be applied to cartoon humour as well. McAlhone and Stuart (1996) describe the basic principle involved in the creation and appreciation of visual wit as a form of ‘play’ which links something familiar with the unexpected: ‘The play involves an agile or acrobatic type of thinking – a leap, a somersault, a reversal, a sideways jump – where the outcome is unexpected. The result is not arrived at through logic, but reaches an undeniable truth’ (ibid.: 15). They believe that a witty design needs to be strongly cued: ‘The better the cue, the bigger the leap the reader can make’ (ibid.: 27). Albeit using different imagery, this description of the process involved in humour appreciation is remarkably similar to the one proposed by the GTVH: the concept of a link between the familiar and the unfamiliar is similar to that of two opposing scripts, play is akin to incongruity, the leap or somersault can be seen as another way of expressing the idea of a sudden clash, and the cue is basically another word for a script-switch trigger. If we consider our two examples again, it seems clear that we cannot simply focus on content and ignore their formal structure when trying to understand what might make them humorous to some people. In the case of Peter Schrank’s cartoon (Illustration 5.2), it is possible to discern two opposing scripts, one to do with junk food, overindulgence and obesity, and another with the contrasting political values of the Democrats and Republicans. As the artist explained (Telephone interview, 19 July 2005), he wanted to use a striking image in order to link the concept of ‘the US dominance of world affairs with their influence on the way the world eats – that is, junk politics meets junk food’. Illustration 5.1 appears to contain several juxtaposed discordant scripts: the first has to do with traditional weddings, the second with rural, hillbilly America, the third with a gay lifestyle, and the fourth with Christian conservative values and the Bush administration. McMurty described the thought process behind the creation of his cartoon in the following way: I just suddenly thought you might get two gay guys who made plans and suddenly it has got to be put off for four years. I was trying to think how I could do it in a funny way so I did it in a hillbilly country rather than slick New Yorkers who might be deciding to get married if they were of the same sex [ ... ] and I made it ridiculous, because I seem to remember one of them was dressed in a bridal gown. (Telephone interview, 30 June 2005)
82 Elisabeth El Refaie
So, the main clash could be located in the juxtaposition of the friendly and relaxed world of the gay couple and the wedding guests and the intolerant stance of the Church and Republicans. A viewer might also focus on the incongruous idea of a gay couple preparing for a traditional wedding ceremony, or on the concept of gay culture, normally perceived to be linked to a slick urban lifestyle, being transposed to a very rural setting. The concept of a clash between opposing scripts thus seems to describe the formal features of cartoon humour reasonably well, although, as I will show in Section 5.4, some aspects of incongruity theory do not work as well for (verbo)-visual forms of humour as they do for purely verbal jokes. 5.3.4 An integrated pragmatic approach Superiority theories, release theories and incongruity theories are thus all able to illuminate different aspects of cartoon humour, even though they individually seem unable to explain all instances and characteristics of verbo-visual humour. In fact, as some writers have recognized, these different approaches are not necessarily mutually exclusive and can be used to complement each other (Koestler, 1976 [1964]). Morreall (1983) believes the essence of humour to lie in ‘the enjoyment of incongruity’, but argues that in many instances it has an emotional component as well, involving the release of ‘hostile, sexual, or other feelings’ (ibid.: 47). The humour in both examples used in this chapter could thus be described as resulting from the enjoyment of a clash of two or more incongruous scripts, coupled with the release of hostile and, in the case of Illustration 5.1, perhaps also sexual feelings. Despite offering a useful way of understanding humour in terms of both content and form, Morreall’s theory is still flawed in that it tends to focus on individual jokes considered in isolation and abstracted from their natural context of use. In fact, what really makes people laugh is often not so much the form or even the content of a joke, but rather the mode of delivery and the social, cultural and political context in which it is created and delivered. Humour is an implicit form of communication, which deliberately leaves something implied and requires a greater level of cooperation on the part of the viewer/reader than the processing of a piece of ordinary information (Veale, 2004; Forceville, 2005). Because of this, any theory of humour must also explore the pragmatics of humour appreciation (Attardo, 2001; Holmes, 2006). In order to read McMurty’s cartoon in the way it was intended, readers would have to be sufficiently informed about the often heated debate about gay marriage in the United States and the differences in attitudes among urban and rural communities, Democrats and Republicans, as well as grasping the reference to the Presidential elections. They would, of course, be much more likely to ‘get’ these references if they saw the cartoon on the day it was originally published. Similarly, Brown’s cartoon will not have
What Makes Us Laugh? 83
the same meanings for someone who does not recognize George Bush and John Kerry, or who fails to realize that the deliberately distorted Uncle Sam figure is a metonym for the American people. Political cartoons are highly ephemeral and can quickly lose their intended meanings when taken out of their original context. But it is not just incomprehension and misinterpretation that threaten the communicative success of a joke: the hearer or reader can also refuse to be amused on ethical or aesthetic grounds (Palmer, 2005, p. 80). By laughing at a humorous communication, people can express who they are or who they would like to be and what they believe in. If, on the other hand, the subversive frame is too threatening to the recipient’s values and sense of identity, the joke may evoke feelings of anger and distress rather than amusement (Ritchie, 2005). In the case of the two cartoons considered in this chapter, individuals would probably only really enjoy them if they felt that they shared some of the opinions being expressed there. However, a cartoon rarely has only one obvious voice and can usually be interpreted in several different ways. McMurty’s cartoon, for example, could either be regarded as offensive to gays in that it relies on popular stereotypes, or conversely as quite subversive in that it challenges the social norms of heterosexual relations by depicting a committed gay relationship.3 If reprinted in a gay magazine, it could even be seen as mocking anti-gay prejudices and those people who hold such attitudes. Many regular readers of the Independent would probably interpret the cartoon in Illustration 5.2 in the way it was intended by the artist, namely as a critical comment on certain aspects of the American lifestyle and the Bush administration. A fervent Republican and Bush supporter is thus unlikely to find this cartoon particularly funny. Other readers might see the cartoon as generally anti-American or as offensive to obese people and might thus also reject its message, albeit for different reasons.
5.4 Exploring multimodal humour There are very few attempts at examining in detail the role of non-verbal modes of communication in humorous texts, and these have tended to use theories and concepts developed outside the field of humour research (Smith, 1996, Kaindl, 2004; Marín-Arrese, 2004; Forceville, 2005). As I indicated in Section 5.3, superiority and release theories can be applied to humour in any mode or combination of modes, since they focus on the content and uses of humorous texts rather than on their specific form. Incongruity theories, in contrast, tend to be so firmly grounded in linguistic analysis that they require a degree of adaptation when confronted with the demands posed by verbo-visual text types. Using the two cartoons as examples, I will try to show that each mode is likely to play a slightly different role in setting
84 Elisabeth El Refaie
up the opposing scripts and in triggering the realization of an incongruity between them. When applying linguistic theories to multimodal text types it is important to bear in mind that each mode has different characteristics, both because of the way it has been shaped and organized by a culture over time and as a result of the materials it uses to make meaning (Kress et al., 2001, p. 15). According to a systemic functional view of multimodality, every mode is organized into networks of interlocking options or choices, which a sign-maker chooses from in order to find the most apt and plausible way of expressing a particular meaning in a particular context (van Leeuwen, 2004). Although all the different modes are able to fulfil all three metafunctions of communication identified by Halliday (1985), they do so in different ways. Kress (2000), for instance, suggests that spoken language ‘may lend itself with greater facility to the representation of action and sequences of action’ (ibid.: 147), while the spatial nature of the visual mode may be more suited to the task of representing the relation of elements to each other. Consequently, the visual mode can and often does refer to ‘things’ that have no verbal translation and vice versa. Barthes (1977, pp. 32–51) was one of the first analysts to explore the semiotic relationship between words and images. He used the term ‘anchorage’ to describe the way that language is used to fix the polysemous meaning of images, and talked about ‘relay’ to refer to word-image relations in sequential forms of communication such as comic strips and motion pictures. More recently, scholars have discovered a whole range of different relationships that can pertain between the verbal and the visual mode (Nikolajeva and Scott, 2001). Detailing the ways in which the two modes can combine in comics, for instance, McCloud (1993, pp. 153–63) lists seven types of relations, ranging from ‘word specific’ or ‘image specific’, where one mode carries the meaning and the other merely adds non- essential detail, to ‘interdependent’, where both modes together produce meanings that neither could convey alone. As was suggested in Section 5.3, formally both cartoons in Illustrations 5.1 and 5.2 can be described in terms of a clash of incongruous scripts; these are represented either through the verbal or the visual mode alone, or through a combination of both. Both modes can also convey a sense of hyperbole, which heightens the perception of incongruity and creates additional humorous meanings.4 In the cartoon in Illustration 5.1, the hillbilly setting is conjured up visually through the clothes, landscape, architecture and means of transport, and verbally through the choice of regional dialect terms Shucks, honey and unorthodox orthography implying a ‘Southern’ pronunciation (Ah instead of I) in the caption. The Church and the Neo- Cons in the Bush administration are represented visually and verbally through the sour-looking vicar,
What Makes Us Laugh? 85
the poster on the wall reporting the banning of gay marriages in 11 states, and the political background implied by the gay couple’s dilemma. The ‘traditional wedding’ script emerges from visual details such as the wedding dress, the bouquet, the father of the ‘bride’, and the assorted wedding guests. The first impression is thus of the two modes working together in an interdependent way. A closer look, however, reveals a slightly more complex set of relationships. For instance, the visual mode assumes a lot of the semiotic work with regard to representing the mood of the different characters and their relationships to one another. Hyperbole also plays an important role in the drawing: the fact that the ‘bride’ sports a beard, is smoking a pipe and has distinctly hairy arms, for instance, clearly enhances the sense of a witty incongruity. The verbal mode, on the other hand, is important as a means of ‘anchoring’ the various visual elements. The headline 11 States ban gay marriages on the Church News display case, for instance, is clearly intended to guide the viewer to a specific interpretation of the depicted scene, which could otherwise have been read as a more general comment on gay marriages. The caption is a conventionalized way of representing spoken language in a written form. In this case, it also forms an essential part of the cartoon’s narrative, conveying aspects of time, sequence and causality which would have been virtually impossible to represent by purely visual means (El Refaie, 2003). The cartoon in Illustration 5.2 can be described in terms of the conflicting scripts of junk food and oppositional politics. The first script is represented by visual means only, with the hamburger and the wide girth of the Uncle Sam figure acting as symbols of the stereotypical American fast food diet and unhealthy lifestyle. The various guns and rifles he is carrying, and the way his hand is apparently hovering over one of the holsters, seem to create an associative link between the culture of junk food and militarism. Visual hyperbole is clearly an important feature of the humour in this cartoon as well. The caricatures of Bush and Kerry seem to function as metonyms of contrasting political ideologies, an interpretation which is supported by the verbal messages in the speech balloons5 and by compositional features such as visual connection and disconnection (Kress and van Leeuwen, 1996): the positioning of the two presidential candidates on either side of Uncle Sam and their contrasting style of dress effectively conveys the concept of conflicting values. Some of the verbal and visual details seem to indicate that the stereotypical American is much closer to Bush than he is to Kerry: the Republican candidate and Uncle Sam are both wearing cowboy boots, while Kerry, the ‘skinny guy’, is dressed in much more formal attire. Moreover, Uncle Sam is sporting a giant button on his lapel with the inscription Don’t trust the French. Vote Republican, which also implies an ideological consensus between Bush and the majority of Americans.6 There is another figure that
86
Elisabeth El Refaie
is even more distant visually, in terms of size, implied perspective and style: the little figure in the bottom right-hand corner, who, according to the artist, is meant to represent the rest of the world looking on in anguish. The idea of a ‘sudden clash’ posited by incongruity theorists as an essential ingredient of jokes does not seem to work as well for (verbo-) visual forms of humour such as the newspaper cartoons. The concept of a mental turnaround triggered by a punch line, in particular, presupposes a chronological movement from the salient to the less salient interpretation. This is not generally applicable to instances of single-panel visual humour, since they are organized spatially rather than sequentially and can potentially be read in any order (Schirato and Webb, 2004, p.86). In the case of our two examples, humour appreciation is more likely to proceed incrementally, as the viewer gradually pieces together all the incongruous elements. Several verbal and visual signs could thus act as scriptswitch triggers in different contexts and in relation to different aspects of the cartoon. With regard to Illustration 5.1, for example, the visual element of a bearded bride could be described as a cue for the incongruity inherent in a very masculine-looking gay man with all the trappings of a traditional bride. However, if the main incongruity is located in the opposition between gay culture and conservative right-wing politics, then the caption or the facial expressions of the wedding guests could be seen to serve as a punch line.
5.5 Conclusion As McAlhone and Stuart (1996) point out, humour appreciation is generally not achieved through logic but rather through a process that might be compared to the ineffable pleasure we gain from listening to music or engaging with a striking work of art. This is perhaps particularly evident in the case of mainly visual forms of humour, which seem to be processed at a less conscious and more emotional level than the written word. Therefore, research into the mechanisms of humour constantly runs the risk of undermining the very phenomenon under consideration by trying to break it down into its logical parts. As I have argued, this fundamental paradox even affects the very basis of research, namely the selection of appropriate data for discussion. I suggested that the best way to approach this dilemma is by trying to determine whether or not something was intended to be humorous in the context in which it was created and/or used. This means that anything that is used as a joke in a particular social situation must be treated as belonging to the category of humour, even though the researcher may consider it to be totally unfunny or even disturbing. I have argued that all three of the main strands of humour theories can be applied to verbo-visual forms of humour, and that each of these theories is useful in explaining important aspects of the same phenomenon. An
What Makes Us Laugh? 87
integrated approach such as the one put forward by Morreall (1983) thus provides a good basis for exploring multimodal forms of humour, but this must be supplemented by a pragmatic account of humour creation and appreciation. Moreover, some of the concepts of incongruity theory need to be adapted in order to take the specific nature of multimodal texts into account. For instance, in the case of cartoons it seems inappropriate to speak of a ‘sudden clash’ triggered by one clearly identifiable punch line, since the reading path is likely to be different for every individual reader. More empirical audience research is needed in order to understand how actual readers go about interpreting multimodal texts and what makes them appreciate or reject a particular humorous message.
Notes 1. I would like to thank cartoonists Stan McMurty and Peter Schrank for granting me telephone interviews and for their kind permission to reprint their work. I am also grateful to the British Academy for supporting the work this chapter is based on (Grant No. SG-39469). 2. According to Nicholas Garland, who works for the Daily Telegraph, the idea that political cartoons are always funny is ‘a common misconception’; in fact, some of the best cartoons are ‘very dark and tragic and sombre’ (Telephone interview, 15 July 2005). Dave Brown, regular cartoonist for the Independent, by contrast, believes that political cartoons should always at least attempt to be funny (Telephone interview, 26 July 2005). 3. McMurty was at pains to stress that his cartoon was in no way intended to be homophobic and that he was ‘really just having a laugh’ (Telephone interview, 30 June 2005). However, since the cartoon was published in the Daily Mail, which is certainly not known for its liberal attitudes, it is likely that at least some readers will have read it as an anti-gay joke. 4. Hyperbole is commonly seen as one of the central markers of verbal irony, but it can equally well be applied to the visual mode (El Refaie, 2005). 5. Speech balloons represent another conventional way of giving spoken language a visual form. Here, the typeface looks deliberately irregular and handwritten, which visually indexes the concept of the highly individual human voice (van Leeuwen, 2004). 6. Peter Schrank told me that he had not invented this slogan himself; apparently there were really some bumper stickers with this message around at the time.
References Attardo, S. (2001) Humorous Texts: A Semantic and Pragmatic Analysis (Berlin: Mouton de Gruyter). —— (2003) ‘Introduction: The Pragmatics of Humor.’ Journal of Pragmatics, 35(9): pp. 1287–94. Attardo, S. and V. Raskin (1991) ‘Script theory revis(it)ed: Joke similarity and joke representation model.’ Humor, 3(4): pp. 293–347. Barron, J. W. (ed.) (1999) Humor and Psyche: Psychoanalytic Perspectives (Hillsdale and London: Analytic Press).
88 Elisabeth El Refaie Barthes, R. (1977) Image-Music-Text (London: Fontana Press): pp. 32–51. Bergson, H. (1911 [1900]) Laughter: An Essay on the Meaning of the Comic (London: Macmillan). Billig, M. (2001) ‘Humour and hatred: The racist jokes of the Ku Klux Klan.’ Discourse and Society, 12(3): pp. 267–89. —— (2005) Laughter and Ridicule: Towards a Social Critique of Humour (London: Sage). Critchley, S. (2002) On Humour (London and New York: Routledge). Davies, C. (2002) The Mirth of Nations (New Brunswick and London: Transaction Publishers). Dines-Levy, G. and G. W. H. Smith (1988) ‘Representations of women and men in Playboy sex cartoons’, in C. Powell and G. E. C. Paton (eds) Humour in Society: Resistance and Control (Basingstoke and London: Macmillan), pp. 234–5. Edwards, J. L. (1997) Political Cartoons in the 1988 Presidential Campaign: Image, Metaphor, and Narrative (New York/London: Garland). El Refaie, E. (2003) ‘Understanding visual metaphor: The example of newspaper cartoons.’ Visual Communication, 2(1): pp. 75–96. —— (2005) ‘ “Our purebred ethnic compatriots”: Subversive irony in newspaper journalism.’ Journal of Pragmatics, 37(6): pp. 781–97. —— (Forthcoming, 2010) ‘The pragmatics of humor reception: Young people’s responses to a newspaper cartoon.’ HUMOR: International Journal of Humor Research. Forceville, C. (2005) ‘Addressing an audience: Time, place, and genre in Peter van Straaten’s calendar cartoons.’ Humor, 18(3): pp. 247–78. Freud, S. (1991 [1905]) Jokes and Their Relation to the Unconscious. Penguin Freud Library, Vol. 6 (Harmondsworth: Penguin). Gruner, C. R. (1997) The Game of Humor: A Comprehensive Theory of Why We Laugh (New Brunswick and London: Transaction). Halliday, M. A. K. (1985) An Introduction to Functional Grammar (London: Edward Arnold). Herzog, T. R. and A. J. Hager (1995) ‘The prediction of preference for sexual cartoons.’ Humor, 8(4): pp. 385–405. Holmes, J. (2006) ‘Sharing a laugh: Pragmatic aspects of humor and gender in the workplace.’ Journal of Pragmatics, 38(1): pp. 26–50. Howitt, D. and K. Owusu-Bempah (2005) ‘Race and ethnicity in popular humour’, in S. Lockyer and M. Pickering (eds) Beyond A Joke: The Limits of Humour (Houndsmill and New York: Palgrave), pp. 45–62. Kaindl, K. (2004) ‘Multimodality in the translation of humour in comics’, in E. Ventola, C. Cassily and M. Kaltenbacher (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 173–92. Koestler, A. (1976 [1964]) The Act of Creation (London: Hutchinson). Kotthoff, H. (2006) ‘Gender and humor: The state of the art.’ Journal of Pragmatics, 38(1): pp. 4–25. Kress, G. (2000) ‘Text as the punctuation of semiosis: Pulling at some of the threads’, in U. H. Meinhof and J. Smith (eds) Intertextuality and the Media: From Genre to Everyday Life (Manchester: Manchester University Press), pp. 132–54. Kress, G., C. Jewitt, J. Ogborn and C. Tsatsarelis (2001) Multimodal Teaching and Learning: The Rhetorics of the Science Classroom (London and New York: Continuum). Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design (London and New York: Routledge).
What Makes Us Laugh? 89 Marín-Arrese, J. I. (2004) ‘Humour as subversion in political cartooning’, in M. Labarta (ed.) Approaches to Critical Discourse Analysis [CD-ROM] (Valencia: Universitat de Valencia, Servei de Publicacions). McAlhone, B. and D. Stuart (1996) A Smile in the Mind: Witty Thinking in Graphic Design (London: Phaidon). McCloud, S. (1993) Understanding Comics (New York: Harper Perennial). Morreall, J. (1983) Taking Laughter Seriously (Albany: State University of New York). Nikolajeva, M. and C. Scott (2001) How Picturebooks Work (New York and London: Garland). Palmer, J. (2005) ‘Parody and decorum: Permission to Mock’, in S. Lockyer and M. Pickering (eds) Beyond A Joke: The Limits of Humour (Houndsmill and New York: Palgrave), pp. 79–97. Pickering, M. and S. Lockyer (2005) ‘Introduction: The ethics and aesthetics of humour and comedy’, in S. Lockyer and M. Pickering (eds) Beyond a Joke: The Limits of Humour (Houndsmill and New York: Palgrave), pp. 1–24. Powell, C. (1988) ‘A phenomenological analysis of humour in society’, in C. Powell and G. E. C. Paton (eds) Humour in Society: Resistance and Control (Basingstoke and London: Macmillan), pp. 86–105. Raskin, V. (1985) Semantic Mechanisms of Humor (Dordrecht, Boston and Lancaster: Reidel). Ritchie, D. (2005) ‘Frame-shifting in humor and irony.’ Metaphor and Symbol, 20(4): pp. 275–94. Schank, R. C. and R. P. Abelson (1977) Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures (Hillsdale, NJ: Erlbaum). Schirato, T. and J. Webb (2004) Understanding the Visual (London, Thousand Oaks and New Delhi: Sage). Smith, K. (1996) ‘Laughing at the way we see: The role of visual organizing principles in cartoon humor.’ Humor, 9(1): pp. 19–38. van Leeuwen, T. (2004) Introducing Social Semiotics: An Introductory Textbook (London: Routledge). Veale, T. (2004) ‘Incongruity in humor: Root cause or epiphenomenon?’ Humor, 17(4): pp. 419–28. Wiseman, R. (2002) Laughlab: The Scientific Search for the World’s Funniest Joke. (London: Random House).
6 Citizenship and Semiotics: Towards a Multimodal Analysis of Representations of the Relationship between the State and the Citizen Giulio Pagani
6.1
Introduction
What is the nature of the relationship between the state and the citizen? How does the state shape the expectations of its citizens? Representation of the state would seem to be implicated in the latter, playing its part in how the state shows its presence and tells of its achievements in the world that its citizens inhabit. Studies of discourse created and disseminated by state institutions in the United Kingdom (Fairclough, 1993; Pagani, 2007) suggest that a marketized rebalancing of the relations between those institutions and the citizens of their jurisdiction has occurred since the early 1990s, and that detectable changes in the mode of discourse were implicated in this. Keeping this in mind, this chapter examines the discursive construction of states and citizens by considering how the meanings of multimodal texts are made available publicly. The goal is to demonstrate the ways in which the state shows as well as tells citizens of its presence and activities and, further, to establish the significance of this process and then propose a means of analysing the discourse it produces. First, in Section 6.2, a brief foray is made into the domain of sociopolitical theory, as this enables definition and discussion of some key terms. Section 6.3 builds on this by proposing that symbolic and discursive representation of the state is not confined to the realm of texts and discourses realized in verbal modes. The visual and, more particularly, the material representation of the state is brought into focus and in line with the adoption of a theoretical perspective informed by approaches to multimodal discourse and social semiotics. In Section 6.4, a description of a model for analysis of material and multimodal meaning-making resources is developed. Section 6.5 analyses an example of discourse which is produced using 90
Citizenship and Semiotics
91
the whole semiotic potential of public service provision. This example demonstrates how the model, based upon analysis of chains of texts, is linked by the process of ‘resemiotization’ (Iedema, 2001), which can be used to track meaning-making and exchanges across a range of modes. The action of examining public transport systems and infrastructures aims to illustrate how the involvement of the state is concretely realized in the discourse the systems create when providing the services. The chapter concludes by arguing that the analysis of such material discursive resources displays comprehensively and critically the ways which the state and the citizen interact.
6.2 What is the state? There are two dichotomies that have to be dealt with in the discussion of the question, ‘What is the state?’ First, it is vital to differentiate the concept of ‘the state’ from that of ‘the nation’, and, second, to investigate what is here perceived to be the opposition between two ways in which the state might function discursively. The state has its basis at the organizational level, whereas a nation is culturally based, and the boundaries of the two may or may not coincide. A state can be considered to be a coherent institution with a pre- eminent role in the proper ordering of society within a particular territory. This institution consists of a complex of agencies, actors and sub-institutions in the public sector which interact with the populace. At one end of the scale, this involves regulating and restricting people’s behaviour, and at the other it may involve empowering and enhancing people’s lives. The aforementioned interaction ranges from the definition and management of law and order to the provision of certain types of services and goods. More abstractly, Hall (1992, p. 292) describes a nation-state as both a political construct and a system of representations, while Verdery (1996, p. 227) sees the relation between states and citizens as generated and represented by symbols which, being ambiguous, can be made to mean different things to different users of them. Vendery (ibid.) goes on to suggest that different meanings of this relation are each symbolized differently and represented by different discourses. Since study of discourse in our sense is not their priority, these theorists rely on terms like ‘symbol’ and ‘symbolism’. This chapter, however, examines practices and effects related to the existence and use of these symbols by considering them as elements within a social semiotic, or network of meanings (Halliday, 1978), pertaining to the surrounding culture. This entails viewing them as semiotic resources which may be used to create discourse. Consequently, since meanings can be made and exchanged using semiotic resources other than language itself, then the relationship between state and citizen cannot be thought of as being expressed solely by means of linguistic utterances or texts. That is, it can be expressed through multimodal discourse because it involves combinations of several types of
92
Giulio Pagani
semiotic resources in its realization (Kress and van Leeuwen, 2001, p. 20). Therefore, work needs to be done on texts or discourses which are produced by using other resources of semiotic potential that are available to the state and which may or may not be co- deployed with language itself. One of the discourses that may be propagated by a state has an ethnocultural basis and is generated from symbols of nationalism, such as flags, myths and history (Smith, 2001, pp. 7–8). This type of discourse is not the focus of this chapter. An alternative discourse may arise from a servicebased version of a civic nationalism, signified through what can correspondingly be termed symbols of nationalization. These will include logos or texts, which could be said to sit at the ‘telling’ end of the spectrum, but this is by no means the limit. Items in the material infrastructure provided by the state (vehicles, buildings, and so on) and the consequent linkage in the citizen’s experience of those signs with the service provided under them by the agencies of the public sector are also implicated in the showing of the state. The focus on material signs in this chapter is, in part, an echo of Billig’s (1995) analysis of the mundane types of (discursive) practices through which the state and citizenship are expressed. He (1995, p. 6) coins the term ‘banal nationalism’ to refer to the phenomenon fed by this backgrounded everyday discourse, and he (1995, p. 93) alights upon the notion that daily life is imbued with routine symbols and uses of language which enable people to identify and reproduce themselves. This entails a complex of representations and practices necessary for this reproduction, which is, in other words, a process by which the populace is constantly reminded of its affiliation to the state in the course of its everyday life. This reminding, or flagging, is so continuous that it is unobtrusive through familiarity, rather than invisible because deeply hidden. Indeed, in conjuring a notable illustrative image, Billig (1995, p. 8) cites the example of the national or civic flags that adorn public buildings and which we never notice thanks to their ubiquity. Billig (1995, p. 38) complains that sociologists have tended to ignore banal flagging while concentrating on more overt or spectacular practices of building a nation or a state. This tendency is corrected by Lewis (2004), who sets out a potentially more fruitful direction for analysing representation of the state by considering everyday practices, or the symbolic and material processes that aid people’s sense of belonging to their community. She suggests that citizenship and identification with a state are nurtured through existence of formal rights and discourses backed up by delivery, so that individuals’ uses of the goods and services provided by the state are implicated in this (Lewis, 2004, p. 22). Lewis thus emphasizes a connection between belonging and social, institutional and individual practices. Perhaps, by varying Billig’s formula, these could be described as banal practices by which the state is put to use, thereby forming a ‘banal statehood’.
Citizenship and Semiotics
93
The argument here is that these practices encompass a whole range of public services from welfare to provision of infrastructure and, for the purposes of this chapter, they can be viewed as multimodal semiotic resources. These resources have a potential for use in the making and exchanging of meanings of state and citizenship and for contributing to the creation of a ‘semiotic landscape’ (cf. Kress and van Leeuwen, 2006, p. 35) in which the state and citizen are each sited.
6.3
Citizens and citizenship
The discussion in Section 6.2 has focused on separating the state from the nation as an institution. It is also necessary to differentiate between ‘citizenship’ and ‘nationality’ in relation to the individual, since, just as the literal and metaphorical boundaries associated with the first two terms are not coterminous, neither are the identities and actualities associated with the latter pair. In the most basic terms, ‘citizenship’ is associated with a state and entails some kind of commitment to it or stake in it and receipt of some benefit for that, whereas, for our purposes here, having a certain nationality merely means being a member of a certain national community. Some recent work, exemplified by Orton (2004), considers that the study of citizenship is the preserve of the social sciences, and that, furthermore, it should be divorced from the more abstract branches of those disciplines and instead be carried out within the ambit of a more practical specialism such as public policy research. The opposition to this trend comes from what might be termed ‘discourse-based’ definitions of, or research into, citizenship. Within such a paradigm, Sbisà (2006, p. 151) builds on a tradition that stems from Marshall’s (1992) influential model1 which proposes that, as well as rights and obligations, citizenship entails legitimate expectations constructed, contested and represented in actual social interaction. Citizens are, therefore, social agents whose nature is shaped with respect to the institutional and social organization of the state in which they find themselves situated (Sbisà, 2006, p. 162). Fairclough, Pardoe and Szerszynski (2006, p. 100) suggest that the social interactions and practices that give meaning to citizenship, and which construct citizens, are part of a range of resources which could include administrative discourse, civil and legal practices and the material objects and spaces of public provision. Defining them as signifiers of the state and citizenship, they continue their argument in the light of Billig’s (1995) work, and they arrive at the notion that these discursive, practical and material resources actually make up the kit for assembly of a ‘banal citizenship’ (Fairclough et al., 2006, p. 101). This banal citizenship correlates with the banal statehood referred to in Section 6.2, and the role of the state and its practices in providing individuals with the means for assembly of citizenship at this level needs to be examined.
94 Giulio Pagani
6.4
Citizenship and semiotics: Multimodal approaches
How is the state represented (multi-)semiotically? As the arguments in Sections 6.2 and 6.3 suggest, consideration must be given to whether it will be possible to describe meaning-making and exchange in relation to texts made using a whole range of semiotic resources. If so, is it possible to trace how the representation of the state is realized in them? The semiotic resources, or modes, of interest are used in discursive frameworks which are made up of routine symbols, material processes and everyday practices of welfare that the state and its agents provide in the course of delivering public services. These texts then open up to analyses that potentially have something to say, or show, about the state; such analyses could include items within the built environment, such as public buildings, public housing and public transport infrastructures. Other types of hardware associated with public services and utilities, such as transport vehicles to take just one example, are also implicated in this telling and showing. Each of the items listed here can be elements of the social semiotic in that they can be used to construct a semiotic landscape and a discourse of the state. In doing this, any one of them can be co-deployed with any of the others and, of course, co- deployed with linguistic texts or utterances, too. How can analysis of discourse created through material objects and services be carried out? Clearly these modes are unlikely to fall within the bounds of a single grammar of language. O’Toole’s (1994, 2004) work on architecture is inspired by systemic functional (SF) theorizing and establishes a route to semiotic readings of buildings within a framework that explores their experiential, interpersonal and textual (‘texture’) functions and the meanings made relative to each of these when the object is considered as an element within the social semiotic surrounding it. The work of Kress and van Leeuwen (2001) has provided a complementary framework for analysis of the built environment as a multimodal ensemble. They investigate how people use a varied range of semiotic resources to make signs within the social contexts in which they move. Kress and van Leeuwen (2001, p. 79) also turn their attention to a facet of the semiotic landscape that is the main concern here, namely the application of liveries onto public transport vehicles (also van Leeuwen, 2005, p. 53). They suggest that these colour schemes could now be thought of as regulated by grammar because they have begun to be powerful carriers of meaning over which social control is now exercised. The suggestion is that in the past the colour schemes were not thought of as semantically or semiotically significant, but now their socio-semantic power has been recognized (and arguably recruited to ideological or political ends). Van Leeuwen’s (2005, p. 53) position is that the ‘regulation colours’ formerly applied to buses and trams run by public utilities were thought of as meaningless, whereas the colour schemes that have replaced them since private
Citizenship and Semiotics
95
operators took over are recognized as containing meaning because they are intended to explicitly represent the operators’ identities and, as will be argued here, the values of the free-market systems in which those companies operate. This idea is extended further by suggesting that if the new liveries have a socio-semantic potential related to the paradigmatic orientation of the social system whereby public transport is apparently provided by the market, then surely the old liveries, far from being meaningless, also had a socio-semantic power that was bestowed on them by the paradigm model that operated at that time, so that the values and identities of the service-providing state agencies were expressed through them instead. Since the application of the theory and the analysis to the built environment has been successfully demonstrated elsewhere (especially O’Toole, 2004), the focus is on this vehicular element of transport networks and also on pursuing a different, though related, analytical method from within the SF-paradigm. Iedema (2001) considers the process of coming into being of a public service artefact, or piece of infrastructure, in his analysis of how a hospital renovation project unfolded. He focuses on meaning shifts between resources at the level of genre, categorizing the shift as a transfer of meaning potentials from one mode to another. He asks what the process means – in part asking what is the meaning of its end result, or alternatively, the end-meaning of the end- object it created. He suggests that if we are to concern ourselves with these end-meanings, or with the current ‘now-meanings’ of an object in front of us, then we need to look at the meanings that were created and exchanged at the beginning of, and during, the process that created it. He examines the potential of material reality as a means of communication by considering it as a semiotic system into which meanings can be translated from other semiotic systems such as language. Materials thus become the expression plane for a set of meanings previously (and concurrently) realized in other systems, and these meanings undergo ‘resemiotization’ as they are passed from mode to mode through discursive chains, arriving in ever less negotiable durable materialities with ever greater status (2001, pp. 25–6). As the end results of semiotic translation of meaning and the end-points of chains of recontextualization, material objects imbue their meanings with maximal weight and authority (Iedema, 2001, p. 25). Iedema also argues that the construction of durable and resistant meaning, or facts, parallels the construction and organization of our spatial environment. Material objects present certain choices on pathways through space and they also present choices on pathways through which meaning is made and through which social reality is made. It would thus seem that, if it was someone’s intention to promote the meanings and social arrangements associated with a particular paradigm, then translating or resemiotizing those meanings into durable and material semiotic resources would be a powerful and productive way
96
Giulio Pagani
of doing so. These resources, after all, have an intrinsic ability to affect the practices of their users; they are a more direct and tangible way of doing so than the more negotiable semiotic systems, such as language. Iedema (2001, p. 33) is careful to point out that, as meanings are resemiotized and transposed from one mode to another, they become subject to the constraints and affordances of the new mode. There is no complete redundancy between any two modes – each offers certain possibilities and restricts others. It is likely to be beyond the capabilities of a given mode to deliver a direct translation of the meanings realized in another mode. Resemiotization does not, therefore, produce exact replication, but instead it produces semiotic metaphors. Developed by O’Halloran (1999, 2003), ‘semiotic metaphor’ describes a process whereby a meaning previously or usually made through use of resources in one particular mode is shifted so as to become attached to resources in another mode. A material object, in its attempt to realize meanings articulated in language that has been used in an earlier part of its chain of production or creation, will inevitably realize those meanings in a metaphorical way. Part of our understanding of the term ‘metaphor’ here must be construed in a way that parallels grammatical metaphor so we can think of resemiotizing creation of semiotic metaphors as leading to a loss of congruence in the realization of meanings. This is at least partly because the constraints of the material resource as a semiotic mode require that meaning transposition has to occur as a best-fit translation that may be an approximation or a compromise. Although semiotic metaphor was originally conceived by O’Halloran as an explanation of intersemiotic reconstruals occurring between language and mathematical symbolism, Iedema’s work is useful in defining a framework within which productive extension of its mechanisms to other semiotic resources can be sited and attempted. The concept of semiotic metaphor is important because it suggests a mechanism for how three- dimensional objects can come to have meaning potentials and become ‘social symbols’, that is, entities to which people are encouraged to attach special meanings (Lim, 2004, p. 241). The focus of its attention is not on modes and semiotic resources by themselves but on their co- deployment, and its power lies in the fact that it may reduce the necessity or urgency of discovering or proposing grammar for a multiplicity of modes on an individual basis.
6.5 Analysis of material social semiotic resources: An example Let us now turn to the application of the resemiotization principle and to the analysis of ‘shown’ discourse realized by public transport vehicles. The public transport vehicle, together with the surrounding package of elements that are part of the field of the transport service (timetables, tickets, stops, shelters, stations and so on) is taken here to be the benchmark example of
Citizenship and Semiotics
97
the ‘routine symbol’, the service it provides to be the prototypical ‘material process’ and its use by the citizen to be the epitome of the ‘everyday practice’ which was discussed earlier. As such, it is seen as an important signifier of banal statehood and banal citizenship. The aim is to show how resemiotization of meaning, as achieved via the mechanism of semiotic metaphor, can be traced through a chain of discourse so that we can attempt to establish the end-meaning which a bus and its livery show to the citizen. Illustration 6.1 shows a number of buses parked in their depot near Bristol, England in 1986. This photo was taken shortly after the privatization of the bus service in that region. The bus in the centre of the frame retains the livery of the state- owned National Bus Company (NBC), the transport utility that operated many UK bus services until the mid-1980s. This livery was green with a horizontal white band. The NBC logo is visible on the radiator grill and the name of the NBC-district is carried near the front wheel-arch. The other buses in the illustration have been repainted and bear the livery of the privatized operator, called ‘Badgerline’, created to take over the services from the NBC. The buses with this livery have yellow ends with a large slanted vertical green stripe in the central area. As can be seen on the bus to the left of the frame, the company name is displayed on the front of the bus. It should be pointed out here that, as a consequence of the photo in Illustration 6.1 being taken in a ‘changeover’ period, the NBC livery on the bus in the centre is no longer in its pure form. It has already been ‘hybridized’ by the addition of a band above the windows that makes clear who its new owners are. Most of the discussion below will proceed as though referring to the unmodified NBC livery. A better illustration of the full effect of the livery change can be viewed at <www.palgrave.com/multisemiotic> to the two colour photos attached separately, where photos of the same vehicle taken on separate occasions before and after privatization avoid the compromise occurring in the image reproduced in Illustration 6.1.) To carry out the analysis, each bus must be considered as a multimodal ensemble made up of three co- deployed semiotic resources: (1) the bus itself (as material object), (2) the livery applied to it and (3) the written texts and logo present upon it that state the name of the owner/operator. Each of these elements is a resemiotization of some other utterance, text or item. The ensemble is at the intersection of many chains; the chain of discourse leading to the existence of the vehicle via its manufacture is one such, as is the chain that precedes the design and application of the livery. The resemiotizing chain that is of most significance here, however, is that which originates from agencies of the state, maybe a series of texts such as a service level agreement, contract, policy manifesto, committee paper and so on, or some other declaration of who is responsible for operating or owning the bus. The first task is to identify and assemble the elements in the chain of discourse leading from the state to the bus service and the bus itself. In the
98 Giulio Pagani (a)
(b)
(c)
Illustration 6.1 Co-occurrence of public sector and private sector bus liveries
Citizenship and Semiotics
99
case of a bus in NBC ownership, the available point of departure is the record of parliamentary debate and discussion leading to the enactment of the legislation that created the nationalized transport utility.2 Subsequent links include the records of decision making by NBC management concerning the planning and design of its services and the acquisition of resources to deliver them. (Note that the actual discussions are not available for analysis in this case, but recorded minutes are.) Other exchanges of meaning, such as advertisements or media releases were also produced before the eventual resemiotization of all these into the ensemble that is the bus. This chain is laden with explicit realizations in language that make clear the role and responsibility of the state complex in providing the bus and the service, and this explicitness is also realized through the action of semiotic metaphor in the resemiotized end-product. In this way, a green bus carrying the NBC logo has a clear meaning potential which it makes apparent to the citizens using it, and it thereby communicates on behalf of the state. Investigating the citizens’ actual awareness of this banal act of communication is beyond the scope of this chapter, but an assumption that it is at least possible for a citizen to read the bus in this way, and to see the bus as metaphorically representing the state, might be supported by considering the citizens’ exposure to other links in this discursive chain, such as the NBC advertisements where the state ownership was declared. By the time the bus service has passed into private ownership the situation has become less straightforward. As van Leeuwen (2005) noted, the new bus/livery ensemble is a meaningful realization of the values and identity of the new private bus company. This is so because it is at the end of a chain of that company’s management discussions and documents, and this chain explicitly realizes the company’s role in the service provision. Reference to the state is, on the surface at least, no longer to be found among the endmeanings of the bus; that is, the state does not seem to be explicitly, or metaphorically, realized in the material ensemble. At face value this should not be surprising, but an incongruity arises when we take into account the continuing role played by the UK central and local government agencies in transport provision. A considerable sum of government money was, and is, still spent on direct or indirect funding through subsidies and grants for bus services. The private sector, as represented by the bus company, does not by itself provide buses through the operation of a free market; yet, in spite of the part played by the state in putting the service on the road, in its new colour scheme the bus ensemble is not configured to make any communication concerning the state. The involvement of the state agencies means that a chain of discourse emanating from the state relating to the bus service does exist even if its content and effects cannot be seen. Discussions occur and reports are produced on how the money of the state will be used to run the service and
100 Giulio Pagani
yet the meanings exchanged there seem to have disappeared, or seem to have been eluded somehow, so that the citizen has no access to them and cannot see any metaphorical representation of the state in the bus. What is the reason for this? One possibility is that the bus ensemble is not capable of metaphorically realizing the concept of partnership between the state and others and a choice of agent therefore has to be made, since, as the NBC bus demonstrates, the concept of sole agency does seem to be within its range of affordances. An alternative suggestion is that political decisions as to attribution of agency override the employment of the full metaphorical potential of the mode. Since it has been argued that the bus service is an important signifier for the state, and given that the citizen might be expected to have a different relationship with the state than with a private operator (and different again when it is the state posing as a private operator), adopting a ‘critical’ approach when using the model for the tracking of meaning into and out of material resources would seem therefore to be especially important in answering the questions posed at the outset of this chapter. Critical investigation of material discursive resources would seem to require a tracing back of the chain to establish the point where there has been a discontinuity in the transfer of some meaning; that is, the point where semiotic metaphor appears to have broken down. It would also need to consider whether this breakdown is an unavoidable consequence of any restricted affordances of the material mode or whether it is due to the strategic choices made by authors overseeing the creation of the final material/multimodal ensemble. In more and more countries, public services have begun to be delivered under the banner of the private sector: described, liveried and otherwise semiotically represented as being provided by the market rather than the state. This process of shrinking the state has been ongoing in the United Kingdom for over twenty years and has had a substantial effect upon how the state has represented itself to its citizens. Prior to privatization of utilities, the role of the state in directly funding or providing public services or other facilities was openly declared and shown through a range of multimodal discursive resources of the kinds discussed in this chapter. The meanings of the state and the citizen that were made and exchanged under those conditions were, arguably, accurately representative of the social structures then existing. This is because these meanings were accurately resemiotized as they passed through discursive chains. In the present day, the same range of multimodal resources is employed in creating different discourses; these discourses tend to indicate that the state no longer directly or indirectly provides or pays for certain services or facilities because the multimodal resources in question are engaged in creating and showing a marketized semiotic landscape. However, this landscape may not be a totally accurate representation of the actual underlying strata if many of these services and facilities continue to be fully or partially funded by the state. So, a bus that
Citizenship and Semiotics
101
is subsidized by central or local government but which carries meaning potential that predominantly or uniquely realizes the meanings of a private enterprise operating under market conditions is itself a text that misrepresents the meanings of the state (posing as ‘market’) and the citizen (posing as a ‘consumer/purchaser’) and the relationship between them.
6.6 Conclusion This chapter has explored how the state represents itself to its citizens and has established that, in order to get the whole picture, we need to look into the ‘showing’ as well as the ‘telling’ of itself. Furthermore, the ostensibly unspectacular and routinely banal instances of this showing achieved metaphorically through certain material structures and associated services have been privileged here because their understated nature masks an important power and effectiveness in handling the representation in question. This approach hinges on the assertion that these material resources are carriers of meaning, and in particular the meaning of what the state is. The model for analysing these materials works by tracking down how meanings flow into them from texts in other genres to which they are linked in a chain. The brief example given shows how material items are in fact at the junction of more than one chain, but it also shows that the other meanings flowing down one of these chains may predominate so that the other meanings become submerged. In this way, the meanings shown by the item may be at odds with meanings told of or shown further back up the chain. This suggests that a critical investigation of material discursive resources, involving the unearthing of meanings now unseen but which had been apparent in the preceding discourses, is a valuable and worthwhile task when investigating how the state achieves the shaping of expectations and how (in)accurate the representation of the relationship between them is. The work done here deals with its material subject as if it were the end link in the chain(s) of texts associated with it.3 However, the general methodological approach may become even more productive if it is recognized that this need not be the case in every instance. For example, the written or spoken discourse about vehicles, infrastructures, and so on, subsequently produced by the citizens using them adds further links to the chains. This further set of texts is itself open to analysis which may reveal how the ongoing exchanging of the meaning of the state is handled.
Notes 1. The Marshallian perspective is that citizenship comprises civil, political and social rights that add up to enable a person to live the life of a civilized being. (The social rights referred to are, in broad terms, ones delivered by the provision of a welfare
102
Giulio Pagani
state.) Such a model of citizenship would seem to be consistent with a meaning of the nation-state that is promoted by a ‘civic’ discourse. 2. The Transport Act 1968. 3. Somewhat similar views of ‘chaining’ have been discussed by Ventola (1999, 2002) by using the concept of ‘Semiotic Spanning’ in the context of language of conferencing.
References Billig, M. (1995) Banal Nationalism (London: Sage Publications). Fairclough, N. (1993) ‘Critical discourse analysis and the marketization of public discourse: the universities.’ Discourse and Society, 4(2): pp. 133–68. Fairclough, N., S. Pardoe and B. Szerszynski (2006) ‘Critical discourse analysis and citizenship’, in H. Hausendorf and A. Bora (eds) Analysing Citizenship Talk (Amsterdam: John Benjamins), pp. 98–123. Hall, S. (1992) ‘The question of cultural identity’, in S. Hall, D. Held and T. McGrew (eds) Modernity and its Futures (Cambridge: Polity Press in association with the Open University), pp. 273–325. Halliday, M. A. K. (1978) Language as Social Semiotic: The Social Interpretation of Language and Meaning (London: Edward Arnold). Iedema, R. (2001) ‘Resemiotization’, Semiotica 137, 1 (4): pp. 23–39. Kress, G. and T. van Leeuwen (2001) Multimodal Discourse: The Modes and Media of Contemporary Communication (London: Arnold). —— (2006) Reading Images: The Grammar of Visual Design, 2nd edn (London: Routledge). Lewis, G. (2004) ‘ “Do not go gently”: Terrains of citizenship and landscapes of the personal’, in G. Lewis (ed.) Citizenship: Personal Lives and Social Policy (Bristol: Policy Press in association with the Open University), pp. 1–37. Lim, F. V. (2004) ‘Developing an integrative multi-semiotic model’, in K. L. O’Halloran (ed.) Multimodal Discourse Analysis: Systemic Functional Perspectives (London: Continuum), pp. 220–46. Marshall, T. H. (1992) Citizenship and Social Class (Oxford: Oxford University Press). O’Halloran, K. L. (1999) ‘Interdependence, interaction and metaphor in multisemiotic texts.’ Social Semiotics, 9(3): pp. 317–54. —— (2003) ‘Intersemiosis in mathematics and science: Grammatical metaphor and semiotic metaphor’, in A-M. Simon-Vandenbergen, M. Taverniers and L. Ravelli (eds) Grammatical Metaphor: Views From Systemic Functional Linguistics (Amsterdam: John Benjamins), pp. 337–65. Orton, M. (2004) ‘Citizenship, responsibility and community’, Warwick Institute for Employment Research Bulletin Number 74, www2.warwick/ac/uk/fac/soc/ier/ publications/bulletins/ier74.pdf (accessed 9 July 2008). O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). —— (2004) ‘Opera Ludentes: The Sydney opera house at work and play’, in K. L. O’Halloran (ed.) Multimodal Discourse Analysis: Systemic Functional Perspectives (London: Continuum), pp. 11–27. Pagani, G. (2007) ‘Expressions/representations of the relationship between the “state” and the “citizen”: Register analysis of local government discourse.’ Critical Approaches to Discourse Analysis across Disciplines, 1(1): pp. 1–18.
Citizenship and Semiotics
103
Sbisà, M. (2006) ‘Communicating citizenship in verbal interaction’, in H. Hausendorf and A. Bora (eds) Analysing Citizenship Talk (Amsterdam: John Benjamins), pp. 151–80. Smith, A. D. (2001) Nationalism: Theory, Ideology, History (Oxford: Polity). van Leeuwen, T. (2005) Introducing Social Semiotics (London: Routledge). Ventola, E. (1999) ‘Semiotic spanning at conferences; Cohesion and coherence in and across conference papers and their discussions’, in W. Bublitz, U. Lenk, and E. Ventola, (eds) Coherence in Spoken and Written Discourse. How to Create It and How to Describe It (Amsterdam: Benjamins), pp. 101–25. —— (2002) ‘Why and what kind of focus on conference presentations?’, in E. Ventola, C. Shalom and S. Thompson (eds) Conference Language. (Frankfurt: Peter Lang), pp. 15–50. Verdery, K. (1996) ‘Whither nation and nationalism’, in G. Balakrishnan (ed.) Mapping the Nation (London: Verso), pp. 226–34.
This page intentionally left blank
Part II Children’s Narratives and Multisemiotics
This page intentionally left blank
7 On Interaction of Image and Verbal Text in a Picture Book. A Multimodal and Systemic Functional Study Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
7.1
Introduction
There is a wide range of approaches to picture books among the existing studies on children’s literature (Schwarcz, 1982; Nikolajeva and Scott, 2001). These narratives have been analysed in connection with developmental psychology, in relation to their therapeutic effects on children (Spitz, 1999) and their thematic and stylistic diversity (Feaver, 1977). In most of these studies, the visual aspects have been considered as secondary, and their relationship to the verbal text has been practically ignored. In the past 25 years, however, a number of critics have analysed how these two forms of communication, the verbal and the visual, work together to create meaning in picture books (Moebius, 1986; Nodelman, 1988; Nikolajeva and Scott, 2000; Lewis, 2006 [2001]). They all seem to agree that the possible relationships between verbal and visual components range from those in which images simply illustrate or translate what is related in the words, to more complex and sophisticated forms of interaction. The more intricate interplay occurs when verbal and non-verbal elements are not mutually reproductive or when they tell different or contradicting stories. Thus, the understanding of meaning not only requires the analysis of language in text, but also the study of other semiotic resources, such as images, gestures or sounds. This chapter analyses the intersemiotic relationship between visual and textual modes in Guess How Much I Love You (1994), a picture book written by Sam McBratney and illustrated by Anita Jeram for children aged six and under. We will examine the role of pictures in relation to words and vice versa, in order to determine how these modes complement one another and how this interplay makes the tale exciting and interesting for young children. Although the plot is simple, consisting of a competition between the two characters, Big Nutbrown Hare and Little Nutbrown Hare, to show how much 107
108 Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
they love each other, the picture book has been translated into 37 languages and is still acclaimed by both literary critics and readers worldwide. This study begins with a brief introduction to multimodality and an account of multimodal approaches to picture books. Special reference is also made to Hallidayan linguistics in an attempt to clarify the boundaries between the categories that Nikolajeva and Scott distinguish, regarding the interaction between verbal and visual elements. Once our own coding scheme is established and the methodology is outlined, the focus turns to the study of the word/image relationship in this narrative. The conclusions drawn from the analysis will shed light on the way images and verbal text are combined in this picture book. This knowledge can be used to facilitate the understanding of the message to young readers and capture their attention throughout the story.
7.2 Scope of the study: Systemic functional linguistics and visual social semiotics In this section, systemic functional linguistics (henceforth SFL) is described as a powerful and flexible model that can be applied to the analysis of semiotic modes other than language. Previous studies based on SFL will be referred to as further examples of its application in multimodal genres. After outlining the coding scheme that is adapted here, the analysis of the codeployment of images and text in Guess How Much I Love You will be carried out. Nikolajeva and Scott’s approach to picture books has also been taken into account to determine the synergy of verbal and visual strands in this narrative, typically intended for first readers or for six-year- olds and under (Ávila, 2008). 7.2.1 Multimodality and SFL According to Kress and van Leeuwen (2001, 2006 [1996]), multimodal texts involve the utilization of several semiotic modes within a social and cultural context, which leads to the creation of a semiotic product or event. This approach works well in the interpretation of picture books since they are polyphonic in nature (Hunt, 1999) and their meanings are made up of the combination of various semiotic resources. The interdependence of images and words demands that both visual and verbal modes are considered in the process of understanding the writer’s and the illustrator’s influence on the final outcome (Lewis, 2006). We agree with Lemke (1998) and Hagan (2007) when they state that the relationship between words and pictures is not necessarily symmetrical, as each mode specializes in the transmission of specific meanings. The verbal resources of language are essential for the representation of sequential relations and events, while the resources of image are most appropriate for the representation of spatial aspects and non-linear relationships. Words
On Interaction of Image and Verbal Text in a Picture Book
109
narrate, determine time and tell us what characters say or think. However, pictures best show what characters look like, what they are doing and the setting in which they appear. Spatial and physical descriptions are typically created by the illustrator as they may instantly communicate information depicting the characters’ physical appearances and setting which, if only narrated verbally, would be too verbose. While words serve to describe space and communicate solely by means of narration, pictures can actually show this, thus transmitting information more effectively (Graham, 2000). SFL has been implemented as an application which may be useful for both the study of words and images in picture books. O’Toole (1994), O’Halloran (2003), Ventola et al. (2004), Kress and van Leeuwen (2006) and Martin (2008), among others, have demonstrated that SFL tools may not solely belong to the study of language and could also be applied to the analysis of other semiotic resources. SFL recognizes three metafunctions, concerned with different types of meaning: interpersonal, textual and ideational. Interpersonal meaning deals with enacting social relationships. In a clause, interpersonal meaning includes the type of speech act chosen (statement, question, command) and the mood of the clause, expressed in English by the presence/absence and ordering of subject and finite verb. The expressions of attitude and judgement, realized by the system of polarity and modality, are also part of the interpersonal meaning. Simultaneously, the nature of textual meaning is concerned with creating a text with relevance to the context in which it is produced and understood. Through the textual metafunction the clause is organized as a coherent message and involves the choice of a particular starting point for it, or theme, which in English tends to be located in first position. Finally, and of special interest is the ideational metafunction, which deals with the clause as a representation of patterns of world experiences. In the clause, the experiential component is represented by choices in transitivity. These choices are conceptualized as situation types with the following components: the type of process chosen, the number of participants involved, the attributes ascribed to them, and the circumstances of place, time, manner, etc. attendant to the process itself (Halliday and Matthiessen, 2004; Royce, 2007; Moya and Pinar, 2008). 7.2.2 Towards a model for the study of intersemiotic relations Kress and van Leeuwen (2006 [1996]) expand on the SFL model to account for types of semiotic meanings other than those encoded by verbal language. These linguists have developed a method of social semiotic analysis of visual communication, based on Halliday’s (1978) social semiotics, and have created a descriptive framework of multimodality, assigning representational, interpersonal and compositional meanings to images.1 Their model, essentially designed for the study of advertising images, provides some valuable clues for the interpretation and understanding of visuals in multimodal
110
Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
genres. However, although Kress and van Leeuwen (2006 [1996]) consider that the non-verbal component is somehow connected with the verbal, they also assume that the verbiage is in no way dependent on it. In fact, they study images independently of verbal messages and, therefore, do not make explicit references to the specific systems used for analysing the interaction between visual and verbal modes. Nikolajeva and Scott’s (2000, 2001) approach to picture books may serve to fill this gap, as they provide five categories to describe word and image interactions. These categories are symmetrical, enhancing, complementary, counterpointing and contradictory interactions. However, as Nikolajeva and Scott also suggest, these terms are not absolute, and the boundaries between the various categories are not always clear- cut. For example, the only feature that Nikolajeva and Scott consider to distinguish between enhancing and complementary relationships is based on the additional information, minimal (enhancement) versus significant (complementary) that the two media provide. In fact, the delimitation of Nikolajeva and Scott’s categories concerning the intersemiotic relationship between visual and verbal media may require a further look into the SFL account, and more specifically into the ideational metafunction of the language. We soon noticed that visual and verbal interactions could not be satisfactorily analysed only in terms of perceptual relations. The additional adaptation of Halliday’s representational metafunction of language will enable us to compare the linguistic patterns of the participant, process and circumstance in the verbal text with the represented participants, the interactive participants and the coherent structural elements of the visual mode (Kress and van Leeuwen, 2006 [1996]). In this sense, Halliday and Matthiessen (2004) distinguish among six types of processes associated with the specific participants’ roles and the circumstances which are represented in the clause. The three core types are: material, mental and relational processes. Material processes reflect the processes of the external world and are typically processes of doing and happening. Mental processes, however, refer to the processes of consciousness and are typically processes of perception (experiencing or sensing) (seeing, hearing, feeling), of cognition (knowing, believing, understanding) and of affection or feeling such as (liking, wishing, fearing and wanting). They are related to Senser and Phenomenon entities. Furthermore, there are additional processes of classifying and identifying, such as relational processes, whereas a participant is identified or situated circumstantially. Finally, although not clearly set apart, further categories located at the three boundaries can be distinguished: behavioural, verbal and existential processes2 (Halliday and Matthiessen, 2004). Like linguistic structures, visual structures and the visual processes within them are linked to participant roles and to specific attributes and
On Interaction of Image and Verbal Text in a Picture Book
111
circumstances. Narrative images are associated both with (i) action processes, which are similar to material and behavioural processes in language, and (ii) with reaction processes, equivalent, to a certain extent, to the mental processes of perception in the linguistic system of transitivity. Although not necessarily identical in function, conceptual images are related to relational and existential processes in language and their associated participants (Kress and van Leeuwen, 2006). Thus, the analysis of the intersemiotic relation between visual and verbal modes in ideational terms requires the identification of: (i) the represented participants reflected in the verbal text and the visual, (ii) the processes or the activity described, (iii) the attributes or the qualities of the participants and, finally, (iv) the circumstances in which the action is being developed (Moya and Pinar, 2008). Analyses of this type have already been carried out by Williams (2000), Royce (2007) and Martin (2008), among others, who have explored the benefits of applying SFL to different genres, in which the co- deployment of visual and linguistic elements plays a key role. Williams (2000), for example, observes the linguistic patterns of Actors and Goals in the lines of Anthony Browne’s Piggybook (1986) in order to show how these particular SFL tools help children to understand this story. The lack of goals in the male participants’ discourse appears to reveal gender attitudes throughout the entire narrative and subsequently builds up mental pictures of the main characters. For the study of the semantic interrelation between verbal and visual modes, Royce (2007, p. 69) also applies the Hallidayan linguistic model of ideational cohesive relations to the visual mode of multimodal texts. After analysing an extract from the issue of The Economist Magazine (2003), ‘Mountains still to climb’, Royce (2007, p. 84) shows that the occurrence of intersemiotic synonymy and repetition shows that the visual and the verbal modes complement each other in maintaining the main topic of the text, while the use of meronymy and collocation seems to support financial discussion. Within this line, and using the tools that the SFL model offers (system of transitivity, thematic patterns, attitude, etc.), Martin (2008) analyses the picture book Photographs in the Mud, written by Diane Wolfer and illustrated by Brain Harrison-Lever (2005). The complementary interaction of verbal and visual strands in this picture book serves the purpose of the story, basically to reconcile the opposing post-war feelings between the Australians and the Japanese. To achieve this, the perspective of two fictional soldiers, their families included, represent the two opposing sides in the Kokoda Campaign in New Guinea in 1942. Martin shows how the co- deployment of images and words in this multimodal text can only be understood in relation to the functions they fulfil within the specific context for which it was created.
112
7.3
Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
Our coding scheme
After having considered the application of SFL to multimodal texts, Section 7.3 establishes a model upon which our analysis is based. The theoretical foundation for this analysis is mainly extrapolated from the SFL account of language as a social semiotic process (Halliday, 1978) and from Nikolajeva and Scott’s (2000, 2001) approach to picture books. SFL is considered to offer useful tools for drawing the boundaries between the categories proposed by Nikolajeva and Scott in identifying what information is carried in the verbal text and in the visual. Our model differentiates between four different types of intersemiotic relations: symmetrical interaction, ideational complementarity, counterpointing interplay and lastly contradictory interaction. In symmetrical interaction, words and images are considered to convey the same story, essentially repeating information through different forms of communication (Nikolajeva and Scott, 2000, p. 225). Although, in this case, the visual concurs with the textual and this may involve a good deal of redundancy, this does not necessarily imply a simple duplication of information, as the freedom of interpretation allowed in images may suggest additional meanings (Gill, 2002). Furthermore, for a relationship of a concurrent nature to be fulfilled, the verbiage and imagery have an equivalent circumstantial participant-process configuration. The term ideational complementarity will be used to refer to those cases in which what is represented in images and what is represented in language is different but complementary (Unsworth, 2008, p. 15). Pictures further amplify the meaning of the words, or the words expand upon the images so that one of the two modes provides the additional information that the other component lacks. Unlike symmetrical interaction, in this relationship, the written part and the image do not necessarily have an equivalent circumstantial participant-process pattern, and the illustrations often show new functional elements which are not referred to in the textual component. Following Gill (2002), we also assume that relationships of a complementary nature occur in the following cases: (i) where significant segments of the narrative are shown in pages consisting of image or text alone; (ii) when juxtaposed images and text are shown jointly to construct activity sequences; and (iii) when significant elements of the action occur within the images only, even though the text is also co-present on the page. Finally, as will be shown in Section 7.4.2, (iv) the utilization of semiotic metaphors3 (O’Halloran, 2003) also contributes to the encouragement of ideational complementary interactions. Depending on the degree of different information presented, a counterpointing interaction may develop where words and images provide ‘alternative information’ and collaborate to communicate meanings beyond the scope of either one standing alone4 (Nikolajeva and Scott, 2000). By using
On Interaction of Image and Verbal Text in a Picture Book
113
counterpointing interaction, the writer and the illustrator may show an intentional lack of coherence between the verbal and the visual elements, opening the storyline up to multiple interpretations. An increased effort is then required to understand the connection and integrate the information that the pictures and images provide simultaneously (Lewis, 2006 [2001]). Although not generally the case, this technique is associated with a divergence in the participant-process and circumstantial patterns of verbal and visual storylines. Nikolajeva and Scott (2000, 2001) distinguish two types of counterpointing interaction: (i) ironic counterpoint takes place when the story is told from an ironic point of view (for instance, a traditional princess wearing a pair of jeans or displaying impolite manners); (ii) perspectival counterpointing occurs when the story is told from a different perspective than what is perceived by a character and differs from what the reader or the other characters of the tale see. For example, the creatures in the textual part of Where the Wild Things Are by Sendak (1963) may suggest a terrifying story, perhaps too frightening for children. However, the pictures present a story of an entirely different nature, where the wild creatures do not prove to be such scary characters (Nodelman, 1988). Finally, in contradictory interaction, words and pictures significantly contradict each other in some way or they express entirely different things (Lewis, 2006, p. 39). The pictures usually tell what really happened, while the text embodies, in many cases, the concept of the contrast (Stevenson, 1998). This ambiguity challenges the reader to mediate between the words and pictures to establish a true understanding of what is being depicted. The lack of equivalence between the verbal and the visual components usually translates into a different participant-process-attributive- circumstantial configuration.
7.4 The interplay of images and words: Analysis and exemplification At this point, we will analyse the co- deployment of words and images in Guess How Much I Love You, by applying the coding scheme presented in Section 7.3. Evidence taken from the tale will be provided to show the different types of interactions established between the visual and verbal modes. 7.4.1 Method of analysis Reference has been made to the correlation between written and visual components in the narrative by comparing the circumstantial-participantprocess configuration of the verbal component with the corresponding represented participants, visual processes and settings shown in the pictures. Finally, as most of the illustrations are double spreads and include different visual and written elements, more than one of the interactive relationships
114
Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
described in Section 7.3 may be found within. This means that, although we have distinguished 18 compositions in the whole book, the verbal/ visual interactions encountered (31) overcome the number of illustrations identified. In this study, we have used the first edition of Guess How Much I Love You, published in 1994 by Walker Books Ltd, London. In order to do the visual analysis, we have considered 18 illustrations distributed in 11 double spreads and seven single pages (1, 12, 13, 14, 16, 17 and 18). The 18 illustrations have been numbered from the first to the last page and references will be made to those numbers. For example, after Illustration 11 when the two hares are looking at the river and the hills, we have distinguished two more Illustrations, 12 and 13, on the basis of the relationship between the two paragraphs and their associated pictures. These are followed by Illustration 14, where Big Nutbrown Hare starts to carry his offspring in his hands. After Illustration 15, when the little hare is sleepy and closes his eyes, Big Nutbrown Hare kisses his son and sets him on the bed (Illustrations 16 and 17). At this point, the big hare lies down next to the little one and whispers, while smiling, how much he loves him. This illustration, located between a paragraph of three lines and the expression, and back, brings the story to an end. 7.4.2
The co- deployment of verbal and non-verbal components
In line with Lewis (2006 [2001]), we have assumed that picture books do not always maintain the same interanimation between words and images throughout the entire tale. The word–picture relationship might even change from composition to composition, from page to page throughout the book. This has led us to analyse how visual and verbal components act upon each other, not only within individual images, but also across images. One example of complementary interaction can immediately be identified on the book cover. It is notable that the title makes reference to the feeling of love and also to the cognitive ability of guessing and that suggests that the main characters are human beings. It is only through the pictures that the reader realizes that the characters are not human, but hares. In addition, there is no one-to- one correspondence between the processes that the verbal and the visual components show. The mental processes of love and guess are not reflected in the illustrations, where Little Nutbrown Hare is just holding his father’s ears. Only the participants I and you are represented as such in both semiotic modes. The adverbial phrase, how much, lacks a visual representation as well. In Illustration 7.1, the first pictorial element of the tale, the text does not seem to be richer in content than the images, since all the written episodes are illustrated visually. Big Nutbrown Hare’s very long ears are depicted as such in the illustration. In addition, the two material processes, going to and held on, and the participants (the two hares, Big Nutbrown Hare’s long ears
On Interaction of Image and Verbal Text in a Picture Book
115
Illustration 7.1 Narrative process: symmetrical interaction
and the bed), mentioned in the verbal text, appear faithfully represented. Finally, Little Nutbrown Hare’s bed is also depicted. From the second to the fourth illustrations, the verbal/visual relationship is essentially of an ideational complementary nature, since the illustrations and the text provide relevant additional information that the other component lacks. However, symmetrical relations can also be identified in these compositions. In Illustration 2, for example, the setting, a large tree, is depicted only through the visual component. The tree is a Locative Circumstance which appears recurring throughout the story and which could be interpreted as a conceptual structure that symbolizes protection and security (Kress and van Leeuwen, 2006 [1996]). In addition, Big Nutbrown Hare’s answer, Oh, I don’t think I could guess that, is not reflected in the picture. The mental process included in the imperative mood structure, guess how much I love you, is not shown in the visual mode either. Only the first part of the verbal text keeps a symmetrical relationship with the pictures. The information that the relational process, want to make sure, and the behavioural process, listen, express in, He wanted to make sure that Big Nutbrown Hare was listening ... , is represented in the illustration with Little Nutbrown Hare holding his father’s ears. In Illustration 7.2, the relational process, had, and its associated participant with its attribute, long arms, are echoed in both the verbal and visual components, establishing a symmetrical interaction. However, Little Nutbrown Hare’s thought, Hmm, that is a lot, cannot be deduced from the visual component, only that he is looking at his father’s outwardly stretched arms. The verbal process, say, is not depicted either, because Big Nutbrown Hare’s
116
Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
Illustration 7.2 Narrative process: complementary interaction
mouth is closed. Once again, the process of love is only referred to in the verbal component. In addition, the meaning of this much is exemplified in the illustration through the use of a semiotic metaphor (O’Halloran, 2003), since this circumstance (linguistic) is represented in the visual mode by the material process of stretching out the arms. This feature becomes recurrent throughout the whole story: some circumstances, with postmodifying clauses and phrases (I love you as high as I can reach; I love you all the way up to my toes; I love you as high as I can hop; I love you all the way down the lane as far as the river; I love you right up to the MOON AND BACK), are represented visually as material processes. This shift increases the complementary nature that is established between words and images in the tale. Also symmetrical and complementary relations define the visual/verbal interaction in Illustrations 5–8. In the fifth illustration, for example, the verbal process, said, is reflected in the visual, as the little hare’s mouth seems to be opened, and the lines that emerge from it suggest a certain movement. However, Big Nutbrown Hare is observing his little one, but nothing is mentioned in the text. As mentioned in Section 7.3, in our coding scheme we have assumed that relationships of ideational complementarity also occur where significant segments of the narrative are shown in pages consisting of standalone images (Gill, 2002). Furthermore, the meaning of I love you [as high as I can reach], can only be fully understood through the visual component. Here there is a shift from circumstance (linguistic) to process (visual) as the young hare lifts up his arms. The linguistic element, as high as I can reach, realized as a rankshifted clause, becomes the major process in the illustration with the participant hare. This semiotic
On Interaction of Image and Verbal Text in a Picture Book
117
metaphor increases the complementary nature of the interaction between the two semiotic modes. Similarly, in Illustration 7 another semiotic metaphor can be identified as the circumstance with post-modifying phrase, up to my toes, becomes a process in the visual medium. In this composition, most of the textual part on the verso is reflected in the image when Little Nutbrown Hare tumbles upside down and reaches up to the trunk of the big tree with his feet. In contrast, the mental processes, love and had an idea, are not depicted. There is no mention of either Big Nutbrown Hare or his large ears, emerging as big branches. However, they do make an appearance on the verso. In the previous double spreads, the exemplification of how much one hare loved the other was drawn through the positioning of the limbs (see Illustration 7.2). The exemplifying hare was placed on the left, while the one observing was positioned on the right. At this point a change of spatial orientation is produced with the use of the hind legs to represent the textual aspects of I love you all the way up to my toes. In addition, the exemplifying hare is placed upside down on the right-hand side of the double spread. This way, reading the picture book does not become monotonous, redundant or predictable. In Illustration 9, the orientation of the story changes again, since Little Nutbrown Hare is represented alone hopping up and down, depicting movement in what has been termed simultaneous succession. This implies a sequence of pictures, showing moments that are disjunctive in time but perceived as belonging together, in an unequivocal order (Nikolajeva and Scott, 2001, p. 140). In this case, the exemplification of their love is not made through body parts alone, but with the whole body. Here, the verbal/ visual interaction is principally symmetrical as the text and the pictures express the same message. Little Nutbrown Hare is represented visually as are the material processes, hop and bounce. However, the meaning of as high as I can hop cannot be deduced from the text alone; the illustration exemplifies and clarifies the extension of the jump, establishing in this way an ideational complementary relationship. The gaps left by the verbal component are thus completed by the pictures. The process of love is not represented in the visual mode either. A new change in the orientation of the story is observed in Illustration 11. The body becomes unimportant when it regards transmitting feelings. Rather, from now on, other elements, such as the landscape, the river, the path, the hills and the moon, are taken into account to describe their love for each other. The whole landscape represented in the long shot of this visual composition is not transmitted by the text. Consequently, the visual elements add to the verbal, since we can also see small houses, trees, roads and a few bushes, not just the river and the hills mentioned verbally by the main characters. A complementary relationship is thus established. Neither one of the following communication processes, cry and say, are represented visually as the two hares’ mouths seem to be static and closed. The mental
118
Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
process of an affectionate nature, love, in I love you all the way down the lane as far as the river, is not reflected in the visual component. In addition, the linguistic part, as far as the river, is a circumstance with a post-modifying phrase which becomes a process in the visual mode, creating a new semiotic metaphor. Only the two main participants (Little Nutbrown Hare and Big Nutbrown Hare) and the secondary entities (the river, the hills) are referred to in both semiotic modes. This also gives the interaction between verbal and non-verbal elements a symmetrical nature. In Illustration 12, it seems that the story is coming to an end. Little Nutbrown Hare puts his hands on his eyes indicating that he is tired and sleepy. However, in the following Illustration, 13, and its corresponding paragraph, located on the same page, we find another temporal marker (then), which indicates a change of orientation, the sky. The text says: [...] Nothing could be further than the sky, establishing a complementary visual/ verbal interaction as the verbal text fully extends the information that the depicted character shows. In Illustration 14, Little Nutbrown Hare is being held by his father, but this fact is not mentioned anywhere in the text. The verbal/visual interaction is, thus, complementary in nature. The moon also appears as a temporal marker. Even though the whole story is being developed at night, it is the first time we have a visual reference of this. In Illustration 15, Little Nutbrown Hare is in his father’s arms with his eyes closed and it seems as if the story is once again nearing the end, as viewed in the following Illustrations, 16 and 17, where Big Nutbrown Hare sets Little Nutbrown Hare on a bed of leaves and kisses him goodnight. These latter illustrations are basically symmetrical, because they represent the content of the text verbatim. The little hare’s bed and some bushes also form part of this picture. The moon against a dark blue background reflects night time. The child may think at this point that Little Nutbrown Hare has won the competition against his father. However, the reader is prompted to turn the page to find Big Nutbrown Hare whispering that he loves him up to the moon AND BACK. Providing that we take into consideration also the first part of the text, the visual/verbal relationship is basically symmetrical. However, the process, whisper, and the last written part, I love you right up to the moon AND BACK, introduce a complementary relationship, which is not echoed in the final picture. As shown in Table 7.1, the results of the analysis confirm that the form of word/image interaction that is used in Guess How Much I Love You to transmit the meanings is mainly symmetrical (52%) and complementary (48%). This is probably because the tale is intended for young children whose literary capacity is still limited. In fact, no cases of contradictory and counterpointing relations have been found, because by using them the writer and the illustrator might add too much new information, making the message more difficult for the young child to decipher.
On Interaction of Image and Verbal Text in a Picture Book Table 7.1
119
Visual and verbal interaction
Categories
Absolute values
Values in percentages
Symmetrical Complementary Contradictory Counterpointing
16 15 0 0
52% 48% 0% 0%
Total number
31
100%
The study of the visual components and their relationship with the text also shows that content is transmitted through both the verbal and the visual, but that each mode specializes in the transmission of particular meanings. In picture books, with their limited scope of verbal text, verbal descriptions of characters and setting are frequently absent or negligible. In Guess How Much I Love You the landscape, for instance, is not described by using words; it is only represented through images. If only words were used to describe the setting and the characters’ physical appearance, it would require much time for the plot to develop, contradicting the principle of brevity that is central to children’s narratives. However, the actions carried out by the characters to show how much they love each other are expressed through the co- deployment of both material processes and narrative images. In fact, there is a predominance of mental (33% of the cases counted) and material (27%) processes over verbal (17%), relational (15%) and behavioural (8%) processes (Moya and Pinar, 2008). Material processes give dynamism to the tale and contribute to the development of the plot as they describe the actions done by both Little Nutbrown Hare and Big Nutbrown Hare to show how great their love is. The writer and the illustrator describe the two hares’ expressions of love by using both material processes and narrative images. However, secondary aspects, which are less important for the development of the plot, are represented only in a single mode. For this reason, relational, mental, verbal and behavioural processes, which represent what the characters are feeling, saying or thinking, are basically reflected in the verbal text, without being depicted in the illustrations. Relational and behavioural processes show a lower frequency than the other types, as their descriptive nature might have led to detailed description that could have interfered with the narrative tension. Verbal processes, which are usually reflected only in the verbal text, reproduce the words uttered by the main characters in direct speech, giving more immediacy to the events narrated. Finally, although the manifestations of an affectionate nature are central aspects to the plot of the tale, the mental process of love is also solely represented in the verbal mode. The inner aspects of our experience of the world are
120 Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz
not easily shown in non-verbal language as psychological descriptions often need words to capture emotion and motivation.
7.5
Conclusion
The aforementioned results show that in Guess How Much I Love You symmetrical and also complementary interactions are used throughout the tale to create meaning. Symmetrical relations facilitate the comprehension of the message, as words and images express the same information through different modes. In books devoted to first readers, illustrations are frequently associated with the participants they evoke. The visual proximity of the pictures to the objects they represent helps young children when they decode the messages, especially at such an early age when they still struggle in identifying verbal graphemes. In addition, through the use of ideational complementary interactions the reader’s attention and the narrative tension are kept alive since some of the information offered is different, albeit only partly. In complementary relations images and words contribute differently to the storyline, as either the images enhance the meaning of the words, or they expand upon the meaning transmitted by the visual component. In this sense, the application of the concept of semiotic metaphor has also proven to be a useful tool for the analysis of the visual and verbal interaction in this picture book. Throughout the story, in the textual component, circumstantial elements, such as this much, as high as I can reach, up to my toes, as high as I can hop, as far as the river or up to the moon and back, which quantify the extension of the love between the two hares, are realized as processes in the visual mode. This feature intensifies the complementary character of the interanimation between pictures and words as there is a semantic shift from the linguistic to the visual and vice versa. No cases of counterpointing and contradictory interactions have been found as they demand a more developed decoding capacity, still not acquired at such an early age. In addition, when these interactions are used, images enter into conflict with the information that the text expresses and, as a result, the reading process is slowed down. Thus, the early age of the children for whom this tale was written and illustrated seems to determine the symmetrical- complementary interaction that predominates in this picture book. Although further exploration is needed to reach a definite conclusion concerning the potential of combining images and words in picture books, we hope that this chapter has contributed to clarify the effects of combining verbal and visual strands in Guess How Much I Love You. The predominance of mainly symmetrical but also complementary interactions may be due to the limited literacy of children under age six, whose cognitive ability, still developing, does not meet the demands of intersemiotic relations of a
On Interaction of Image and Verbal Text in a Picture Book
121
more complex mental nature. And as demonstrated in Section 7.4, this does not imply, however, that the illustrations are necessarily a faithful echo of the meanings transmitted by the verbal text. Concerning interactions, the use of symmetry and complementarity, and the lack of counterpointing and contradiction, seem to be useful techniques to create tales that are both easily accessible for young children to understand and, in turn, interesting enough to hold their attention. Artists and writers are to become aware of the potential of combining verbal and visual modes in picture books so that they offer complementary meanings, without pushing the limits drawn by the cognitive and literary abilities of their young readers.
Acknowledgements Our sincere gratitude to Kay O’Halloran and Eija Ventola for their helpful comments and suggestions on the concept of semiotic metaphor and its applications to this picture book. Special thanks also to Edie Cruise and Lauren Webster for their style suggestions on previous drafts. Finally, the authors are grateful to Walter Books Ltd, for granting licence to reproduce in Figures 7.1 and 7.2 inside illustrations from Guess How Much I Love You (1994) by Sam McBratney and illustrated by Anita Jeram.
Notes 1. Representational meaning deals with the forms of visual representation of events in the world, the represented participants (RP) involved in it and the circumstances in which they occur. It is defined by two key features: the Narrative and the Conceptual, which are differentiated by the presence of vectors. While narrative images are characterized by vectors of motion which allow viewers to create a story about the RPs, the conceptual images do not include vectors. In addition, Interpersonal meaning is concerned with the type of relationship established between the viewer and what is viewed. Interpersonal meaning includes the following features: Image Act and Gaze, Social Distance and Intimacy, Horizontal Angle and Involvement, and Vertical Angle and Power Relations. Finally, compositional meaning regards the ways in which the layout of the composition gives more information value and relative salience to certain elements drawn in the image. This type of visual meaning is characterized by three features, which are Information value, Framing and Salience. For further information on this approach to images, see Kress and van Leeuwen, 2006. 2. Behavioural processes are on the borderline between material and mental processes as they include characteristics of each. They represent the outer manifestations of inner aspects of our experience and include volitional processes (watch, listen), involuntary or spontaneous uncontrolled processes (laugh) and psychological states (sleep). Verbal processes are processes of saying and are typically associated with the following participants: Speaker, Receiver (addressee) and Verbiage (the said). Finally, by means of existential processes, phenomena of all kinds are recognized to exist, or to happen.
122 Arsenio Jesús Moya Guijarro and María Jesús Pinar Sanz 3. O’Halloran (2003) extends the Hallidayan concept of grammatical metaphor to the semiotic metaphor in order to determine how semiotic modes interact with each other. Like grammatical metaphor, semiotic metaphor also involves a shift in the grammatical class or function of an element. As this process does not take place intra-semiotically (as in the case of grammatical metaphor), but inter-semiotically, the reconstrual produces a semantic change in the function of that element (O’Halloran 2003, p. 357). These shifts may involve either the reconstruction of textual processes as participants in the visual representation or the introduction of new visual participants. For further information on semiotic metaphor and its application to mathematic and scientific discourse, see O’Halloran (2003). 4. Counterpoint is a term derived from Latin and describes the art of combining individual melodic voices to form a harmonious whole (Lewis, 2006). In the case of picture books, the technique of counterpointing is used to refer to the juxtaposition of the individual verbal and visual strands, their influence on each other and the possible combined effect of both layers as a whole.
References Ávila, J. A. (2008) La Progresión Temática y Tópica de Narraciones Infantiles en Lengua Inglesa. Un Estudio Contrastivo por Edades (Cuenca: Servicio de Publicaciones de la Universidad de Castilla-La Mancha). Feaver, W. (1977) When We Were Young: Two Centuries of Children’s Book Illustrations (London: Thames and Hudson). Gill, T. (2002) Visual and Verbal Playmates: An Exploration of Visual and Verbal Modalities in Children’s Picture Books (Unpublished BA (Hons): University of Sydney). Graham, J. (2000) ‘Creativity and picture books.’ Reading 34(2): pp. 61–7. Hagan, S. M. (2007) ‘Visual/verbal collaboration in print. Complementary differences, necessary ties, and an untapped rhetorical opportunity.’ Written Communication 24(1): pp. 49–83. Halliday, M. A. K. (1978) Language as Social Semiotic: The Social Interpretation of Language and Meaning (London: Edward Arnold). Halliday, M. A. K. and C. M. Matthiessen (2004) An Introduction to Functional Grammar, 3rd edn. (London: Edward Arnold). Hunt, P (ed.) (1999) Understanding Children’s Literature (London: Routledge). Kress, G. and T. van Leeuwen (2001) Multimodal Discourse. The Modes and Media of Contemporary Communication (London: Arnold). —— (2006 [1996]) Reading Images. The Grammar of Visual Design (London: Routledge). Lemke, J. (1998) ‘Multiplying meaning: Visual and verbal semiotics in scientific text’, in J. R. Martin and R. Veel (eds) Reading Science: Critical and Functional Perspectives on Discourses of Science (London: Routledge), pp. 87–113. Lewis, D. (2006 [2001]) Reading Contemporary Picturebooks. Picturing Text (London: Routledge/Falmer). Martin, J. R. (2008) ‘Intermodal reconciliation: Mates in arms’, in L. Unsworth (ed.) New Literacies and the English Curriculum: Multimodal Perspectives (London: Continuum), pp. 112–48. Moebius, W. (1986) ‘Introduction to picturebooks codes.’ Word and Image, 2(2): pp. 141–58.
On Interaction of Image and Verbal Text in a Picture Book
123
Moya, J. and M. J. Pinar (2008) ‘Compositional, interpersonal and representational meanings in a children’s narrative. A multimodal discourse analysis.’ Journal of Pragmatics, 40(9): pp. 1601–19. Nikolajeva, M. and C. Scott (2000) ‘The dynamics of picturebook communication.’ Children’s Literature in Education, 31(4): 225–39. —— (2001) How Picturebooks Work (New York and London: Garland Publishing). Nodelman, P. (1988) Words about Pictures: The Narrative Art of Children’s Picturebooks (Athens: The University of Georgia Press). O’Halloran, K. L. (2003) ‘Intersemiosis in mathematics and science: Grammatical metaphor and semiotic metaphor’, in A. M. Simon-Vandenbergen, M. Taverniers and L. Ravelli (eds) Grammatical Metaphor: Views from Systemic Functional Linguistics (Amsterdam: John Benjamins), pp. 337–65. O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). Royce, T. D. (2007) ‘Intersemiotic complementarity: A framework for multimodal discourse analysis’, in T. D. Royce and W. L. Bowcher (eds) New Directions in the Analysis of Multimodal Discourse (Mahwah: Lawrence Erlbaum), pp. 63–110. Schwarcz, J. H. (1982) Ways of the Illustrator: Visual Communication in Children’s Literature (Chicago: American Library Association). Sendak, M. (1963) Where the Wild Things Are (US: Haper and Row). Spitz, E. (1999) Inside Picture Books (New Haven, CT: Yale University Press). Stevenson, D. (1998). ‘Narrative in picture books or, the paper that should have had slides’, in B. Hearne, J. Del Negro, C. Jenkins and D. Stevenson (eds) Story: From Fireplace to Cyberspace (US: the Board of Trustees of the University of Illinois). Unsworth, L. (ed.) (2008) New Literacies and the English Curriculum: Multimodal Perspectives (London: Continuum). Ventola, E., C. Charles and M. Kaltenbacher (2004) Perspectives on Multimodality (Amsterdam/Philadelphia: John Benjamins). Williams, G. (2000) ‘Children’s literature, children and uses of language description’, in L. Unsworth (ed.) Researching language in schools and communities: A functional linguistic perspective (London: Cassell), pp. 111–29.
8 The Text-Image Matching: One Story, Two Textualizations María Cristina Astorga
8.1
Introduction
Due to the influence of information technology, picture books, like any other kinds of books, have had to compete with television, the video industry and computer games. This phenomenon has created the need to develop integrative visual and verbal literacy practices in order to foster children’s engagement with books in ways that nurture their own understanding of the implications of text-image matching. Specialists in literacy education believe that if students are to become effective participants in emerging multiliteracies, they need to be able to understand how the resources of language, image and digital rhetorics can be deployed independently and interactively to construct different kinds of meanings (Unsworth, 2001, p. 8). Although the use of picture books in EFL/L2 classrooms is an established practice, we need more research that explores how the elements of the stories recounted in picture books for EFL/L2 learners are intermingled through the visual and verbal semiotics. This chapter has two aims: the first to show how semantic relations between text and image can be uncovered by drawing on the semiotics of images as analytical tools. For this purpose, two different textualizations of The Sly Fox and the Red Hen (Ullstein, 1987) for two different readerships will be considered. The second aim is to indicate the usefulness of these descriptions in facilitating a multimodal approach to the teaching of EFL/ L2-reading to young learners. The study is carried out by describing and comparing how the visual and verbal systems of transitivity are simultaneously realized in the orientation, the complication and the resolution stages of the stories. 124
The Text-Image Matching
8.2
125
The grammar of visual images
This section will present a semiotic account of images developed by researchers who have extrapolated the visual categories they propose from the systemic functional descriptions of English grammar. Drawing on Halliday (1985), Kress and van Leeuwen (1990, p. 13) propose that visual design, like language or like all semiotic codes, fulfils the following functions: it can represent patterns of experience (the ideational function), it can enact social interactions (the interpersonal function) and it can integrate the representational and interactive patterns into a meaningful whole (the textual function). The analysis of images as representations of states of affairs in the world, and of the social relations between viewer-image, as proposed by Kress and van Leeuwen (1990), offers a new perspective to the study of written narratives where image and text interact in significant ways to make a narrative whole. Their proposal encourages a linguistically oriented approach to the study of children’s literature which, despite its widely recognized influence on children’s social, cognitive and linguistic development, has received, as Knowles and Malmkjaer (1996, p. 1) observe, little linguistic analysis. The description of text-image matching will here be focused only on the ideational metafunction, where the experience of the world is realized through: ● ●
●
a variety of participants, e.g. Alice (animate), a large mushroom (inanimate); processes, e.g. ran (material), wondered (mental), said (verbal), swallowed (behavioural), was (relational), there was (existential); and circumstances, e.g. happily (how?), in a thick wood (where?), just at this moment (when?).
In the category of images that Kress and van Leeuwen (1990) propose for the analysis of the visual system of transitivity, they distinguish between conceptual images and presentational images. Conceptual images are those images that represent the meaning of a participant, its stable and visible essence, and define it as a member of a class. They serve to explain what things are like and are about ‘being’. As to the relationship that these images have with the grammar of transitivity, Kress and van Leeuwen attribute to them a function akin to – but by no means identical to – existential and relational processes. Presentational images, in contrast, deal with actions and events; they show a particular moment in time or a particular event. With respect to the system of transitivity to which they can be related, Kress and van Leeuwen observe that they fulfil a function akin to – but by no means identical to – material, behavioural and up to a point, mental processes. The presence or absence of a setting also serves to categorize these images: the setting is
126
María Cristina Astorga
not necessary in conceptual images, but it is obligatory in presentational images. According to Kress and van Leeuwen (1990), a picture of an action or interaction without a setting acquires a conceptual aspect and the more defined the setting is in conceptual images, the more the picture will blend the conceptual and the presentational. Barthes (1977, p. 39), who held the view that images are polysemous and therefore dependent on the verbal text, identified two basic image-text relations: elaboration and relay. In elaboration, the verbal text restates the meanings of the image or vice versa; in this case, the same meanings are communicated by the verbal code and the visual code. In relay, the verbal text extends the meanings of the image or vice versa, but in either case word and image are in a complementary relation and new meanings are added to complete the message. In order to describe the text-image relations of the two versions of the same story, the analysis draws simultaneously on the above-mentioned theoretical categories: first, images are identified as either conceptual or presentational as proposed by Kress and van Leeuwen (1990) and second, the relation between written text and visual text is described as one of relay or elaboration as proposed by Barthes (1977).
8.3 Analysis of relations between ideational meanings in language and in visual representation This section presents a detailed analysis of how picture and language simultaneously and interactively construct the meanings of the story in the two different versions. Eggins (1994, p. 310) suggests that a useful step in comparative text analysis is to problematize the texts by asking just what is interesting about them. Before I undertook the comparative analysis, I knew that I could expect differences at the level of language between the two versions because they are intended for different readers: one is a graded reader directed at children who are beginning to learn EFL/L2; the other version is non-abridged and can therefore be suitable for children whose mother tongue is English or for EFL/L2 learners, if the version is judged by the teacher to be within their linguistic competence. However, I found that these two versions had something in common: the presence of richly colourful illustrations enhancing the content of the story. It is out of this similarity that my dimension of interest arose: I wanted to know how and to what extent the visual images enhanced the meanings of the verbal clauses about the characters, about the setting of the story and about the actions in which the main animal characters were involved. I was led by the conviction that this type of analysis offers EFL/L2 teachers a new perspective to assess picture books by giving systematic attention
The Text-Image Matching
127
to the word-image connections and that it can also inform the design of a pedagogy that promotes a multimodal processing of the stories EFL/L2 learners read. In the comparative analysis that follows, the simplified version will be referred to as Text A and the non-abridged version as Text B. The relations to be considered in both texts are those between the clauses of the verbal text and the images that appear on the same page because it is assumed, given the design of the two stories, that the young readers will perceive them together. In order to have a satisfactory baseline data for comparison, I found it necessary to examine both narrative texts in respect of how their ideational meanings were organized functionally in stages. For this purpose, following Rothery and Stenglin (1997, pp. 244–5), I identified the generic structure of the two stories and labelled each section as Orientation, Complicating action and Resolution. Then I explored how the meanings in each section were constructed by the verbal and visual systems of transitivity. 8.3.1
Orientation
In the orientation stage of Text A, both verbal and visual text open the story introducing the character and the setting. The existential clause Here is Red Hen has its visual equivalent in the conceptual image which depicts the little hen. However, while the setting is referred to verbally by means of the deictic ‘here’, the images foreground the physical location of the story by showing what it looks like. This can be seen in Illustration 8.1. There is a relation of relay in which the visual text communicates more meanings than the relational clause of the written text. The same relation
Illustration 8.1 The orientation stage in the simplified and non-abridged versions
128 María Cristina Astorga
of relay is perceived in the next picture frame: though the relational circumstantial clause Red Hen’s home is in a tree has its visual referent in the conceptual image showing the hen’s house, the illustrations depict a larger rural setting. Consequently, the pictures contribute more meanings to the orientation than the words themselves. In the orientation stage of Text B, the pictures also provide visual support but not all the clauses have visual representation. For example, the relational clause She kept her little house neat and tidy has no visual representation because the attributes neat and tidy are not shown in the conceptual image of the house which only shows its exterior. Furthermore, none of the material processes in the clause She did all her own washing, cooking and cleaning have visual matching because the images do not show the little hen involved in these household duties. Only the following clause has a visual representation: Every day she went out with her little basket to pick up sticks for her fire. The image which matches this clause depicts the hen holding a basket as if she were about to pick up sticks. This can be seen in Illustration 8.1. Consequently, the dominant relation between text and image is that of relay, though different from the one identified in the simplified version, Text A. In the non-abridged version, Text B, the relation occurs in the opposite direction: the verbal text communicates more meanings than the visual text. The same kind of text-image matching or lack of matching also occurs with the other characters of this tale. For example, in the orientation of Text A, the Sly Fox is introduced by means of two relational clauses: This is Sly Fox and He is hungry. Both clauses have visual representation as the image shows the fox sitting on the grass with his tongue out which makes him look hungry. In Text B, the orientation introducing the fox is construed by a variety of clauses, some of which have visual matching and some of which do not. For example, the clause He lived with his mother in a den underground is visually depicted. However, the mental clause with the fox as Senser He wanted to catch the little red hen is not represented visually. In this case, there is a relation of relay: the written text communicates more meanings than the visual text. 8.3.2
Complicating action
In the complicating action stage of Text A, every clause with Sly Fox or Red Hen as an Actor of a material process has its visual match and every event realized by the clauses of the written text (Sly Fox goes to sleep; Red Hen gets out of the bag; She puts the stones in the bag) is represented pictorially on a separate page. Thus, the sequencing of events is signalled in the picture frames. By contrast, the events of the same stage in Text B are presented on only one page and the sequencing is signalled by language: temporal conjunctive elements, such as before long, after a while, as soon as, then. Many of
The Text-Image Matching
129
the material clauses in Text B do not have visual representation (She collected some large stones and put them into the bag). Furthermore, the clauses indicating a sequence of two related events (Sly Fox sat down to rest and began to doze) have no matching images as the picture only illustrates the meaning of the relational clause (The fox was asleep). 8.3.3 Resolution In the resolution stage of Text A, the end of Sly Fox is indicated by means of the clause That is the end of Sly Fox and the onomatopoeic word Splash. But it is indeed the picture that dramatically shows how Sly Fox gets killed by the boiling water. In Text B, it is the verbal text that explains how the two foxes die: The boiling water splashed all over the two foxes and killed them both. Though the image that goes with the verbal text provides visual support to this scene, it is less dramatic than the image in Text A because it does not show the foxes being killed by the hot water; it only suggests how they died. 8.3.4
Summary
The analysis in this section has explained how the patterning of the language and of the illustrations interact to construct the meanings of the simplified and non-abridged versions of the traditional tale The Sly Fox and the Red Hen. It has demonstrated that the relation of elaboration, consisting of verbal and visual semantic correspondences between participants, processes and circumstances, is more frequent in the simplified story. It has also shown that the relation of relay occurs in both versions, though in a different way: in the simplified story, Text A, the picture extends the meanings of the verbal text and in the non-abridged edition, Text B, the verbal text extends the meanings of the visual text. These findings are important because they serve to demonstrate that, apart from linguistic differences, there is another level at which the two versions are not alike: the level of text-image interaction. If EFL/L2 teachers are aware of the role played by the illustrations that accompany the written story, they will be better prepared to encourage EFL/L2 learners to examine how the visual resources confirm, explain and or elaborate what the written text asserts. In turn, this multimodal strategy may simultaneously enhance the process of EFL/L2-reading and -learning.
8.4 Discussion In this section, I provide evidence that shows further differences between the two versions and discuss how these differences may affect the processing of the story by EFL/L2 readers. It is necessary to acknowledge that the way language and image interact in a multimodal text may affect the way the text is processed by the intended reader.
130 María Cristina Astorga
Chatman (1978, p. 101) observes that with the verbal text readers have their own individual vision when they recreate existents and space in their imagination by transforming words into mental projections. In contrast, visual readers have a standard vision when characters and setting are actually shown in pictures. In both versions of The Sly Fox and the Red Hen, the pictures are the source of the standard vision, but in Text B, the non-abridged story, more mental representations have to be construed from the words than from the visual images given that not all the clauses have visual matching. The information that Text A provides for the young reader to build up location and character construct (Emmott, 1994, p. 157) is different from that of Text B. In Text A, there are no linguistic elements indicating the location of the story; in fact, it is only the visual text that provides this information at every stage. However, the way children will mentally recreate the setting of the story will depend upon their knowledge of the world: hens are usually found in farms; therefore, the picture to which the orientational word here refers, may be considered to represent a farm, even if what the picture shows is an area of grass that does not look like a rural setting. In this sense, Toolan (1988, p. 103) notes: We can even cope quite happily with stories about hens and foxes where the word ‘farm’ never appears: unless we learn to the contrary, we simply assume a stereotypical rural background. In Text B, the non-abridged story, the reader has more clues for constructing location: the circumstantial elements in a little house in the woods, in a den underground and through the wood together with the pictures provide information about location and shift of location. There are also differences with respect to the textual indicators that Text A and Text B provide for the reader to build up information for the two animal characters. Whatever the textual clues, this is a story where the characters’ traits need to be reconstructed from their actions: the characters demonstrate their ‘cleverness’ by taking steps in order to fool each other. In Text A, except for the presence of ‘sly’ in the title, the characters are not assigned attributes; these have to be inferred from the characters’ actions which are simultaneously communicated by language and images. The images succeed in presenting Red Hen in the role of a victim: she can be seen falling straight into the Fox’s bag because at this stage the Fox proves to be the cleverest. The reversal of the situation is also pictorially depicted: Red Hen is shown putting stones into the bag after she has managed to get out of it. For this reason, in Text A, the construction of characterization requires a certain degree of inference-making because the written message has no clauses describing the characters’ traits; the reader simply witnesses their actions, in language and image, and has to reconstruct a mental image of such traits from mere verbal representations of actions.
The Text-Image Matching
131
By contrast, in Text B, apart from the images that also show how they fool each other, there are direct textual indicators of character: the reader learns from the written text that Red Hen is too clever for Sly Fox, and that Sly Fox is capable of thinking up a very cunning plan, which again suggests that he is clever. Emmott (1994) notices that in fictional narrative the characters’ pro-forms cannot be classed as exophoric since they refer to the fictional world, not to the real world and, because of this, the reader cannot physically look around him or herself for the referent. In the two versions under analysis, whenever the characters are referred to, either lexically or pronominally, they can be identified in the pictures accompanying the verbal text. Thus, the reader does not have to do much cognitive work to recover the referents; they are present in the visual images. But as far as the language realizations are concerned, there are differences with respect to the way the system of reference is used in each version. It is not surprising that in Text A – the simplified version intended for learners with limited competence in English – the characters are most of the time referred to lexically, while pronominal reference is significantly less frequent. By contrast, in Text B pronominal reference is as frequent as lexical reference. These differences can be seen in Figure 8.1. In the simplified version, which can be considered as a bald kind of narrative, the story is presented as an account of purely temporally ordered events, without any elaboration. For this reason, the information that is not explicit, such as characters’ attributes, temporal and consequential relations must be inferred from the written text. However, this process may be aided by the pictures because there is text-image matching at every stage of the story. For example, from the picture of Sly Fox looking hungry and at the 40 35
35
35
29
30 25 20 15 10
8
5 0 Text A Lexical reference
Text B Pronominal reference
Figure 8.1 System of reference in two versions of The Sly Fox and the Red Hen
132 María Cristina Astorga
same time holding a bag, the young EFL reader may infer that he wants to eat Red Hen; in fact, the word eat appears much later in the story. Other logical relations need to be inferred: Red Hen is dizzy because Sly Fox runs round and round on purpose to make her fall into his bag, or it is because Sly Fox is asleep that Red Hen decides to get out of the bag and to put stones in her place. The identification of these and other relations is essential if the central point of this fable is to be grasped. The characters make things happen because they have goals and they expect their actions to have effects and results. So, it may be concluded that the simplicity of Text A is deceiving; even if it contains language carefully selected for the initial stages of EFL/L2-reading, it requires more inference work than Text B. My point is that simplified versions, on the one hand, contain language that makes the story accessible to EFL/L2 learners with limited knowledge of the English language system but, on the other hand, they deprive the learners of the linguistic resources they need to be able to understand how the events in the narratives are related, why some events are caused by other events or why some characters are different from others. EFL/L2 learners, irrespective of their age, will build their knowledge of the new language upon what they know about language as native speakers. Rutherford (1987, p. 7) stresses this point when he observes that ‘the learner does not embark upon his EFL/L2-learning experience as a tabula rasa or in total ignorance of everything concerning language and what we use it for’. The ability to perceive events in terms of temporal and causal connections is precisely part of their prior knowledge. For example, Martin (1983) reports that children (six- to seven-year- olds) tend to use additive relations while older children (10- to 11-year- olds) tend to make explicit use of temporal and consequential relations, apart from additive ones. Crombie (1985, p. 83) notes that [A]lthough there may be differences between one language and another in terms of which relational values are distinctively encoded, it would appear to be the case that except for a few peripheral relational distinctions which may be specific to a particular language or a particular group of languages, relational values are conceptual universals. We may argue that, if the learners already master sequential and consequential relations from L1, they will only have to learn the target words to express relations that are already familiar to them and that this will reduce the cognitive difficulty of learning new target words. The simplicity of abridged picture books written for EFL/L2 learners may also be deceiving with respect to the pictures that illustrate them. Astorga (1999) found a direct relationship between the type of processes that predominated in the written text and the type of text-image relation. In simplified stories verbal processes are more frequent, as in the Ugly Duckling
The Text-Image Matching
133
(Ullstein, 1987), and the stories contain fewer relations of elaboration, since there is no matching between such processes and the images. This means that the meaning of those clauses is not made comprehensible by the pictures. In contrast, in non-abridged stories material processes predominate, and they were found to contain more relations of elaboration. This has the result that the meanings of the nominal groups that name the characters, of the verbal groups that indicate the events and actions undertaken by the characters, as well as of the adverbial groups that locate the story in time and place, are made clear by the use of both conceptual and presentational images. One of the main implications of these findings is that EFL/L2 teachers do not have to discard non-abridged picture books as being unsuitable for beginning EFL/L2 readers; at least, not without having first explored the functions of the visual images that illustrate the verbal meanings of the story. For the same reason, the suitability of simplified stories cannot be determined without a prior analysis of the visual resources that accompany the written text.
8.5 Pedagogic implications This final section develops a pedagogic proposal that deals with the textimage relation with the aim of enhancing both the process of EFL/L2reading and -learning by young learners. We need to acknowledge that picture books, in addition to providing children with an aesthetic and multisensory experience, have an important role in assisting children’s EFL/L2 development. Research by Walsh (2000) has provided us evidence that illustrations help EFL/L2 learners to comprehend the meanings of the stories they read. It has also shown that the difficulties that EFL/L2 children experience are associated with the written text and not with the visual text. Therefore, it seems pedagogically sound to propose an approach that exploits the fascination that children feel for illustrations and, at the same time, attempts to extend this interest to the verbal elements of the stories. Pictures can be used to bring language to children’s conscious attention, specifically to features that have to do with the macro- as well as microstructures of the narrative texts. What I am suggesting is that, as part of the process of teaching EFL/L2-reading, visual images can be exploited to promote genre- oriented noticing (Astorga, 2007, p. 255) which requires the use of both top- down noticing and bottom-up noticing. Top- down noticing may call the children’s attention to the overall meanings conveyed at each stage (orientation, complication, resolution) of the schematic structure of narratives. In contrast, bottom-up noticing may orient children’s attention to the lexicogrammatical and discourse features that realize the meanings of the stories.
134 María Cristina Astorga
At the macro-level, EFL/L2 learners can be made aware of the generic structure of a story by comparing how each stage is realized visually and verbally. For example, they may be asked to compare a picture that illustrates the orientation stage in order to determine whether the verbal and visual texts introduce the same information about the characters and the setting or, else, whether the verbal text tells more than the visual text or the other way round. Similar procedures may be used with the other stages of a story. For the micro-level aspects of a story, the children’s attention may be drawn to units larger than the word. The noticing technique of underlining may be useful for this purpose. For instance, the teacher may underline the verbal groups that realize the different processes (material, verbal, relational, etc.) and ask children to find out whether the pictures show the same actions, thoughts, behaviours and relations. Nominal groups which name the human and non-human participants may also be underlined for children to see whether there are semantic correspondences between these groups and the narrative images. The main advantage of this kind of multimodal reading practice resides in the fact that, early on, young EFL/L2 readers may be initiated in interpretative textual practices and be trained to use the bottom-up reading strategy, known as segmentation, through the identification of meaningful units of language, such as the groups that make up the clauses. Most importantly, through this approach, children can consider the meaning and function of these groups in the context of the narrative texts. Wray and Medwell (1991, p. 96) claim that ‘efficient reading demands a great deal of knowledge of and familiarity with syntax or sentence structure of the written language’. In my opinion, encouraging young EFL/L2 learners to work with the groups as well as with the clauses of the written stories in conjunction with the visual images is an effective way of scaffolding practice in EFL/L2-reading and of giving learners the opportunity to explore how the forms of the target language make the meanings of the stories they read. Holderness (1991, p. 23) observes that cognitive challenge is one of the factors that make a task attractive and interesting to children. Open-ended activities are among the most challenging tasks for children because they lead to problem-solving and investigation. The task of finding similarities and differences between the macro- and micro-aspects of verbal and visual texts has a problem-solving component: the learners have to go through a process of analysis, description and comparison in order to find out what meanings words and images share or do not share. This kind of work is suitable for young learners, whatever their level of EFL/ L2 proficiency, because they can learn the EFL/L2 by investigation which, in turn, is motivated and promoted by their natural interest in illustrations.
The Text-Image Matching
8.6
135
Conclusion
This chapter has presented a multimodal approach to the reading of picture books that draws on the grammar of visual images and has made explicit, through the analysis of two different versions of the same story, how it is possible to uncover relations of meanings between the verbal and the visual modes by comparing conceptual and presentational images to the clauses of the written texts. Kress and van Leeuwen (1996, p. 12) observe that educationalists everywhere have become aware of the increasing role of visual communication in learning materials of various kinds, and they are asking themselves what kinds of maps, charts, diagrams, pictures and forms of layout will be most effective for learning. To answer this question, they need a language for speaking about the forms and meanings of these visual learning materials. It is my contention that the ability to use language to speak about the forms and meanings of visual instructional materials is equally important for EFL/L2 teachers, especially when they need to be able to scaffold the process of EFL/L2-reading. Undoubtedly, EFL/L2 teachers need to develop their own pedagogy for engaging the young readers in an exploration of how image and language simultaneously encode the meanings of the stories they read. The value implicit in this multimodal approach to picture book evaluation lies in the process of redescription that all educators have to carry forward. In this respect, the remark made by Meek in relation to the process of literacy seems equally relevant to the field of second language teaching: ‘We need to redescribe both children reading and children’s books and give ourselves lessons from both’ (Meek, 1992, p. 178).
References Astorga, M. C. (1999) ‘The text-image interaction and second language learning.’ The Australian Journal of Language and Literacy, 22(3): pp. 212–33. —— (2007) ‘Teaching academic writing in the EFL context: Redesigning pedagogy.’ Pedagogies: An International Journal, 2(4): pp. 251–67. Barthes, R. (1977) Image-Music-Text (London: Fontana). Chatman, S. (1978) Story and Discourse (Ithaca: Cornell University Press). Crombie, W. (1985) Discourse and Language Teaching: A Relational Approach to Syllabus Design (Oxford: Oxford University Press). Eggins, S. (1994) An Introduction to Systemic Functional Linguistics (London: Pinter). Emmott, C. (1994) ‘Frames of reference: Contextual monitoring and the interpretation of narrative discourse’, in M. Coulthard (ed.) Advances in Written Text Analysis (London: Routledge), pp. 157–66. Halliday, M. A. K. (1985) An Introduction to Functional Grammar, 2nd edn (London: Edward Arnold). Holderness, J. (1991) ‘Activity-based teaching: Approaches to topic- centred work’, in C. Brumfit, J. Moon and R. Tongue (eds) Teaching English to Children (London: Collins), pp. 18–32.
136 María Cristina Astorga Knowles, M. and K. Malmkjaer (1996) Language and Control in Children’s Literature (London: Routledge). Kress, G. and T. van Leeuwen (1990) Reading Images (Geelong, VIC.: Deakin University Press). —— (1996) Reading Images: The Grammar of Visual Design (London: Routledge). Martin, J. R. (1983) ‘The development of register’, in J. Fine and R. Freedle (eds) Developmental Issues in Discourse (Norwood, NJ: Ablex), pp. 1–40. Meek, M. (1992) ‘Children reading – now’, in M. Styles, E. Bearner and V. Watson (eds) After Alice (London: Cassell), pp. 160–89. Rothery, J. and M. Stenglin, M. (1997) ‘Entertaining and instructing: Exploring experience through story’, in F. Christie and J. R. Martin (eds) Genres and Institutions: Social Processes in the Workplace and School (London: Cassell), pp. 231–63. Rutherford, W. (1987) Second Language Grammar: Learning and Teaching (London: Longman). Toolan, M. (1988) Narrative: A Critical Linguistic Introduction (London: Routledge). Walsh, M. (2000) ‘Text-related variables in narrative picture-books: Children’s responses to visual and verbal texts.’ The Australian Journal of Language and Literacy, 23 (2): pp. 139–56. Wray, D. and J. Medwell (1991) Literacy and Language in the Primary Years (London: Routledge). Unsworth, L. (2001) Teaching Multiliteracies across the Curriculum: Changing Contexts of Text and Image in Classroom Practice (Philadelphia: Open University Press).
Children’s books referred to in this study Ullstein, S. (1987) The Sly Fox and Red Hen (Ladybird Graded Readers) (Loughborough: Ladybird Books). Ullstein, S. (1987) The Ugly Duckling (Ladybird Graded Readers) (Loughborough: Ladybird Books).
Part III Text and Visual Interaction in Advertising and Marketing
This page intentionally left blank
9 Sequential Visual Discourse Frames Kay L. O’Halloran and Victor Lim Fei
9.1
Introduction
The cover of the weekly American news magazine Time (8 May 2006) features ‘The lives and ideas of the world’s most influential people’ with a photograph montage of the people included on the 2006 Time list. Preceding the cover story ‘People Who Shape Our World’, Time Asia contains 1–2 full-page colour advertisements for Rolex, Shell, DHL, Breitling, Mercedes-Benz, Toshiba, Lufthansa, Sony Ericsson, Lenovo, Longines and Rado products and services, in addition to some news articles. The cover story (72 pages) contains further advertisements for Bayers Healthcare, Asian Games, Thai Airways, Toyota Formula 1 and Cartier which unfold generically until something unusual happens. The Cartier advertisement (Frame 1, Illustration 9.1) becomes a two-page visual spread (Frames 2–3, Illustration 9.1) that unfolds at the centre to reveal the four-page article Time 100 ‘Power Couples’ (Centrefold, Illustration 9.1). Upon folding back these two pages (Frames 2–3) and turning the page, the Cartier advertisement contains one further page (Frame 4, Illustration 9.1). This chapter adopts a socio-semiotic perspective (e.g. Halliday, 1978, 2004; Kress and van Leeuwen, 1996; van Leeuwen, 2005) to analyse the sequential multisemiotic discourse of the Cartier advertisement which unfolds in the context of the Time magazine’s 2006 list of the world’s 100 most influential people. The findings are related to identity (e.g. Bauman, 2004; Iedema and Caldas- Coulthard, 2008) and the functions of print media in the age of software-based modernity which has fundamentally changed ‘all aspects of the human condition’ (Bauman, 2000). What is the place of print media today, and how is corporate identity maintained in a globalized consumer market known for its fluidity, transience and change (Bauman, 2000)? The Cartier advertisement and Time magazine’s special issues of the 100 most influential people are investigated for this purpose from the perspective of Michael Halliday’s systemic functional social semiotic theory. 139
140
Kay L. O’Halloran and Victor Lim Fei
Frame 1
Frame 2–3
Frame 4
Centerfold Illustration 9.1 Cartier advertisement and centrefold (Time Asia, 8 May 2006)
9.2
Systemic functional social semiotic theory
Michael Halliday’s (e.g. 1978, 2004; Halliday and Matthiessen, 1999; Martin, 1992) systemic functional social semiotic theory provides an approach to modelling, analysing and interpreting multimodal phenomena, known as systemic functional-multimodal discourse analysis (SF-MDA) (Djonov, 2005; O’Halloran, 2007, 2008a). In this approach, multimodal phenomena are conceptualized as choices from semiotic resources (e.g. language, images, music, mathematical symbolism, gesture and movement) which integrate across visual, auditory and somatic (haptic, gustatory and olfactory) modes to construct meaning in the context of their instantiation. Halliday (2004) developed SF-theory in relation to language, which he describes as having abstract grammatical systems which realize four metafunctions: (a) experiential meaning: to construct experience of the world; (b) logical meaning: to make logical connections in that world; (c) interpersonal meaning: to enact social relations; and (d) textual meaning: to organize the semiotic choices which unfold. Halliday (2004) contains a comprehensive description of the grammatical systems of the English language through
Sequential Visual Discourse Frames 141
which the four metafunctions are realized. Halliday views language as ‘one of a number of [social semiotic] systems of meaning that, taken all together, constitute human culture’ (Halliday, 1985, p. 4). Halliday’s SF-theory has been extended to other semiotic resources, including paintings and other forms of displayed art (O’Toole, 1994), visual design (Kress and van Leeuwen, 1996), mathematical symbolism and images (O’Halloran, 2005), action and gesture (Martinec, 2000, 2004) and music and sound (van Leeuwen, 1999). Furthermore, Halliday’s SF-theory provides a platform for theorizing and analysing how semiotic choices combine to create meaning in multimodal phenomena (e.g. van Leeuwen, 1985, 2005; Kress and van Leeuwen, 1996, 2001; Ventola, Charles and Kaltenbacher, 2004; O’Halloran, 2005; Baldry and Thibault, 2006; Royce and Bowcher, 2006; Unsworth, 2006). The SF-MDA framework in Table 9.1, based on Halliday’s (2004) SF-theory for language and O’Toole’s (1994) SF-model for paintings, is used to analyse the Cartier advertisement displayed in Illustration 9.1. In the SF-MDA framework, choices from metafunctionally organized systems function ‘intersemiotically’ within and across the context, content and expression planes, giving rise to contextualizing relations and semantic expansions. That is, semiotic choices in the constituent ranks for language (word, word group, clause, clause complex and discourse) integrate with image choices (part, figure, episode, work and inter-visual relations), resulting in co- contextualized (similar) and/or re- contextualized (new) semantic fields. In addition, semiotic choices from systems on the expression plane (e.g. colour, font style and paper quality) result in the materiality of the text with its associated semantic field (e.g. glossy paper versus recycled paper), which is contextualized in relation to linguistic and visual choices made on the content plane. These configurations of semiotic choices integrate within and across items and mini-genres (e.g. photos, written text and logos) to realize the register of the advertisement in terms of tenor (the social relations), field (the content) and mode (visual, aural and somatic). Print advertisements unfold as a genre with associated views and ideologies about the world. While semiosis in single-image texts has been theorized (e.g. O’Toole, 1994; Kress and van Leeuwen, 1996), there have been fewer investigations into the meanings arising from sequences of images (e.g. Lim, 2004, 2005; O’Halloran, 2005; Painter, 2007; Moya and Pinar, 2008). Therefore, with the larger objective of understanding the ways in which re-contextualizing and co- contextualizing relations take place within and across multimodal phenomena, it is useful to explore the mechanisms and processes through which meaning expansions occur across sequential images. Sequential image frames can be understood in terms of the concept of the ‘emergent narrative’ realized through the integration of linguistic choices (Halliday, 2004; Martin, 1992, 2008; Martin and Rose, 2003) and
142
Kay L. O’Halloran and Victor Lim Fei
Table 9.1 SF-MDA framework for print advertisements (based on O’Halloran, 2008a) Ideology Context
Genre Register Items and mini-genres
Content
Language
Images
Intersemiosis
Discourse semantics Intersemiosis Discourse systems
Inter-visual relations work Grammar Intersemiosis
Clause complex Clause Word Group Word Expression
Episode Figure Part
Graphology, typography and graphics
image choices (O’Toole, 1994), including the inter-visual relations for image sequences on the discourse semantics stratum (see Table 9.1). Lim (2006, pp. 195–213) proposes the following discourse systems for inter-visual relations: (a) experiential meaning: VISUAL TAXONOMY and Associating Elements; (b) logical meaning: VISUAL TAXIS and Transition Relations; (c) textual meaning: VISUAL REFERENCE and Visual Linking Devices; and (d) interpersonal meaning: VISUAL CONFIGURATION and Flow. Lim’s (2006) systems are based on Martin’s (1992, 2008; Martin and Rose, 2003) discourse systems for language which informed O’Halloran’s (2005, pp. 133–5) discourse systems for mathematical images. Lim’s (2006, pp. 195–213) discourse systems are briefly described below. Gestalt theory explains that viewers have an overall perception of forms and objects and that when their parts become the focus, they are perceived in relation to the whole (O’Toole, 1994, p. 23). Therefore, Associating Elements constitute the pictorial part–part and part–whole relations to account for the experiential meaning arising from the actions and settings across different frames. Transition Relations, adapted from McCloud (1993, p. 74), realize the logical relations between frames. The types of Transition Relations are Moment–Moment, Action–Action, Subject–Subject,
Sequential Visual Discourse Frames 143
Scene–Scene, Aspect–Aspect and Non- Sequitur. Visual Linking Devices function textually to provide coherence and cohesion to the sequential images, and the recurrence of such choices adds to the overall cohesion of the image sequence. In their role of providing continuity between frames, the Visual Linking Devices are analogous to the concept of motifs in language. Lastly, Flow is the level of the reader’s interpersonal engagement necessary to comprehend the emergent narrative arising from the image sequence. A strong Flow demands lesser involvement on the part of the reader to make sense of the narrative, and vice versa. System choices for Associating Elements, Transition Relations and Visual Linking Devices contribute to the strength of Flow across the frames. The SF-MDA framework and Lim’s (2006) discourse systems for intervisual relations are used for the multisemiotic analysis of the Cartier advertisement. The ideologies arising from the semiotic choices are interpreted within the context of Time magazine and the larger socio- cultural context of Western culture. The analysis reveals that the primary aim of the advertisement is to reinforce the brand identity of Cartier, with the ultimate aim of selling its products. A secondary aim is to lead the intended reader to visit the Cartier website for further details and retail information. The multisemiotic analysis highlights the strategies through which these aims are achieved, and an investigation of the Cartier and Time websites contributes to our understanding of the functions and affordances of print media in the age of interactive digital media. This chapter therefore hopes to demonstrate how meanings are made in the immediate context of their instantiation in print media, and how these meanings function intertextually with digital media sites in the globalized market world of today.
9.3
The Cartier advertisement (Time Asia, 8 May 2006)
The Cartier advertisement text spans Frames 1–4 displayed in Illustration 9.1. The intended reading path is sequential from Frame 1 (the page before the centrefold), to Frames 2–3 (the centrefold) to Frame 4 (the page after the centrefold). Frames 2–3 can be opened up to view the ‘Time 100 Power Couples’ article displayed in Illustration 9.1. 9.3.1 Haptic mode and the semiotics of action While recognizing the dominant mode of the Cartier advertisement is visual, the text also operates as a communicative artefact through the haptic (tactile) mode on the expression plane. The paper on which the text is printed, as well as the physical form which the text takes, is ideationally, interpersonally and textually meaningful. In this case, the texture of the Cartier advertisement is thicker and glossier compared to other pages in
144
Kay L. O’Halloran and Victor Lim Fei
Time magazine. In addition, the design of the centrefold which requires the reader to manually ‘open up’ the advertisement to read the ‘Power Couple’ article is a marked option as it departs from the usual practice of just turning over the pages in the magazine. It engages the reader interpersonally as it requires the reader to perform a different action of opening up the pages instead of turning them over. This departure draws attention to the Cartier advertisement, as the reader has to literally stop to open up the pages. The main aim of advertisements is to attract and retain (momentarily, at least) the reader’s attention, and thus the material choices for the haptic mode and the semiotics of action (Martinec, 2000, 2004) are marked in the Cartier advertisement through its unconventional selections. The reader’s attention is inevitably drawn to the Cartier advertisement if they are flipping through or reading the Time magazine. 9.3.2 The visual mode The system choices for the haptic mode and the semiotics of action function together with system choices in the visual mode to engage the reader. In what follows, the semiotic choices for language and image are analysed to investigate the emerging narrative which unfolds across Frames 1–4, and the ideologies underlying the advertisement. Frame 1 The preferential point of entry, or the Centre of Visual Impact (CVI) (Bohle, 1990) in Frame 1 of the Cartier advertisement, is linguistic text ‘LOVE’ which appears in capitalized letters and white font set against a red background. The contrast provided by choices from the system of Colour function to engage the reader interpersonally. The system of Chiaroscuro, the application of light and shadows, also works powerfully to create an engaging background, which helps to draw the reader’s attention towards the linguistic text in the frame. Therefore, the system of Colour is deployed effectively by manipulating the choices available in the sub-systems of Hue, Tone and Saturation. Colour is somewhat different to other grammatical and discourse systems, because it functions as an open system with the potential to realize more than one metafunction. O’Toole (2005, p. 88), for example, argues that ‘choices in the Representational function like Action/Scene/Portrayal are systemic ... [but] Clarity, Light and Color in the Modal function ... are more a question of degree, or points on an almost infinite cline, not discrete options in a closed system’. Furthermore, systems such as Colour are not dedicated to a single metafunction, and so they have a ‘low systemmetafunction fidelity’ (Lim, 2004, p. 223). Hence, it is possible for the system to simultaneously fulfil two or more metafunctions in a text. For instance,
Sequential Visual Discourse Frames 145
Colour serves not only the interpersonal metafunction, but the ideational and textual metafunctions as well. However, there are certain environments or conditions, known as the Critical Impetus, which lead to the dominance of a particular metafunction (Lim, 2004, p. 224). The Critical Impetus of salience functions to realize interpersonal meaning in the Cartier advertisement. The salience is brought about by the contrasting shades of the red hue, the result of the play on lighting, which accentuate the linguistic text ‘LOVE’ which is located directly below the scattered muffled white lights appearing in the background. The white font colour of LOVE juxtaposed against the red background also achieves a Critical Impetus of salience. The red background is meaningful experientially as well, and this functionality of colour may be understood through the concept of Denotative Value and Connotative Value (Barthes, 1977) (see Lim, 2004, p. 233). The Denotative Value operates on the sensory perceptual level and is the literal sense of the item. For instance, the Denotative Value of the colour red literally refers to the red tone. On the other hand, the Connotative Value is contextual and ideologically determined. For instance, the Connotative Value of the colour red in the Cartier advertisement suggests romance, passion and intrigue. The Connotative Value is context-sensitive and culturespecific, distinctive to particular semiotic communities, the people who share the same understanding and agreement to a common usage of semiotic choices. The choice of Typeface in the system of Font functions to create meaning (e.g. van Leeuwen, 2006; Machin, 2007). Roman Typeface is used for the word ‘LOVE’, with the characteristic that each letter of the word is distinguished with a discrete Internal Space. Although a wide selection in the system of Internal Space can diminish readability, the choices made in the design of the word ‘LOVE’ draw attention to each letter, especially to the stylized letters ‘O’ and ‘E’. In fact, a rather clever integration has taken place in the image of the letter ‘O’ as a close repetition of the letter ‘E’. Experientially, the solid ‘static’ Typeface for ‘LOVE’ suggests an entity, rather than the process of loving. In contrast, the Cartier logo has a trademark Script Typeface with its interpersonal appeal arising from the dynamic handwriting style reminiscent of writing-masters from earlier centuries. This Typeface gives rise to a Connotative Value of class and sophistication. Lastly, Frame 1 achieves salience through the fact that there are only two items, the word ‘LOVE’ and the Cartier logo. The simplicity and open space in Frame 1 contrasts sharply with other pages in Time magazine which are densely packed with linguistic text and multiple images. Therefore salience is created through the singularity of the words and their relatively large font size, aided by the colour contrast and the concentration of white lights in the background. The semiotic choices underscore the prominence of the
146
Kay L. O’Halloran and Victor Lim Fei
word ‘LOVE’ and its central role in this advertisement. The Cartier logo is instantaneously recognizable to many readers, with its trademark Script Typeface and red background. Frames 2–3 The experiential meaning of the lights scattered across Frame 1 emerges into greater clarity in Frames 2–3 to reveal a landscape with tall buildings. The represented landscape changes sequentially across the two frames signifying a movement from left-to-right, which results in a spatial shift in visual temporality. The linguistic text in Frame 2 identifies the landscape as New York City. In this case, the text anchors the image (Barthes, 1977), adding definitiveness to the representation. This demonstrates how language functions to co- contextualize the image, in this case the bright lights of New York City, quite possibly Broadway, which is one of the most famous theatres in the world. ‘Along with London’s West End theater, Broadway theater is usually considered to represent the highest level of commercial theater in the English-speaking world’.1 Evidently, Cartier identifies its market brand with famous people, places and things. The CVI in Frame 2 is the word ‘LOVE’, and the linguistic text ‘In the 1970s, New York was the place where Cartier found the inspiration for its famous bracelet. Locked in place by a loved one, it symbolizes an everlasting bond.’ The text has three cases of marked thematic organization. First, the marked theme ‘In the 1970s’ creates a timeframe for the Cartier bracelet, suggesting that it has a noteworthy history. Second, the predication of ‘New York’ in ‘New York was the place’ functions to place New York City in the subject position, instead of a circumstantial adjunct in the non-predicated form ‘In the 1970s, Cartier found the inspiration for its famous bracelet in New York.’ These two choices are meaningful because they foreground the dynamic interplay between an abstract nostalgia for the past and the material physicality of the famous city. Visually, the landscape enshrouded in haze in Frames 2–3 clears to reveal a more tangible representation of buildings, reinforcing the historical significance of the bracelet. Cartier is the agent responsible for finding the ‘inspiration for its famous bracelet’ in New York, but the use of the grammatical metaphor ‘inspiration’ permits the person who was inspired and the phenomenon which inspired him/her to design the bracelet to remain unknown and somewhat mysterious. The third instance of marked textual organization occurs in the sentence ‘Locked in place by a loved one, it symbolizes an everlasting bond’ because the dependent clause ‘Locked in place by a loved one’ precedes the primary clause ‘it symbolizes an everlasting bond’. The marked thematic organization foregrounds the fact that the bracelet is locked into place, but the voice is passive (i.e. ‘by a loved one’) with ellipsed subject ‘the bracelet’. The primary clause ‘it symbolizes an everlasting bond’ attaches the abstract value
Sequential Visual Discourse Frames 147
of ‘an everlasting bond’ to the token ‘it’, which is the bracelet. Thus linguistically, ‘the bracelet’ only appears in the written text in the first sentence as a postmodifier for ‘the inspiration [for its famous bracelet]’. The linguistic backgrounding of the bracelet in Frame 2 is semiotically inverted in Frame 3 where the visual image of the dazzling gold Cartier bracelet dominates the entire page. The bracelet is dynamically depicted through the image of the key and chain flying across Frames 2–3 to lock (or unlock) the bracelet. The bracelet is positioned at an oblique angle which gives it an added sense of dynamism of its own, accentuated by the flash of light and the brilliant shiny gold surface which functions to engage the reader and draw his/her attention to the insignias engraved on the bracelet. The systems of Parallelism and Repetition connect the Cartier logo and the design of the ‘O’ in the word ‘LOVE’ to the bracelet. In addition, the salient position occupied by the word ‘LOVE’ in Frames 1–2 is replaced by the image of the Cartier bracelet in Frame 3, thus constructing the Cartier ‘LOVE bracelet’. Frame 4 The Cartier LOVE bracelets, one gold and the other silver, completely dominate Frame 4 through Size, Position and Colour which are accentuated by the flash of light. The two bracelets are intertwined, a love match made against the bright lights of New York City. The oblique lines (O’Toole, 1994) created by the bracelets lead the reader’s gaze down, from the CVI of the bracelets to the brand name Cartier and to the website address which appears in relatively small font. This provides the crucial connection between Cartier’s advertisement and their website, inviting further engagement which moves beyond the initial reading of the printed advertisement. Unsurprisingly, the Cartier website provides information of the addresses of the Cartier outlets throughout the world, in addition to sophisticated hypermedia displays of the company and its products. Thus, the Cartier print advertisement is effectively used to attract the potential consumer, with an intended progression to their website for information and possible purchases. In this way, the Cartier print advertisement is able to remain minimalist in terms of linguistic and visual items and components, which increases its markedness in the context of Time magazine. 9.3.3
The emergent narrative
The sequence of images in Frames 1–4 has a strong Flow in relation to the degree of the reader’s engagement and reasoning necessary to understand the emergent narrative. The logical meaning made through the system of Visual Taxis in the form of transition relations is Subject-to-Subject. The main subject in Frame 1 is the word ‘LOVE’. The reappearance of the word ‘LOVE’ in Frame 2 connects the two frames as a Visual Linking Device. The two-page presentation in Frames 2–3 provides continuity, and the Visual
148
Kay L. O’Halloran and Victor Lim Fei
Linking Devices of the bracelet and the Cartier brand name, together with the Associating Elements of the landscape, facilitate the Subject-to-Subject transition relations across frames. The experiential meaning is achieved through the deployment of Associating Elements in the visual taxonomy. The Associating Elements in Frame 1 are the shades of white lights against the red backdrop. As Associating Elements, their recurrences across Frames 2–4 lend cohesion to the sequential text. Although the white spots seem to be part of the background design, they become meaningful as Associating Elements in relation to Frames 2–3 because the red hue lightens to reveal a backdrop of city lights and buildings in Frame 3. This becomes more evident in Frame 4, when the lights and buildings become clearer, as the ‘fog’ subsides. The Associating Elements suggest an urban setting, likely to be the heart of a modern city. Co- contextualized by the linguistic text in Frame 2, the reader learns that the landscape is New York City. This is further reinforced by the representation of the Empire State Building as an Associating Element, which is discernible in Frames 3 and 4. Retrospectively, it becomes certain that the red background in Frame 1 is the landscape of New York City enshrouded in haze. There are two Episodes which are featured in Frames 2–3. Frame 2 contains the word ‘LOVE’ and the text about how Cartier found the inspiration for the bracelet in New York City in the 1970s. Frame 3 contains the visual representation of the chain and bracelet where the selections for Stance (O’Toole, 1994) suggest movement and energy. This is accentuated by the radiant burst of light through the bracelet. Therefore, the relative calm and static representations in Frames 1–2 are juxtaposed against the dynamism conveyed in Frame 3. An emergent narrative reading of this advertisement points to Frame 3 as the climax in the plot. This is consistent with the expectation of the rise in energy and intensity leading to the climax, in this case, the unveiling of the product, the Cartier bracelet. The close analysis of the linguistic and visual texts in Frames 1–3 is fruitful for understanding the meanings made intersemiotically in the Cartier advertisement. One mechanism for conceptualizing intersemiosis between language and images is semiotic metaphor (O’Halloran, 1999, 2003). As an extension of grammatical metaphor, semiotic metaphor is the process whereby a ‘semantic reconstrual’ across different semiotic resources occurs with a shift in the functional status of an element, consequently leading to a multiplication of meaning. That is, ‘the new functional status of the element does not equate with its former status in the original semiotic or, alternatively, a new functional element is introduced in the new semiotic, which previously did not exist’ (O’Halloran, 1999, p. 348). The main message in the Cartier advertisement is contained within the semiotic metaphor whereby the abstract entity of love (in reality, a process) in Frames 1–2 is construed as a visual entity, the bracelet in Frame 3. However, the material process whereby ‘the loved one’ locks the bracelet ‘in
Sequential Visual Discourse Frames 149
place’ becomes the semiotic metaphor for love. In other words, the abstract entity ‘love’ is re- construed as the physical process of locking the bracelet in place, the special characteristic of the Cartier LOVE bracelet with its key and chain. Moreover, the LOVE bracelet re- contextualizes the concept of handcuffs and other locking devices such as the chastity belt which the powerful use to restrict the actions of the less powerful through material restraint. The re- contextualization of locking devices as luxury jewellery items which signify ‘love’ (in much the same way as the wedding ring symbolizes marital commitment and fidelity) provides the underlying metaphorical foundation for the Cartier advertisement. The ultimate climax and resolution of the emergent narrative is the intertwining of the two bracelets in Frame 4 which replaces the word ‘LOVE’ in Frame 2 and the bracelet in Frame 3. The landscape becomes disambiguated due to the radiance produced by the flashes of light, and the energy and dynamism displayed by the bracelets suggest the passion and emotion surrounding the act of love. Connection between Frames 1–4 is made through the Visual Linking Device of the insignia on the bracelets, that is, the letter ‘O’ in the word ‘LOVE’. The recurrence of the ‘O’ reinforces its importance and significance as the representative insignia of the Cartier brand. The repetition functions to strengthen the impression of the brand through its emblem. The metaphorical foundation of the emergent narrative and the reinforcement of the brand identity on the Cartier LOVE bracelet present an ideology about the nature of love and its realization in the world today. 9.3.4 Ideology The Cartier advertisement invites the interpretation that brilliance and power are not only in the materiality of the physical object itself, but really in the emotions and passion behind the transformation of love into the act of giving and locking the gift into place. Ideologically, the realization of love (entity) in the dynamism generated by the action (process) is a subtle and clever advertising strategy to associate love with action and material goods, in this case, a luxury locking device. This expression of love creates a force so potent and a light so blinding that it can shatter the darkness and enable sight. Metaphorically represented by the energy and brilliance of the burst of light against the backdrop of one of the most famous cities in the world, it is suggested that the Cartier bracelet brings vision and power, which defies the traditional concept of locking devices as mechanisms of restraint. In this case, love becomes the act of giving luxury jewellery and restraint becomes power, an inversion which somehow mirrors the traditional marriage ceremony in Western culture. In addition, an exclusive identity for Cartier and its customers is created through link to the people featured in Time magazine’s list of the 100 most influential people in the world for 2006 and the Time 100 ‘Power Couples’. Thus the reader is offered the chance to
150 Kay L. O’Halloran and Victor Lim Fei
join the celebrity class of the powerful and influential through the material token of the Cartier bracelet. The final message of the advertisement is the website address, www.love. cartier.com. The positioning of the website address in relatively small font at the bottom of the final frame serves as the final line in this advertisement – now that you have appreciated the association between love, the Cartier bracelet and the powerful, you are invited to visit the website for more information. This provides the platform for the translation of the reader’s abstract ideas and emotions into tangible physical action of buying and giving. Translating the interest in the advertisement into the impetus to take action by buying the product is the ultimate motivation behind all advertisements. In conclusion, the Time magazine website for the list of the 100 most influential people and the Cartier website are briefly visited to investigate their intertextual relations with Time magazine and the Cartier LOVE bracelet advertisement. Given the extensiveness of the Time and Cartier websites, it is not possible to undertake a detailed investigation, but observations will be made with regard to the respective functionalities of the print and digital media genres.
9.4
The Time 100 series and Cartier websites
Time magazine, with American, European, Asian and South Pacific editions, publishes the Time 100 series which features their list of the 100 most influential people in the world. The winners are divided into five categories of Leaders and Revolutionaries, Builders and Titans, Artists and Entertainers, Scientists and Thinkers, and Heroes and Icons, so the list covers the major domains of human endeavour with global significance. The Time 100 list was originally conceived in 1999 as Time 100: The Most Important People of the Century to document the most influential politicians, artists, innovators, scientists and cultural icons in the twentieth century. Given the success of the twentieth- century list, Time began to publish in 2004 an annual list of top 100 people who continue to influence the world.2 The Time 100 list generates much controversy and reasons are sought for those who are included as well as those who are excluded. Time’s Editor-atlarge Michael Elliott explained in 2004 that there are three qualities when choosing candidates, and these include their public possession of power (e.g. George Bush), those with real influence but not necessarily a public presence (e.g. Ali Husaini Sistani, the Grand Ayatullah of Iraq’s Shi’ites) and those who influence through their moral example (e.g. Nelson Mandela).3 Managing Editor Richard Strengel for the 2007 list further explains ‘that the Time 100 was not a list of the hottest, most popular or most powerful people, but rather the most influential’.4 The person appearing the most is six-times listed Oprah Winfrey (2008, 2007, 2006, 2005, 2004 and the
Sequential Visual Discourse Frames 151
twentieth century), followed by George Bush, Hillary Clinton, Bill Gates and Hu Jintao who have been listed four times. Barack Obama was listed three times (2008, 2007 and 2006) before his inauguration as President of the United States on 20 January 2009. Time in partnership with CNN has an extensive online daily news website (www.time.com) with a link to Time magazine (www.time.com/time/ magazine). The Time website contains links to archived material for the Time 100 list, making it possible to search and retrieve information.5 The latest 2008 Time 100 website6 features people on the list according to the five categories, with a link to the ‘On the Red Carpet’ page which contains a video of interviews at the 2008 Time 100 awards ceremony which appears to unfold like the Oscar awards in Hollywood. The 2007 Time 100 website does not contain the ‘On the Red Carpet’ page. However, the Fame Game7 website which ‘maps and analyses your social connections and media attention to help you promote meaningful ideas, people and organizations in culture’ contains photos of people at the Time 100 awards 2007 party.8 According to the website, the 2007 Time 100 party had ‘organizational connections’ to Cartier. In addition, Cartier was the ‘official media’ for other social events listed on Fame Game website,9 and the Cartier logo and ‘the bracelet’ are displayed in connection with those events. Therefore, Cartier advertising materials are associated with the Time 100 list across multiple sites, connecting their products with influential people ‘whose power, talent or moral example is changing our world’.10 The organizational connections between Time 100 and Cartier are not surprising because Cartier is a French jeweller and watch manufacturer with a long history of providing luxury jewellery and watches to royalty, stars and celebrities. The company is a subsidiary of Compagnie Financière Richemont SA, and the name of the corporation dates back to the Cartier family of jewellers who sold the company in 1962. Today, Cartier is a global company with an extensive website,11 which includes sites in Africa, Asia/ Oceania, Europe, Middle East, North America and Latin America. The website ‘www.love.cartier.com’ displayed on the Cartier advertisement opens a web page with ‘HOW FAR WOULD YOU GO FOR LOVE?’ with links to their international sites across the world. Clicking on the ‘Asia/ Oceania’ link results in a menu of Asian languages and English. Selecting ‘English’ leads the viewer to the Cartier website,12 which features a dynamic menu which moves across the screen left-to-right or vice versa, according to user’s mouse click. Music accompanies the moving menu which features HOME, LOVE MUSIC, LOVE GALLERY, LOVE CHARITY, PLAYLIST US and LOVE COLLECTION. The hypermedia in the Cartier website is sophisticated beyond description in this paper. The US PLAYLIST, for example, traces an outline of New York City as the menu moves across the screen. The intertextual links to New York City are reinforced, and the black and white photo links (for black and white video clips) create a nostalgic feel for the past
152 Kay L. O’Halloran and Victor Lim Fei
in relation to the present, replicating the sentiments of the Cartier print advertisement. The Cartier LOVE COLLECTION features the LOVE bracelet, in addition to LOVE cuffs, LOVE rings, LOVE necklaces and LOVE watches. The LOVE bracelet is featured as the first dynamic display. A golden screwdriver twirls from its base and majestically floats across the screen to the locking mechanism on the LOVE bracelet while a swirling line moves across the screen through the screwdriver and bracelet. The line moves down towards the bottom of the screen to trace the outline of a woman and a man lying on their backs, head to head, joined together with locked hands. The LOVE bracelet floats down to the woman’s wrist and simultaneously a second bracelet appears on the man’s wrist. The dynamics of the animation are sleek, innovative and interpersonally engaging as the lines unfold to create a captivating multimodal narrative of the locking of the LOVE bracelets on the couple’s wrists. The Cartier website recreates the emerging narrative in the Cartier advertisement in Time Asia on 8 May 2006 in ways which extend beyond the semantics of print media which is limited to the haptic and visual modes. The creative dynamic integration of music, graphics and visual images on the Cartier website creates a multimodal experience which utilizes the visual, auditory and somatic (action) modes to create multiple complex emergent narratives which are not constrained by the meaning potential of static frames. Indeed, the simplicity of the Cartier print advertisement can be contrasted to the complexity of the Cartier hypermedia genres with their music clips, photographs, drawings, videos and ‘remediated’ (Bolter and Grusin, 2000) print texts. Furthermore, the interactive digital catalogue contains an expanded version of the text in the Cartier print advertisement. In the hypertext catalogue, ‘the ultimate symbol of loving commitment’ is the locking device of the LOVE bracelet which becomes ‘a rallying cry for modern love, totally free from convention’, an ironical play on the notions of freedom and restraint. In the 1970s, imagination soared to new heights and any folly was possible. Cartier created the ultimate symbol of loving commitment: in the manner of handcuffs, the famous bracelet closes with a tiny screwdriver. Its humorous impudence appealed and was immediately adopted by legendary couples as a cult jewel. LOVE is a provocative talisman, a rallying cry for modern love, totally free from convention.13 The semantic potential of hypermedia includes and extends the print media experience and so the two media genres ‘semiotically span’ (Ventola, 1999, 2002) each other to create a co- contextualizing semantic field which reinforces the exclusive brand identity of Cartier with its accompanying identity of elegance, sophistication and cult of celebrity. However, print
Sequential Visual Discourse Frames 153
media contains a meaning potential which cannot as yet be replicated in digital media, and that is the somatic modality whereby the reader physically interacts with the advertisement to feel (and smell) the quality of Cartier. It is a matter of time, however, before everyday digital media technologies incorporate the complete range of modalities experienced in material lived-in reality.
9.5
Conclusion
The configuration of marked choices in the Cartier advertisement in Time Asia 8 May 2006 guarantees that the viewer will see the advertisement, if the magazine is opened, and promotions at celebrity events ensure a visibility for Cartier which is hard to achieve on the Internet alone. Therefore, print and digital media combine to form a strategic multimodal semiotic campaign for Cartier whereby print advertisements lead the viewer to their website where the full range of products are shown, plus much more. Cartier website provides in- depth information about their company, history, products and services which are presented using dynamic visual, aural and somatic modes of presentation. The customer proceeds by his/her own volition to purchase Cartier luxury goods, but ideological motivations shaped by consumerism provide the basis for their actions. ‘Industries reflect the world through consumer choices, but they also shape the world by offering a limited set of choices, giving rise to Foucault’s view of consumerism as technology of the self’ (Martin et al., 1988; O’Halloran, 2008b, p. 59). We need to develop theories and practices to understand the relations between consumerism, identity and power in the digital age where transnational franchises increasingly market their goods and services across multiple sites (Lemke, in press). Systemic-functional social semiotics offers a theoretical platform to investigate the multimodal semiotic landscape in the digital age. As suggested in this chapter, the semantics afforded by print media are insufficient to capture the meaning potential of digital media, thus an approach utilizing digital technology is required to map visual, aural and somatic modalities which operate in the new world told and shown by new media.
Notes Website references 1. http://en.wikipedia.org/wiki/Broadway_theatre (accessed 19 January 2009). 2. See http://en.wikipedia.org/wiki/Time_100 (accessed 19 January 2009). 3. http://w w w.timewarner.com/corp/newsroom/pr/0,20812,670354,0 0.html (accessed 19 January 2009). 4. http://en.wikipedia.org/wiki/Time_100 (accessed 20 January 2009). 5. http://en.wikipedia.org/wiki/Time_100 (accessed 20 January 2009).
154
Kay L. O’Halloran and Victor Lim Fei
6. www.time.com/time/specials/2007/0,28757,1733748,00.html?iid=redirecttime100 (accessed 19 January 2009). 7. www.famegame.com/ (accessed 21 January 2009). 8. www.famegame.com/party/362963/ (accessed 21 January 2009). 9. www.famegame.com/org/97592 (accessed 21 January 2009). 10. www.time.com/time/2006/time100/ (accessed 20 January 2009). 11. www.cartier.com/ (accessed 21 January 2009). 12. http://love.cartier.com/home.php?idlangue=ukandidcontinent=ao (accessed 21 January 2009). 13. www.cartier.com/en/Creation,B4064400,,Love-Rings (accessed 10 February 2009).
References Baldry, A. P. and P. J. Thibault (2006) Multimodal Transcription and Text Analysis (London: Equinox). Barthes, R. (1977) ‘Rhetoric of the image’, in Image-Music-Text. (London: Fontana), pp. 32–51. Bauman, Z. (2000) Liquid Modernity (Cambridge: Polity Press). —— (2004) Identity (Cambridge: Polity Press). Bohle, R. (1990) Publication Design for Editors (New Jersey: Prentice Hall). Bolter, J. D. and R. Grusin (2000). Remediation: Understanding New Media. (Cambridge, MA: The MIT Press). Djonov, E. (2005). Analysing the Organisation of Information in Websites: From Hypermedia Design to Systemic Functional Hypermedia Discourse Analysis. PhD Thesis, University of New South Wales. Halliday, M. A. K. (1978) Language as Social Semiotic (London: Arnold). —— (1985) ‘Part A’, in M. A. K. Halliday and R. Hasan (eds) Language, Context, and Text: Aspects of Language in a Social-Semiotic Perspective (Geelong, VIC: Deakin University Press), pp. 1–49. [Republished by Oxford University Press, 1989.] —— (2004) An Introduction to Functional Grammar (3rd edn, revised by C. M. I. M. Matthiessen, ed.) (London: Arnold). Halliday, M. A. K. and C. M. I. M. Matthiessen (1999) Construing Experience through Meaning: A Language-Based Approach to Cognition (London: Cassell). Iedema, R. and C. R. Caldas- Coulthard (2008) ‘Introduction: Identity trouble: Critical discourse and contested identities’, in R. Iedema and C. R. Caldas- Coulthard (eds) Identity Trouble: Critical Discourse and Contested Identities (New York: Palgrave Macmillan), pp. 1–14. Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design. (London: Routledge). [2nd revised edition 2006.] —— (2001) Multimodal Discourse: The Modes and Media of Contemporary Communication (London: Arnold). Lemke, J. L. (in press) ‘Transmedia traversals: Marketing meaning and identity’, in A. Baldry and E. Montagna (eds) Interdisciplinary Approaches to Multimodality: Theory and Practice. Readings in Intersemiosis and Multimedia. (Campobasso: Palladino). Lim, F. V. (2004) ‘Developing an integrative multisemiotic model’, in K. L. O’Halloran (ed.) (2004) Multimodal Discourse Analysis (London and New York: Continuum), pp. 220–46. —— (2005) ‘Problematising semiotic resource’, in E. Ventola, C. Charles and M. Kaltenbacher (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 51–64.
Sequential Visual Discourse Frames 155 —— (2006) ‘The visual semantics stratum: Making meaning in a sequential series of visual images’, in T. Royce and W. Bowcher (eds) New Directions in the Analysis of Multimodal Discourse (New Jersey: Lawrence Erlbaum), pp. 195–214. Machin, D. (2007) Introduction to Multimodal Analysis (London: Hodder). Martin, L. H., Gutman, H. and Hutton, P. H. (eds) (1988) Technologies of the Self: A Seminar with Michel Foucault (Amherst: University of Massachusetts Press). Martin, J. R. (1992). English Text: System and Structure (Amsterdam: Benjamins). —— (2008) ‘Intermodal reconciliation: Mates in arms’, in L. Unsworth (ed.) New Literacies and the English Curriculum: Multimodal Perspectives (London: Continuum), pp. 112–48. Martin, J. R. and D. Rose (2003) Working with Discourse: Meaning Beyond the Clause. (London: Continuum) [2nd revised edition 2007.] Martinec, R. (2000) ‘Types of processes in action.’ Semiotica, 130(3/4): pp. 243–68. —— (2004) ‘Gestures that co- concur with speech as a systematic resource: the realization of experiential meanings in indexes.’ Social Semiotics 14(2): pp. 193–213. McCloud, S. (1993) Understanding Comics: The Invisible Art. (Northhampton, MA: Tundra). Moya, J. and M. J. Pinar (2008) ‘Compositional, interpersonal and representational meanings in a children’s narrative. A multimodal discourse analysis.’ Journal of Pragmatics 40(9): pp. 1601–19. O’Halloran, K. L. (1999) ‘Interdependence, interaction and metaphor in multisemiotic texts.’ Social Semiotics 9(3): pp. 317–54. —— (2003) ‘Intersemiosis in mathematics and science: Grammatical metaphor and semiotic metaphor’, in A. M. Simon-Vandenbergen, M. Taverniers and L. Ravelli (eds) Grammatical Metaphor: Views from Systemic Functional Linguistics (Amsterdam: John Benjamins), pp. 337–65. —— (2005) Mathematical Discourse: Language, Symbolism and Visual Images (London and New York: Continuum). —— (2007) ‘Systemic functional multimodal discourse analysis (SF-MDA) approach to mathematics, grammar and literacy’, in A. McCabe, M. O’Donnell and R. Whittaker (eds) Advances in Language and Education (London and New York: Continuum), pp. 75–100. —— (2008a) ‘Inter-semiotic expansion of experiential meaning: Hierarchical scales and metaphor in mathematics discourse’, in C. Jones and E. Ventola (eds) New Developments in the Study of Ideational Meaning: From Language to Multimodality (London: Equinox), pp. 231–54. —— (2008b) ‘Power, identity and life in the digital age’, in M. Amano, M. O’Toole, S. Shigemi and S. Wei (eds) Proceedings of the Third Global COE Conference (Nagoya: Nagoya University), pp. 45–61. O’Toole, M. (1994) The Language of Display Art (London: Leicester University Press). —— (2005) ‘Pushing out the boundaries: Designing a systemic-functional model for non European visual arts’, in J. Webster, C. Loran and G. Williams (eds) Linguistics and the Human Sciences (London: Equinox), pp. 83–97. Painter, C. (2007) ‘Children’s picture book narratives: Reading sequences of images’, in A. McCabe, M. O’Donnell and R. Whittaker (eds) Advances in Language and Education (London: Continuum), pp. 40–59. Royce, T. and W. Bowcher (eds) (2006) New Directions in the Analysis of Multimodal Discourse (New Jersey: Lawrence Erlbaum). Unsworth, L. (2006) ‘Towards a meta-language for multiliteracies education: Describing the meaning making resources of language-image interaction.’ English Teaching: Practice and Critique, 5(1): pp. 55–76.
156
Kay L. O’Halloran and Victor Lim Fei
van Leeuwen, T. (1985) ‘Rhythmic structure of the film text,’ in T. A. Van Dijk (ed.) Discourse and Communication: New Approaches to the Analysis of Mass Media Discourse and Communication (Berlin: Walter de Gruyter): pp. 216–32. —— (1999) Speech, Music, Sound (London: Macmillan). —— (2005) Introducing Social Semiotics (London: Routledge). —— (2006) ‘Towards a semiotics of typography.’ Information Design Journal, 14(27): pp. 139–55. Ventola, E. (1999) ‘Semiotic spanning at conferences; Cohesion and coherence in and across conference papers and their discussions’, in W. Bublitz, U. Lenk and E. Ventola, (eds) Coherence in Spoken and Written Discourse. How to Create It and How to Describe It (Amsterdam: Benjamins), pp. 101–25. —— (2002) ‘Why and what kind of focus on conference presentations?’, in E. Ventola, C. Shalom and S. Thompson (eds) Conference Language. (Frankfurt am Main: Peter Lang), pp. 15–50. Ventola, E., C. Charles and M. Kaltenbacher (eds) (2004) Perspectives on Multimodality (Amsterdam: John Benjamins).
10 A Systemic Functional Framework for the Analysis of Corporate Television Advertisements Sabine Tan
10.1
Introduction
Every day, corporate television advertisements stream into the living spaces of mass audiences. Companies frequently choose corporate advertisements to communicate a diverse number of messages about themselves, their values, doctrines and beliefs. Yet, despite uncertainties about the effectiveness of this medium, the meaning-making mechanisms of corporate advertisements have remained largely unexplored. Accordingly, to aid our understanding of the complex ways in which visual, auditory and linguistic message elements combine to form meaning in corporate television advertisements, this chapter introduces an analytic, integrative multisemiotic framework and transcription template for decoding and interpreting the meaning-making propensities of dynamic multimodal texts. The chapter begins by highlighting the historical impact of television on the advertising industry and discussing the corporate ideology of the featured advertising text. It then provides an outline of current developments in multimodal analysis and introduces the theoretical framework and transcription template. Subsequently, it discusses the methodological aspects of selection criteria for the segmentation of a film text into appropriate constituent levels before illustrating the framework in the analysis of a sample text, and it concludes by considering the imperatives of a semiotic approach and industrial practice for the analysis of corporate television advertisements. 10.1.1
The history of television advertising
Advertising started in print. When radio came along ... we all had to ... sell with words and music and no pictures ... Then along came television and everything changed again – back to pictures, the era of visual demonstration. (Edwin L. Artzt, former chairman of the board and CEO of Procter & Gamble) (Budd et al., 1999, p. 9) 157
158 Sabine Tan
For advertisers, a new era began in the 1950s and 1960s, when advertisements first began to infiltrate the living spaces of target audiences through the medium of television. For the next 40 years, television helped to popularize many renowned brand names almost overnight by harnessing the dynamic power of moving pictures (Budd et al., 1999, p. 9). Traditionally, commercial advertising was – and still is – driven by business needs and priorities. Established practices in film production in terms of audience positioning also helped to shape the tone of commercial television. According to Ellsworth (1997, p. 23), most decisions about the overall structure of a film text are made in light of conscious and unconscious assumptions about who the intended audience is, what they want, and how they read films. Hence, the advertising industry learnt very quickly ‘to speak in terms meaningful to those it would target’ (Budd et al., 1999, pp. 13–16). Unlike other, more conventional, forms of advertising, corporate advertisements do not promote a particular product or service to specific target audiences, but instead tend to communicate some information about the advertiser as a whole. Many businesses, banks included, use corporate advertising to put across an idea or philosophy about their company as part of a campaign to reinforce their brand name and/or to promote long-term goodwill (Arens, 2002, p. 94). 10.1.2
The advertising text
The advertisement analysed in this chapter is HSBC’s (one of the world’s largest international banking and financial services organizations) 2004 ‘Okey Doke’ motorcycle television commercial, hereafter referred to as the HSBCtext. Its storyline revolves around the impressions of a biker travelling all over South America and his ultimate ignorance about the culturally diverse meaning of the kinetic expression ‘Everything’s OK’. The advertisement constituted part of HSBC’s ‘Cultural Collisions’ advertising campaign, originally launched in 2002, designed to stress the importance of local knowledge and cultural diversity across the globe, and to reinforce HSBC’s brand image as ‘The world’s local bank’ (see HSBC, 2002).1
10.2 The theoretical framework In terms of theoretical approaches, however, the medium of corporate advertising has often been treated with reserve. According to Arens (2002, p. 357), ‘[h]istorically, companies and even professional people have questioned, or misunderstood, the effectiveness of corporate advertising’. As Kress and van Leeuwen (2006, pp. 31–2) aptly point out, ‘[t]he world represented visually on the screens of the “new media” is a differently constructed world to that which had been represented on the densely printed pages of the print media of some thirty or forty years ago. The resources it offers for understanding and for meaning-making differ from those of the world represented in
A Systemic Functional Framework for the Analysis of TV-Ads
159
language’. As a result, the diverse meaning-making potentials of corporate advertisements are often left unexplored because, as Budd et al. (1999, p. 86) observe, ‘[w]e may know how to use the tools of language, how to understand television and film [...] But [because] we’re operating from “inside” these systems, we don’t fully understand their rules and implications’. Moreover, the medium of television does not derive its visual or verbal import in isolation, but routinely co- deploys devices from more than one semiotic mode of communication. In order to demystify the multisemiotic language of corporate television advertisements, one needs to look closely at the systems and synergistic processes that operate to make meaning in such dynamic texts, which the proposed framework hopes to address. However, while various theoretical semiotic models exist for analysing self- contained, static forms of mostly print media, integrative multisemiotic frameworks that take into consideration the dynamic, multivariate character of filmic representations are currently still under-represented. The most eminent semiotic theories for the analysis of visual images within a systemic functional tradition are perhaps represented by the works of O’Toole (1994) and Kress and van Leeuwen (1996, 2001). Traditionally, other linguists have drawn on and expanded these established concepts by taking into account processes of intersemiosis, such as Royce (1998), whose analytical framework for exploring intersemiotic complementarity exemplifies that verbal and visual modes ‘work together’ to create and reinforce meaning in page-based multimodal texts. Although some of these theories have been expanded to include three- dimensional media such as sculpture and architecture (O’Toole, 1994; Kress and van Leeuwen, 1996, 2001), they are still largely concerned with the analysis of static texts. Advances in the analysis of dynamic media were made by Thibault (2000), Iedema (2001), Baldry (2004) and O’Halloran (2004). Thibault (2000, p. 359), who developed a multimodal transcription template for a television advertisement, is particularly critical of the use of existing, two-dimensional, frameworks for the analysis of dynamic representations. He argues that they remain too ‘closely tied to the linguistic semiotic’ and are ‘not adequately motivated [by] the nature of the link between any particular metafunction and the syntagmatic structuring principle(s) which realize it’. Thibault’s (2000), Iedema’s (2001) and Baldry’s (2004) frameworks all highlight the multifunctional nature of meaning-making resources and the importance of rhythm as a source for constructing meaning in dynamic texts, while O’Halloran’s (2004) systemic functional model for the analysis of film also acknowledges the significance of filmic genres and cinematographic techniques. The proposed framework (which is outlined in Section 10.3.1) was developed by drawing on adaptations of these theories and concepts. In addition, following O’Halloran (2004), the framework also considers the impact of cinematographic conventions based on Bordwell and Thompson’s (2004) discussions on film theory, by exploring how these interact dynamically
160
Sabine Tan
with other semiotic resources, to aid our understanding of the intricacies of meaning-making processes that may operate in dynamic multimodal texts.
10.3
The transcription template
The proposed transcription template (an excerpt of which is included in Appendix 10.1a, together with a List of Notations/Notational Symbols in Appendix 10.1c evolved as a hybridized adaptation of Thibault’s (2000) and Baldry’s (2004) manual transcription templates for the analysis of television commercials. While Thibault’s (2000) template is stacked vertically on the basis of a fixed time-per-second correlation between shots and selected visual frames (i.e. one shot per second), Baldry’s (2004) row-based transcription template adopts a horizontal, continuous layout that unfolds like a musical score, with the potential to move beyond the confines of the printed page (see Baldry, 2004, p. 84, 87). Following Baldry (2004), a horizontal format is preferred as it supports a continuous, wrap-around presentation of visual frames based on actual shot length or duration. It is my contention that a horizontal layout aids intersemiotic analysis, as it captures the composite process of the ways in which the different resources are co-deployed across modes. Moreover, it allows aspects of the Soundtrack to be traced and correlated more precisely as they coincide with a particular Frame within a Shot or Mise-en-Scène. Moreover, a further beneficial aspect is that a row-based, horizontal system also allows for the analytical categories to be expanded indefinitely. 10.3.1
Template outline and description of analytic categories
The proposed transcription template is structured according to four broad analytic blocks or categories, which are divided into smaller sub- categorical units, largely consisting of two rows each (see Appendix 10.1a). Block 1 is concerned with the sequential array of frames, shots, scenes, sequences, phases and sub-phases. Block 2 captures aspects of the Soundtrack. Block 3 is metafunctionally organized and concerned with manifestations of experiential/representational, interpersonal and textual/compositional meaning potential(s) conveyed by visual message elements, while Block 4, the last one, records intersemiotic relations. 10.3.1.1
Block 1 – Frame, Shot, Scene, Sequence, Phase and Sub-Phase
The uppermost row of the template reflects the number and title of the respective Phase and/or Sub-phase, while the second row, labelled Visual Frame, features the still image of the visual representations, complete with Frame, Shot, Scene, and Sequence numbers. 10.3.1.2 Block 2 – the soundtrack Block 2 aims to be a stepping stone for uncovering the meanings conveyed by the Soundtrack as realized, for example, by Soundscapes and Sound
A Systemic Functional Framework for the Analysis of TV-Ads
161
Perspectives (van Leeuwen, 1999). The first two rows in this block are concerned with recording the modal input in Music, for example, the degree of loudness, volume and tempo, and instances of melodic changes and Song, while the subsequent row includes a prosodical transcript of the accented and rhythmical Speech units as they co- occur in sequence with the respective Visual Frame(s). The notational symbols and glosses used to record aspects of the Soundtrack, particularly the element of Music, are adapted from Thibault (2000) and Baldry (2004), while the transcription symbols to capture accented and/or stressed speech units are adaptations of notational conventions used for conversation analysis and prosody as devised by Atkinson and Heritage (1984) and Bolinger (1985) (see Appendix 10.1c). 10.3.1.3 Block 3 – visual message elements The first row in Block 3 offers a brief Verbal Description of who or what is represented in the specific Shot or Mise- en-Scène, and/or what the represented participants are doing. More precisely, it offers a brief description of the depicted Setting, kinetic features and actions. Drawing largely on adaptations of the theories and concepts expressed by Halliday (1994), Kress and van Leeuwen (1996, 2001), van Leeuwen (1996, 2001) and Bordwell and Thompson (2004), the next six rows in Block 3 concentrate on manifestations of experiential/representational, interpersonal and textual/compositional meaning potential(s). For example, the first two of these metafunctionally oriented rows reflect on the ostentations of experiential/representational meaning potential(s), first by Narrative Representations, with a focus on Participants, Gaze- or Kinetic Action Vectors, Material-, Reactional-, Mental-, and Existential Processes, and Circumstances and, secondly, by Conceptual Representations, that is, instantiations of Relational and Semiotic Processes, as realized through processes of Denotation, such as Classificational, Symbolic Attributive and Symbolic Suggestive Processes, and/or processes of Connotation. The following two metafunctionally oriented rows record interpersonal realizations of meaning. While the row labelled Mood is concerned with Image Acts, Visual Offers or Demands, as conveyed by the Gaze, and visual realization of Social Distance, Interpersonal Involvement, Power Relations and Subjectivity, as conveyed by Size of Frame, Camera Angle, Camera Movement and Perspective, the row headed Modality is concerned with ostentations of visual reality, as conveyed through Colour Coding, Contextualization, Depth Perspectives, and Cinematographic Devices, such as Photographic Exposure and Speed of Motion/Camera Movement. The last two of the metafunctionally oriented rows are dedicated to recording textual/compositional relations. The first row of these, Composition, is concerned with aspects of Information Value and Visual Salience as mediated by Given & New structures, realizations of the Ideal & Real, and aspects of Compositional Balance, Frame Lines, and Viewing Paths. The second, entitled Graphic-, Rhythmic-, Spatiotemporal and Conjunctive Relations, forms the interstice between intra- and intersemiotic
162
Sabine Tan
selections, and is concerned with instantiations of logical meaning, and considers these relations between adjacent (and other) Shots. This row is specifically concerned with the meanings resulting from cinematographic techniques and editing devices, on the basis of Bordwell and Thompson’s (2004) discussions on film theory. 10.3.1.4 Block 4 – intersemiotic relations While the intra-semiotic meaning potentials form an important aspect of the intersemiotic framework, they do not tell us how a shot, scene or sequence integrates or combines with other semiotic modalities, that is, they do not reveal how a filmic phase ‘relates to and is synchronized with say, music, language, and other sounds of the soundtrack’ (Thibault, 2000, p. 319). To describe the meaning-making mechanisms that are achieved through Intersemiotic Relations and to illustrate how the meanings projected through various semiotic modes and resources function to co- or re-contextualize each other in interactive space, the proposed framework adopts Royce’s (1998, 2002) terminology, which, in turn, is based on Halliday and Hasan’s (1976, 1985) categories of lexical cohesion. Consequently, to account for experiential complementarity, Block 4 seeks to identify intersemiotic Repetitions of identical meaning, and intersemiotic collaborations as realized through Synonymy (similarity relations), Antonymy (opposite relations), Meronymy (part-whole relations), Hyponymy (class-subclass relations), and Collocation (expectancy relations). Instantiations of interpersonal complementarity, in turn, are concerned with the consistency of interpersonal relations, sustained through Reinforcements of Address, and aspects of Attitudinal Congruence or Dissonance in terms of Mood and Modality. The analysis of textual/compositional complementarity, however, presents a departure from Royce’s (1998) framework, and instead focuses on the realization of Intertextual Synonymy, such as the resolve of Given and New Structures across modes, and the conjunctive relations between image and narration, as instantiated through Iconographical Symbolism (where textual arguments act as visual ‘pointers’ which tell the viewer how a given motif is to be interpreted symbolically). While instances of intersemiotic complementarity at the rank of Shot, Scene or Sequence contribute mostly to reinforce experiential meaning potential (see Tan, 2005) at the micro-analytic stage, instances of interpersonal and textual/compositional complementarity seem to have more immediate bearing on the overall meaning potential of the texts, and may be best analysed at the macro-analytic stage at the rank of Phase and Work As Whole as outlined in Section 10.4.2. 10.3.2 Aspects of methodology: Criteria for segmenting the film text into constituent levels of analysis The development of an integrative, multisemiotic framework for analysing the meaning-making potential(s) conveyed through the intra- and intersemiotic
A Systemic Functional Framework for the Analysis of TV-Ads
163
processes in corporate television advertisements invariably involves the segmentation of the texts into appropriate analytical categories. O’Halloran (2004), for instance, suggests the segmentation of filmic text according to Film Type, Film Form, and Genre, and a ‘metafunctionally organized rank constituent structure’ based on Film Plot, Sequences, Scene, Mise- en-Scène (i.e. everything which is seen within the frame as it unfolds in time together with the accompanying soundtrack) and Frame (O’Halloran, 2004, pp. 114–17). Alternatively, Iedema (2001) proposes six levels of analysis, namely, Frame, Shot, Scene, Sequence, Generic Stage, and Work As Whole (see Iedema, 2001, pp. 188–9). The smallest unit, the Frame, is essentially a salient or representative still of a Shot. Shots, in turn, are composed of a variable series of representative stills or frames, characterized by uncut camera actions. If there are perceptible camera movements within shots, ‘this may be due to panning, tracking, zooming, and so on, but not editing cuts’ (Iedema, 2001, p. 189). Scenes comprise more than one shot, whereby the principal identifying criteria is temporal and spatial continuity. Sequences, in turn, ‘comprise a range of contiguous scenes which are linked ... on the basis of thematic or logical continuity’ (Iedema, 2001, p. 189). The proposed framework for analysing the intra- (Micro-level Analysis) and intersemiotic (Macro-level Analysis) meaning potentials largely follows Iedema’s (2001) constituent levels of Frame, Shot, Scene and Sequence, although Iedema’s rank Generic Stage is substituted by the concept of Phase (explained in Section 10.4.2), while the notions of Film Form, Type and Genre, and Miseen-Scène are incorporated, where appropriate, at the respective micro- and/ or macro-analytic stage.
10.4 Analysis of the HSBC-text: An alternative view Following the selection criteria outlined in Section 10.3.2, the HSBC-text was first segmented into individual frames of 0.08 seconds each. With a total screen-time of about 60 seconds, it comprises a total of 750 frames, with an average number of 11 frames per Shot, and an average Shot duration of about 0.87 seconds, although it includes many individual Shots which are significantly shorter. The number of Shots included in the transcription template are based on actual length or duration, as this may have a considerable impact on the rhythmic- dynamic relations of the text, and needs to be taken into consideration in the analysis of these characteristics.2 In addition, many Shots, Scenes and Sequences in the HSBC-text seem to disregard the spatiotemporal relationships which form the underlying basis for Iedema’s (2001) classification. Instead they appear to be dictated by other forms of continuity, as well as discontinuity, motivated by formal and/or stylistic devices, elements of Mise- en-Scène, and unusual editing conventions. This is explored further in Section 10.4.1.
164
Sabine Tan
10.4.1 Micro-level analysis This section discusses the impact of intra-semiotic choices realized through editing conventions and other forms of (dis)continutity at the rank of Shot, Scene, and Sequence. 10.4.1.1 The impact of editing and other (dis)continuity devices Commonly used editing devices, apart from straight cuts, are dissolves or fades, which unambiguously mark the transition between shots (see Bordwell and Thompson, 2004). In the HSBC-text, however, it is at times unclear where one shot ends and another begins. For example, in SCENE 1, shot boundaries are marked, or rather masked, by what may be termed a Flash (see Appendix 10.1b for visualization).3 Here, a Flash (SHOT 11), characterized by a brief burst of psychedelic lights and colours, separates SHOT 10 (an out- of-focus, overexposed, close-up shot of a little native South American boy) from SHOT 12 (a medium shot of two human figures, of which the little boy in SHOT 11 is now a participant, and which may thus be rendered a Scene). Another distinctive editing device for segregating shots in the HSBC-text is what is here termed a Swoosh. A Swoosh is characterized by a rapid diminishing of sharpness and focus, or blurring of the image, as demonstrated in SCENE 2, SEQUENCE 3, SHOT 14. While Swooshes may resemble whip pans (see Bordwell and Thompson, 2004, p. 401), Swooshes are distinctively different devices in that they can be present even when the camera is held static. In the HSBC-text, Swooshes often function as transition or boundary markers between Shots as well as Phases, such as in SHOT 29–30, for example, but they may also constitute a single Shot, as in SHOT 7. In addition, by conveying a sense of rapid movement, Swooshes in the HSBC-text also fulfil an important narrative function. 10.4.1.2 The impact of conjunctive relations The viewers’ overall understanding of the unfolding of the filmic event operates largely within the Logical Metafunction, which is concerned with the ways in which one event is related to another in the overall structure of the film text, on the basis of the respective Conjunctive Relations that exist between Shots. In dynamic moving images, actions and events are frequently linked on the basis of Temporal Sequences, whereby ‘[t]he first shot shows one action or transaction or event, and the next shot shows the next action or transaction or event’ (van Leeuwen, 1996, p. 97). The logic of Temporal Sequences may be preserved through Continuity Editing, specifically ‘match on action’ shots, whereby a person’s action is shown as beginning in one shot and continued in the following shot (Bordwell and Thompson, 2004, p. 315). Temporal Conjunctions, however, can also be found in shots or scenes that cross the ‘Axis of Action’, where the logic of the unfolding action or event
A Systemic Functional Framework for the Analysis of TV-Ads
165
is supported by mobile framing, such as demonstrated in SCENE 7 (SHOTS 42–43) where the action of the biker stopping to pick up his female companion is first captured in a lateral tracking shot from a frontal perspective (SHOT 42).4 The camera then crosses the ‘axis of action’ as it continues to capture the action from a different angle in SHOT 43. Although the directional shift gives rise to a rhythmic/dynamic disjunction, temporal as well as spatial contiguity is preserved. Another aspect of a temporal conjunction relevant to the criteria for scene selection is Simultaneity, where ‘the first shot shows one action, transaction or event, the next shot shows another one, understood as happening at the same time’ (van Leeuwen, 1996, p. 97). This is demonstrated in SCENE 4 (SHOTS 21–22), which, at the same time, represents a transformation or shift in point of view. The sequential logic of the scene is further established in terms of Spatial Conjunctions, represented by the concept of Overview and Detail, which concerns ‘the relation between one shot which shows the whole of something (“overview”) and another which shows a part (“detail”)’ (van Leeuwen, 1996, p. 101). In SCENE 4, for example, SHOT 21 represents an Overview of the Scene (with the truck and biker captured in a long shot from the perspective of a neutral observer or co-traveller), while SHOT 22 represents the Detail, which captures what the biker himself might presumably see during the process of overtaking, thus creating the perception of simultaneity. In contrast, SCENE 3 depicts a reverse construction of this concept, where the camera first zooms in on the Detail in a medium closeup of native South Americans in carnival costume (SHOT 19), while the following long shot presents an Overview of the Scene (SHOT 20), inclusive of background features and Setting. 10.4.1.3 The impact of graphic relations Graphic Relations, as represented by similarities in terms of Setting, Colour and Lighting, may also contribute to uphold the logical contiguity of Scenes and Sequences. In SCENE 3 (SHOTS 18–20), temporal and spatial continuity is mainly transmitted through an optically subjective perspective, or point- of-view (POV) shot structure, as well as elements of Mise- en-Scène (see Bordwell and Thompson, 2004, pp. 176, 264). The same also holds for SCENE 5 (SHOTS 24–25), and SCENE 6 (SHOTS 34–35). Consequently, in view of perceptible contiguities in terms of Setting and Lighting, it could be reasonably surmised that SHOT 44 (which captures the image of a mangy dog running along the roadside, presumably from the perspective of the biker or his pillion rider) effectively forms part of SCENE 7, while on the basis of Iedema’s (2001) classification, it is merely considered part of a larger Sequence. Some Scenes may have disruptions of temporal and spatial continuity, as in SCENE 1 and SCENE 2 (see Appendix 10.1b), while others can stand entirely on their own without entering into thematic or logical relationships
166
Sabine Tan
with other Scenes to form a Sequence. Alternatively, Scenes may simultaneously constitute Sequences, such as SCENE 2/SEQUENCE 3, where the camera seems to move with the main protagonist (the biker) within and across time–space boundaries. Conversely, Sequences need not necessarily be made up of contiguous Scenes but can be composed of individual Shots that are linked either by repetitious elements of Mise-en-Scène, or on the basis of narrative, thematic or logical continuity. In fact, given the underlying narrative theme of cross- continental travel in the HSBC-text, it could even be construed as representing one, long, contiguous Sequence, offering a varied ‘sampling of local sights and customs’ (Bordwell and Thompson, 2004, p. 133). 10.4.2 Macro-level analysis While the previous section dealt with micro-analytical considerations at the rank of Shot, Scene and Sequence, the macro-analytic stage is concerned with the wider organizational principles of the film text and the dynamic impact of intersemiosis at the rank of Phase and Work As Whole. 10.4.2.1 The concept of Phase Although narrative strands may contribute in some measure to the overall organization of a film text, television advertisements routinely unfold in wave-like, rhythmical patterns, or Phases, which arise out of the constant shift in selection choices from one or more semiotic modes or resources (see Thibault, 2000; Baldry, 2004). Building on the work of Gregory (1995, 2002), Thibault (2000, p. 325–6) defines a phase as a ‘set of co-patterned semiotic selections that are co- deployed in a consistent way over a given stretch of text’. In order to capture the dynamic interaction of semiotic choices, macro-analysis is based on the phasal organization of the film text. 10.4.2.2
The phasal structure of the HSBC-text
The phasal boundaries in the HSBC-text (see Appendix 10.1b), which draw largely on the generic features of a travel documentary in narrative film form, were identified and assigned a title on the basis of overall narrative theme of the text.5 Sub-phasal titles, in turn, reflect the thematic content of the internal narrative strands, which do indeed reveal a consistent, wavelike pattern, as they cut back and forth between interrelated conceptual themes. The individual phases, however, do not necessarily correlate with the narrative stages of thematic development. Rather, they coincide with the Given and New structures of the text, at both thematic and phasal level. For example, Phase 1 All over South America may be seen as constituting the GIVEN in terms of the overall information structure of the text (see Halliday, 1994), but it encompasses both narrative stages of Orientation and Complication.6 The subsequent phase, Phase 2 ‘In Brazil’, which corresponds with the narrative stage of Conflict, may be seen as realizing the NEW,
A Systemic Functional Framework for the Analysis of TV-Ads Table 10.1
Phasal and information structure of the HSBC-text
Phase 1:
All over South America, this gesture means everything’s OK.
GIVEN
Phase 2:
Apart from Brazil, that is. Where it’s really rather rude.
NEW
167
which, intersemiotically, is further supported by the thematic structure of the voice- over narrative. However, in terms of metafunctional realization, it may not be the phases but the transitions that are of particular relevance (see Thibault, 2000, p. 321; Baldry, 2004, p. 95). As Baldry (2004, p. 93) points out, phasal transitions ‘are not necessarily equated with the cutting from one shot to another’, nor need they coincide with Scenes or Sequences. Rather, transitions are instantiated by changes in the metafunctional organization of the film text (see Baldry, 2004, p. 94). In addition, these shifts are often accompanied by cinematographic devices, such as a change in camera movement, or a change in the music or the soundtrack (see Iedema, 2001, p. 190). 10.4.2.3 The impact of rhythmic/dynamic and graphic relations In dynamic moving images, Rhythm is considered to be primarily responsible for the organization of filmic meaning (see Iedema, 2001, p. 193). Often, this is achieved by manipulating elements of Mise- en-Scène, that is, choices within the Editing function, which ‘can control not only what we look at’ (Bordwell and Thompson, 2004, p. 218), but also how we look at a multimodal text. As pointed out by Bordwell and Thompson (2004, p. 207), ‘[t]he general formal principles of unity, disunity, similarity, difference, and development will guide us in analyzing how specific elements of mise- enscene can function together’, or which should be treated as separate and independent, or perhaps even contrasting items of information (see van Leeuwen, 1996, p. 96). The fluidity of dynamic moving images may thus be influenced by the rhythmic/dynamic sequencing relations between Shots, which regulate aspects of spatiotemporal, as well as logical, continuity and/ or discontinuity. For example, sub-phasal boundaries and transitions in the HSBC-text are often marked by Rhythmic/Dynamic Disjunctions in terms of camera movement and Graphic Conflicts in terms of colour saturation and lighting. For example, in SCENE 1, a shock cut (Bordwell and Thompson, 2004, p. 400), realized by abrupt changes in Setting, Colour Saturation and Lighting (SHOT 10 and 11), upsets not only the spatial as well as the temporal continuity of the preceding Shots, but also effectively marks the beginning of a new sub-phase. By the same token, connections between Shots, Scenes and sequences may be established purely on the basis of Graphic Relations; that is, similarities in terms of shapes, colour, lighting conditions, camera orientation, etc.
168
Sabine Tan Table 10.2 Intersemiotic repetition of conceptual narrative theme this gesture
means
everything’s OK
Identified
Process: intensive
Identifier
Token
Value
Signifier
Signified
In the HSBC-text, similarities in terms of Setting, Colour Saturation and Lighting not only inform the modal function, but essentially operate on the intersemiotic plane by creating recurrent Visual Themes or Motifs. Placed at major transition points, they fulfil the function of demarcating sub-phasal boundaries. Frequently, this is reinforced by simultaneous changes in the Soundtrack, such as the intervallic sub-phases ‘On the road again’ (Sub-phase 1b, 1k, 1m – SHOT 3, SHOT 32, SHOT 40; and Sub-phase 1g, 1o – SHOT 17, SHOT 47, SHOT 49), where extreme long, underexposed shots of the biker against a dark-blue and ominous background are often accompanied by the synthesized hum of the drone of the motorcycle (simultaneously representing a change in the interpersonal orientation of the text). Finally, in the HSBC-text, Graphic Relations also play a part in advancing the overall narrative development of the text. Sub-phase 1q ‘Everything’s OK’, for example, presents a congruent stream of visual motifs (conveyed through the kinetic hand-gesture) that unfold in a quick succession of Graphic Conflicts. These effectively function to realize the Conceptual Narrative Theme of ‘Everything’s OK’ (visually and verbally) through the process of Intersemiotic Repetition (see Appendix 10.1a for close analysis) established on the basis of Iconographical Symbolism, with the verbal arguments acting as a visual ‘pointer’ which literally tells us how the given motif should be interpreted (see van Leeuwen, 2001, p. 107). In this instance, the verbal utterance fulfils a particular defining function by ‘anchoring’ or fixing the meaning of the signified through an Intensive Identifying process of the ‘equative’ kind (see Halliday and Hasan, 1985, p. 123). Additionally, in light of the overall narrative development of the text at the rank of Work As Whole, the rapidly changing Mise-en-Scène and conflicting graphic relations evident in this sub-phase may also help build up viewers’ excitement, and may thus be construed as a prelude to the climax, which culminates in the cross- cultural faux pas committed by the biker in the subsequent phase ‘In Brazil’.7 10.4.2.4 The impact of soundscapes and sound perspectives Lastly, phasal transition points can also be marked by changes in the Soundtrack. Quite clearly, the composite meaning of filmic representations is not realized through the semiotic dynamics of moving images alone, but in a symbiotic relation with the Soundtrack. For example, the manner in
A Systemic Functional Framework for the Analysis of TV-Ads
169
which sound can attract our attention may be relayed through to textual/ compositional choices, that is, the way the different auditory elements (i.e. speech, music and noise) relate to each other in terms of their spatial, temporal and rhythmic integration. Like visual representations, sound events may be compositionally foregrounded in terms of Salience, that is, the way in which the different elements stand out to differing degrees. Unlike visual modes which are inherently two- dimensional, van Leeuwen (1999, p. 14) elaborates that sound is a ‘wrap-around’ medium that tends to be ‘hierarchized’ in terms of perspective. He introduces Schafer’s (1977) terms of Figure, Ground and Field to differentiate between individual auditory elements that take precedence over one another in terms of their relative loudness, or acoustic salience (van Leeuwen, 1999, pp. 14–23). Consequently, Figure is declared to be ‘the most important sound, the sound which the listener must identify with, and/or react to and/or act upon’ (van Leeuwen, 1999, p. 23), whereas Ground and Field generally do not ‘speak to us’ with their ‘own voice’, but rather act as a form of accompaniment, setting, or ‘aural wallpaper’ (van Leeuwen, 1999, p. 112) that can be heard in the background. This is particularly evident in the HSBC-text, where the change to Sub-phase 1q ‘Everything’s OK’ is heralded by distinct changes in the perspective of the Soundtrack, which makes individual sounds stand out to differing degrees. This contrasts sharply with the overall pattern of the HSBC-soundtrack, which – for the most part – presents the viewer/listener with the uniform,
−3 −6 −12 −18 −∞ −18 −12 −6 −3 dB 0:30.0
Key: ARIAL NARROW SMALL CAPS= music, sound Arial italics = song Arial bold = voice over narration Times New Roman lower case = diegetic sound
Figure 10.1
0:35.0
0:40.0
0:45.0
0:50.0
0:55.0 hms
Easy Rider riding on the high-way All over South America this gesture means everything’s OK take me higher Apart from Brazil that is diegetic sound: radio Where it’s really rather rude diegetic sound: spoon clinking against glass We never underestimate the importance of local knowledge Scream HSCB The world’s local bank
0:25.0
There goes easy
0:20.0
ROCK MUSIC
0:15.0 ROCK MUSIC
0:10.0
MOTORBIKE DRONE
SOFT DRUMS
hms 0:05.0
Soundscapes: waveform analysis of the HSBC-text
170 Sabine Tan
impenetrable ‘wall of sound’ that is characteristic of the lofi soundscape, which is the typical perspective of hard rock (see van Leeuwen, 1999, p. 17, 21). Consequently, the transition to Phase 2 ‘In Brazil’ is marked not only linguistically through Conjunctive Relations (‘Apart from Brazil, that is’), but also by the abrupt suspension of all background music, song and voice- over narration, which, in terms of its waveform, is only punctuated at salient points by the relative silence of diegetic sounds (see Figure 10.1).8 On a structural level, these sudden, disjunctive shifts in sound texture, perspective, source and volume may even be seen as fulfilling a deictic function by creating a marked break in the narrative flux in which the diegetic space is momentarily (and rather unexpectedly) shared with the viewer (see Goodman and Graddol, 1996, p. 59). Placed at strategic points in the filmic narrative, these sound events fulfil a textual function that channels viewers’ attention to what is thematically (and visually) most important (namely, the cross- cultural faux pas committed by the biker). In addition, as these structural juxtapositions have the potential to sufficiently startle the viewers, and thus engage them emotionally on an interpersonal level, these devices may perhaps be construed as representing Acoustic Demands, which effectively highlight the synergistic qualities of the system of Sound in conveying the overall meaning potential of the text.
10.5 Conclusion As demonstrated in this chapter, the meaning potential of corporate television advertisements is multifunctional, multidimensional and interactive. However, as Kress and van Leeuwen (2001, p. 112) point out, ‘[m]any semiotic resources are reserved for specialists, or known in different ways by those who actively use them for semiotic production’. In advertising, especially in large agencies, the production of visual and verbal representations often falls within the purview of individual experts, which may result in differing ‘views’ and ‘voices’ being represented in a single text. This problematic issue may be further compounded by the fact that up until now not many theoretical or practical approaches to advertising take into account the meaning potential of semiotics. Although Kress and van Leeuwen (1996) point to an increasing acceptance by the advertising industry towards combining systematic analysis and practice, they do accede that ‘within the media, visual design is still the province of specialists who generally see little need for methodical and analytically explicit approaches, and rely on creative sensibilities honed through experience’ (Kress and van Leeuwen, 1996, p. 12). Moreover, further inconsistencies in the encoding and decoding of dynamic multimodal texts may result from the fact that people are presumed to be ‘more critically attuned to what is said, but less so to what is shown or “sounded” ’ (Iedema, 2001, p. 201). Without having access to
A Systemic Functional Framework for the Analysis of TV-Ads
171
the meanings produced by semiotic systems, especially ‘[i]f we turn our attention to images or sound, we often have no other resources for dealing with them than intuition and commonsense. But if we cannot deconstruct visuals and sounds, a whole universe of meanings escapes critical notice’ (Iedema, 2001, p. 202). While a semiotic approach may not be the answer to all these problems, the framework proposed in this chapter may constitute a viable means for ‘deconstructing’ and making transparent ‘what might otherwise remain at the level of vague suspicion and intuitive response’ (Iedema, 2001, p. 200). Nevertheless, the proposed transcription template is not without its limitations. For one reason, while it tries to be reasonably detailed and comprehensive, it is essentially exploratory in its approach. Secondly, there is a strong interpretative component involved, as the analysis does not take into account the views of the advertiser or advertising specialists who were responsible for the creation of the text. Thirdly, as the meaning of television commercials is inherently context-bound and generically specific, any account of the meaning-making mechanisms in a television commercial is therefore partial and subjective, and not necessarily applicable to other advertising texts. Consequently, more in- depth research into other genres, film-forms, institutions and industries would be needed – which, in addition to the social semiotic, also takes into account business practices and behaviour – to form a more conclusive assessment about the meaning-making potentialities in corporate advertisements. Apart from these limitations, it is nevertheless hoped that the proposed framework and transcription template for multimodal analysis will contribute to advance our understanding of the complex ways in which semiotic modes and resources may contribute to form meaning in a dynamic advertising text.
Appendix 10.1a images Block 1
Excerpt of transcription template for the multimodal analysis of dynamic moving
Phase/Sub-phase
1q – Everything’s OK SHOT 54
SHOT 55
Visual Frame
Frame 292 Frame 293 Block 2
Sound: Music Song
172
Speech Block 3
Experiential/ Representational Meaning
Verbal Description
Frame 294
Frame 295
Frame 296
Frame 297 Frame 298
background rock music continues Volume (p); Tempo: F song continues, but inaudible thing’s every
Frame 299
Frame 300
background rock music continues Volume (p); Tempo: F O
song continues, but inaudible K
[Two figures on snow-capped mountaintop; figure on the right waves it’s arm. Argentine flag flaps in foreground.]
Butcher, clad in green T-shirt, surrounded by flanks of brightly colored red meat, holds up hand in “Everything’s OK” gesture.
Narrative Representations
P:2 ; Vector: Y:gaze:off-screen:viewer Process: Existential/Circumstance of Location Movement (flag): Y, Process: non-transactional, intransitive, material process of action
P:1; Vector: Y:gaze:off-screen:viewer Circumstance of Means
Conceptual Representations
Relational Process: Symbolic Attributive: circumstantial:attributive participants are mountaineers from (OR in) Argentina Semiotic Process: Denotation:Categorization/ Typification:Setting, props, Visual Collocation/Iconographical Symbolism Argentine flag; Visual Metaphor Visual Theme/Motif impaired vision (implied) Conceptual/Narrative Theme Everything’s OK
Relational Process: Symbolic Attributive: intensive:attributive participant is a butcher Semiotic Process: Denotation:Categorization/ Typification: props, Visual Collocation meat Conceptual/Narrative Theme Everything’s OK
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
Interpersonal Meaning
Textual/ Compositional Meaning
173
Block 4
Mood
Modality Composition Graphic/ Rhythmic/ Spatio-Temporal Relations Intersemiotic Relations
Direct Address: Y:demand; Size of Frame: extreme long to long shot; Social Distance: public to far social; Angle/Power: HP:slightly oblique/detached, VP:low; CM:stat
Direct Address: Y:demand; Size of Frame: medium shot; Social Distance: far personal/close social; Angle/Power: HP:frontal:involved, VP:median; CM:stat
Color: less than naturalistic S/D; CX: median-low; Depth: medium-shallow:angled; CD: Exposure:under
Color: naturalistic S/D; CX: low; Depth: shallow:central
Salience: Figure:placement
Salience: Figure:Meat:perspectivecontrast:color
↔ Graphic Conflict: Setting color lighting ↔ ↔ Rhythmic/Dynamic Match: CM ↔ Conceptual/Narrative Relation to SHOT 12,22,26,38,39,45,52,53,55,56,(60)
↔ Graphic Conflict: Setting color lighting ↔ Graphic Relation to SCENE 3 SEQUENCE 6; ↔ Rhythmic/Dynamic Match: CM ↔ Conceptual/Narrative Relation to SHOT 12,22,26,38,39,45,52,53,54,56,(60)
Intersemiotic Complementarity: Intersemiotic Repetition Iconographical Symbolism
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
Appendix 10.1b
Phasal and narrative structure in the HSBC-text
Narrative Stage
Orientation
Phase/
1: All over South America
Sub-phase
1a – Setting the Scene
1b – On the road
1d – The natives
1c – The biker
Visual Frame SHOT 1
SHOT 2
SHOT 3
SHOT 4
SHOT 5
SHOT 6
SHOT 7
SHOT 8
SHOT 9
Scene/
SCENE 1
Sequence
174
Sound: Music
Narrative Stage
SHOT 10
SEQUENCE 1
SEQUENCE 2
beat of soft drums, cymbals Volume (n); Tempo: M
drums, :synthesized motorcycle drone Volume (f); Tempo: F
new beat: rock music; Volume (ff); Tempo: F
Orientation
Phase/
1: All over South America
Sub-phase
1d – The natives
1e – The biker
1f – The natives
1g – On the road again
1h – Carnival – the men
Visual Frame SHOT 11 Scene/ Sequence Sound: Music
SHOT 12
SCENE 1
SHOT 13
SHOT 14
SHOT 15
SHOT 16
SHOT 17
SCENE 2
SHOT 18
SHOT 19 SCENE 3
SEQUENCE 3 new beat: rock music continues Volume (ff); Tempo: F 10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
SHOT 20
Narrative Stage
Orientation
Phase/
1: All over South America
Sub-phase
1i – On the road again
1j – The natives
1k – On the road again
Visual Frame
SHOT 21 Scene/
SHOT 22
SHOT 23
SCENE 4
SHOT 24
SHOT 25
SHOT 26
SHOT 27
SHOT 28
SHOT 29
SHOT 30
SCENE 5
Sequence
SEQUENCE 4
SEQUENCE 5
new beat: rock music continues Volume (ff); Tempo: F
Sound: Music
175 Narrative Stage
Orientation
Phase/
1: All over South America
Sub-phase
1k – On the road again
1l – Carnival – the women
1m – On the road again
Visual Frame
SHOT 31
SHOT 32
SHOT 33
Sequence Sound: Music
SHOT 34
SHOT 35
SHOT 36
SHOT 37
SHOT 38
SHOT 39
SCENE 6
Scene/ SEQUENCE 5 new beat: rock music Volume (ff); Tempo: F :dig:motorcycle drone (p)
SEQUENCE 6 new beat: rock music Volume (ff); Tempo: F
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
SHOT 40
Narrative Stage Phase/ Sub-phase
Orientation
Complication
1: All over South America 1m – On the 1n – The companion road again
1o – On the road again
1p – Trouble
Visual Frame
SHOT 41
SHOT 42 SHOT 43 SCENE 7
SHOT 45
SHOT 46
SHOT 47
SHOT 48
SHOT 49
SHOT 50
SEQUENCE 8 background rock music continues Volume (f); Tempo: F
new beat: rock music continues Volume (ff); Tempo: F
There (?goes?) (? /i:/ /ze / ?) Easy
Song Speech
176
Narrative Stage Phase/ Sub-phase
SHOT 44 SEQUENCE 7
e
Scene/ Sequence Sound: Music
Rider
riding
Complication
Conflict
1: All over South America 1p – Trouble 1q – Everything’s OK
2: In Brazil 2a – Setting 2b – The faux pas the Scene
on the highway All over
Visual Frame
SHOT 51 Scene/ Sequence Sound: Music
Song Speech
SHOT 52 SCENE 8
SHOT 53
SHOT 54
SHOT 55
background rock music continues Volume (p); Tempo: F;
South America
song continues, but inaudible this gesture (.) means everything’s OK
SHOT 56
SHOT 57
SHOT 58
SHOT 59 SCENE 9 SEQUENCE 9 :dig:radio:rock music Volume: (p), Tempo: M
background rock music stops take me higher (uhm) Apart from that is (•) (•) (uhm) Brazil
Where it’s really
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
SHOT 60 :dig:radio stops
ra::ther ru:::de
Narrative Stage
Conflict
Resolution
Coda
Phase/
2: In Brazil
3: HSBC The world’s local bank
4: All’s Well
Sub-phase
2b – The faux pas
2c – The monkey
Visual Frame
SHOT 61
SHOT 62
Sequence
177
Sound: Music
SHOT 63
SHOT 64
SHOT 65
SHOT 66
SCENE 9
Scene/
SHOT 67
SEQUENCE 9 :silence
:dig:spoon clinking against glass
:dig:silence, bird twitter, crickets (pp)
drums, rock music Volume: (f), Tempo: F
background rock music: new beat Volume: (ff), Tempo: F
Song (SCREA:::M) Speech
SHOT 68
SCENE 10
We never underestimate the importance of local knowledge
HSCB (.) The world’s local bank
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
178 Sabine Tan
Appendix 10.1c List of notations/notational symbols Notational Symbol Description The Soundtrack Denotes instrumental music; more than one musical note indicate changes in beat, rhythm or melody Long directional arrow indicates continuation of sound (speech/music/sound effects) Marks the end of a continuous stretch of sound-input (pp) (p) (n) (f) (ff) S M F (.) (•) :::
underline underline >...< every thing’s OK
[text] (text) —text—
P P:1 P:1+ P:2 Y/N
Degree of Loudness/Volume and Tempo: Music Very soft Soft Normal Loud Very loud Slow Median Fast Accented/Rhythmical Units: Speech and Song A small dot enclosed in round brackets indicates a short pause A large dot enclosed in round brackets indicates a long pause Colons indicate that the speaker has stretched the preceding sound or vowel; the more colons, the greater the extent of stretching Underlining and bold underlining indicates stressed/accented speech units of varying degrees Indicates stretches of speech that are noticeably quicker than the surrounding text Indicates stretches of speech that are noticeably slower than the surrounding text Words or speech units displayed higher or lower on the line denote rising or falling pitch levels Verbal Description of Kinetic Movement or Setting Text enclosed in square brackets designate a series of actions or movements which occur simultaneously Round brackets indicate a sequence of movement(s) in time Text enclosed in long dashes indicate static images with little or no movement Narrative Representations: Identification of Represented Participants Denotes represented participant(s) Only one participant in visual frame More than one participant, but other participants are part of Setting Two participants; identifies transactive relations Denotes the presence or absence of Gaze- or Kinetic Action Vectors Continued
A Systemic Functional Framework for the Analysis of TV-Ads
179
Appendix 10.1c Continued Notational Symbol Description
Y N
Visual Demands/Offers Denotes a visual demand Denotes a visual offer
close-up extreme close-up extreme long shot long shot medium long shot medium shot medium close-up
Size of Visual Frame Shows just the head, hands, feet, or a small object Singles out a portion of the face (eyes or lips) Human Figure is barely visible; landscapes, bird’s-eye views Full view of human figure(s) with background Human Figure is framed from about the knees up Frames the human body from the waist up Frames the body from the chest up
close personal
far personal close social far social public HP VP POV CM stat mobile dolly pan tilt zoom-in/out
Color
S/D
Social Distance/Proxemics Distance at which ‘one can hold or grasp the other person’ and therefore also the distance between people who have in intimate relation with each other Distance at which ‘subjects of personal interests and involvements are discussed’ Distance at which ‘impersonal business occurs’ More formal and impersonal Distance between people who are and remain strangers Angle/Power, Perspective Horizontal Angle: frontal angle signals involvement, oblique angle signals detachment Vertical Angle denotes power relations: high/median/low Point-of-View (subjective image) Camera Movement Camera Movement Stationary Camera Mobile Framing Camera travels in any direction along the ground: forward, backward, circularly, diagonally, or from side to side Camera scans space horizontally from left to right or right to left Camera scans spaces vertically up or down Camera does not alter position; space is either magnified or de-magnified Directionality of camera movement is indicated by short directional arrows Naturalistic Modality Markers Summarizes aspects of Color Saturation (running from full color saturation to the absence of color, to black and white), Color Differentiation (running from a maximally diversified range of colors to monochrome) Color Saturation/Color Differentiation Continued
180
Sabine Tan
Appendix 10.1c Continued Notational Symbol Description CX Depth
CD Exposure:under Exposure:over Speed Racking Focus
Contextualization (running from the absence of background to fully articulated and detailed background) Depth of Perspective (running from the absence of depth to maximally deep perspective, with central perspective having highest modality) Cinematographic Devices Cinematographic Device Identifies shots that are underexposed Identifies shots that are overexposed Identifies kinetic or camera movements that appear less than real, i.e. either too fast or too slow Identifies instances where perspective has been adjusted by racking focus, e.g., whereby a shot beings with an object in the foreground shown in sharp focus with the background fuzzy, then adjusts perspective by racking focus so that the background elements come into focus and the foreground becomes blurred. Graphic/Rhythmic Relations Indicates graphic, rhythmic, or spatio-temporal relations with preceding and succeeding Shots
ST-Discontinuity FX super
dissolve fade-in fade-out
Special effects/editing devices Identifies instances of Spatio-Temporal Discontinuity Special effect/editing device Superimposition: identifies instances where one image is laid over another; also identifies non-diegetic inserts such as superimpositions of written text or logos within the visual frame Fades-out one shot while superimposing the next Gradually lightens a shot from black Gradually darkens the end of shot to black
Notes 1. It should be noted that the advertisement is merely used here as a generic model for illustrating the various choices that are available within a semiotic framework. This chapter does not purport to interpret the text in terms of the advertiser’s philosophy. 2. Due to space constraints, only every other representative Frame was eventually selected for inclusion in the Transcription Template. 3. Appendix 10.1b presents the phasal and narrative structure of the HSBC-text as it unfolds on a shot-by-shot basis, together with the soundtrack. Whilst the proposed framework calls for shots to be depicted in terms of their actual length and duration, due to space constraints, Appendix 10.1b includes only the most salient visual frame. 4. For a detailed discussion on mobile framing, see Bordwell and Thompson (2004).
A Systemic Functional Framework for the Analysis of TV-Ads
181
5. For a discussion on film form, type and genre, see Bordwell and Thompson (2004). 6. For a definition of social narratives, see Labov (1972) and Toolan (1988). 7. Shots in Sub-phase 1q consist of 7–8 Frames with an average duration of 0.62 seconds, which is considerably shorter than the gross average of 11 Frames per Shot with an average Shot duration of about 0.87 seconds. 8. The waveform analysis (performed with Adobe Audition™) reflects the degree of loudness/volume measured in decibels.
References Arens, W. F. (2002) Contemporary Advertising (New York: McGraw-Hill/Irwin). Atkinson, J. M. and J. Heritage (1984) Structures of Social Action: Studies in Conversation Analysis (Cambridge: Cambridge University Press). Baldry, A. P. (2004) ‘Phase and transition, type and instance: Patterns in media texts as seen through a multimodal concordancer’, in K. L. O’Halloran (ed.) Multimodal Discourse Analysis: Systemic Functional Perspectives (London and New York: Continuum), pp. 83–108. Bolinger, D. (1985) Intonation and Its Parts (London: Edward Arnold). Bordwell, D. and K. Thompson (2004) Film Art: An Introduction (Boston: McGraw-Hill). Budd, M., S. Craig and C. Steinman (1999) Consuming Environments: Television and Commercial Culture (New Brunswick, NJ: Rutgers University Press). Ellsworth, E. (1997) Teaching Positions: Difference, Pedagogy, and the Power of Address (New York: Teachers College Press). Goodman, S. and D. Graddol (1996) Redesigning English: New Texts, New Identities (London: Routledge). Gregory, M. (1995) ‘Generic expectancies and discoursal surprises: John Donne’s The Good Morrow’, in P. H. Fries and M. Gregory (eds) Discourse in Society: Systemic Functional Perspectives, Meaning and Choice in Language: Studies for Michael Halliday (Norwood, NJ: Ablex): pp. 67–84. —— (2002) ‘Phasal analysis within communication linguistics: Two contrastive discourses’, in P. H. Fries, M. Cummings, D. Lockwood and W. Spruiell (eds) Relations and Functions within and around Language (New York: Continuum), pp. 316–45. Halliday, M. A. K. (1994) An Introduction to Functional Grammar, 2nd edn (London: Edward Arnold). Halliday, M. A. K. and R. Hasan (1976) Cohesion in English (London: Longman). —— (1985) Language, Context and Text: Aspects of Language in a Social-Semiotic Perspective (Oxford: Oxford University Press). Iedema, R. (2001) ‘Analysing film and television: A social semiotic account of hospital: an unhealthy business’, in T. van Leeuwen and C. Jewitt (eds) Handbook of Visual Analysis (London; Thousand Oaks; New Delhi: Sage Publications), pp. 183–204. Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design (London: Routledge). —— (2001) Multimodal Discourse: The Modes and Media of Contemporary Communication (London: Arnold). Labov, W. (1972) Language in the Inner City (Philadelphia: University of Pennsylvania Press). O’Halloran, K. L. (2004) ‘Visual semiosis in film’, in K. L. O’Halloran (ed.) Multimodal Discourse Analysis, Systemic Functional Perspectives (London and New York: Continuum), pp. 109–30.
182
Sabine Tan
O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). Royce, T. D. (1998) ‘Synergy on the page: Exploring the intersemiotic complementarity in page-based multimodal text.’ JASFL Occasional Papers, 1(1): pp. 25–48. —— (2002) ‘Multimodality in the TESOL classroom: Exploring visual-verbal synergy.’ TESOL Quarterly, 36(2): pp. 191–205. Schafer, R. M. (1977) The Tuning of the World (Toronto: McClelland and Stewart). Tan, S. (2005) A Systemic Functional Approach to the Analysis of Corporate Television Advertisements, Unpublished Master’s thesis (Singapore: National University of Singapore). Thibault, P. J. (2000) ‘The multimodal transcription of a television advertisement: Theory and practice’, in A. Baldry (ed.) Multimodality and Multimediality in the Distance Learning Age: Papers in English Linguistics (Campobasso: Palladino), pp. 311–85. Toolan, M. J. (1988) Narrative: A Critical Linguistic Introduction (London; New York: Routledge). van Leeuwen, T. (1996) ‘Moving English: The visual language of film’, in S. Goodman and D. Graddol (eds) Redesigning English: New Texts, New Identities (London: Routledge), pp. 81–105. —— (1999) Speech, Music, Sound (Houndsmills, Basingstoke, Hampshire and London: Macmillan Press Ltd). —— (2001) ‘Semiotics and iconography’, in T. van Leeuwen and C. Jewitt (eds) Handbook of Visual Analysis (London; Thousand Oaks; New Delhi: Sage), pp. 92–118.
Website: HSBC (2002) HSBC Global Site, www.hsbc.com/1/2/newsroom/news/news-archive2002/new- campaign-for-the-worlds-local-bank (accessed 11 November 2007).
11 Multisemiotic Marketing and Advertising: Globalization versus Localization and the Media Anna Hopearuoho and Eija Ventola
11.1
Introduction
Advertising is one of the principal genres which surround us superfluously today. In the modern world, we cannot avoid encountering ads locally and globally: they ‘bounce’ upon us on the pages of the daily papers and journals; they draw our attention along motorways on billboards; we have to listen to and watch them on the radio, on TV and on the Internet. Ads draw our attention to the qualities of marketed products or services and appeal to our needs and emotions by using highly creative and ‘colourful’ language. Ads are carefully designed to catch people’s attention and with them advertisers enforce consumers to remember specific product names by using some typical, marked linguistic choices, for example, wordplay, imperative mood, inventing new words and using familiar words in new contexts (Dyer, 1982, pp. 139–41). Such phenomena and advertising genres have frequently and extensively been studied by linguists (see Bell, 1991, pp. 135–42 and Geis, 1982; Vestergaard and Schrøder, 1985; Coleman, 1990; Cook, 1992; Myers, 1994; Goddard, 1998; Crook, 2004). Today, however, there is a notable shift in research foci and ways of doing research when advertising has moved from the media of paper, radio and TV onto the Internet. Why is this so? The Internet as a new technological medium offers flexible ways of combining different kind of modes. It has quickly developed into a medium that enables complex meaning-making ‘multisemiotically’ (the term used here, but cf. e.g. Norris (2004) ‘modal density’; Baldry and Thibault (2006) ‘co- deployment of resources’). Written texts, still images, layouts similar and dissimilar to paper ads, spoken texts and sound effects like in radio ads and even moving images, that is, animation and film, until now only typical in TV ads – they are all now combined in one on the Internet (see also Crystal, 2001, p. 201; Bateman, 2008, pp. 1–2). One Internet page can simultaneously use all these modes and 183
184
Anna Hopearuoho and Eija Ventola
media for advertising – for creating meanings whereby advertisers try to connect the object that is being advertised with certain feelings that make the objects appealing enough to potential customers to buy them, and now not only locally but also globally. Yet, as with the traditional media ads, even on the Internet, consumers can ‘opt out’ of the reading process. They can skip those parts of the layout where the advertising texts and images appear; they can turn off the volume if the voice ad appears and close the images and film clip ads if they start automatically. But advertisers seem to have developed one way of advertising on the Internet which the consumers cannot ‘skip’ and which is thus hard to ignore unless mechanically ‘switched off’: the ‘pop-up’ banner ads. Due to the multisemiotic versatility of both the Internet page ad and the ‘pop-up’ banner ad appearing on it, this form of advertising is a particularly interesting choice to global players when they advertise their products and services. The chapter first discusses briefly the marketing contexts for services and products and the shift from mere local marketing to global marketing and, further, the ‘multisemiotization’ processes that coincide with the ‘globalization processes’ (Section 11.2). The chapter uses the systemic functional linguistic (SFL) and the multisemiotic views developed within that framework to discuss whether traditional ads and Internet ads (non-pop-ups and popups) use the same or different means of creating meanings metafunctionally, that is, experientially, interpersonally and textually/compositionally in local and global advertising (Section 11.3). What is of specific interest here is: how are the three metafunctions fulfilled in local and global advertisements and to what effect, what kind of a role do the similarities or differences play in the localization vs. globalization processes of marketing techniques of products and what do we learn about local strategies of global players? These issues will be discussed in relation to some results of the data example analyses (Section 11.4). Finally, conclusions are made about the meaning-making in traditional media and Internet advertising: how the local becomes global and how it is enforced upon us multisemiotically. This seems to indicate a growing need, on the one hand, for training interdisciplinary experts who can design such advertisements but who at the same time are linguistically and semiotically sensitive to the localization needs in the global market and, on the other hand, for raising critical multisemiotic awareness among viewer-readers of ads so that they can make informed choices of the services and products they consume (Section 11.5).
11.2 Changes in marketing contexts – from local to global marketing Traditionally, products and services have been consumed in the local neighbourhood or area, or at least in the same country in which they have been
Multisemiotic Marketing and Advertising
185
produced. This meant that advertising was conventionally primarily local. It was designed thematically and linguistically to appeal to local consumers and it was carried out with local media: first paper, then radio and TV. This largely applied up to the 1980s. ‘Foreign advertising’ also reached local consumers to some degree, when they bought foreign papers, listened to foreign radio stations and, somewhat later, were able to watch foreign cable TV-programmes and their foreign ads. But these ads were designed with the foreign local market, not the global market, in mind. The availability of products has traditionally been confined to the regions where advertising has been seen as profitable; that is, there was no point in advertising an ‘exotic’ product, for example, in Finland, if it could not be bought in Finland. Today, the mechanisms of advertising have also changed. Due to the expansion of media, but particularly with the new possibilities offered by such technology as the Internet, there has been a shift in marketing design ‘from local to global’. The nature of advertising has changed because the nature of international business of today has changed. As international trade has expanded and transporting goods and services across countries and continents has become easier, companies have also become multinational and their globalization has become inevitable. In fact, many companies have had no choice but to go global, just to sustain their competitive advantage. Consequently, potential customers are no longer necessarily in the same region or even in the same country as the producers, nor do they speak the same language. Foreign products and services are now transported and available globally (and one cannot always totally be aware of their true origins). Marketing and advertising has also taken a turn to globalism – both in the traditional means of advertising as well as in the latest medium, the Internet. Particularly this technology has given consumers the possibility to order products that are advertised worldwide quickly and efficiently, simply by dialling a number or by clicking a mouse button from virtually any country of the world. This global availability of products has also generated strategies for ‘global advertising’. Multinational companies prefer marketing strategies and media that reach maximal amounts of potential customers all over the world. Thus, for example, a global Finnish product can no longer be advertised in Finnish or with the marketing strategies and themes that appeal only to the local population (Nokia advertising ‘Connecting People’ may be considered one prime example of such worldwide campaigns). Thus, when studying ‘global advertising’, it becomes interesting to consider, for example, the following questions: What will get represented in global ads? How will global customers be addressed? How are global ads composed for the currently available media? Globalization is almost forced upon consumers through globalized marketing and availability of goods and services, but problems connected with globalized marketing also appear. Cultural diversity is a difficult challenge for any multinational company (see e.g. the discussion in Chapter 10 in
186
Anna Hopearuoho and Eija Ventola
this volume). Globalization also meets resistance. Many buyers reject globalization and the products the exact origins of which they do not know. In other words, an opposite movement to ‘globalization’ has (re-)emerged, ‘localization’, as many people have started to value local products more than the imported ones. Consuming domestic goods is fashionable, and some people want to reject worldwide brands and images that are used for marketing products throughout the world using the same advertising strategies, themes and images. Thus, ‘globalization–localization’ seems to create tension to marketing designs: how to advertise global products and services globally and yet locally maximize appeal to customers? It seems that companies now try to place their marketing strategies somewhere between the two end positions on the cline of ‘globalization–localization’. Many multinational corporations now strive to build a consistent international brand image and thus use maximally the same global marketing material in every country, assuming that the same brand can principally be advertised everywhere in the same way. They try to develop a global identity for themselves and their products and services by not making significant differences in the marketing and advertising in different cultures. But, although these kinds of overall marketing means may be global, many big, multinational companies may also have to rely on ‘localization’ strategies in order to reach customers for their global products and services in different countries. This means developing a unique ‘marketing mix of global and local’ to match the distinctive features of each culture. The global advertisements are made to vary across countries to some degree because the local contexts may demand different linguistic and other semiotic realizations (i.e. local languages are used for advertising; certain cultural semiotic choices are highlighted in the ads; etc.). In this chapter, the discussion on placing the ads on the marketing scale of ‘globalization–localization’ will be elaborated with examples from a big multinational company, Toyota, which uses both ‘global and local’ strategies to advertise its cars. It will further be suggested that this positioning on the scale of ‘globalization–localization’ will also depend on the modal realizations of the ads on paper, on a web page or as a ‘pop-up’ banner, and that the differences and the similarities can fruitfully be analysed with the current multisemiotic theories developed within the systemic functional linguistic (SFL) framework.
11.3
Multisemiotics for analysing local and global ads
As noted in Section 11.2, multinational companies have to consider not only how to globalize but also how to combine the global with the local in ads in different media, when localizing advertisement campaigns. Thus, they have to find solutions for the right mix of the global representation of products and services with the local – the mix that addresses different
Multisemiotic Marketing and Advertising
187
global audiences with enough local appeal – and then use globally effective media to infiltrate the consciousness of local consumers maximally. A theory that seems to handle the ‘what is represented, to whom and with what means linguistically’ well is the Hallidayan SFL theory and its metafunctional analysis of texts. Functionally, language systems conglomerate to roughly three different aspects of meaning-making: ideational, interpersonal and textual. The ideational metafunction construes a model of experience, and it is concerned with clauses as representations. The interpersonal metafunction enacts social relationships, especially the relationship between speaker and hearer, and it is concerned with clauses as exchanges. The textual metafunction creates relevance to context, and it is concerned with clauses as messages (Halliday, 1994, pp. 34–6). But since in ads we also deal with modes other than language in meaningmaking processes, in the analysis we also apply the latest developments in SFL towards a theory of multimodal and -medial theory or theories, also called ‘multimodality’ (e.g. O’Toole, 1994; Kress and van Leeuwen, 1996), ‘multimodal discourse analysis’ (e.g. O’Halloran, 2004) or ‘multimodal semiotics’ (e.g. Unsworth, 2008), or simply ‘multisemiotics’, a term preferred here. These approaches are firmly grounded within the Hallidayan metafunctionally oriented meaning-making theory of language which is now pushed towards descriptions of other modes in addition to language, towards multifunctionally organized ‘grammars’ for explaining the other modal meaning-making choices and their combinations (e.g. the ‘visual grammar’ of images of Kress and van Leeuwen, 1996). The argument is that images, just as language, combine parts into meaningful wholes. While the grammar of a language describes how words combine in clauses, sentences and texts, the grammar of visuals describes the way in which depicted people, places and things combine in visual statements (Kress and van Leeuwen, 1996, p. 1). Here only a brief explanation of the expansion of metafunctions beyond their use in text analysis to multisemiotics will be necessary before the actual advertisements will be considered. The ideational metafunction relates to how the participants, processes and relationships are represented in the text or image. O’Toole (1994, p. 14–15) calls this metafunction representational function, since its purpose is to convey to the viewer what the picture is actually about. The transitivity, that is, who is doing what to whom, can be explored by looking at what kind of processes there are in the text or image. From the processes involved in representation, the ones most important to our current purposes are the material processes which describe action, for example, in verbal texts where the actor is doing something concrete or abstract. In images, these may also involve a direct portrayal of the process. Although also important, here the other processes will for reasons of space be focused upon less (for more details, see e.g. Kress and van Leeuwen, 1990, pp. 65–70; Kress and van Leeuwen, 1996; O’Toole, 1994).
188
Anna Hopearuoho and Eija Ventola
The interpersonal metafunction (or engagement function, O’Toole 1994, pp. 5–8) includes the reader or viewer in an interaction with the text or image. A parallel is made between an offer and a command in text, for example, Would you like a cup of tea? / Give me a cup of tea! and images that either demand or offer (Kress and van Leeuwen, 1996). A demanding image engages viewers by addressing them directly. The participant’s gaze demands something from the viewer. An image can also address the viewer indirectly, and just offer something for them to see. These kinds of ‘offer pictures’ do not engage the viewer with a gaze. The textual metafunction relates to how a text is organized and structured (Hofinger and Ventola, 2004, p. 202), but when referring to pages, images and the Internet, a label compositional can also be used. Here, the Theme/ Given (in bold) and Rheme/New realizations in texts, for example, John ate the cake vs. The cake was eaten by John, are applied to pictures: what is placed on the left is the ‘Theme’ of the picture, while the rest is the ‘Rheme’. The ‘Given’ is presented as the material that the reader already knows, and the ‘New’ is the unknown part. As in written texts, these elements conflate frequently with the ‘Theme’ and ‘Rheme’: the ‘Given’ element on the left and the ‘New’ is on the right side of the page (Bateman et al., 2004, p. 66; see also Bateman, 2008 for a critical discussion). The assumption is that what is placed on the left is presented to the viewer as the familiar ‘point of departure’ for reading the image. This does not, however, mean that the element on the left is necessarily something that the reader is familiar with; it is merely presented as such (Kress and van Leeuwen, 1996, p. 187). Another way of organizing the elements in an image is the ‘Ideal’/‘Real’ division. The ideal is presented on the top of the picture, while the real is placed on the bottom, according to Kress and van Leeuwen (1996, pp. 191–206). The information value varies according to different areas of the page: the centre usually contains the nucleus of information, and the other elements are in the margins. Salience can also create a hierarchy of importance among elements in an image. The greater the salience, the more important the element is. Salience means the weight of an object in a picture, and it consists of the size, distance, sharpness of focus, colour and placement of the object. The importance of a particular element can also be emphasized by framing. Framing increases the individuality of the element, while the lack of framing makes it part of a group. In compositions, whether paper layouts, images or web pages, salient elements are important since they attract the eye first and thus ‘the reading path’ of an image depends on the salient elements (Kress and van Leeuwen, 1996, p. 208). In texts, the traditional reading path in Western countries goes from left to right, top to bottom. However, the reading path of a web page is significantly different from the traditional one. Crystal notes (2001, p. 196) that there are large quantities of texts that can be read in a multidimensional way. The readers, or viewers, move their eyes about the page according to
Multisemiotic Marketing and Advertising
189
their interests and the design of the page. They do not necessarily read the text in fixed sequences. One significant factor affecting the reading path of a web page is the organization of information on the page. According to Crystal (2001, pp. 196–7), particular areas of a web page are designed to break the normal reading path to attract the reader’s attention (these issues have been discussed more specifically by Bateman, 2008). Today, there seem to be certain conventions about what kind of information a particular part of the page is supposed to contain. Especially in ads, these conventional patterns can be broken to catch the viewers’ attention. For example, advertisement banners are expected to be at the top of the page but can be placed in unexpected areas of the page. When describing multimodal artefacts, O’Toole (1994) extends the notion of ‘rank’ to the analysis of not only text, but also to other semiotics, such as paintings, sculptures, architectural buildings, suggesting that a multimodal object like a painting can be analysed at the ranks of work, episode, figure, member (for details, see O’Toole, 1994). Although the issue of ‘rank’ in multisemiotic analyses is still somewhat controversial, this notion will also be made use of here, when analysing the ads. These kinds of multisemiotic SFL considerations beyond text and their role in discourse studies have amply been demonstrated in research papers dealing with the use of different modes like language, image, sound, music, etc. for communicative purposes, and combining the modes with the use of different media – paper, telephone, radio, TV, Internet (see Ventola et al., 2004; O’Halloran, 2004; Jones and Ventola, 2008; Unsworth, 2008; this volume, etc.). It is not an easy task to describe the mix of ‘globalization–localization’ multisemiotically and to analyse if and how in ads the global products are multisemiotically ‘pushed’ into the consciousness of local consumers. It involves exploring the semiosis with this framework and analysing the way the ads have been construed by choice combinations of various linguistic, pictorial and sometimes even sound choices into structures of various kinds that consumers consider ‘unified, harmonious and appealing meaningful wholes’ in advertising. Section 11.4 discusses some of the Toyota car ads from the sample texts and looks at how they are construed metafunctionally to represent, to appeal, to be read as discourses in their compositions. The focus is on what is being taken for granted in an ad; this can help us to establish the link between language, multisemiotics and ideology which seems to be particularly important when studying global and local aspects of advertisements.
11.4 Multisemiotics and ‘globalization–localization’ in car ads To illustrate the processes of ‘globalization–localization’ at work in advertising, let us focus on a commodity available worldwide – a Toyota car,
190
Anna Hopearuoho and Eija Ventola
produced by a successful, multinational company – and its ads in a traditional local magazine, on an Internet page and as a pop-up ad where Toyota is being imposed on readers who are actually reading other web pages on the Internet. The following Toyota ads will be discussed: ‘a paper ad’ (paper as a medium) from 2005, the web pages of the car company from 2005 and the current local country ‘web page ads’ (available at the time of writing from www.toyota- europe.com/) (Internet as a medium), and two ‘pop-up banner ads’ (Internet web page as a medium) from Toyota, one from 2005 and the other provided by the company in 2008. Each of the media is represented in Illustration 11.1. All of the ads include at least text as well as images. Therefore they are multisemiotic; they display ‘the use of several semiotic modes in the design of a semiotic product or event’ (Kress and van Leeuwen, 2001, p. 20). What is interesting in this context is how the different modes are combined and how they reinforce one another in pushing a global product. Both the text and the images (whether still or moving) of the ads will be considered important as well as their relationship with one another. Each individual ad is first described in detail and then some aspects of the metafunctional analysis will be highlighted and related to ‘globalization– localization’ issues. That is, the general goal is to find out how the three metafunctions are realized in the advertisements and for the construal of which effect? When analysing the meanings that advertisements create, what should be kept in mind is that advertisements from a particular corporation are always part of its marketing strategy, which in turn is the overall global selling strategy of that corporation, in this case. Let us begin with the traditional magazine ‘paper’ advertisement of Toyota, then move onto the Toyota ‘web page’ advertising and finally to the Toyota ‘pop-up’ ad and discuss some of the significant realizations of the metafunctional choices in the ads, the similarities and differences in them and the general implications of how the ads function as ‘semiotic ensembles’ to enhance the ‘globalization–localization’ processes in advertising global products. 11.4.1 Multisemiotics and ‘globalization–localization’ in a traditional ‘paper’ car ad The ‘paper ad’ (paper as medium), the first to be discussed, was taken from a monthly supplement magazine of a Finnish newspaper, Helsingin Sanomat, in spring 2005. Here the ad represents the choices that a multinational company makes when wanting to be ‘local’. The ad appeared at the time of the Athletics World Championships held in Helsinki in August of 2005, so, as a paper ad, it is both ‘place-bound’ as well as ‘time-bound’. The ad presents four images, each representing a different car model option, Yaris, Avensis, RAV4 and Corolla, as options for buying. The images of Yaris, Avensis and RAV4 are in the top part of the image. In the top right-hand corner, the
‘Paper ad’
‘Web page ad’
Illustration 11.1
‘Pop-up banner ad’
Toyota advertisements in different media
10.1057/9780230245341 - The World Told and the World Shown: Multisemiotic Issues, Edited by Eija Ventola and Arsenio Jesús Moya Guijarro
192
Anna Hopearuoho and Eija Ventola
Internet page address, www.toyota.fi, is visible, but of course the reader cannot click it to get to the Internet. The choice of Corolla as the most salient picture in the middle is most likely not accidental as Corolla has for years led sales (of Toyota) in Finland, due to it being a popular, reasonably priced family car (see www.fi.wikipedia.org/wiki/Toyota_Corolla). This image has on the top right-hand corner a round white-based circle, like a logo, with Helsinki Athletics World Championships 2005 written in Finnish on it. The word HELSINKI appears in the logo in capitals and the round logo includes also the Finnish flag – a blue cross against a white background. Beneath the round logo is the following text in Finnish, but here given as an English translation: The main official partner of the 10th IAAF World Championships in Athletics. Underneath the Corolla image, we have a capturing title in Finnish in red letters: The top- equipped Championship 2005-Edition models – now available (here again as a translation), and the descriptive text on the left explains that the special edition equipment can also be obtained for all the models represented in the ad. All the text in the ad is in Finnish, except a few English expressions intermingled with the Finnish. Linguistically and appropriately to the purposes of the ad, the Finnish text enforces the link with the Athletics World Championships and the top performance of athletes and cars: The thoroughly considered quality, refined to the detail, creates the excellence of the top- equipped Toyota Championship 2005 Edition that leaves others far behind. On the right below the Corolla image, the company gives an example of how the Corolla MM 2005 Edition can be equipped for €300 (the retail value of the chosen equipment being €670). Below these there is a framed box titled with red capital letters: SUPER DAY and below it an indication of the date, 14.8.05; the text on the right within the frame invites the readers to participate in a competition by the date set either at Toyota representative offices or on the web page www.toyota.fi/superday to win VIP passes for 2 to the Athletics World Championships. On the right of this SUPER DAY box, we see the round Toyota Logo and its slogan in caps: TODAY (grey) TOMORROW (black) TOYOTA (red), with the colours highlighting the importance of the choice of Toyota. At the bottom of the ad, separated by a line, in two columns, we are given the technical information of the Toyota cars advertised here, for example, their price, their consumption per 100 km, etc. When this ‘paper ad’ is analysed metafunctionally, following the multisemiotic principles of O’Toole (1994) and others, as far as the representational metafunction at the rank of the whole work is concerned, this advertisement is more of a portrayal of a car than, for example a narrative telling a story, or it can be considered to represent a collection of portrayals of items of the same class. At the rank of episode, the ad includes material processes: the cars seem to be moving forward. The actors involved seem to be men (in the smaller images of the other models advertised, the drivers are not quite as visible as the male driving the Corolla). In the processes, action is realized
Multisemiotic Marketing and Advertising
193
by vectors created by the diagonal line of action. The vectors are always slightly diagonal from top left to bottom right in all of the pictures. This gives an impression of speed. From the interpersonal perspective and following Kress and van Leeuwen (1996), this ‘paper ad’ is an ‘offer’ picture, not a ‘demand’. There is no direct engagement with the viewer, and the participants in the images are not engaged with one another. In terms of the compositional metafunction, the four cars are presented as the choices to the consumer, Corolla being the most salient one. Again, following Kress and van Leeuwen’s (1996) argumentation, as in most car ads, also in this one the ‘ideal’ is on the top, the cars, and the ‘real’ at the bottom, the heading and the text about the special MM 2005 Edition model, the SUPER DAY competition in which you can win VIP tickets to the Athletics World Championships and the technical details about the cars. As magazines are usually read by local readers, it can be assumed that ‘paper ads’ encourage relatively locally oriented metafunctional realization combinations of meanings. That is, what will be represented, to whom and by what means is to a certain degree determined by the ‘paper layout page’ as the medium to the local readership. This Toyota ‘paper medium’ ad has significant localization features. Linguistically it is realized in Finnish, and thus the text can only be read by those who master that language, so the language medium is determined by the local readership. The logo in the Corolla image is also local, where the blue letters of HELSINKI and the Finnish Flag would only be appealing to the Finnish audience, both as text as well as through the national colours – blue and white. What could also be considered local is that in the choice, the representation and the composition of the Corolla as the biggest centre image in the ad, as it is the most favoured of the Toyotas in Finland. This ad, as already mentioned, is also ‘time-bound’, the MM 2005, realizes both the ‘time-boundedness’ as well as the locality, as in the Corolla MM 2005 Edition, when the Corolla is given as an example of the specially equipped car for €300, MM can only be interpreted by the Finns as an abbreviation referring to the World Championship (Maailmanmestaruus). That the multinational companies mix ‘global and local’ in their advertising campaigns can easily be seen in this magazine ‘paper’ ad. Although the readers of this ad are Finnish and its distribution is clearly within Finland, the ad has various ‘global’ features. Linguistically we get the ‘borrowed’ English words within the Finnish text, for example, Edition, SUPER DAY and the English words of the Toyota slogan, TODAY, TOMORROW, TOYOTA. Most Finnish buyers would of course understand these English words; their inclusion may perhaps be motivated by the ‘international’ atmosphere of the 2005 World Championships, but in fact they come from the European advertising strategy of Toyota. Interpersonally, the ad is designed as an ‘offer’ picture globally, but the language makes it local, with the exception of the
194
Anna Hopearuoho and Eija Ventola
few foreign words. The representational global company origins of the ad can also be seen in the details of the images portraying the different Toyota models. None of the images have Finnish background; the Yaris- driver is cruising on a road that has steep cliffs (very unlikely in Finland), the Avensisdriver is crossing a bridge (that could be Finnish, but could be anywhere), the RAV4- driver parks his car in front of a house where a palm tree is growing on the lawn (definitely not likely in Finland – yet). The Corolla image could be set in Finland, as it is a representation of a male driver sitting in a car that is just starting off from what seems to be a relatively neutral car exhibition hall. It is very likely that all the images, as in fact most of the compositional layout, of the ad has ‘global’ company origins; the ad is merely conveniently localized linguistically by the Finnish translation and by setting the 2005 World Championships materials into the ad. Localization requires profound understanding of the people in the target culture; in the Corolla ad the advertiser has chosen to ‘localize’ with national symbols (flag) and colours; yet, nothing has been done to localize the images, which mostly likely are provided globally by Toyota to the advertising campaigns throughout the world. When looking at these images, most consumers find themselves in an environment other than their own familiar ones. When ads are realized as ‘static’, ‘paper-bound’, ‘place-bound’ and ‘timebound’ ads, multinational companies may or may not try to find a balance between the ‘mix of global and local’. But when the medium of ads changes to the Internet, the choices of multisemiotic realizations multiply in terms of text, image, layout and movement in animation and film clip choices and, of course, through hyperlinking. Thus, what on the Internet initially seems equivalent to the ‘static’ paper ad, very soon becomes more ‘dynamic’ and also ‘interactive’ as readers follow their own reading path on the Internet. Let us now leave the paper ad, get active and go to the company ‘web pages’ of Toyota to see how they have developed and how they differ from the ‘paper’ ads. 11.4.2 Multisemiotics and ‘globalization–localization’ on a car web ad At the same time as the 2005 magazine ‘paper ad’ of Toyota appeared, Toyota cars were also advertised on the Internet. The address for that web page was and still is www.toyota.fi, but today the Toyota advertisement is realized differently from the ad in 2005. Today (at the time of finalizing this chapter) that particular web page is part of the European Toyota web page, www. toyota- europe.com/, displaying the various designs of Toyota web pages for individual local countries in their respective languages. The following discussion of the Internet and the web page ads relates to all of the Toyota sub web pages, all of which may in turn have totally changed in multisemiotic content by the time the readers of this book are actually reading this chapter. This fact in itself indicates how difficult it is to conduct multisemiotic
Multisemiotic Marketing and Advertising
195
research with continuously changing data, and another difficulty relates to reporting about the research: with ‘paper medium’ in the current volume. Not only are the number of images allowed in the volume limited but there is also no way of showing everything that happens on the screen on these varying Toyota web pages (e.g. movement). Here the discussion, unfortunately, has to be displayed in writing – more by telling than by showing. A company ‘web page’, like the one that Toyota had in 2005, can contain a lot of written text and texts of various kinds (different genres) since it can be scrolled. But usually the ‘home pages’ (first pages) of car web ads are not designed to be scrolled but rather to be seen in one go on the screen. This limits the amount of written text in these web ads, or the text has to appear under separate hyperlinks. In 2005, when the potential Toyota buyers clicked the address www.toyota.fi, they were faced with what looked like a very similar page to the printed page discussed in Section 11.4.1, but this time displaying the Toyota RAV4, a bigger model than the Corolla. Similar to many of the other Internet pages of that time, its layout is organized in two vertical columns. The wider of these on the right contained a big picture of the Toyota RAV4 on a highway, driven, again, by a man (background hard to tell). The car was in the same kind of diagonal angle as in the previous images in the paper ad, indicating movement forward and speed. Above the image, placed horizontally on the right, was the round Toyota logo and underneath it the words TODAY, TOMORROW, TOYOTA with the same colour coding (Toyota in red). In the margin beneath the image and the Finnish slogan for RAV4 – Make the most of everyday of your life (translation) were three separate columns under the headline in Finnish ‘News’. The news columns informed the readers about the new Toyota-agreement, Toyota rental car options for customers and the New Land Cruiser. Each of these news items had a hyperclick option for reading more and thus opening a new page on the screen. Therefore the static-looking page that had even less text and the ‘paper’ ad provided more information through hyperlinks. Right at the bottom, the logo and the slogan of Toyota was again repeated. On the left in the margin, there were various menu options, for finding the right Toyota dealership close to you, for ordering brochures of cars, the newsletter, etc. Beneath that was a typical ‘Search’ function, a miniature image of RAV4, and further below the listing of all the possible models for sale in 2005 and the option menu to choose any of them. Initially then, this example page of the RAV4 from 2005 seemed to include fewer images than the paper ad, and less text. But the possibilities for effectiveness of advertising are enormously multiplied by the fact that once the viewer-reader starts clicking, the amount of information in the ad is not as limited as in the traditional paper ad. A car ad realized as a web page on the Internet allows many more pictures, although the viewing is somewhat limited only to one screen (and one model) at a time. Hence, a company ‘web page ad’ already in 2005 enhanced tremendously the amount
196
Anna Hopearuoho and Eija Ventola
and the effectiveness of advertising for a multinational car company such as Toyota. Yet, from the point of view of the metafunctional analysis, in terms of the images that appeared on the web page www.toyota.fi in 2005, the web page did not differ distinctively from the paper ad of 2005. The greatest difference is that customers can create their own interactive ‘reading paths’ with hyperlinks, and thus the reading experience and information retrieval in the form of images and written text expands, perhaps even more than customers need or want. The web page viewer may not look at everything on the screen and thus the initial page designers had created good, informative, attractive, short texts and link labels in Finnish, which lured the customer to click on to the details. Similarly, the other semiotic material should also be appealing, but here the images were very similar to the paper ad of the same year: no extras could be done with the images; for example, when clicked, they did not start moving; they merely represented. From the point of view of ‘globalization–localization’, it seems that at least the Finnish web page in 2005 was linguistically more Finnish than the Corolla MM 2005 Edition ad. The RAV4 ad had no English words other than the Toyota slogan and some of the car type labels, like Land Cruiser, and the link tax free. So in this respect, the ‘localization’ of the global product had been done; yet, all the other realizations suggest global solutions, including the layout and particularly the picture of the RAV4, which, judging by the scenery and by the open space and the trees in the background, could hardly be Finland. It can be argued, then, that in 2005, almost the same options were offered to the viewer-reader of the ‘web page ad’ as to the viewer-reader of the traditional magazine ‘paper ad’. Reading a web page required somewhat more effort from the customer, but also gave more information. The traditional paper ad could be read ‘in one go’ by the viewer; the ad was just a page within the magazine and could be read quickly or skipped. The web page ad could not be skipped, as it had to be the viewer-reader who wanted to get information about Toyota, and who actively sought the Toyota home page, where the whole page was about Toyota cars and also advertised Toyota as a company. The web page ad involved more work and more activeness from the part of the viewer-reader; the first initial page did not give much information on the individual cars, but just indicated the links where one could go. From the point of view of efficient car advertising of the global car company Toyota, the fact that the whole Internet page is the domain of the global player is important and like the Finnish www.toyota.fi the local subsidiaries are the ones who localize the web page as local needs be, or as local resources allow, as will be discussed next. Today, in 2009, web advertising has enhanced technologically and multisemiotically even further. This can be witnessed by a visit to www.toyotaeurope.com/ and the various subpages that link to the individual Toyota
Multisemiotic Marketing and Advertising
197
web advertising in various European countries, including for some particular reason Israel, as well. The basic ‘home’ layout for these current local Toyota subsidiary pages is the same ‘global’ page (see www.toyota- europe. com/) and in certain respects this common ‘home’ is even simpler than the 2005 web version of www.toyota.fi. But it now seems that the differences between the local subsidiary pages are increasing. When the Lithuanian Toyota web page, www.toyota.lt/ is visited, it looks very much like the Finnish web page of 2005. Predominant is, of course, the Toyota logo and the slogan TODAY, TOMORROW, TOYOTA. The centre is occupied from the left to the right by a huge space for images; the page displays an image of the Toyota Avensis, diagonally, similarly to the Finnish ads discussed, but this time the direction is from the left to the right. But the Lithuanian Toyota Avensis has no driver and the car is in a neutral setting (white) background. There is no sense of speed, and, more importantly, the image itself does not move, either; it neither zooms nor rotates, nor moves from left to right, etc. Only the hyperlinks can be activated on this page; thus, nothing much has changed in the ads of this subsidiary of the global player. The Lithuanian web page is the simplest of the current ‘localized’ ‘global’ layouts. Others utilize the developments in Internet technology to various degrees. The Israeli Toyota page, www.toyota.co.il/ has, of course, the reading path of Hebrew – from right to left – and at the centre a picture frame where the images keep changing, that is, the large frame is used for a ‘slideshow’ to show the different models, reminding us of PowerPoint presentations. The Finnish page, www.toyota.fi, also applies this technique and presents in the centre a PowerPoint screen, but now a real presentation context is built in (see Illustration 11.1). Again, the big centre image shows a screen onto which an image of each car model, for example, RAV4, is projected. But now the viewer-reader can also see a PowerPoint changer and its green button for changing the slides beneath the image of the car on the screen. Then a hand appears and starts writing on the blackboard that is situated next to the screen. Thus, a human actor who changes the slides of the PowerPoint presentation and who writes the text onto the screen behind the PP-screen is represented by the hand and its movements. The details written on the blackboard behind the screen are the details of each car, for example, the offer for the RAV4. The text unfolds in real time (the technique of unfolding reminds us of the techniques whereby the PowerPoint screen texts can be made to unfold). Today, various kinds of movement and rotation (cf. three- dimensional programs) seem to be incorporated into many of these subpages. The Austrian web page, www.toyota.at/, represents a Christmas scene, where the background, the text and the images are automated to change regularly. But even more exciting innovations and multisemiotic combinations are possible. The French web page, www.toyota.fr/, and the web page of Ireland,
198
Anna Hopearuoho and Eija Ventola
www.toyota.ie/, incorporate the same music into their ad but different images. The Italian web page, www.toyota.it/toyota/welcome.aspx, incorporates relatively mechanically a film clip into one of the smaller frames on its page, whereas the Polish web page incorporates both sound and film onto the major centre image. A very innovative solution has also been realized on the Spanish web page, www.toyota.es/, where the images of the cars are incorporated with the inserted film narrative as a frame within a bigger frame (thus creating ‘a genre within a genre’ realization). Most of the ads on local Toyota pages actually use the same images, presumably provided by the head office. The universal global material is just adapted to the local cultural circumstances of each country where the products are marketed. It seems obvious that global players aim to create a unified image of their products and brands, but they also seem to want localization in the web advertising campaigns of the local distributors’ web page, and the locals seem to decide what appeals to their customers. The differences and similarities in the local web ads of Toyota are clear. A detailed analysis of the Toyota campaigns around the world would indicate the various strategies applied but would demand more work. Going to the particular Toyota web ad page is normally a conscious choice by the potential buyer. But what if the big multinational company Toyota ‘intrudes’ with its global and local marketing onto another page that viewerreaders happen to be currently reading? 11.4.3 Multisemiotics and ‘globalization–localization’ on a ‘pop-up’ banner ad The last aspect of the multisemiotics and the ‘globalization–localization’ issue and advertising discussed in this chapter is the relatively frequently appearing ‘pop-up banner ads’ that appear on the web pages with which the viewer-reader is engaged. The default display of web pages is always the topmost and leftmost portion; this is the ‘guaranteed viewing area’ (Kok Kum Chiew, 2004, p. 146). If the viewer-readers have a small screen, they see only the top left corner of the page, and need to scroll down to see the rest. Usually the right side of the page, vertically, is a typical place for ads, since that area is not the most valuable area of the page. But when companies want to compete intensely for the user’s attention on the screen, they resort to placing horizontal ‘popup banner ads’ in the top left area. There the ads keep ‘popping up’ on the top of the page drawing the viewer-reader’s attention (of course, sometimes banner ads can also be vertical in shape and be placed in the ‘traditional’ place of ads, the side margins). The ever-appearing ‘pop-up ads’ distract the reading process of the actual chosen web page. Thus it is somewhat daring of the companies to intrude on people’s reading paths; the pop-ups can rather potentially put customers
Multisemiotic Marketing and Advertising
199
off than appeal them to buy. When such an ad ‘pops up’, it is hard not to look at it as it is often in one way or another made salient; the viewer then loses control of the original reading path, and can only get back to it by quickly closing the pop-up banner. But if one leaves it on, the pop-up ad continues to unfold and announce its advertising message exactly the way that its original designers have wanted it to unfold and once the ‘invader’ has finished it can automatically restart. The global company Toyota has resorted to this advertising method in its overall car advertising campaign. The first of the two Toyota pop-ups ads discussed here appeared on the news website (www.foxnews.com) of an American cable and satellite news (also widely read internationally) at approximately the same time in 2005 as did the ‘paper ad’ and the first Toyota ‘web ad’ already discussed. The banner under focus here is a horizontal Toyota pop-up ad that intrudes on the news page that the viewer-reader is focusing on and that literally moves forward; its slogan written in white letters, Moving forward, moves from left to right on the red background of the first frame of the banner ad (although the movement cannot be represented here) and so does the brand name Toyota, and then the rest of the whole ad starts unfolding. We could consider this unfolding of the banner frames as an interplay of episodes (to work out a term closest to O’Toole’s (1994) terms of description). This pop-up ad shows how advertising continually succeeds in utilizing the technological achievements for its selling and branding purposes both locally and globally. Locating its advertisement as a pop-up ad on a Fox News Internet page, the multinational company Toyota will not only reach all the readers of this news page locally, in this case in the USA, but also globally, as Fox News has readers all over the globe. This horizontal ad, when it first ‘pops up’, presents the new Toyota Avalon as an experience, as it unfolds. It parallels driving the Toyota Avalon with such experiences as a visit to the opera, attending a basketball finals game and exploring a museum. It begins with the first pop-up banner frame of a red background with white text and the click button Moving forward > actually moving from left to right. The logo and the name of Toyota are placed on the second row underneath the middle of the word forward and the click button ‘>’. Then the background of the ad turns grey. On the left, a ticket to the Metropolitan Opera from the 2000 season appears. Visible underneath the Opera ticket are also the corners of some other tickets. The text, Metropolitan Philharmonic. Opening night that then appears on the right on the grey background re- enforces the message first displayed by the image of the ticket and the words appearing on the ticket. The ticket then ‘drops out’ of the picture, revealing another ticket, this time for the basketball finals, 2002. The text on the right appears: Basketball finals. Game 7. Again, the ticket ‘drops out’ and reveals yet another one, a ticket for the Museum of Fine Art, 1998. The text, Museum of Fine Art. Special Preview, unfolds and the last ticket ‘drops out’ and is replaced by a photograph of a man driving (or at least sitting in) a car. The text relates to
200 Anna Hopearuoho and Eija Ventola
the image: Avalon Limited’s Heated and Ventilated Front Seats. A new experience to add to your collection. Finally the picture of the driving man changes into a photo of an Avalon, shot from the back, with some trees and possibly another car in the background. The text on the left unfolds from left to right. The completely re-imagined 2005 Avalon. Click to experience more, and on the far right a red ‘button’ appears. This button has the same text as the top red banner, but in reverse: first the Toyota logo and the name of the company appear on the top and then beneath the slogan Moving forward and the ‘click button’, ‘>’ gives the viewer-reader an option to act. Representationally the most significant message in this pop-up ad is Moving forward. It is a material process – well suited for car ads whether construed linguistically or pictorially, as we have seen in the earlier Toyota examples discussed. It is also a pun: the car is obviously meant to move forward physically, but ‘moving forward’ can also mean staying up-to- date with modern technology. Both meanings are meant to be associated with Toyota. The material process moving is enforced by the fact that the whole ad is moving (which cannot be re-represented here). The other material processes are more subtly insinuated in the ad, such as ‘going to’ the opera, the basketball game and the museum. These processes are used in defining the target audience of the ad. Middle-aged, cultivated men (like the one also shown in the picture in one of the frames) use these services and might also be interested in buying the new Toyota model. The overall result is a conceptualization of a (male) customer who is actively experiencing new top events, including driving a new Toyota, and thus moving forward in his life. Interpersonally, linguistically the ad explicitly speaks of the customer in A new experience to add to your collection. Avalon Limited’s Heated and Ventilated Front Seats, and speaks to the viewer-reader encouraging him/her to act with an imperative in The completely re-imagined 2005 Avalon. Click to experience more. The images, however, do not directly engage the viewer. The tickets, as well as the car, are offered to the viewer, rather than engaging the viewer. The man sitting in the car is shown from a horizontal angle, as an equal to the viewer. This might emphasize that the viewer could just as well be the one in the car. Compositionally, the pop-up ad begins with presenting Moving forward as ‘Given’ and Toyota as ‘New’ in the first frame. What is already a part of the potential car buyer’s experiences are operas, sports games and museums – all represented as ‘Given’ in the ad (realized on the left). When the image of the man driving a Toyota appears as part of the collection, the ‘newness’ of the experience is emphasized by the word new in the text: A new experience to add to your collection. The advertiser is working with the assumption that the viewer has some sort of a collection of valuable experiences and what he still needs as an addition to that collection is the experience of driving a new Toyota. This can be achieved by buying a Toyota – a car in the last banner frame – enforced by the logo and the brand name, Toyota, and the way to
Multisemiotic Marketing and Advertising
201
act is to ‘move forward’ by clicking on the ‘>’ button. The designers of the ad play very cleverly with the Given/New structure the text Moving forward, the click button ‘>’, the logo and the word TOYOTA. At the beginning their reading path is just as listed with the logo and TOYOTA appearing on the second line. At the end of the banner ad, the top and bottom lines are reversed so that the reading path is: the logo, TOYOTA, Moving forward, the click button ‘>’. Thus, enforcement to act, to click the ‘>’ button and read more, comes as a natural consequence of the unfolding of the pop-up ad. This pop-up ad and the other ads of the year 2005 seem to have been designed to appeal to men’s macho tastes for cool cars. Such models as the special edition model, the Corolla MM 2005 Edition, the RAV4, and the new Toyota Avalon in the pop-up ad, are represented to be still within one’s budget. Yet, even if a family man chooses, for example, an Avalon, a safe and solid family car, he can enjoy a unique driving experience, equivalent to the highlights of cultural and sports experiences. In the more recent images of a pop-up ad that was provided for this publication by Toyota, the idea of ‘moving forward’ is still kept: the images of cars are still displayed, moving forward (the realization of material processes) in the images, but now from right to left diagonally. A significant change is in the driver: now the Toyota Avalon is driven by a woman and the slogan for Avalon is Luxury born from technology (see Illustration 11.1). As we know, the word luxury is a word that frequently collocates with the word woman. With the pop-up ads we have discovered a new way of advertising that reaches beyond the local advertising in papers and on web pages discussed so far. What is significant in these ads is that since the www.foxnews.com page is visited not just by American but also by many international readerviewers, the ads have global effects. We have thus moved from local, countrybased company web-page advertising to global, ever-intruding advertising. This global advertising takes place in English, not in the national languages, as is the case with the paper ads and the national web pages. This enforces the position of English as the lingua franca of global advertising and of international companies. We have seen that in the paper ads and on the national web ads (www. toyota- europe.com/) the Toyota slogan has been Today, Tomorrow, Toyota. With this slogan, the logo of Toyota and other marketing material Toyota has been consistent in its overall image of Toyota in European advertising. The advertising materials have formed a cluster of texts that reinforce one another and help to build the Toyota brand in Europe. The slogan on the pop-up ads is, however, American: Toyota – Moving forward. Because of its protruding abilities, the American slogan may soon globally take over. On the surface the first pop-up ad discussed may be seen to orient itself to local, American values, Metropolitan Philharmonic, Basketball Finals and The Museum of Modern Art, being very valued ‘New York’ experiences of Americans. But global intrusive advertising makes them also very wanted
202 Anna Hopearuoho and Eija Ventola
and valued experiences globally, and during the ‘cheap dollar’ and before the economic sub-prime loan crisis of 2008, it was not uncommon of the high- (and even middle- class) society members from Europe and Asia to take a short ‘cultural excursions to New York’ to experience just what the pop-up ad promotes as valuable experiences (whether the same people have shifted their car model to Toyota is unknown).
11.5 Discussion and conclusion Today, more and more companies ‘go global’ and need to develop a global identity for themselves and their products. The marketing and advertising of their products need to take place not only locally but also globally – especially when realized through the Internet. Thus, companies are often facing the dilemma of how to ‘mix and match’ the ‘local and global’ effectively when marketing products and services. This chapter has shown how local contexts (and special events in local contexts) may demand locally oriented linguistic and other semiotic realizations (i.e. local languages are used for advertising, certain cultural semiotic choices are highlighted in the ads, etc.). But it has also shown that local ads have global features in their designs and that the specific designs, for example the local web designs, use similar unified templates but realize them somewhat differently nationally, as in the case of Toyota in www.toyota- europe.com/. The tools of multisemiotics (see e.g. Bateman, 2008) help to understand how ads are construed to make particular meanings and this in turn may help both the designers in creating ads and the viewer-readers in interpreting them. When the aim of the company is to maximize global marketing, the Internet seems also to be the most effective means, allowing various modes to be combined to create a unique multimodal semiosis when using this one medium. The Internet ad reaches the maximal amount of potential customers from all over the world with no delivery chain costs, and when the ads appear as ‘pop-ups’ that are imposed on the current web page screens that the readers are reading, the readers are forced to follow the unfolding of the pop-up ad and its message (unless they have consciously blocked the pop-ups). Whether a full company or product web page or a pop-up ad, the Internet ads are multisemiotic in nature: written text, sound and images, both still and moving, are found on Internet; the text and the images are made to move on the screen, change colour and shape, and animation and film clips within the ad further complicate the semiosis as they are often telling their own ‘story’, the meanings of which the viewer-readers have to work out and link up with the products and services advertised. The techniques of creating multisemiotic advertisements for local and global purposes are taken over quickly. Yet, the advertisers must be careful with the mix of ‘local and global’. If the local is represented as global, but is still within the reach of only a few (such as the visits to the Metropolitan
Multisemiotic Marketing and Advertising
203
or the Museum of Modern Art in New York at normal times), it can also be off-putting to potential customers. Emphasizing the role of one sex over the other in the matter of choosing the right car for the family may similarly not always be the most successful advertising strategy in all contexts. Using too many foreign words may be disturbing to local customers not yet used to the ‘mediamultilingualism’ brought about by the globalization processes into ads and English as the major lingua franca of such ads. Advertisements should always be analysed as part of the advertisers’ marketing strategy – local and global. There are no good or bad ads; there are only ads that match the marketing strategy and those that do not. Marketing experts (e.g. Jobber, 2007, p. 814) say that constructing a business strategy involves a ‘strategic triangle’, where three aspects have to be taken into account: the corporation itself, the customer and the competition. It is obvious that knowing this marketing triangle will not be enough for the communication needs of global businesses in the rapidly changing technological world. What are needed are interdisciplinary experts who can bring the marketing triangle together with the ‘metafunctional triangle’. That is, universities need to train experts who are semiotically sensitive to create representationally, interpersonally and compositionally culturally appropriate, appealing and effective advertisements to handle the demands of globalization – localization in the local and global markets. But the trained can also reverse the direction of their services and become advisers to consumers, helping them to interpret the complex ‘grammars’ of multisemiotic advertisements which then lead to informed decisions of what to buy and what not to buy .
References Baldry, A. and P. J. Thibault (2006) Multimodal Transcription and Text Analysis (London: Equinox). Bateman, J. (2008) Multimodality and Genre. A Foundation for the Systematic Analysis of Multimodal Documents (Basingstoke: Palgrave Macmillan). Bateman, J., J. Delin and R. Henschel (2004) ‘Multimodality and empiricism: Preparing for a corpus-based approach to the study of multimodal meaningmaking’, in E. Ventola, C. Charles and M. Kaltenbacher (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 65–89. Bell, A. (1991) The Language of News Media (Oxford and Cambridge: Blackwell). Coleman, L. (1990) ‘The language of advertising.’ Journal of Pragmatics, 14(1): pp. 137–45. Cook, G. (1992) The Discourse of Advertising (London: Routledge). Crook, J. (2004) ‘On covert communication in advertising.’ Journal of Pragmatics, 36(4): pp. 715–38. Crystal, D. (2001) Language and the Internet (Cambridge: University Press). Dyer, G. (1982) Advertising as Communication (London: Routledge). Geis, M. (1982) The Language of Television Advertising (New York: Academic Press). Goddard, A. (1998) The Language of Advertising (London: Routledge).
204
Anna Hopearuoho and Eija Ventola
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (London: Edward Arnold). Hofinger, A. and E. Ventola (2004) ‘Multimodality in operation: Language and picture in a museum’, in E. Ventola, C. Charles and M. Kaltenbacher, (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 193–209. Jobber, D. (2007) Principles and Practice of Marketing, 5th edn (Berkshire: McGrawHill). Jones, C. and E. Ventola (eds) (2008) From Language to Multimodality (London: Equinox). Kok Kum Chiew, A. (2004) ‘Multisemiotic mediation in hypertext’, in K. L. O’Halloran (ed.) Multimodal Discourse Analysis (London, New York: Continuum), pp. 131–59. Kress, G. and T. van Leeuwen (1990) Reading Images: Sociocultural Aspects of Language and Education (Geelong, Vic: Deakin University Press). —— (1996) Reading Images. The Grammar of Visual Design (London: Routledge). —— (2001) Multimodal Discourse: The Modes and Media of Contemporary Communication (London: Arnold). Myers, G. (1994) Words in Ads (London: Edward Arnold). Norris, S. (2004) Analyzing Multimodal Interaction (New York, London: Routledge). O’Halloran, K. (ed.) (2004) Multimodal Discourse Analysis (London, New York: Continuum). O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). Unsworth, L. (2008) Multimodal Semiotics: Functional Analysis in Contexts of Education (London, New York: Continuum). Ventola, E., C. Charles and M. Kaltenbacher (eds) (2004) Perspectives on Multimodality (Amsterdam: John Benjamins). Vestergaard, T. and K. Schrøder (1985) The Language of Advertising (Oxford: Blackwell).
Electronic sources Toyota Europe (2009) www.toyota- europe.com/, date of access 3 January 2009. Wikipedia (2009) http://en.wikipedia.org/wiki/Toyota_Corolla, date of access 3 January 2009.
Sources for illustrations discussed in the chapter Fox News (2005) http://www.foxnews.com (accessed 7 March 2005). Helsingin Sanomat, monthly supplement Kuukausiliite, June 2005, pp. 2–3 (Illustration 11.1). Toyota RAV4 (2009) http://www.toyota.fi/ ( accessed 3 January 2009) (Illustration 11.1). Toyota (2005) http://www.toyota.com/ (accessed 7 March 2005). Toyota Avalon (2008) banner advertisement, by photographer Vincent Dente (Illustration 11.1).
Part IV Multisemiotics in Enacted Roles and Virtual Identities
This page intentionally left blank
12 Taking the Viewer into the Field: Interaction between Visual and Verbal Representation in a Television Earth Sciences Documentary Alison Love
12.1
Introduction
This chapter discusses the complementary verbal and visual strategies that are used in a television documentary series, Earth Story, screened by BBC (1998/2006). The series sets out to answer questions about the formation of the Earth, the forces that have changed it over time and the influence these changes have had on the evolution of life on the planet. The chapter examines the ways in which the series uses verbal and visual modes to complement each other to ‘take viewers into the field’ of geology – literally, by showing the places geologists go and the features they examine and, more metaphorically, by engaging viewers in the semiotic process of interpreting these features as geologists do. The chapter shows how the two modes of representation, sometimes assisted by the musical mode, interact to lead viewers themselves to see as geologists, to think as geologists and to share the experience of behaving as a geologist. It also discusses how the role of the presenter as a mediator, the sequence of visual and verbal representation, the use of both verbal and visual cohesive ties and the choice of linguistic structures all contribute to engaging viewers in the experience of exploring geology. The series exploits the resources of both spoken language and the visual mode to produce a television documentary which combines a serious scientific argument about the interaction between the planet and life on it with an invitation to enjoy the experience of exploring and interacting with the Earth. The chapter takes as its starting point two issues raised by Myers in his introduction to a special issue of Discourse Studies on scientific popularization (Myers, 2003). First, Myers points out that the main focus of discourse analysis of scientific popularization has been on verbal texts and that other 207
208 Alison Love
codes have been relatively neglected. Yet, he continues, ‘some of the most dramatic and memorable encounters with science are primarily visual’ and that for discourse analysts to neglect these other, particularly visual, semiotic modes which are utilized in representing scientific subjects ‘limits studies of popularization’ (Myers, 2003, p. 272). In a response to this challenge, this chapter will analyse some of the most important aspects of the interaction between visual and verbal representation in the television documentary series, Earth Story. Second, Myers questions the dominant assumption of a clear dichotomy between specialist science genres such as the research article and ‘popular’ genres. He quotes Hilgartner’s comment that ‘popularization is a matter of degree’ and suggests a ‘continuum of popularization’ (Myers, 2003, p. 270). Acknowledging this view, the chapter will investigate the strategies this television documentary shares with those used in specialist geological discourse, particularly as described by Dressen (2003; Dressen-Hammouda, 2008). The analysis of these shared features is seen as a contribution to the discussion of the ‘continuum of popularization’ and the role that verbal-visual interaction may have in positioning the documentary on this continuum.
12.2 Earth Story: A popular science documentary Earth Story was originally screened by the British Broadcasting Corporation in 1998 and brought out as a DVD in 2006. The series sets out, over eight episodes, to answer questions about the formation of the Earth, the forces that have changed it over time and the relationship between these geological forces and the existence of life on Earth. In some ways this series continues the tradition of major BBC popular science documentary series, epitomized by David Attenborough’s Life on Earth, which seek to present the content and excitement of the natural world to a non-specialist (though educated) audience. However, the scientific area this Earth Story series covers, that of geology, poses more difficulties for the ‘mediator’ (see Moirand, 2003) between a specialized science and a general audience than does the area of ‘natural history’. Rocks have less immediate visual appeal than animals, birds and plants. Moreover, much of the focus of geology is not on the readily visible, but on features hidden beneath the Earth’s surface, and on processes which leave only subtle traces to be observed. Greater effort is therefore needed to engage viewers. The series is presented by Aubrey Manning, a professional biologist and also a seasoned television presenter. The instalments consist largely of sequences in which Manning unravels the thread of geological argument about the formation and development of the Earth and its relationship with the evolution of life in interaction with a range of geologists, sometimes in the laboratory or study, but most often ‘in the field’. However, the series aims to do more than introduce viewers to the field of geology: it aims to involve viewers in the experience of geological
Taking the Viewer into the Field
209
discovery. To achieve this, it employs a ‘layered’, multivocal mode of presentation, by having as presenter an ‘intermediary’ figure, who enacts the curiosity, thought processes and emotions of both a professional scientist and an educated lay-person. Manning is an experienced biologist, interested in the links between his own field and that of geology and so possessing scientific authority. But he also lacks geological knowledge, and needs ‘teaching’, just as viewers do. The documentary series is structured so that Manning ‘mediates’ between geologists and viewers, by ‘going into the field’ with geologists, both literally and figuratively: he accompanies geologists to a large number of key sites and discusses with them what the viewers learn from them. From these discussions, he constructs for viewers the developing argument which underlies the series, that the structures of the Earth and life on Earth are linked in a complex, integrated system. At the same time, he interacts with the geologists by expressing his emotional responses to what he is shown and by joining in their enthusiasm for their discoveries.
12.3
Theoretical framework
The chapter draws on previous work which analyses other genres within the discourse of geology and on systemic functional linguistics (SFL) as a framework to analyse the verbal, and to some extent the visual, representation. 12.3.1 The discourse of geology Geology is primarily an observational, rather than experimental, discipline, with its foundations firmly laid in field observation: ‘seeing’ in the field. However, geology is not, of course, purely descriptive, but seeks to interpret field data to establish the connection between geological features and the processes that formed them (see Love, 1991, 1993, for discussion of the discourse of geology in introductory textbooks in terms of ‘products’ and ‘processes’). Dressen (2003) discusses how geology research articles use implicit evidence of the geologists’ field experience to create credibility for their interpretive claims: thinking geologically must be rooted in geological observation. In a more recent paper (Dressen-Hammouda, 2008), she analyses how a novice geologist gradually develops his fluency in producing field reports, not only in terms of his mastery of the written genre (‘materialized genre’), but also in terms of his ability to show his mastery of the geology’s ‘symbolic genres’ – the ‘discipline’s shared ways of seeing and interpreting field structures’ (Dressen-Hammouda, 2008, p. 239). Drawing on the concept of ‘embodied frames’, she suggests that becoming a geologist involves acquiring these ‘symbolic genres’: ‘ways of being, seeing, interpreting, behaving and thinking’ (Dressen-Hammouda, 2008, p. 239), which are passed on to novices, most strikingly when ‘in the field’.
210
Alison Love
It will be argued that the television documentary Earth Story, through its interaction of verbal and visual representation of geology, attempts not only to ‘tell the story’ of the Earth, but also to engage viewers in the ‘symbolic genres’ of geology, of seeing, thinking and behaving as a geologist. 12.3.2
Systemic functional linguistics
The chapter will draw on aspects of SFL, particularly on the experiential metafunction, which deals with ‘the clause as representation’. Halliday (1994, p. 106) suggests that ‘reality is made up of PROCESSES’ and that ‘[t]he transitivity system construes the world of experience into a manageable set of PROCESS TYPES’. The transitivity system differentiates the grammatical description of process types such as doing (Material), being (Relational), seeing, thinking (Mental), saying (Verbal) and the participant relationships involved in each. The usefulness of such an analytical framework for this study is that it facilitates a description of the verbal representation of the activities involved in geologists’ study of the Earth and of the processes involved in the history of the Earth. Simultaneously, it allows for a focus on the participant roles of the Earth and those who study it. These categories can be extended to describe the visual representation of the activity of geologists and of the action of geological forces, and, in particular, the visual focus placed on different aspects of the Earth and its features or on humans exploring the Earth. An alternative approach within SFL to the representation of events in the world is that of ergativity, which deals with whether a process has occurred ‘by itself’ or has been caused to happen (Thompson, 2004, p. 135). Ergative clauses have two participants and realize causation: for example, The heat melted the ice. Non-ergative clauses have only one participant and focus on a change of state of that participant: for example, The ice melted. This mode of analysis is particularly useful for describing the representation of the Earth undergoing processes of change, which is precisely the concern of the documentary. The use of these descriptive frameworks facilitates identification of the different ways that the Earth is represented in the documentary: as an object of actions initiated by geologists, as a self-contained system and as an entity that, metaphorically, initiates processes which affect geologists.
12.4 Taking the viewer into the field: The verbal and visual strategies of Earth Story The strategies used in the documentary will be discussed in terms of how they realize three of Dressen-Hammouda’s ‘symbolic genres’ of geology: seeing, thinking and behaving as a geologist. 12.4.1
How to see as a geologist
Geology is a strongly observational discipline. Non-geologists need to be introduced to geologists’ ways of looking and seeing. In the documentary
Taking the Viewer into the Field
211
Earth Story, viewers are gradually introduced to how to see as geologists, through a number of complementary strategies. First, the visual material introduces viewers to the different levels at which the Earth and geological features are observed by geologists. Since geology is concerned with the Earth as a system which can be observed from the macro- to the micro-scale, the visual elements focus on and shift between all these levels. The global is represented by the map of the world, while specific locations are introduced by focus on landscape, often starting with an aerial overview and then shifting to a more ‘eye-level’ representation. The camera then moves in to focus on rock outcrops, the main ‘visual evidence’ for geological processes. There may then be a close-up of ‘hand specimens’ of rocks. At some points, microscope images of the rocks are included. These representations demonstrate the systematic structure of the Earth, while also paralleling the different levels at which geologists approach this structure. Further, the series shows viewers areas of the Earth important to geologists, but rarely seen: detailed images of the ocean floor are a particularly spectacular sequence, while other sequences show rock strata exposed underground in deep mines. However, the most important strategy in leading viewers to ‘see as geologists’ involves the interaction of verbal and visual elements. Viewers are guided to see beyond the familiar landscape to observe the details relevant to geologists. To achieve this, ‘showing’ is not enough – viewers must also be ‘told’ what to see. An example of this is an early sequence devoted to the ancient rocks of Barberton in South Africa. The presenter visits the area with a geologist, Maarten de Wit, who has made geological maps of the area. Manning, the presenter, explains: When you do this, a striking pattern quickly emerges. He thus assists in ‘organising people’s perception’ (Goodwin, 2001, p.169), by drawing attention to the existence of a pattern (through a non-ergative clause) before viewers look in detail at the landscape. There is then a shot of de Wit looking at the landscape, after which he remarks: Once you start mapping the hills here, you’ll notice that the landscape is dominated by stripes – stripes of rocks, like that one there ... De Wit points, and the camera then zooms in on the stripe, giving time for viewers to ‘see’. De Wit continues: If you get your eye in, after a while you’ll see in fact that all these rock layers are visible. In this case, this huge mass here has finer, vertical layers ... The verbal commentary again pauses, to give time for viewers to observe. Here there is explicit ‘training’ in ‘how to see as a geologist’. Throughout the series, the verbal commentary frequently emphasizes modes of seeing, by the use of Mental process verbs of perception. Manning – and through him the viewers – are urged to look: Look at this rock!; You can even see sand grains; You can see if you look downwards that there are several of these slabs; You can see these barnacles and seaweed. Thus, the viewers are drawn into habits of observing features which are of geological significance.
212
Alison Love
Sometimes, as in these examples, the verbal precedes the visual, guiding viewers into what and how they should see. At other times, the camera focuses briefly on important features, for viewers to observe them, before they are told what to see. For example, when Manning visits Siccar Point in Scotland, he alerts viewers to its significance: ... and the discovery [Hutton] made here changed for ever how geologists think about time. The camera then moves from an image of crashing waves to pan over and down the small patch of cliffside famous to geologists as Hutton’s Unconformity. Viewers have a short opportunity to look at the structure of the cliff, before Chris Nicholas, the geologist, describes what Hutton observed: What Hutton could see was that the grey rock that’s down at the bottom of the cliff stands vertically. But on top of it is this horizontal rock, and between the two is this sort of undulating surface. The viewers can ‘check’ their observation against the geologist’s description. The commentary also emphasizes ways of seeing through its use of Mental process verbs in Manning’s accounts of the discoveries of geologists: When he’d looked, George had seen signs of past earthquakes engraved into the land. The activity of looking and seeing is emphasized as central to solving geological puzzles. However, when geologists observe, they see more than simply rocks or the shape of landscape. They see clues to what has happened to the Earth. What they observe are geological products: these are the indicators of geological processes (Love, 1991, 1993). One of the most important aims of the series is to present the Earth as dynamic, subject to long-term processes that have shaped both the Earth itself and the development of life on it. So it is not sufficient to guide non-geologists to see relevant features: they must be guided to see how these features are clues to the interpretation of processes of formation. This requires more complex interaction of visual and verbal elements. The ‘meaning’ of the layering of rocks at Barberton is elucidated by explaining the processes involved in their formation. Manning remarks: ... geologists had begun to realize that the processes that create these layers were still at work around them. This is followed by the music of plucked strings, and an image of water starting to run over rock, a visual introduction to one of the forces that have shaped the landscape. The drops become a small runnel, then a full stream and then the focus shifts to a river – the Komati, in Barberton – running through a valley. Manning’s commentary then explains: As the Komati River flows through Barberton, it cuts down through the rocks, eroding them into sand and silt which it carries downstream. The silt falls to the bottom, layer upon layer, eventually turning into new rock. Viewers can see the river flowing, but are also led to understand the processes which it performs over time. The focus moves visually from the observed
Taking the Viewer into the Field
213
phenomenon of the layered rocks to the action of the present-day river, while the verbal commentary employs lexical cohesion by the repetition of layer, linking the layers observed at Barberton to the layering of silt deposition by the river. Thus, what the geologist ‘sees’ in the layers becomes clear. This is followed by a sequence in which de Wit points out ripple marks in rocks and series of slabs: he shows viewers these features, but also tells them what they mean: a slab represents a river bed; sand grains and ripple marks represent whole histories of rivers. By his use of the identifying relational process represents, he guides non-geologists to ‘see’ beyond the surface features to identify the geological processes of which they are evidence. A later sequence uses cuts between images of rocks in Barberton and present-day volcanic activity to create visual cohesion between the rock structures and their processes of formation. Manning in an introductory voice-over announces: What the rocks reveal is that 3.5 billion years ago the Earth was a world of volcanoes. De Wit then points out little globules in the rocks, at which point the camera zooms in on the detail of the rock surface. He then explains the processes involved: Little globules form volcanic clouds where large volcanoes erupt violently – like Mt St Helen’s, for example. The scene then cuts to an image of the eruption of Mt St Helen’s, reminding viewers that these processes are still taking place. The focus then returns to the rock in de Wit’s hand, and he invites viewers to look at these funny shapes that tells [sic] us that these volcanic rocks were erupted under water ... The visual images then cut rapidly between those of an underwater volcano, a sea coast and a cloud of water vapour – in other words, showing the processes that these funny shapes conjure up to a geologist. In such sequences, the rapid alternation of visual images of ‘product’ and ‘process’ enables viewers to make sense of the verbal account and embodies the interpretation of the geologist. Connected to such guidance in observation is the language used to represent the Earth and its features. They are represented as an independent system to be observed and interpreted. They are, as illustrated above, often presented as the Phenomena of Mental process verbs of perception. However, the Earth or a geological feature may also be represented as the medium of a non-ergative clause, which, as explained above, is frequently used to realize a change of state. Examples from the documentary include: Little globules form in volcanic clouds where volcanoes erupt violently; These rocks crystallized out ... ; The continents were just starting to form; The entire floor of the Pacific Ocean was sliding ... This non-ergative representation has two effects. First, it represents the Earth as a system undergoing change which is independent of humans and is available to observe. Second, it raises questions about the agents of change: What mechanisms are responsible for the eruption of volcanoes or the breaking apart of supercontinents? Showing how geologists have worked to find the answers to these questions is one of the important aims of the series.
214
Alison Love
12.4.2
How to think as a geologist
To establish such links between products and processes, geologists have to, however, think in a particular way. The documentary series attempts to take viewers further into the experience of ‘being a geologist’ by representing ‘how geologists think’ – how they formulate questions and seek the answers. Here the intermediary role of the presenter is particularly important. Manning often makes structuring remarks in voice-over, which act as links between sequences. These links usually refer to the wider field of geology, commenting on the state of knowledge at a particular time or raising questions which require answers, for example: Could this be happening deep below the volcanoes and could this be the key to how continents grow? and Could the flask from Bolivia contain water from the Pacific Ocean? Such questions, posed by the presenter, invite viewers to participate in the hypotheses the geologists are making and testing. They assist viewers in structuring the knowledge they have been given. Manning also comments, frequently at episode boundaries, on the puzzles that remain after a particular piece of evidence has been observed. For example, at one episode boundary, he says in voice-over: So it looks as if the continents have all been built up by volcanoes like these, gradually building up the crust over geological time. But there was a problem with this idea. How could something happening around the Pacific explain rocks in Scotland? This was a complete mystery. [pause] But then the Ring of Fire provided another vital clue. Manning uses these verbal sequences to establish the development of geological argument. Here he comments on the significance for geological knowledge of what has been learned in the Andes, as the camera moves with a group of geologists as they leave their field area. He pauses and then announces the next sequence, at which point the visual image cuts to archive footage of the Great Alaskan Earthquake. Further, the documentary also endeavours to engage viewers in the thinking processes of geologists as they struggle to make sense of what they observe. For example, in Barberton, de Wit is shown scrambling over a hillside. The camera then focuses on some oddly structured rocks. De Wit then emphasizes his discovery: I caught these rocks almost by accident. He expresses his bewilderment: I thought, ‘What is that?’ I hadn’t a clue. I’d never seen anything like it before. He then tells of his experience in visiting New Zealand and sitting by volcanic mud pools. The camera accordingly cuts to heaving mud pools. De Wit describes his discovery in terms of Mental processes: All of a sudden I remembered these structures. I thought, ‘Wow! That’s got to be it!’ The visual images then alternate rapidly between the New Zealand mud pools and the rock structures in Barberton, visually reflecting de Wit’s mental recognition of similar patterns. He then establishes the connection he
Taking the Viewer into the Field
215
has made, using an elliptical relational process: Ancient mud-pool structures, frozen in the rock here, and then rephrases the discovery through a thematic equative which foregrounds the metaphorical process of ‘giving away’: And what gives it away, of course, is all these intersections. This sequence shares with viewers how geologists think, how they seek answers to questions and how they solve problems by relating structures to processes of formation. Another sequence demonstrating how geologists come to conclusions about earth processes is that concerned with the Great Alaskan Earthquake of 1964. Geologist George Plafker investigated ‘confusing stories’ of land which had apparently been raised by the earthquake and land that had apparently sunk. Manning tells us that Plafker decided to examine the shoreline in detail. Here he made his first remarkable discovery. The camera then focuses on a line of dead barnacles (small shellfish that live on rocks in tidal zones). Plafker says: You can see these barnacles. [pause] Let me show you something fascinating now. Here is a line of dead barnacles, and this line is about six feet above the living barnacles. There is then a pause in the verbal commentary, while the camera focuses on dead barnacles, accompanied by musical chords that suggest ‘significance’. As viewers struggle to interpret what they see, Plafker remarks: When I came here in 1964, some of these barnacles were still alive! Manning explains: George realized that during the few moments of the earthquake, the land here had been jacked up six feet. What was more, George had found a way to measure accurately land level changes over the whole enormous area affected. The lexical cohesion of the repetition of six feet assists the viewers to understand Plafker’s method of measurement. The significance of Plafker’s observations is explained as playing a key part in establishing the revolutionary theory of plate tectonics. Manning sums up the connection between observation and theory: What had begun with a line of dead barnacles ended with a new view of our planet. Underlining the connections between observation and deduction is the frequent use in the commentary of Mental processes of thinking, as in: We speculate back in time that this is the sort of place where life might have started; It dawned on scientists that ... ; all of a sudden I remembered. The importance of ‘thinking’ is further emphasized by metaphorical expression of mental processes, such as Wegener made a bold mental leap, where the material process ‘dramatizes’ the significance of the geologist’s idea. The verbal commentary complements the visual representation most particularly in establishing the connections between observation of features and deduction of processes. 12.4.3
How to behave as a geologist
The previous two sub-sections have illustrated how the documentary series employs the interaction of verbal and visual to introduce viewers to the disciplinary content of geology and the logical relations that geologists establish between product and process. However, it is clear that the series also invites viewers to share in the day-to-day experience of ‘behaving as a
216
Alison Love
geologist’. The material processes in which geologists are actors are represented visually: they walk, fly in small planes, row in inflatable boats, dive in submersibles, drive in rough country. In addition, the visual material focuses on the activities in which geologists engage in their detailed investigation of geological features and specimens. They are shown chipping at rocks with their hammers, taking samples, blowing or spitting on a sample to clean it. They use a variety of equipment in laboratories to measure and analyse samples, they draw maps, make calculations. In other words, the series portrays the variety of methods used by geologists to gain a greater understanding of the Earth and the processes affecting it. However, the images of geologists ‘in action’ can also be seen as emphasizing them as ‘in interaction’ with the Earth. Their physical engagement with the Earth is emphasized: there is much focus on walking and climbing, with many camera shots actually focusing on feet in contact with the earth. Their contact with the Earth through touch is also stressed: geologists feel the textures of rocks; Plafker is shown touching the line of barnacles. In the verbal commentary, this interaction is strikingly represented by linguistic choices. The Earth and geological features are often represented as Sayers of Verbal processes, sometimes with geologists as Receivers of their message. For example: At last the rocks really begin to speak; That’s what they’ve been trying to tell me. Thus, the rocks are metaphorically represented as attempting to communicate with geologists. In other places, geological features are represented as Actors of material processes that are concerned with ‘revelation’ and as providers of benefits to humanity: What the rocks of Barberton reveal ... ; The sea-floor gave up its secrets – on dry land!; the planet yielding up its riches ... ; The volcanoes have done all the work [i.e. through laying down economically valuable minerals]. Again the Earth is represented as in an interactive relationship with humanity, both in assisting geologists to understand its ‘secrets’ and in providing ‘riches’. This kind of metaphorical representation, in contrast to the representation of the Earth as a system, draws the viewer into a more emotional engagement with the practice of geology, emphasizing the excitement of discovery and the beneficial relationship between the Earth and humans. A striking multimodal representation of this interaction between the Earth and humans can be seen at different points in the episode which explains the association between volcanic activity and reserves of precious metals in the Andes. The episode opens with a reconstruction of a Spanish conquistador exploring for precious metals (with focus on his horse’s hooves on the earth). He then ‘discovers’ the Cerro Rico mountain of silver at Potosi in Bolivia and falls to his knees in thankful prayer, accompanied by religious choral music. After images of miners extracting the silver, and images of Spanish treasure made from it, there is a sequence of Aymara townspeople at Potosi celebrating their thanks to Pacha Mama, the Earth Goddess, for providing the riches. This also marks the start of a field expedition by a
Taking the Viewer into the Field
217
team of geologists in Bolivia who are investigating the relationship between the volcanic processes at work in the region and the creation of the continents. Towards the end of the episode, after a diagram has been used to demonstrate this relationship, the geologists are shown returning from the field, with the volcanoes in the background, commenting on their feelings about experiencing the growth of continental crust. At this point the religious music reappears. This musical and visual cohesion stresses that for all three groups the volcanoes provide a kind of Holy Grail, a long-sought end of a quest. The representation of geologists in the documentary emphasizes the emotional aspect of their experience. Geologists featured in the series express their emotions openly and vigorously, particularly amazement: It’s amazing!; Wow!; Quite a romantic thought! It blows my mind actually! Significantly, emotions are related explicitly to their engagement with geology: That’s why I’m a geologist!; As you drive along, you’re witnessing the creation of new continental crust – quite an exciting thought; What’s really exciting about these rocks is what’s in them. The unsolved mysteries of the Earth are causes of concern to them: something that bothered me for a long time. The nineteenth-century debate about the age of the Earth is represented in terms of geologists’ emotions: But to field geologists like Hall, his number felt too small. When Rutherford demonstrated a much greater age for the Earth, geologists gave a sigh of relief. It is also clear that personal engagement with a geological field area can be expressed almost as possessiveness. For example, the Barberton area is referred to as: field area of Maarten de Wit. George Plafker enthuses about the 1964 Alaskan earthquake: ... and it had happened in my field area! The presenter, further into the same episode, refers to his [George Plafker’s] enormous fault. Thus, viewers are drawn into experiencing the emotional appeal of geology, especially in solving mysteries and being in touch with the creative processes of the Earth. Visual resources are used to represent the emotions of geologists. They are often shown looking with appreciation at breathtaking views, for example from Table Mountain in South Africa. It is, however, notable that these moments are always linked back to the geological features involved. Geologists are shown in other ways enjoying interaction with the Earth. For example, on the descent from a challenging expedition high up a volcano in the Andes, the geologists have fun bathing in hot springs, which are, of course, also a volcanic feature. One outstanding sequence exploits multimodality to represent a merging of scientific enquiry and emotional awe. A geologist in his office shows the presenter an ancient meteorite. He states the rock’s age, adding that it is the oldest object that can be held in the human hand. He then puts it into the presenter’s own hand. There follows a significant pause in the verbal commentary, overlaid with solemn chords, as the presenter gazes at the meteorite
218 Alison Love
with an expression of awe on his face: the camera focuses on the presenter, rather than the meteorite. Abruptly, the scene shifts to the laboratory, where part of the meteorite is dissolved to establish its age and the results are displayed as a computer print-out. This interplay of visual, verbal and musical strategies both gives structure to the sequence and emphasizes the merging of scientific endeavour with emotional engagement. Thus, viewers are drawn into their own emotional engagement with geology – its excitement, surprises, puzzles and awe.
12.5 Conclusion This chapter has set out to examine how the popular science television documentary series Earth Story exploits the interaction between verbal and visual modes of the medium to both ‘tell’ its audience its argument about the history of the Earth and ‘show’ the evidence on which this argument is based. In the documentary, the two modes are fully integrated, with the visual mode presenting features of the Earth and of the activities of geologists, while the verbal mode guides viewers both into ‘how to see’ and into how to interpret the significance of what they see through understanding of ‘how geologists think’. Both modes are essential to achieving the purpose of the series. As Lemke (1998, pp. 92–5) has argued, the resources of each mode construct ‘joint meanings’, ‘multiplying the set of possible meanings that can be made. The chapter also set out to examine the relationship between the television documentary and specialist geological discourse. The documentary clearly has features which attract a non-specialist audience, for example, stunning pictures of beautiful scenery and spectacular ocean bed sequences, humour and expression of strong emotion on the part of the geologists featured – the enjoyment of being ‘in the field’. At the same time the series works hard to engage viewers in the ‘field’ of geology as a discipline, in experiencing ‘how to be a geologist’ in terms of organization of perception and of modes of thinking and interpretation. It attempts to initiate viewers into the ‘symbolic genres’ of geology (Dressen-Hammouda, 2008). It is important to realize that such ‘hybridity’ of purpose (see Myers, 2003, p. 271) is not unusual in different genres of scientific discourse. Dressen (2003), as mentioned earlier, has shown how the credibility of arguments in geology research articles may depend on ‘implicit persuasive strategies’ that provide subtle evidence for ‘field culture’. She suggests that ‘geological values’ include ‘acting like a field geologist’ by showing evidence of having been to remote locations and knowing ‘how to see’ like a geologist (Dressen, 2003, p. 277). She mentions that the discourse of geology research articles may show traces of ‘on-site enthusiasms’, with lexis such as superb or spectacular. She suggests that ‘authors appear to lay territorial claim to the structures studied in their region’ (Dressen, 2003, p. 283). All these values, which Dressen suggests are
Taking the Viewer into the Field
219
expressed by indirect strategies in research articles, become more transparent and explicit in the television documentary. This suggests that the documentary succeeds in engaging its popular audience in the full experience of ‘being a geologist’, leading viewers to appreciate how geologists see, how they think, how they behave and how they feel about their relationship with the Earth. This chapter is a very tentative excursion into the analysis of the multimodal discourse of a television popular science documentary. Much more rigorous attention needs to be paid to the connections between theories of visual and verbal representation. Further work is also needed on hybridity in popular science documentaries between the commercial need to attract and entertain viewers and the attempt to engage their understanding of the ways in which knowledge is created in a scientific discipline. An analysis of Earth Story suggests that interaction between visual and verbal modes enables viewers to enact the experience of geologists in their ways of seeing, ways of thinking and ways of behaving, including their enjoyment of their interaction with the Earth.
References British Broadcasting Corporation (1998/2006) Earth Story (London: BBC). Dressen, D. (2003) ‘Geologists’ implicit persuasive strategies and the construction of evaluative evidence.’ Journal of English for Academic Purposes, 2(4): pp. 273–90. Dressen-Hammouda, D. (2008) ‘From novice to disciplinary expert: Disciplinary identity and genre mastery.’ English for Specific Purposes 27(2): pp. 233–52. Goodwin, C. (2001) ‘Practices of seeing visual analysis: An ethnomethodological approach’, in T. van Leeuwen and C. Jewitt (eds) The Handbook of Visual Analysis (London: Sage), pp. 157–82. Halliday, M. A. K. (1994) An Introduction to Functional Grammar, 2nd edn (London: Arnold). Lemke, J. (1998) ‘Multiplying meaning: Visual and verbal semiotics in scientific text’, in J. R. Martin and R. Veel (eds) Reading Science: Critical and Functional Perspectives on Discourses of Science (London: Routledge), pp. 87–113. Love, A. M. (1991) ‘Process and product in geology: An investigation of some discourse features of two introductory textbooks.’ English for Specific Purposes, 10(2): pp. 89–109. —— (1993) ‘Lexico-grammatical features of geology textbooks: Process and product revisited.’ English for Specific Purposes, 12(3): pp. 197–218. Moirand, S. (2003) ‘Communicative and cognitive dimensions of discourse on science in the French mass media.’ Discourse Studies, 5(2): pp. 175–206. Myers, G. (2003) ‘Discourse studies of scientific popularization: Questioning the boundaries.’ Discourse Studies, 5(2): pp. 265–79. Thompson, G. (2004) Introducing Functional Grammar, 2nd edn (London, Arnold).
13 Developing the Metafunctional Framework for Analysing Multimodal Hypertextual Identity Construction Arianna Maiorani
13.1
Introduction
Thousands of players all over the world are attracted by the virtual worlds and adventures offered by the Massive Multiplayer Online Role-Play Games (MMORPG). These games, accessible through the Internet, offer the possibility of living virtual parallel lives and adventures in a great choice of hyper-environments: from fantasy worlds to war sites, to deep space or even to a reproduction of ‘real everyday life’. This relatively new form of entertainment uses the Internet dimension as an expansion1 of the world outside, where players can act, behave and communicate by using different virtual identities. These ‘roles’ support a huge commercial market that thrives worldwide. Players come from a variety of age groups, social and educational backgrounds, and they pay monthly or weekly subscription fees to continue visiting and acting in the virtual worlds of the games. MMORPG generate their own communities, blogs and forums, which keep on feeding their markets.2 They have become a social and cultural phenomenon and their commercial importance cannot be undervalued: in short, their commercial success highlights their importance in our society and culture. MMORPG are by nature social,3 and as such they tend to replicate, through communication, the basic social aspects of the societies and cultures they are played in: social classes and hierarchies, related typologies of actions and occupations, possibilities and limitations, goals and means, ethic and moral values. They therefore imply an involvement of players that have to enter their social hyper-contexts and communicate in the roles of specific, defined social identities. 220
Developing the Metafunctional Framework
221
Communication through words and action is therefore the basic activity that takes place in a MMORPG, or more accurately, it is the semiotic dimension that generates the whole game activity and community. This kind of communication is realized through different semiotic modalities: language, images, sounds. Therefore, multimodal discourse analysis4 is a valuable method to investigate the social and cultural context in and by which these games are produced. The purpose of this chapter is to propose and test a developed model of multimodal systemic functional analysis primarily based on the Hallidayan (2004) framework of analysis and on Kress and van Leeuwen’s (2006) grammar of visual design which may be applied to the analysis of the multimodal discourse realized by MMORPG. Since these game experiences change during the game itself and are different and specific for each online player, this study will focus on a fundamental phase of all games: the creation of the alias, the multimodal identity which represents the player within the hypercontext of the game. The chapter will start from the analysis of how characters are described by Tolkien in the verbal text; this presentation will then be compared with the process of creation of an alias for The Lord of the Rings Online as a multimodal text ‘written’ by the game player. Analysis will start from a passage from The Lord of the Rings, first published between 1954 and 1955, and will then explore the possibilities of the multimodal approach to the hyper- environments of the MMORPG, whose first release was in 2007. In this way, the applicability of the functional framework5 as a transmedial comparative tool for analysis will be tested and possibly new categories for multimodal discourse analysis will be proposed for the analysis of MMORPG.
13.2 Text transmediality and metafunctional analysis As a fundamental text of Celtic-inspired fantasy literature, The Lord of the Rings has ‘migrated’ from one medium to another. This was due not only to the prolific writing of Tolkien but also to his capacity of drawing maps of the lands where the story events take place, designing a territoriality which is set deeply in the Western literary and cultural tradition. The verbal text analysis will focus on a written passage from The Fellowship of the Ring, the first book of the trilogy, where some of the main characters are described for the first time. Focus will be on how Tolkien introduces ‘visually’ three main characters on the scene of his narrative: Elrond, King of Elves, the Elf Glorfindel and Gandalf. The passage analysed is from the first chapter of Book 2, entitled ‘Many Meetings’. Gandalf, who has already appeared in the book, is here re-introduced as part of a group of representatives of good powers. Tables showing the complete systemic functional analysis of the whole text can be found in the Appendix.
222
Arianna Maiorani
Tolkien describes each character as precisely set in an environment and in a fixed posture, as if in a picture. Interestingly, many of the traits he describes are ‘attributed’ or added to the specific character by some unidentified external power. Accordingly, 16 of the 33 sentences analysed realize relational processes of the attributive kind. As far as the transitivity structure is concerned, the passage seems to be divided into two distinct parts. The first part, devoted to the description of the group of characters set in a specific environment, is characterized by the use of behavioural processes, along with existential, mental, verbal and relational ones. The second part, where single characters are described in their specificity, the use of attributive relational processes becomes dominant. In this second part, which starts from Clause 15 (Glorfindel was tall and straight) the use of circumstances of location/space in marked thematic position is also frequent. This highlights Tolkien’s way of introducing characters precisely positioned on the scene and described in their particular attitudes, thoughts and behaviours. Inner, intrinsic qualities are not directly described but suggested by connecting them to visual features that the description tries to evoke. The character of Elrond, for example, is introduced in his posture by a behavioural, near material process (Elrond sat): he occupies a specific place in the ‘picture’, which is pointed at by the circumstance of location/space. The following two clauses feature the same kind of circumstance in marked position (and next to him), at the beginning of the clause: thus, also the other two characters are positioned. Then, the fourth character, Frodo, is mentioned as Senser of a mental process of perception: through his eyes, the writer can describe the visual aspect of the other characters and make his gaze move like a camera. He looks in wonder: this circumstance of manner also connotes the kind of look the reader is invited to share. Each character is described mainly as having attributes given by a superior unnamed power, which uses them as the ‘site’ of eminent qualities. Almost all sentences in the second part of the passage start either by a circumstance of location/space in marked thematic position (e.g. on his brow, and in his hand), or by a deictic possessive followed by the name of an anatomic part. In the first case, the characters’ physical features are highlighted as ‘places’ where special qualities and powers reside: these are ‘positioned’ mainly through the particular use of material processes (Clauses 24 and 26; an example is also in Clause 14, in the first part of the passage), behavioural processes (Clause 20), and relational attributive processes. Some peculiarities deserve to be noted: wisdom is personified as a non-typical Behaver in Clause 20, while Elrond’s memory as well as his Elf crown, signs of his majesty and noble nature, are both construed as goals of an agentless material process. The clauses where the structure ‘deictic possessive + physical feature + was/were + attribute’ is repeated, realize a grammatical parallelism which makes the reader focus on each character’s detailed description and aspect,
Developing the Metafunctional Framework
223
enhancing the symbolic value of each single feature. This repetition also realizes a parallel thematic progression that characterizes the whole second part of the passage and distinguishes the different sub-sections devoted to each character. Single characters’ descriptions can be therefore compared to symbolic portraits where, through physical description, also moral qualities and special powers are described. In the Lord of the Rings Online, the characters are also created one by one: or better, each one is created after a prototype which is offered on the screen according to the race and gender chosen by the player. This prototype, whose aspect changes during the creation process, is set against a suggestive background, and it takes its basic features from Tolkien’s lore. The scene behind it reminds me of Tolkien’s descriptions of the lands from which each race comes. The moment of creating the alias is therefore a moment of transition from the world created by Tolkien to the world that will be generated in the MMORPG.
13.3 The construction of an alias in the multimodal text: Multimodal analysis of the process of virtual identity creation in The Lord of the Rings Online The Internet is by nature a multimodal communicative dimension where communication takes place thanks to the same semiotic modes that are used in the world outside it.6 It can be seen as an expansion of the outside world which is not subjected to the physical laws of time and space. However, MMORPG produce hyper-environments where players can imagine experiencing the physical dimension of the real world. Rather paradoxically, these games are based on the production of a ‘virtual physical presence’ of players in the hyper-environment and on the creation of virtual worlds. This trend reflects an attempt at adding a physical quality to the intrinsic non-physical nature of the Internet, which validates even more the idea that the Internet dimension is an expansion of the ‘real world’ rather than its virtual counterpart. The development of this new type of virtual entertainment has challenged all models of semiotic analysis by generating a new position of the player which is here defined as the creatively interactive represented Participant. In the hyper-contexts offered by each game, this Participant realizes both visual and verbal/sound texts by interacting with other players and the hypercontexts through multimodal resources, and creating by the same resources new interactive options both for him/herself and the other players. Individual multimodal hyper-discourse is created as a consequence of playing the game: only then one enters the discourse generated by the virtual community of players/Participants and interacts with it through multimodal resources. The identity of a player/Participant in the game hyperdiscourse is a social construction created as a response to the hyper-social
224 Arianna Maiorani
context of the game and the social-context of the player: the creation of this identity is specifically functional to the online game. For these reasons, the creation of a game alias can be considered the first step of the multimodal hyper-discourse the player/Participant will realize while playing the game and interacting with the other players/Participants. Two important factors must be taken into consideration in the study of the alias creation process: one is the fact that the represented Participant in the game (the alias) and the interactive Participant (the player) inherently overlap. The other factor is that, unlike what happens in the creation of a verbal text where textual meanings enable other meanings to combine, in the process of creation of a MMORPG alias, it is the choices made at the level of potential multimodal experiential and interpersonal meanings that enable the multimodal textual meanings to be realized in the form of the alias itself. Thus, in this kind of process, the ‘enabling’ quality is transferred upon the interpersonal and experiential multimodal meanings, which will be realized depending on which features and tools will be attributed to the game character (race, gender, class, inherent powers and instruments). This happens because MMORPG need a pretext to settle their hyper-social and cultural context in the non-physical Internet dimension. It is the non-physicality of the Internet that, while allowing for virtually infinite possibilities of communication, makes it necessary for MMORPG to rely on the physical qualities of the world outside as a reference. The MMORPG that has been chosen as a case study has a pretext, J. R. R. Tolkien’s The Lord of the Rings, which has inspired many other fantasy games and that has shown great transmedial potential. The Lord of the Rings Online (hereafter LOTRO) can, however, be considered as an expansion of the movie trilogy7 inspired by the books, since many game-generated characters (or NPCs, non-player characters) as well as landscapes, environments and aesthetic features have been inspired by actors, creatures and sets of the movies. When LOTRO was put on the market, its pretext was already very well known worldwide both by book readers and by movie fans. The Lord of the Rings Online is introduced by a short movie featuring all the main book characters in some of the game sets and explaining what the context of situation is (a battle against the evil forces) and what the would-be player is expected to achieve as a fighter of evil forces. A basic knowledge of Tolkien’s world is implied as a prerequisite which shows the basic connection with the written texts and their traditional representation of the Middle Earth and its inhabitants. When a player starts the process of creation of his/her virtual identity, in order to become an effective Participant of a MMORPG, he or she starts creating a multimodal meaning potential that will take the form of a character moving, acting and communicating in the hyper-context of the game. The creation of a character in LOTRO is achieved through three consecutive attributive phases which respectively imply the choice of a race and gender,
Developing the Metafunctional Framework
225
the choice of a class and the choice of a name, a geographical background and more specific physical features. As in all other MMORPG, the introduction of a character in the LOTRO multimodal hyper-context is done through the ‘physical’ creation of the character as multimodal Participant. This Participant has to be visually and audibly positioned in the hyper-context. Only after its creation will the character be placed in the hyper-context: the player is offered a choice of possible ‘sets’ or virtual worlds among which s/he has to select the one where s/he wants to start his/her ‘adventure’ and construct his/her hypertextual path. This alias is a social multimodal construction through which an ontological relationship of identification between a Participant in a virtual world and a player (and buyer of the product game) in the real one is built. Sections 13.3.1 to 13.3.3 will discuss the process of its creation as a fundamental transmedial semiotic event. 13.3.1 The physical creation of a virtual identity: Race and background The first of the hyper-pages devoted to the creation of a virtual identity in The Lord of the Rings Online offers a selection of races and corresponding cultural backgrounds among which the player has to make his/her choice.8 When clicking on the races and genres icons on the left side of the page (Illustration 13.1), the right-hand side icons are activated. For every choice, a text appears; it describes the historical and cultural background of the race and gender selected and the powers/abilities attributed to the character accordingly. A button which links to specific descriptive trailers about the different races is positioned beneath these. This organization reflects the conventional Western textual organization, where reading proceeds from left to right and top-down, and where new information is typically visually located on the right side. On the top of the page, in a central position, there is the title and logo of the MMORPG. In the lower right part, a small section which opens windows offering technical options is located. The character/ Participant under construction is in the centre of the page and changes in real time according to the choices made by the player: it is a visual carrier who is given attributes by the player. Beneath the character there is a platform with buttons to make him/her move, zoom in and out, and rotate; this camera-like modality construes the second level of the player’s presence in the MMORPG: s/he is both actor and director of a multimodal representation of the narrative s/he construes and directs at the same time. Figure 13.1 shows a scheme of this first phase page structure, which will be kept almost the same in the following phases. In this first phase, the player has to select combinations in a system, as when using a language. The system offered on the left side of the page is paradigmatic: all possible choices are present and available at the same time and represent the meaning potential of the game. The system of
226
Arianna Maiorani
Illustration 13.1 First phase of LOTRO character creation – choice of race and gender. Retrieved from www.youtube.com, April 2008
choices offered on the right side is syntagmatic and depends on the choices made on the left side. Choices at this stage are mainly related to the race and genre the alias will belong to in the hyper-context of situation of the specific MMORPG. This will determine the attribution of a specific background and a specific range of available tools, skills and powers to develop, which will be offered for selection in the following phases of the alias construction. Selection of a race and a genre (the textual construction of the character as a multimodal text) will therefore determine the way the player/ Participant will interact with both other players and the virtual environment (potential interactive interpersonal meanings), as well as the actions s/he will be able to perform through powers and tools (potential interactive experiential meanings). At this point, a systemic description of the choices requires an important distinction between the background context of situation of the MMORPG, which in this case is based on the sets, plot and characters of Tolkien’s books
Developing the Metafunctional Framework
Select a race
Title of the MMORPG and logo History and features of the RACE/GENDER selection
Icon MAN male Icon MAN female Icon DWARF Icon HOBBIT male Icon HOBBIT female
227
Character under construction
Icon ELF male
Icon Available class and tools Icon Available class and tools Icon Available class and tools
Icon ELF female
Icon Available class and tools Characters movement’s pushes
Figure 13.1
Construction page schematic structure
and the hyper-context of potential situation which the player will construe through his/her choices. The former determines the number and kind of choices the player is offered. The books that inspired both the movies and the online role-play game create a general context of situation whose basic knowledge is a requirement for players. Even those who have not read the books nor seen the movies will have to learn about Tolkien’s world: that is why explanatory trailer movies are offered during the alias creation process. The choices offered when creating an alias and entering the game are determined by the existence and knowledge of this background context. When the player enters the game and creates his/her own hyper-texts in the form of characters, this background context of situation becomes the game’s context of culture. The latter context is structured during the character’s process of construction and will evolve during the game sessions. Its predictability depends on the character’s functionalities, which are determined by the choices made during its process of creation. Both contexts, therefore, are related to the alias as a multimodal text: the background context of situation activates the multimodal meanings which will be realized in the hyper-context of potential situation through the alias itself. This systemic functional relationship between these two contexts and the game alias, which is schematically represented in Figure 13.2 can be defined as a hyper-contextual functional process of causation. The background context of situation activates transmedial meanings which will be realized
228 Arianna Maiorani Background context of situation
I Field Human experience represented in The Lord of the Rings books II Tenor Relationships built-in The Lord of the Rings books III Mode Texture of The Lord of the Rings books
Transmedial initiating context
Meanings
Experiential meanings
Interpersonal/ Interactive meanings
Textual meanings
Transmedial meanings
Multimodal transmedial realizations in the alias construction
Hyper-context of potential Situation
Selection and attribution of action-enabling features
Action multimodal meaning potential
Selection and attribution of interactionenabling features
Interaction multimodal meaning potential
The alias as a result of selections at experiential and interpersonal/ interactive level
Potential hyper-path of the alias in the game
Transmediated multimodal meanings
Potential hypercontext
Figure 13.2 Systemic functional representation of the hyper-contextual functional process of causation
by the game character as a creative multimodal text; the game character will enter and interact in the hyper-context of potential situation where the player will design his/her hyper-path during the game session. Thus, the systemic functional relationship between the two contexts is systematized and the alias creation process is highlighted as the transmedial phase of the MMORPG hyper-textual creation. This scheme can be applied in the study of MMORPG to verify and analyse the existence and characteristics of the social and cultural background in and by which a game is created and to trace and systematize their relationships. 13.3.2 The physical creation of a virtual identity: More specific functionalities The second hyper-page devoted to the creation of the virtual identity is structured in the same way of the first one, but with a few changes in the selection offered. On the left side, a choice of social classes related to the race and genre previously selected is offered. As in the first page, by clicking on each class icon, the player activates an explanatory text in the right-hand side section. This text gives information on the social location of each class, background, attributes and tools, powers and social role associated with the selection. At the bottom of this right-hand side section, a button creates a link to short trailers about each class function and powers. The character under construction is always in the centre of the page and can be moved by the commands set under it. On top, over the character and always in central position, there is the title and logo of the MMORPG. Under the left section
Developing the Metafunctional Framework
229
containing the list of classes, a button for opening a window regulating technical settings is also positioned, along with the button which directs the player back to the previous page. The choice of a class determines the attribution to the character of a more specific personal background and set of powers and tools, as well as a generally recognized specific social function. This will enable more specific as well as restricted interaction possibilities in the course of the game. These pre-determined restrictions are once more determined by the background context of situation and will determine the creation of hyper-paths and hyper-context of situation during the game sessions. Thus, also in this phase the character under construction activates and realizes the transmediality of the meanings between the two contexts. Each class choice is related to just one specific set of attributes: this means that whichever race and genre the player chooses, each choice made in the first page will determine a restricted number of class choices which, in its turn, will determine one set of specific powers and functions of the character under creation. In this respect, the system of choices available for the construction of the virtual identity can be compared to the language systems: the more one approaches the realization of the virtual identity (a multimodal text), the more one restricts choices. However, unlike what happens in the realization of a verbal message, where the process of creation is generated by and destined for the same context of situation, in the case of a MMORPG, the character as multimodal text functions as a transmedial creative/interactive text working between two contexts. The background context of situation is the one where the creators/ producers of the MMORPG can exercise their power as far as the selection of a specific target client is concerned, because the choice of a pretext realized by a background context of situation implies a preliminary selection of reader/users and future game players. Thus, the systemic functional analysis of an alias creation process can also reveal the commercial strategies that lie behind and support the creation, production and marketing of a MMORPG. 13.3.3 The physical creation of a virtual identity: Name and aesthetic features The third hyper-page devoted to the creation of a virtual identity in The Lord of the Rings Online has the same general structure as the first and the second one. Choices have to be made in the left and right sections while the character under construction occupies the centre of the page, with title and logo above and movement commands below. In the bottom left section, there is again the technical features button and the ‘backwards’ one. In the bottom right section, the buttons to finalize the character creation and enter the game through the virtual identity are positioned. The left-hand side section is devoted to the choice of a name: suggestions of prefixes and
230 Arianna Maiorani
suffixes are offered, according to the selection of a cultural and geographical origin, on the basis of the race and genre selected by the player. These will consequently determine the choices of more specific physical traits on the right-hand side. In this phase, the choices will restrict availabilities to only one multimodal text, the completed game character. The alias is therefore created through consecutive attributive phases; in this respect, this process is comparable to the one enacted by the writer: Tolkien’s characters, as shown in Section 13.2, are described as being progressively attributed qualities and features by an unknown external power/ entity and against a specific background. The multimodally represented Participants of a MMORPG overlap by nature with the interactive Participants of the game: characters are virtual social reconstructions of the players. The players are supposed to come from a cultural background which comprises the background context of situation of The Lord of the Rings books. Therefore, through the process of creation of virtual identities, the multimodal hyper-text of a MMORPG, elaborates meanings belonging to a text in a background context of situation and develops them into a multimodal text in the form of a game character with multimodal interactional potentialities. The example presented in this chapter with The Lord of the Rings may serve as a model for analysis of all other MMORPG.
13.4
Conclusion
The comparison between the systemic functional linguistic analysis performed on the Lord of the Rings verbal text and the multimodal analysis performed on the multimodal text of the online game have shown, first of all, that a MMORPG offers a virtual world that a player can enter only if he or she previously construes a multimodal pretext in the form of a character; this is achieved through the process of creation of a virtual identity which will be represented in the game hyper-environment and which overlaps with the player as interactive Participant in the multimodal discourse construed through the game play. In addition, it has been observed that the producers of a MMORPG point at a range of potential buyers precisely through the choices they offer in terms of virtual identity creation. A MMORPG like The Lord of the Rings Online relies on a pre-existent verbal text in order to target potential buyers, because the choices it offers for creating virtual identities are based on the background context of situation realized by such a text. Furthermore, three important things can also be inferred about the validity and applicability of the functional framework to the study of this new thriving form of online entertainment. First, the functional framework of analysis, when applied to the multimodal hyper-discourse of a MMORPG, implies the conceptualization of new, developed typologies of meanings which derive from transmediality within the three basic Hallidayan categories.
Developing the Metafunctional Framework
231
Second, there are two types of contexts of situation being involved in the creation of the hyper-text of the MMORPG. Third, that the background context of situation is the factor the creators and sellers of a MMORPG can and must exploit in order to target the game/product to a certain kind of potential buyers and market. Interestingly, if the way a character is introduced on the book scene is compared to the process of creation of a virtual identity, an important fact emerges: while the meanings realized by the verbal text tend to make the reader ‘look’ at a character inserted in a specific environment as the ‘incarnation’ of a certain kind of power and a series of specific personal qualities, the potential interactional multimodal meanings selected by each player when creating a virtual identity tend to make this character ‘look to act’, that is to say having physical as well as additional features which can grant the character the possibility to realize specific actions and interactions in the game. A future step into the study of MMORPG functions and markets may be oriented towards the comparison between the results of analyses like the ones proposed in this chapter and the quantitative data related to the actual sales of the games as products. This would allow us to know if the potential target buyer inferred by the multimodal study corresponds to the typology of the actual buyer, and would tell us about the social response to the marketing strategies involved in the creation of this kind of multimodal form of entertainment.
Appendix 13.1 Systemic functional analysis of The Fellowship of the Ring 1) The Hall of Elrond’s House was Carrier
Pr. Relational/attributive Attribute
Subject
Finite Mood Block
Theme
Rheme
2) [There]
[were]
Elves for the most part
Pr. Existential Existent Subject
Finite Residue
Mood Block Theme
filled with folks
Rheme
Residue
232 Arianna Maiorani 3) though there
Subject
were
a few guests of other sorts
Pr. Existential
Existent
Finite
Residue
Residue Mood Block
Theme
Rheme
4) Elrond (..*..)
sat
in a great chair at the end of the table upon the dais
Behaver
Pr. Behavioural (near Material)
Circ. location/space
Subject
Finitepast
Mood Block Theme
Residue Rheme
4 bis) * as was
his custom
Pr. Relational
Carrier
Finite
Subject
Mood Block Theme
Rheme
5) and next to him on the one side Circ. location/space
sat
Glorfindel
Pr. Behavioural (near Material)
Behaver
Residue
Finite-past Mood Block
Theme
Rheme
Subject
Developing the Metafunctional Framework 6) on the other side
sat
Gandalf
Circ. location/space
Pr. Behavioural (near Material)
Behaver Finite-past
Subject
Mood Block
Residue Theme
Rheme
7) Frodo
looked at
them
in wonder
Senser
Pr. Mental/ Perception
Phenomenon
Circ. manner/ quality
Subject
Finitepast
Residue
Mood Block Theme
Rheme
8) for he
had never before seen
Elrond
Senser
Pr. Mental/Perception
Phenomenon
Subject
Finite
Theme
Rheme
Mood Adjunct
Residue
Mood Block
9) of whom
so many tales
spoke
Circ. matter
Sayer
Pr. Verbal
Subject
Theme
Finite-past Mood Block
Residue
Rheme
Residue
233
234 Arianna Maiorani 10) and as they
sat
upon his right hand and his left
Behaver
Pr. Behavioural
Circ. Location/Space
Subject
Finite-past Mood Block
Theme
Residue
Rheme
11) Glorfindel, and even Gandalf (..*..),
were revealed
as lords of dignity and power
Identified
Pr. Relational/Identifying
Identifier
Subject
Finite Mood Block
Theme
Residue
Rheme
11 bis) *whom
he
he thought
Phenomenon 2
Residue
so well knew
Senser 1
Pr. Mental/ Cognition
Senser 2
Pr. Mental/ Cognition
Subject
Finitepast
Subject
Finite past
Mood Block
Mood Block
Circ. Manner/ Quality
Residue
Residue Theme
Rheme
12) Gandalf
was
shorter in stature than the other two;
Carrier
Pr. Relational/Attributive
Attribute
Subject
Finite
Mood Block Theme
Rheme
Residue
Developing the Metafunctional Framework
235
13) but
his long white hair, his sweeping silver beard, and his broad shoulders,
made
him look
Initiator
like some wise king of ancient legend.
Carrier
Pr. Relational/ Circumstance as Attributive Attribute
Subject
Finite present
Mood Block Residue
Residue
Theme
Rheme
14) In his aged face under great snowy brows
his dark eyes
were set
like coals [[that could leap suddenly into fire]].
Circ. Location/Space
Goal
Pr. Material Circ. Manner/Comparison
Subject
Finite
Residue
Mood Block
Theme
Rheme
Residue
15) Glorfindel
was
tall and straight;
Carrier
Pr. Relational/Attributive
Attribute
Subject
Finite Mood Block
Theme
Rheme
Residue
236
Arianna Maiorani
16) his hair
was
of shining gold,
Carrier
Pr. Relational/Attributive
Attribute
Subject
Finite
Mood Block Theme
Residue
Rheme
17) his face
[was]
fair and young and fearless and full of joy;
Carrier
[Pr. Relational/Attributive] Attribute
Subject
[Finite]
Mood Block Theme
Residue
Rheme
18) his eyes
were
bright and keen
Carrier
Pr. Relational/Attributive
Attribute
Subject
Finite
Mood Block Theme
Finite
Rheme
19) and his voice
[was]
like music;
Carrier
[Pr. Relational/Attributive]
Circumstance as Attribute
Subject
[Finite]
Mood Block Theme
Residue Rheme
Developing the Metafunctional Framework 20) on his brow
sat
wisdom,
Circ. Location/Space
Pr. Behavioural
Behaver
Finite past
Subject
Mood Block
Residue Theme
Rheme
21) and
in his hand
was
strength.
Circumstance as Attribute
Pr. Relational/Attributive
Carrier
Finite
Subject Mood Block
Residue Theme
Rheme
22) The face of Elrond
was
Carrier
Pr. Relational/Attributive Attribute
Subject
ageless,
Finite Mood Block
Theme
Residue
Rheme
23) [it]
[was]
neither old nor young,
[Carrier]
[Pr. Relational/Attributive]
Attribute
[Subject]
[Finite]
[Mood Block] [Theme]
Rheme
Residue
237
238 Arianna Maiorani 24) though in it
was written
the memory of many things both glad and sorrowful.
Circ. Location/ Pr. Material Space
Goal
Residue
Subject
Finite
Residue
Mood Block
Theme
Rheme
25) His hair
was
Dark as the shadows of twilight,
Carrier
Pr. Relational/Attributive
Attribute
Subject
Finite Residue
Mood Block Theme
Rheme
26) and
upon it
was set
a circlet of silver;
Circ. Location/Space
Pr. Material
Goal
Finite
Subject
Residue Mood Block
Residue
Theme
Rheme
27) his eyes
were
grey
as a clear evening,
Carrier
Pr. Relational/Attributive
Attribute
Circ. Manner/Comparison
Subject
Finite
Mood Block Theme
Rheme
Residue
Developing the Metafunctional Framework 28) and
in them
was
a light like the light of stars
Circumstance as Attribute
Pr. Relational/ Attributive
Carrier
Finite
Subject
Residue
Mood Block
Theme
Rheme
29) Venerable
he
seemed
Attribute
Carrier
Pr.Relational/Attributive
Subject
Finite-past
Residue
Mood Block
Theme
Rheme
as a king crowned with many winters, Circ. Manner/Comparison
Residue
30) and
yet hale
[he]
[seemed]
as a tried warrior in the fullness of his strength.
Attribute
[Carrier]
[Pr. relational/ Attributive]
Circ. Manner/Comparison
[Subject]
[Finite]
Mood Block
Residue
Theme
Residue
Rheme
31) He
was
the Lord of Rivendell
Identified
Pr. Relational/Identifying
Identifier
Subject
Finite
Mood Block Theme
Rheme
Residue
239
240 Arianna Maiorani 32) and
[he]
[was]
mighty among both Elves and Men.
[Carrier]
[Pr. Relational/ Attributive]
Attribute
[Subject]
[Finite]
Mood Block Residue Theme
Residue Rheme
Text taken from J.R.R. Tolkien, The Fellowship of the Ring (The Lord of the Rings, Part One), London, Harper Collins, 2007, p. 295.9
Notes 1. See Maiorani, 2008. 2. See Slevin, 2000. 3. Using the general term ‘video games’, Gee (2003, p. 7) comments on the social nature of these entertainment products: ‘Video-games – like many other games – are inherently social, though in video games, sometimes the other players are fantasy creatures endowed, by the computer, with artificial intelligence and sometimes they are real people playing out fantasy roles.’ Interestingly, Gee underlines the fact that social relations are established while playing video games without paying attention to the fact that one can interact either with other human players or with computer-generated identities. 4. Kress and van Leeuwen (2006), O’Halloran (2004), van Leeuwen (1999), Ventola et al. (2004) offer several theoretical descriptions and many different examples of multimodal discourse analysis, covering a variety of semiotic systems and text typologies. 5. This work uses Halliday and Matthiessen (2004) as basic reference for the functional theory. 6. See van Leeuwen (2005). 7. The three movies, released between 2001 and 2003 and directed by Peter Jackson, keep the same titles of the trilogy books. 8. The author would like to thank YouTube and Andrew from Google brandpermission service who kindly and rapidly answered her requests of copyright permission. All screenshots published in this chapter are available online and are used according to the Fair Dealing law. 9. The author has asked the publisher for copyright permission but so far has received no answer. The text used in this analysis is published according to the Fair Dealing law.
References Gee, J. P. (2003) What Video Games Have to Teach Us About Learning and Literacy (New York: Palgrave Macmillan).
Developing the Metafunctional Framework
241
Halliday, M. A. K. and C. M. I. M. Matthiessen (2004) An Introduction to Functional Grammar, 3rd edn (London: Arnold). Kress, G. and T. van Leeuwen (2006) Reading Images: The Grammar of Visual Design, 2nd edn (London and New York: Routledge). Maiorani, A. (2008) ‘Web experience as an expansion: A perspective from multimodal discourse analysis’, in Proceedings of the AISB 2008 Symposium on Multimodal Output Generation, Vol. 10 (Aberdeen: AISB), pp. 58–61. O’Halloran, K. L. (ed.) (2004) Multimodal Discourse Analysis: Systemic Functional Perspectives (London: Continuum). Slevin, J. (2000) The Internet and Society (Cambridge: Polity Press). van Leeuwen, T. (1999) Speech, Music, Sound (London: Macmillan). —— (2005) Introducing Social Semiotics (London and New York: Routledge). Ventola, E., C. Charles and M. Kaltenbacher (eds) (2004) Perspectives on Multimodality (Amsterdam and Philadelphia: John Benjamins).
This page intentionally left blank
Part V Integrating Text, Visual and Space Multimodally
This page intentionally left blank
14 From Musing to Amusing: Semogenesis and Western Museums Maree Stenglin
14.1
Introduction
In recent years, there has been growing interest in museums among social semioticians, and in particular, among scholars working with systemic functional linguistics (SFL). For instance, MacLulich (1993), Ferguson, MacLulich and Ravelli (1995), Ravelli (1996, 1998, 2006) and Purser (2000) have applied SFL to the analysis of verbal texts in museums. Hofinger and Ventola (2004) have moved beyond SFL, into systemic functional-multimodal discourse analysis (SF-MDA), to explore verbiage-image relations. SF scholars have also considered the organization of space in museum exhibitions in relation to genre (White, 1994), composition (van Leeuwen, 1998) and the co-deployment of different semiotic resources (Pang, 2004). In addition, Stenglin (2004, 2007, 2008, in press for 2009a, in press for 2009b) has proposed tools for analysing ideational, interpersonal and textual meanings in three-dimensional spaces such as museums, homes and shopping centres. These tools were used to illuminate how the discourse of reconciliation was enacted in Te Papa Tongarewa, The National Museum of New Zealand (Martin and Stenglin, 2007). Complementing these studies, this chapter explores the ideology behind the evolution of Western museums as cultural institutions through a dynamic model that involves two semiotic resources: social context and semogenesis (Martin, 1997). This model is based on a central tenet of SFL: that ‘texts are social processes and need to be analysed as manifestations of the culture they in large measure construct’ (Martin, 1992, p. 493). While numerous SFL studies provide systematic accounts of the social contexts in which written and spoken texts function, SF-MDA studies have tended to focus almost exclusively on the interaction between semiotic resources or the theorization of different modalities. To address this gap, and specifically to engage with the broader sociocultural context in which SF-MDA studies of museums are situated, this chapter explores how semogenesis, the process of semiotic change, projects ‘valuer’ on to different time scales in the evolution of museums. Several 245
246
Maree Stenglin
important insights into museums emerge as a result: a rich understanding of the social context from which museums, as cultural institutions, have evolved, and a systematic understanding of the social context in which they currently operate. Also illuminated is the way hegemony has been construed in museums over the past few centuries. If such rich perspectives on social change are systematically mapped and drawn on to inform more detailed work on multimodality in cultural institutions such as museums, the benefits for understanding, engaging with and accurately interpreting the complexities of these sites are immense.
14.2 Introducing the tools for analysing ideology: Semogenesis and social context One way of modelling ideology semiotically involves using semogenesis (Halliday, 1992, 1993; Halliday and Matthiessen, 1999) to project social context (see Martin, 1997, pp. 9–11). Semogenesis is concerned with the ways meanings unfold over time and has three timescales to model socio-semiotic change. They are logogenesis, ontogenesis and phylogenesis. Logogenesis refers to short timeframes such as the unfolding of a text; ontogenesis refers to longer timeframes such as the development of language in the individual, while phylogenesis refers to extended time-depth such as the evolution of language in a culture. (See Martin, 1997, p. 9 for a diagrammatic representation of these timescales). Social context, as developed within SFL by Martin (1992), is stratified into two levels: context of culture and context of situation. context of culture refers to ‘genre’ or the staged, goal-oriented social processes which people use as they live their lives. Context of situation refers to three dimensions of a situation: the social activity (Field), the social relationship between interactants (Tenor) and the semiotic channel of communication (Mode). These dimensions are collectively known as register.1 When semogenesis projects social context, it enables us to see how a person, or cultural institution (in the case of Western museums), engages dynamically with individual texts/exhibitions as they unfold in space and time (logogenesis); how a person/cultural institution is positioned and repositioned throughout their life history (ontogenesis) or ‘the ways a culture reworks hegemony across generations (phylogenesis)’ Martin (1997, p. 10). Exploring all of these projections is beyond the scope of this chapter but Section 14.3 will investigate configurations of social context (genre and register) in relation to two critical moments in the phylogenesis of museums.
14.3 Illuminating ideology: Two seminal moments in museum history Before exploring the key social changes that have phylogenetically shaped museums as cultural institutions, we need to acknowledge that
From Musing to Amusing: Semogenesis and Western Museums
247
the evolution of the Western museum as a cultural institution actually began during the Renaissance with the cabinets of curiosities. These served the social purposes of private glorification and mastery over the world. Accordingly, their valuer was concerned with admiring and respecting the triumphs of the collectors and celebrating the material domination of the colonial world. Cabinets of curiosity were commonplace in Europe during the fifteenth century, especially in Florence, with the emergence of a wealthy merchant class.2 In the sixteenth century, however, the cabinets lost their focus on curios and became ‘cabinets of the world’, that is, encyclopaedic collections. These were the phylogenetic foundations on which the modern and post-modern museums were built. (For a comprehensive discussion of sixteenth-century cabinets of the world, see Hooper-Greenhill, 1992; Stenglin, 2004, pp. 74–84.) We now turn to two critical moments in the evolution of museums: the emergence of the first public museum in the eighteenth century, and later, the hybrid public museum of the new millennium. 14.3.1 The emergence of the public museum: The palace of the people The eighteenth century was dominated by the philosophy of Enlightenment; coupled with the advent of democracy, this provided the stimulus for the next seminal moment in the genesis of the museum (Einreinhofer, 1997; Hooper-Greenhill, 2000). During this period, the construal of ideology shifted as the valuer of the Western museum became concerned with the triumph of invention and good citizenship. Museums entered the public sphere as secular institutions with collections held in trust for the public, but even more importantly, they were open to the citizens of entire nations. Thus, public museums, such as the Louvre in Paris and the British Museum in London, were born by appropriating royal and aristocratic collections, respectively.3 Museums also became increasingly specialized at this time (Bennett, 1995, p. 2). The eighteenth century saw the emergence of at least three distinct museum types: art museums such as the Louvre, natural history museums such as the British Museum, and history museums such as the Capitoline in Rome. This development was tied to the proliferation of science during the seventeenth and eighteenth centuries, and it resulted in valuing classification and hierarchy (Bennett, 1995, p. 2). In addition, technological innovations meant that scientific observations became more precise than ever before. As a result, science discredited magic and sorcery, and the eighteenth century was characterized by the appeal to ‘logic’ and ‘reason’. With the transition from the private to the public realm, the shift to scientific method and social inclusiveness, the social purpose of museums also changed. Instead of private glorification, the museum became a tool of ‘enlightenment’. Its social purpose was to educate the citizens of the
248
Maree Stenglin
nation-state by teaching them about the universe through the display of objects. Objects were, moreover, regarded as ‘sources of knowledge, as parts of the real world that had fixed and finite meanings that could both be discovered, once and for all, and then taught through being put on show’ (Hooper-Greenhill, 2000, p. 5). The instructional purpose of the museum was strongly influenced by the widespread belief that people could be morally and intellectually ‘improved’ through education (Gay, 1984, pp. 14–17; Ozouf, 1988, p. 198–203; Bennett, 1995, pp. 18–20). In other words, eighteenth-century museums, like schools and libraries, became pedagogical institutions with a focus on the selective transmission and the acquisition of knowledge (Bernstein, 1990, pp. 183–4). According to Bennett (1995, pp. 18–20), the aim of the British government at this time was to ‘civilize’ the population and ‘transform’ its citizens by regulating their behaviour. One of the ways it attempted to do this was by encouraging the ‘lower classes’ to visit libraries, museums and art galleries (Bennett, 1995, p. 19). The idea was that by engaging with cultural institutions such as these, the masses would learn to imitate the behaviour, dress, morals, manners, norms and values of their ‘social superiors’. The museum as an eighteenth-century pedagogic institution achieved its educational goals of enlightening the public through didactic and ‘logically organized’ exhibitions. At the Louvre, for instance, which opened to the public on 10 August 1793, the first exhibition was a display of the national collection of art, involving almost 1,200 works. The overall aim of the opening exhibition was instructional: to provide visitors with a broad understanding of the history of French art, especially the Western influences that had shaped it. To achieve this aim, the exhibition was organized as a macro-genre (Martin and Rose, 2003). First, the objects were thematically grouped into two schools of art: the Italian and Northern. The Italian schools were allocated two ‘courts’ of the Louvre’s Grand Gallery, while four ‘courts’ were dedicated to the Northern schools. Embedded within these geographic classifications, moreover, were a series of recounts. The aim of each recount was to document and showcase the professional development of individual artists. The exhibition as a macro-genre thus classified art objects into a geographic report, and then embedded a series of biographical recounts within them. By organizing exhibitions into macro-genres in this way, it was believed that object displays were able to ‘enlighten’ the masses. They did this by presenting visitors with knowledge in clear and logical ways. So walking through an exhibition, which unfolded in space and time, was conceived to be ‘a pedagogic act’ (Hooper-Greenhill, 2000, pp. 5–6). In this way, mode, especially the modality of space, was seen as critical to achieving the museum’s pedagogical social purpose.
From Musing to Amusing: Semogenesis and Western Museums
14.3.1.1
249
Mode
Although objects and collections were at the heart of the public museum of the eighteenth century, the modalities of space and language played increasingly pivotal roles in assisting the museum to achieve its underlying pedagogic purpose. For instance, organizing the Louvre’s spaces into semi-enclosed hubs which housed either the Italian or Northern schools of art played a pivotal role in classifying the objects into distinct but interrelated semiotic units. This meant that, although spatial enclosures, such as walls, functioned to separate one semiotic array of objects from another, the divisions were not concerned with sealing objects from one another. Rather, they were concerned with grouping the objects into distinct but related categories – in this instance, schools of art. Within each school, moreover, chronology was used to organize the display of works in such a way that the visitor’s logogenetic movement through each hub dynamically corresponded with the unfolding of several distinguished artists’ careers. In these ways the organization of spaces was crucial to achieving the social purpose of educating visitors about the history of Western art. Similarly, in the British Museum, objects were scientifically organized using a Linnean taxonomy and displayed as semiotic units. Thus, one spatial enclosure housed the fish gallery, another the insect gallery, a third the reptile gallery and so forth. Given that the public museum of the eighteenth century was ‘an educational institution for the common people’ (Einreinhofer, 1997, p. 26), language, both spoken and written, began to play an important role in exhibitions. First, the spoken mode became part of the guided-viewing experiences offered to visitors interested in joining curators for floor talks about the works on the display in the exhibition (Hooper-Greenhill, 1991, p. 258; 1992). The written mode complemented these talks via the display of explanatory text panels and the development of catalogues, written by curators and sold cheaply, in order to encourage visitors to purchase them and read more about the history of western art in their homes (HooperGreenhill, 1991, p. 258; 1992). In these ways, the modalities of space and language assisted museums in transferring knowledge from specialist curators to a general public. 14.3.1.2 Field Essentially, the four global activity sequences of the cabinets – collecting, curating, conserving and viewing – continued. However, some activities, such as curating, did change in nature, while others, such as conserving, became more specialized. The taxonomy in Figure 14.1 presents the activity sequences of the eighteenth-century public museum. Another major difference from the cabinets of the past was the addition of a new activity: that of administering. As the size of the collection entrusted to the public museum for safekeeping was large (and kept expanding in
250 Maree Stenglin Gathering (objects) Collecting
Sourcing (objects) Sorting (objects)
Repairing (objects) Conserving Recreating (broken objects by piecing them back together)
Displaying (objects) Observing/studying (objects) Researching
Interpreting (objects) Translating (primary source materials from antiquity) Sharing (knowledge with the owners of the collection)
Assessing (the monetary value of the objects) Viewing Inferring (the wealth, social power and status of the owners) Figure 14.1
Activities associated with the eighteenth-century museum
response to imperialist conquests), the need to administer museums was identified and directors were appointed. Their role was to supervise all of the museum’s activities including new front-of-house duties such as cleaning and surveillance. In response to the museum’s social purpose of educating the masses, which included the improvement of their morals and manners, surveillance also came into being. In particular, uniformed security guards were employed to patrol the public spaces and ensure the safety of the collections, as well as regulate the behaviour of visitors (Bennett, 1995, p. 24). In this way, the activity of visiting museums and viewing their exhibitions became ‘one of the central acts of democratic citizenship’ (Davidson, 2001, p. 13). Furthermore, as the activity sequences of collecting indicate, eighteenthcentury national museums were sites of imperialism. National museums, such as the Louvre, became depositories for the spoils of wars fought by military leaders, such as Napoleon. This meant that specialized curatorial staff accompanied war leaders in order to inspect, appropriate and
From Musing to Amusing: Semogenesis and Western Museums
251
Princes Owners
Merchants Scholars
Owners (princes, merchants, scholars) Other wealthy citizens Viewers
Scholars including travelling scholars Collecting agents from other countries
Researchers/scholars
Conservators/artists Figure 14.2
Participants associated with the eighteenth-century museum
transfer the most valuable works to the national museum (Einreinhofer, 1997, pp. 25–6). In fact, the reward Napoleon bestowed on the greatest ‘collector’ of such spoils, Dominique Vivant Denon, was the directorship of the Louvre. The activities involved in developing exhibitions, that is, selecting, grouping, sequencing and displaying objects to achieve a social purpose are also not neutral. Contrary to the popular opinion of the time, the eighteenthcentury museum did not depict social realities such as the history of art or the plan of creation in ‘neutral’, ‘rational’ and ‘objective’ ways (HooperGreenhill, 2000, pp. 17–18; Davidson, 2001). Thus Weil (1995, p. 17) writes As museum workers, we are not merely passive reflectors of the world simple recorders of its seven wonders – but active participants in how the world is perceived and understood, participants in the creation of meaning, shapers of reality. So, the selection of some objects was always made at the exclusion of others. Yet the ways in which these meanings were made remained largely implicit throughout the eighteenth and nineteenth centuries. A taxonomy of the participants involved in activities of the eighteenthcentury public museum is represented in Figure 14.2. 14.3.1.3 Tenor As with the cabinets, the power relations enacted in the public museum were unequal. Rather than being expressions of private affluence or royal power, social relationships were distinguished by power differentials related
252
Maree Stenglin
to knowledge and social status. At one level, social relations tended to be that of specialist to lay-person, or expert to apprentice. The status of the expert curator was so high that the curatorial voice was regarded as being ‘transcendent’ (Weil, 1990, p. 51). At another level, social relations were those of nation-state to citizen. It is not surprising, then, that many of the exhibitions held in public museums coincided with celebrations of national events, such as the birthday of Napoleon (Einreinhofer, 1997). The aim of such activities was clearly to foster a deep sense of pride, solidarity and national identity among the citizens. Thus national museums were actively involved in constructing the citizen in relation to the nation-state (Duncan, 1998; Davidson, 2001, p. 13). Frequency of contact was controlled to some extent by the institution. Attendance at the Louvre, for example, was organized in accordance with a ten-day roster. During the first five days of the cycle, the museum was exclusively open to artists and copyists. During the next two days it was closed for cleaning, while the last three days of the cycle were dedicated to the general public and open free of charge (Hooper-Greenhill, 1992, p. 183). This effectively meant that the museum was thus inaccessible to ordinary citizens for 70 per cent of its operational time. From the viewpoint of affect,4 the recontextualization of a royal palace into a democratic institution dedicated to serving the general public involved many polarities: the transformation of a powerful icon of monarchy, aristocracy and wealth into an icon of liberty, equality and fraternity. This, in turn, implies that what was once accessible to a privileged few is now open to all citizens, but as previously discussed, the museum was not as accessible as initially thought. Moreover, the public museum of the eighteenth century is a pedagogical institution controlled by the state. This means that it is simultaneously one of ‘the technologies designed to create docile bodies, and to reform the population as a resource for the government,’ (Hooper-Greenhill, 1992, p. 195). Herein lies one of its central anomalies. On the one hand, the public museum represents free and ‘open’ access to the collective treasures of the nation, while, on the other, it is a powerful instrument for propaganda and social control (Bennett, 1995, pp. 20–4). This anomaly indicates that, as was the case with Renaissance cabinets, social power still dominates Tenor relations in the eighteenth-century museum. Finally, with respect to orientation to affiliation, national museums were strongly dedicated to evoking pride in the greatness of the nation. This, in turn, facilitated a strong sense of unity, that is, nationalism. One of the ways the state achieved this was by ‘dazzling’ the public with the spectacle of the nation’s great treasures (McClellan, 1994, p. 99). The power of objects for bonding people was equally evident in its negative form. Although Einreinhofer (1997, p. 27) describes the despoiling of the Louvre after
From Musing to Amusing: Semogenesis and Western Museums
253
Napoleon’s defeat as ‘a bitter and humiliating experience for the French people’, these negative emotions also united people in their adversity. 14.3.2 The emergence of a hybrid: the post-modern and post-colonial museum From the end of the twentieth century, a new type of museum has begun to emerge: the hybrid museum of post-modernity and post-colonialism. With a blend of social purposes, two stand to the fore: education and entertainment, but their differences make them challenging to reconcile. The over-riding social purpose of the post-modern museum remains pedagogic (Anderson, 1997; Anderson, 1998; Hein and Alexander, 1998; Hooper-Greenhill, 1991, 1992, 2000; Falk and Dierking, 1992, 1995, 2000; Weil, 1990, 1995). Museums are among our pre-eminent cultural institutions for learning. Museums are where society gathers, preserves, and displays visible records of social, scientific and artistic accomplishments; where society supports scholarship that extends knowledge from paleontology to meteorites; and where people of all ages turn to build understandings of culture, history and science. (Leinhardt, Crowley and Knutson, 2002, p. ix) To achieve its educational goals, the post-modern museum, like its predecessor, is organized as a complex of macro-genres. Thus, some of its exhibitions are recounts (van Leeuwen, 1998), while others are scientific information reports often containing embedded explanations of life cycles and reproduction. Others present thematically organized interpretations of art while still others blend persuasive genres, such as expositions, discussions and directives (White, 1994).5 Like the people’s palace of the eighteenth century, the hybrid museum still aspires to ‘civilize the masses’ (Bennett, 1995, p. 8) but in a slightly different way. Rather than improving people’s manners, its role is to facilitate moral and ethical social change (Weil, 1990, pp. 43–56; Kelly and Gordon, 2002, p. 153), for example ●
● ●
promoting reconciliation between Indigenous and non-Indigenous Australians (Griffin and Sullivan, 1997, p. 11; Kelly and Gordon, 2002, pp. 168–9); inspiring the conservation of the oceans (Anderson, 1988, p. 104); combating violence and bigotry (Anderson, 1988, p. 5).
In other words, the telos of education in the hybrid museum is oriented to realigning visitors into a new subjectivity. Conflated with education in the hybrid museum is another social purpose, that of entertainment. This conflation has its genesis in the Great International Exhibitions of the nineteenth century (Greenhalgh, 1989),
254 Maree Stenglin
which were designed to entertain, as well as educate, the general public. The goal of entertainment appears to have arisen in response to two specific factors. The first is the pressure of economic rationalism: funding is increasingly dependent on visitor numbers. The second is the positioning of the hybrid museum by market and visitor research as a ‘leisure activity’, competing with theme parks, leisure centres and shopping malls for market share (Rojek, 1995; Kotler and Kotler, 1998, 2000; Environmetrics, 2000; Lynch et al., 2000). Economic imperatives can no longer be ignored if museums are to survive (Weil, 1990), so they have become hybrids involved in several entrepreneurial initiatives aimed at improving the quality and range of their leisure service provision. First, they have developed multiplex functions (retail, dining, providing conference and venue-hire facilities). Second, museum staff have attempted to make exhibitions more alluring as leisure options to ‘consumers’ through the following developments: ●
●
●
Hands-on learning activities which are seen to provide ‘added educational value’ to consumers (Caulton, 1998, p. 2); Active participation in learning through focus groups and frontend evaluations asking visitors what they wish to see, do and learn in exhibitions; Creating learning environments that are fun (Scott, 2001).
The social purpose of entertainment, however, tends to sit uneasily with museum curators whose work is strongly situated in post-colonial discourses. Embracing the awareness that museum objects have a multiplicity of meanings, post-colonial scholarship is dedicated to exploring these meanings in ways that value the voices of the oppressed and marginalized as well as to foregrounding diversity and difference (Jordanova, 1989; Vergo, 1989; Weil, 1990, 1995; Karp, Kreamer and Lavine, 1992; Hooper-Greenhill, 2000; Davidson, 2001; Pang, 2004; Martin and Stenglin, 2007). However, given the nature of post-colonial scholarship, the hybrid museum’s emphasis on entertainment and leisure servicing is difficult to reconcile with the curatorial focus on reshaping and reclaiming the past – a challenge museums need to resolve. Having briefly explored the nature of the disparate social processes involved in the development of the emerging hybrid museum, let us now consider how combinations of Mode, Field and Tenor are phased into these social processes. 14.3.2.1 Mode Exhibitions in the post-modern museum are characterized by a proliferation of modes. In organizing exhibitions, for instance, the modalities of space, objects (including visual images) and language are co-deployed
From Musing to Amusing: Semogenesis and Western Museums
255
(White, 1994; Pang, 2004). While these semiotics indicate continuity with the museums of the past, in the hybrid museum they are also increasingly co-deployed with modalities such as action and sound/music (through demonstrations, theatre performances and hands-on activities, and an abundance of independent multimodal texts such as computer interactives, databases and documentaries). Significantly, the one modality that is all-pervasive in all museums is space – space envelopes all modes. It materializes around objects, interactives and displays as well as people as they perform, listen or negotiate a path throughout an unfolding exhibition. Increasingly, that pathway is open and multilinear. Ever increasingly, museums are housed in open-plan, purpose-built structures, often constructed from glass panels. The interplay between interior and exterior is increasingly dissolving as museums seek to become transparent and ‘open’ to the public they serve. 14.3.2.2 Field In the post-modern and post-colonial world of the new millennium, museums are evolving into complex institutions. The International Council of Museums (www.icom.org) defines a museum as involved in five main activities: acquiring, conserving, researching, communicating and exhibiting. These are taxonomized in Figure 14.3. The staff involved in pedagogical activities such as collecting, conserving, researching, exhibiting and educating participate in highly specialized activities, which depend on written transmission and institutionalized learning. This indicates that research, that is, the production of new knowledge, continues to be highly valued in museums. The ICOM definition, however, fails to acknowledge the activities of staff involved in the operational and administrative running of the museum, and yet, these are crucial to the survival of the hybrid museum. Furthermore, the development of multiplex functions has considerably extended the type of activities in which museums are involved, for example, providing dining options as well as spaces for conferences and social functions. Naoshima Contemporary Art Museum in Japan, purpose-built as a museum/hotel, has blurred the boundaries even further by offering visitors overnight accommodation. Such activities are a direct reflection of the way post-modern museums are striving to offer consumers an optimal number of choices for leisure and recreation. The participants involved in these activities of the hybrid museum are shown in Figure 14.4. The main distinguishing feature of this taxonomy is that people working in museums today are strongly classified on the basis of their professional specialization into divisions: the Exhibitions division, the Corporate Services division, the Education division and so forth.6 Although the social roles for staff are strongly bound in these ways, there are multiple choices for classifying visitors. Figure 14.4 groups visitors on the basis
256 Maree Stenglin Inspecting Collecting
Packaging Shipping Sorting Storing Documenting (archival materials) Selecting (works) Grouping
Curating
Exhibiting
Sequencing Arranging/displaying (objects) Writing (text panels/catalogues) Teaching
Recording (inventories) Cleaning Conserving
Repairing Restoring Transferring works (e.g. from canvas to wood) Corresponding
Administering
Accounting Surveilling/policing Cleaning Looking
Viewing
Reading/listening Learning
Figure 14.3
Activities associated with a hybrid museum
of frequency of contact but other choices for classification could have been made based on demographic variables: educational qualifications, the social group they visit with (families, singles, couples), age and so forth. 14.3.2.3
Tenor
Building interpersonal relationships with visitors is a major challenge for the post-modern museum. In general terms, museums have tried to be more inclusive of visitors, to embrace them, involve them as active participants
From Musing to Amusing: Semogenesis and Western Museums
257
Project managers Designers Head of Exhibitions
Preparators Artificers (builders) Audio visual technicians Administrative staff Managers Accountants
Deputy director (admin) Ministry (Arts)
Trust
Head of Corporate Services
Cleaners, security guards Builders Clerks
Director Education officers Interpretive officers Head of Education
Bookings officers Evaluators Volunteers Publicists, marketers Publishers
Head of Community
Photographers
Relations
Retailers Managers, chef, waiters
Deputy Director (science)
Head of Anthropology
Scientists, clerical staff
Head of Environment
Scientists, clerical staff
Head of Zoology
Scientists, clerical staff
Head of Info. Science
Librarians, clerical staff
Frequent Visitors
Occasional Non-participants
Figure 14.4
Participants associated with a hybrid museum
in partnership with the museum and foster a strong sense of belonging and affiliation. This means that power is the most complex Tenor dimension to negotiate. The social roles of visitors, in particular, seem to have changed quite dramatically: in addition to citizens, visitors are now consumers, partners and active agents with the power to shape (through audience surveys and focus groups) many of the choices museums offer. In an attempt to become as inclusive as possible, many museums seem to have confused the distinction between authoritarian and authoritative. They regard both as being negative, and yet, as visitors are not experts in the fields of the institution,
258
Maree Stenglin
it is often difficult for them to make informed decisions about exhibition content and interpretation. Furthermore, museums are also involved in complex and ongoing negotiations of power relations with the multiple communities they serve, especially those who have been marginalized or silenced in the past. One of the most powerful (and perhaps least obvious) ways in which this is done is through the composition of the Board of Trustees (Anderson, 1998, p. 4). In order to be inclusive, the Board should not only represent diverse points of view, it should also reflect the mix of communities that the museum serves. Frequency of contact also involves the consideration of several factors. First, the way the contact between members of a culture and a museum is distinguished may vary in regularity. Contact may thus be classified as regular (several times a year to annual contact), intermittent, one-off or non-existent. Clearly, the more regular the contact, the higher the degree of involvement a person has with a cultural institution, and this, in turn, impacts on the strength of the social relationship a cultural institution has with its visitors. The final dimension of contact concerns the amount of contact time. Visitor studies have shown that on average visitors spend less than 20 minutes in an exhibition (Kelly, 1996, 1997; Serrell, 1996). This is an extremely illuminating finding. Given that the amount of contact time is so low, the way visitors spend this time will be crucial to achieving the institution’s educational objectives. It also impacts on the visitor’s orientation to affiliation and the positive/negative affect they feel towards museums. The positive socializing experiences of the museum through the interactions of social groups such as families and school groups seem to create a long-term predisposition towards museum visiting and a positive orientation to affiliation (Bourdieu and Darbel, 1991; Saatchi and Saatchi, 2000). Negative experiences, on the other hand, seem to predispose people towards becoming non-participants who hardly, if ever, engage. As a result, museums strongly foreground the importance of fostering positive experiences for visitors, especially children, through an orientation to fun and hands-on, discovery learning. Another challenge for post-modern museums is to find ways to expand their audiences. Part of this involves making people from communities that have been marginalized and silenced in the past feel safe, secure and welcome. To assist with this, museums need to do more than research the social groups that visit them. They also need strategies to help them meet the challenges involved in creating a positive orientation to affiliation by welcoming visitors to their spaces as well as to the ideas explored in their exhibitions. Both are crucial to making marginalized people feel more positively oriented towards the institution.
From Musing to Amusing: Semogenesis and Western Museums
259
Related to fostering a positive orientation to facilitation is affect. The enjoyment and happiness that flows when positive affect is evoked tends to bring people together. One of the ways many museum educators in recent years have tended to approach the challenge of evoking affect is by constructing an almost causal relationship between physical contact, in the sense of ‘hands-on’ learning, and enjoyment. They thus design hands-on activities such as ‘dress ups’ for children as well as craft-making, ‘touch tables’, computer interactives, theatre performances and so forth. Such activities are thought to be much more enjoyable because they involve active rather than passive learning (Hein and Alexander, 1998, p. 26). The hands-on learning movement began in science centres such as Questacon in Canberra (Caulton, 1998, p. 2). It then spread to all types of museums: art, natural history and history. Within the museum profession it is often argued that hands-on learning (or learning by doing) is more fun than didactic learning because it involves the material, physical exploration of objects and phenomena (Hein and Alexander, 1998; Caulton, 1998). Fun, in turn, fosters a positive predisposition to museum participation and is seen as something that should be encouraged. Hands-on learning is commonly regarded as providing ‘added educational value’ to consumers, especially middle-class parents (Caulton, 1998, p. 2). However, as Caulton warns, there is no conclusive evidence that ‘hands-on’ interactions lead to ‘brains-on’ learning: ‘the evidence that [visitors] have actually learned anything, or indeed have not had previously held misconceptions reinforced, remains unproven’ (Caulton, 1998, p. 22). Furthermore, hybrid museums still retain a strong imprint of their predecessors, the Renaissance cabinets: curiosity is seen the precursor of learning in that it provides the stimulus that attracts the attention of the museum visitor. Museum educators believe that once stimulated, ‘... curiosity can then be redirected to more detailed information about the animal in question, or it can be used as the basis for presenting new concepts and delivering messages’ (Masters, 2003, p. 130). In fact, Masters argues that the first and most important challenge for museum exhibitions is to ‘capture the imagination and curiosity of visitors’ (Masters, 2003, p. 131). In these ways, curiosity and fun are construed as pedagogic bridges leading visitors into uncommonsense knowledge and understandings. Macken-Horarik’s domains of knowledge (1996, 1998) offer an alternative perspective that museum professionals could consider exploring. In particular, Macken-Horarik’s research offers a theory of semiosis that can be effectively used to scaffold visitors from their everyday, commonsense understandings into understandings that are theoretical and reflexive in nature. (For an account of how Macken-Horarik’s domains have been applied to the analysis of a museum, see Stenglin, 2008). Finally, linked to the ethical social purpose museums have had since the eighteenth century, another important type of learning occurs in hybrid
260 Maree Stenglin
museums – attitudinal change. In order to facilitate attitudinal transformation, the ideal, from the point of view of the museum, is to provide visitors with a secure and comfortable environment in which they can explore their feelings (Hubbell Mackinney, 1996, p. 10). Museologists hope that this process will lead to ethical changes that realign the visitor’s ‘sense and sensibility’ with those of the cultural institution (Kelly, 2000; Lave and Wenger, 1991; Marton et al., 1993): Ultimately, museum learning is about ‘changing as a person’: how well a visit inspires and stimulates people into wanting to know more, as well as changing how they see themselves and their world as both an individual and as part of a community. (Kelly and Gordon, 2002, p. 161) Although fun and the facilitation of attitudinal change sit side by side in the post-modern museum, as the quotation above indicates, it is the museum’s role as an agent of social change that remains paramount.
14.4 Conclusion: Amusing and beyond – the challenge for social semiotics The discussion in this chapter has indicated several differences and some important continuities in the way ideology has been construed in museums over time. The Renaissance cabinets, for example, which served the social purposes of private glorification and mastery over the universe, were dependent on wealth and power. Accordingly, their valuer was concerned with admiring and respecting the triumphs of the collectors and celebrating the material domination of the universe and the colonial world. The reversal of the cabinet’s social exclusiveness coincided with the emergence of a new and liberal middle class. Their aim was to use knowledge to enlighten and morally transform the masses by presenting the world as rational, ordered, logical – as something that can be learnt. Organized around chronological sequences and typologies, museums were the physical embodiment of rationality and progress. In this way the valuer of the eighteenth-century museum was concerned with the triumph of invention and discovery and good citizenship. Citizenship, in turn, depended on respecting knowledge as a means of freely transmitting appropriate morals, behaviours and social values. The funding crises of post-modernity, however, have stimulated another epistemic shift in museums. Consequently, the valuer of the hybrid museum is now concerned with consumption, that is, the buying and selling of goods and services. Servicing is strongly focused on the leisure industry and the museum has become a service provider in an increasingly competitive marketplace. Although the valuing of pluralism and inclusiveness is important to those concerned with scholarship, and opportunities for attitudinal
From Musing to Amusing: Semogenesis and Western Museums
261
growth and transformation remain highly valued by educators, these coexist alongside the leisure servicing goals driven by global economic imperatives foregrounding entertainment. Nevertheless, realigning visitors into a new subjectivity remains one of the over-arching goals of the hybrid museum. In these ways, the exploration of semogenesis and social context has been able to illuminate the ways hegemony has been construed in museums over time in their function as agents of social control. For social semioticians interested in SF-MDA, this chapter has demonstrated how projecting social context from phylogenesis yields a systematic framework for exploring the evolution of a cultural institution. However this has not been an exhaustive account, it has simply aimed to highlight one important dimension of the rich theory with which the SFL-oriented researchers work. Similarly, those interested in exhibition analysis are strongly encouraged to ontogenetically and logogenetically project social context before beginning their multimodal investigations. To exemplify the importance of doing so, Martin and Stenglin’s (2007, p. 236) exploration of ontogenesis meant they had to balance their analysis and interpretation of the reconciliation exhibition in the Museum of New Zealand, Te Papa Tongarewa, against the fact that in 2001 there was an outstanding Treaty claim on the very land and sea where the museum is built. Without this investigation into ontogenesis, their account and interpretation of spatial semiotics would have been seriously compromised. Clearly, the benefits of adopting this approach for SF-MDA are enormous.
Acknowledgement I would sincerely like to thank Dr Emilia Djonov for her critical comments on the first draft of this chapter.
Notes 1. For more information on the earlier work from which Martin’s stratified model of context has evolved, see Ventola (1987) and Martin (1999b). 2. In Germany the cabinets of curiosity were known as ‘Wunderkammern’, in Austria they were referred to as the ‘Kunstkammern’, in France they were a ‘cabinet’, in England a ‘closet’, while in Italy a range of words were used: ‘gabinetto’, ‘studiolo’, ‘guardaroba’ and ‘museo’ (Alexander, 1979, pp. 24–6; Hooper-Greenhill, 1992, pp. 86–9). 3. The French revolutionary government appropriated, gathered, reorganized and displayed royal collections at the Louvre. The British Museum, on the other hand, originated in the private collections of Sir Hans Sloane. First housed in Great Russell Street, these were subsequently purchased by the British Parliament and installed in Montagu House in Bloomsbury. The British Museum was initially a semi-public institution as its Trustees immediately initiated a system of fee-paying in order to restrict the number and type of the general public who could enter its spaces.
262
Maree Stenglin
4. Affect as it is used here is one of the Tenor variables that co-articulates with status and contact (Martin, 1992). It has a positive or negative dimension together with permanent/transient aspect as theorized by Poynton. 5. Most museum professionals involved in exhibition design, however, are not explicitly aware of the range of genres they work with. Rather, they tend to work implicitly with the notion of the exhibition as a story, storyline or narrative (Verlarde, 1988; Vergo, 1989; Dean, 1994; Serrell, 1996). 6. The taxonomy in Figure 14.4 is based on the organization of the Australian Museum in Sydney in the early years of the new millennium. It is therefore only one example of the internal organization of a contemporary cultural institution and it is possible for different museums to have different organizational structures.
References Alexander, E. (1979) Museums in Motion (Nashville: the American Association for State and Local History). Anderson, D. (1997) A COMMON WEALTH: Museums and Learning in the United Kingdom, a Report to the Department of National Heritage (London: Victoria and Albert Museum). Anderson, G. (1998) Museum Mission Statements: Building a Distinct Identity (Washington, DC: American Association of Museums). Bennett, T. (1995) The Birth of the Museum (London: Routledge). Bernstein, B. (1990) The Structuring of Pedagogic Discourse: Class, Codes and Control Vol. IV (London and New York: Routledge). Bourdieu, P. and A. Darbel (1991) For the Love of Art: European Art Museums and their Public (Cambridge: Polity Press). Caulton, T. (1998) Hands-on Exhibitions (London and New York: Routledge). Davidson, G. (2001) ‘National museums in a global age: Observations abroad and reflections at home’, in D. McIntyre and K. Wehner (eds) National Museums: Negotiating Histories (Canberra: National Museum of Australia/Centre for CrossCultural Research ANU/Australian Key Centre for Cultural and Media Policy/ Griffith University), pp. 12–28. Dean, D. (1994) Museum Exhibition – Theory and Practice (London/NY: Routledge). Duncan, C. (1998) ‘The art museum as ritual’, in D. Preziosi (ed.) The Art of Art History: A Critical Anthology (Oxford and New York: Oxford University Press): 473–85. Einreinhofer, N. (1997) The American Museum: Elitism and Democracy (London and Washington, DC: Leicester University Press). Environmetrics (2000) Leisure and Change: Implications for Museums in the 21st century (Report prepared for the Powerhouse Museum by the School of Leisure Sport and Tourism, University of Technology, Sydney). Falk, J. and L. Dierking (1992) The Museum Experience (Washington, DC: Whalesback Books). —— (1995) Public Institutions for Personal Learning: Establishing a Research Agenda (Washington, DC: American Association of Museums). —— (2000) Learning from Museums: Visitor Experiences and the Making of Meaning (New York/Oxford: AltaMira Press). Ferguson, L., C. MacLulich and L. Ravelli (1995) Meanings and Messages: Language Guidelines for Museum Exhibitions (Sydney: Australian Museum). Gay, P. (1984) Age of Enlightenment (Amsterdam: Time-Life).
From Musing to Amusing: Semogenesis and Western Museums
263
Greenhalgh, P. (1989) ‘Education, entertainment and politics: Lessons from the great international exhibitions’, in P. Vergo (ed.) The New Museology (London: Reaktion Books), pp. 74–98. Griffin, D and T. Sullivan (1997) ‘Shared histories.’ Muse, March/April, 3: p. 11. Halliday, M. A. K. (1992) ‘How do you mean?’, in M. Davies and L. Ravelli (eds) Recent Advances in Systemic Linguistics (London: Pinter), pp. 20–35. Halliday, M. A. K. (1993) Language in a changing world. Canberra, ACT: Applied Linguistics Association of Australia (Occasional Paper 13). Halliday, M. A. K. and Matthiessen, C. M. I. M. (1999) Construing Experience through Meaning: A Language-Based Approach to Cognition (London: Cassell). Hein, G. E. and M. Alexander (1998) Museums: Places of Learning (Washington, DC: American Association of Museums). Hofinger, A. and E. Ventola (2004) ‘Multimodality in operation: Language and picture in a museum’, in E. Ventola, C. Charles and M. Kaltenbacher (eds) Perspectives on Multimodality (Amsterdam: John Benjamins), pp. 193–209. Hooper-Greenhill, E. (1991) Museum and Gallery Education (Leicester: Leicester University Press). —— (1992) Museums and the Shaping of Knowledge (London: Routledge). —— (2000) Museums and the Interpretation of Visual Culture. (London and New York: Routledge). Hubbell Mackinney, L. (1996) To See 'Em Live Brings 'Em More Into Memory: Front End Interviews About Invertebrates with Visitors to the California Academy of Sciences (California: California Academy of Sciences). Jordanova, L. (1989) ‘Objects of knowledge: A historical perspective on museums’, in P. Vergo (ed.) The New Museology (London: Reaktion Books), p. 22–40. Karp, I., C. Mullen Kreamer and S. D. Lavine (eds) (1992) Museums and Communities: The Politics of Public Culture (Washington, DC/London: Smithsonian Institution Press). Kelly, L. (1996) ‘Tracking study results: Frogs exhibition’, Unpublished paper (Centre for Evaluation and Audience Research, Australian Museum, Sydney). Kelly, L. (1997) ‘Tracking study results: SEX It’s only natural exhibition.’ Unpublished paper, Centre for Evaluation and Audience Research, Australian Museum, Sydney. —— (2000) ‘Understanding conceptions of learning’, in Change and Choice in the New Century: Is Education Y2K Compliant? (Proceedings of the Change in Education Research Group Conference: Sydney), pp. 115–21. Kelly, L. and P. Gordon (2002) ‘Developing a community of practice: Museums and reconciliation in Australia’, in R. Sandell (ed.) Museums, Society, Inequality (London/ New York: Routledge), pp. 153–74. Kotler, N. and P. Kotler (1998) Museum Strategy and Marketing (San Francisco: Jossey-Bass Publishers). —— (2000) ‘Can museums be all things to all people? Missions, goals and marketing’s role.’ Museum Management and Curatorship, 18(3): pp. 271–87. Lave, J. and E. Wenger (1991) Situated Learning: Legitimate Peripheral Participation (Cambridge: Cambridge University Press). Leinhardt, G., K. Crowley and K. Knutson (eds) (2002) Learning Conversations in Museums (London: Lawrence Erlbaum Associates Publishers). Lynch, R., C. Burton, C. Scott, P. Wilson and P. Smith (2000) Leisure and Change: Implications for Museums in the 21st Century (Sydney: UTS Powerhouse Publishing). Macken-Horarik, M. (1996) ‘Construing the invisible: Specialized literacy practices in junior secondary English.’ Unpublished PhD thesis (Sydney: University of Sydney).
264 Maree Stenglin Macken-Horarik, M. (1998) ‘Exploring the requirements of critical school literacy: A view from two classrooms’, in F. Christie and R. Misson (eds) Literacy and Schooling (London/New York: Routledge), pp. 74–103. MacLulich, C. (1993) ‘Off the wall: Theory and practice in the language of exhibition texts in museums,’ Unpublished Master of Letters thesis (Sydney: University of Sydney). Martin, J. R. (1992) English Text (Philadelphia and Amsterdam: John Benjamins). —— (1997) ‘Analysing genre: Functional parameters’, in F. Christie and J. R. Martin (eds) Genre and Institutions: Social processes in the Workplace and School (London: Cassell), pp. 3–39. Martin, J. R and D. Rose (2003) Working with Discourse: Meaning Beyond the Clause (London and New York: Continuum). Martin, J. R. and M. Stenglin (2007) ‘Materialising reconciliation: Negotiating difference in a post-colonial exhibition’, in T. Royce and W. Bowcher (eds) New Directions in the Analysis of Multimodal Discourse (Mahwah, NJ: Lawrence Erlbaum), pp. 215–38. Marton, F., G. Dall’alba and E. Beaty (1993) ‘Conceptions of learning.’ International Journal of Educational Research, 19(3): pp. 277–300. Masters, S. (2003) ‘The use of live animals in museum exhibitions’, in L. Kelly and J. Barrett (eds) UNCOVER VOLUME 1 (Sydney: Australian Museum), pp. 127–34. McClellan, A. (1994) Inventing the Louvre; Art, Politics, and the Origins of the Modern Museum in Eighteenth Century Paris (Cambridge: Cambridge University Press). Ozouf, M. (1988) Festivals and the French Revolution (Cambridge, MA: Harvard University Press). Pang, A. (2004) ‘Making history in from colony to nation: A multimodal analysis of a museum exhibition in Singapore’, in K. O’Halloran (ed.) Multimodal Discourse Analysis (London/New York: Continuum), pp. 28–54. Purser, E. (2000) ‘Telling stories: Text analysis in a museum’, in E. Ventola (ed.) Discourse and Community: Doing Functional Linguistics (Tübingen: Günter Narr Verlag), pp. 169–98. Ravelli, L. (1996) ‘Making language accessible: Successful text writing for museum visitors.’ Linguistics and Education, 8(4): pp. 367–87. —— (1998) ‘The consequences of choice: Discursive positioning in an art institution’, in A. Sanchez-Macarro and R. Carter (eds) Linguistic Choices across Genres: Variation in Spoken and Written English (Amsterdam and Philadelphia: John Benjamins), pp. 136–53. Ravelli, L. J. (2006) Museum Texts: Communication Frameworks (London/New York: Routledge). Rojek, C. (1995) Decentering Leisure (London: Sage). Saatchi and Saatchi (2000) Australians and the Arts: What Do the Arts Mean to Australians? (Surry Hills, NSW: Australia Council). Scott, C. (2001) ‘Future Shots.’ Humanities Research, 8(1): pp. 68–70. Serrell, B. (1996) Exhibit Labels: An Interpretive Approach (California: Alta Mira Press). Stenglin, M. (2004) Packaging Curiosities: Towards a grammar of three-dimensional space (PhD thesis: University of Sydney). —— (2007) ‘Making art accessible: Opening up a whole new world.’ Visual Communication, special edition, Immersion 6 (2): pp. 202–13. —— (2008) ‘Olympism: How a bonding icon gets its “charge” ’, in L. Unsworth (ed.) Multimodal Semiotics and Multiliteracies Education: Transdisciplinary Approaches to Research and Professional Practice (London: Continuum), pp. 50–66.
From Musing to Amusing: Semogenesis and Western Museums
265
—— (forthcoming for 2009a) ‘Space Odyssey: Towards a social semiotic model of 3D space.’ Visual Communication. —— (forthcoming for 2009b) ‘Binding: A resource for exploring interpersonal meaning in 3D space.’ Social Semiotics 18(4). van Leeuwen, T. (1998) ‘Textual space and point of view’, paper presented to the Museums Australia State Conference, Who sees, who speaks – voices and points of view in exhibitions, Australian Museum, 21 September. Vergo, P. (ed.) (1989) The New Museology (London: Reaktion Books). Verlarde, G. (1988) Designing Exhibitions (London: Design Council). Weil, S. E. (1990) Rethinking the Museum: And Other Meditations (Washington, DC and London: Smithsonian Institution Press). —— (1995) A Cabinet of Curiosities: Inquiries into Museums and Their Prospects (Washington, DC and London: Smithsonian Institution Press). White, P. (1994) ‘Images of the shark: ‘Jaws’, gold fish or cuddly toy? An analysis of the Australian museum’s shark exhibition from a communicative perspective.’ Unpublished monograph (Sydney: Department of Linguistics, University of Sydney).
15 Floods and Fidget Wheels: A Comparative Systemic Functional Analysis of Slessor’s ‘Five Bells’ and Olsen’s ‘Salute to Five Bells’ Kathryn Tuckwell
15.1
Introduction
As evidence for multisemiotic isomorphism, this chapter presents a comparative analysis of two ‘highly valued texts’ (see Halliday 2002 [1981], p. 229): the poem ‘Five Bells’ (1939) by the Australian modernist poet Kenneth Slessor, and the mural that pays homage to the poem, ‘Salute to Five Bells’ (1973), by the contemporary Australian painter John Olsen. Before this analysis is presented, however, some theoretical issues surrounding the notion of isomorphism are outlined in Section 15.2, focusing first on a more general idea of isomorphism – that is, that a powerful framework for semiotic analysis can display the common features of meaning-making across all semiotic modes – and then moving on to consider the particular rigor provided by a cross-semiotic comparison of works of arts with similar themes. Section 15.3 gives some background to the two texts, and then the analytical findings are presented in Section 15.4. Section 15.5 concludes the chapter with a brief return to the theoretical issues and a perspective on the potential for further research.
15.2 Theoretical considerations This section begins by defining (in Section 15.2.1) what is meant in this chapter by the term ‘isomorphism’, and discussing the value of a multidimensional framework in displaying such structural similarities across semiotic modes. Section 15.2.2 considers the particular theoretical relevance and rigor of the type of comparative analysis that is presented in this chapter, that is, a comparison not just between two texts but between two works of art with similar themes. Finally, in Section 15.2.3, the notion of 266
Floods and Fidget Wheels
267
‘resemiotization’ will be briefly considered, although it will not be revisited in any detail in this chapter. 15.2.1 Isomorphism ‘Isomorphism’ – from the Greek isos (equal) and morphe (form) – is a term usually used in relation to biological, mathematical or chemical entities that are analogous in structure while not necessarily being related in other ways, for example ancestry (for biological organisms) or composition (in relation to chemical, particularly crystalline, substances) (Dictionary.com Unabridged (v 1.1)). In the glossary of The Language of Displayed Art, O’Toole (1994, p. 280) defines isomorphism as ‘the state of sharing the same structural properties’. When discussing isomorphism explicitly, O’Toole (1994, pp. 149–51) notes that ‘Both Boris Uspensky and Michael Halliday have demonstrated the universality of certain semiotic processes’ [my emphasis], and that Uspensky is interested in ‘structural isomorphism, comparable types of semiotic patterning in literature and painting’ [my emphasis]. He goes on to cite the metafunctions as ‘Halliday’s major contribution to linguistics’ (ibid., p. 149) and the feature of the systemic functional linguistics (SFL) framework that allows cross-semiotic comparison, saying (ibid., p. 151) My thesis throughout this book is that Halliday’s three functions are valid as general semiotic mechanisms, that they are realized systemically in painting, sculpture, architecture, music and poetry, and that they make possible a comparative semiotics, not just between texts in a single medium, but across semiotic codes. [my emphasis] This statement of O’Toole’s thesis, along with his use of terms such as ‘processes’, ‘patterning’ and ‘mechanism’ rather than the term ‘form’, suggest that we should think of ‘universality’ and ‘isomorphism’ not as equivalences in form as such, but as equivalences in the general principles that motivate form. Thus, in general, we can make the same argument about isomorphism across semiotic systems that SFL-typologists make about ‘universal’ categories across languages: that in order to generate useful analytical categories for any meaning-making system, we need to begin not with the categories of another meaning-making system (in the case of typology: the categories of another language, such as Latin or English) but with a powerful theory of meaning-making, with which we can make descriptions of actual instances of meaning-making and, from these, descriptions of relevant categories (see Caffarel, Martin and Matthiessen, 2004 for an expansion of this point). The power of SFL as a descriptive tool lies in its multidimensionality, with the dimensions including metafunction, stratification, instantiation, axis, delicacy and rank (see, e.g. Caffarel, Martin and Matthiessen, 2004: Section 15.1.3). While O’Toole (1994) does not explicitly incorporate dimensions other than metafunction and rank into his model,
268 Kathryn Tuckwell
or discuss them explicitly with respect to isomorphism, it is clear from other discussions in the book that he is working with the other dimensions (see, e.g. his defence of ‘the role of semiotics in relation to art history, criticism and teaching’ in Chapter 5). Thus, to quote again from the glossary (O’Toole, 1994, p. 280), A central argument of this book is that a functional-systemic approach to the visual arts enables us to recognize and describe similarities in structure (isomorphism) between painting, sculpture and architecture, and between any of them and other semiotic systems such as language, or gesture, or dance, etc. ... This does not depend, of course, on their having the same subject matter as the works by Bruegel and Auden do. 15.2.2 Isomorphism between ‘highly valued texts’ The last sentence of the above quotation refers to O’Toole’s (1994: Chapter 4) detailed comparison of Auden’s poem ‘Musée des Beaux Arts’ (1939) with the painting to which it makes reference, Pieter Bruegel the Elder’s ‘Landscape with the Fall of Icarus’ (c. 1558). Although O’Toole asserts here that isomorphism exists between semiotic systems regardless of ‘subject matter’ – which certainly fits with the understanding of isomorphism outlined above – examples like the paired poem and painting that he analyses, and those analysed in this chapter, provide a particularly rigorous test of the theory. This is not just because, following accepted scientific practice, the comparison of texts with the same ‘subject matter’ reduces one source of variability between the entities being compared. The rigor stems in part from the nature of verbal art, where ‘... the role of language is central. Here language is not as clothing is to the body; it is the body’, (Hasan, 1985, p. 91) and where there is a consistency of foregrounding, such that ‘various foregrounded patterns point towards the same general kind of meaning’ (ibid., p. 95). According to Hasan (ibid., pp. 94–9), the patterning of such patterns construes and constructs the literary theme of a work of verbal art, which Hasan (ibid., p. 97) describes as ‘the deepest level of meaning in verbal art’. If we have a framework for analysis, that can display how the patterning of patterns of highly valued texts in other semiotic modes construes and constructs their artistic themes, and if we have two texts in different semiotic modes that are considered to evoke the same kinds of artistic/literary themes, then being able to show structural isomorphism between those two texts is significant evidence for isomorphism between semiotic systems. O’Toole’s (1994) analyses of individual highly valued texts – paintings, sculptures and architectural works in different styles – show that his framework does indeed allow the description of patternings of patterns in these texts and their relationship to the ‘themes’ of the works in context. His comparative analysis of Auden and Bruegel makes an even more convincing argument for isomorphism, showing that there are equivalences between
Floods and Fidget Wheels
269
the foregrounded patterns across the two works. It is this kind of evidence that this current paper is attempting to display, and is hopefully valuable not just as another instance of the same thing, but also as an instance of a slightly different type, in that (1) the relationship between the two texts is inverted in my study – that is, the painting is a homage to a pre-existing poem, rather than the other way around, and (2) Olsen’s mural is abstract, whereas the Bruegel painting is realist, allowing a consideration of the effect of the ‘play’ of meaning allowed/required by the abstract style on the capacity of the analytic framework to demonstrate isomorphism. 15.2.3 Resemiotization A final theoretical point before moving on to the texts themselves is the relevance of ‘resemiotization’ (Iedema, 2001, 2003) to the kind of comparative analysis presented in this chapter. In contrast to ‘multimodality’, which is ‘concerned with the multi-semiotic complexity of a construct or a practice’ (Iedema, 2003, p. 40), ‘resemiotization’ has ‘not so much to do with the semiotic complexity of particular representations as with the origin and dynamic emergence of those representations’ (ibid.): Resemiotization is about how meaning making shifts from context to context, from practice to practice, or from one stage of a practice to the next. (Iedema, 2003, p. 41) One of the main examples that Iedema uses to develop this concept is the process of planning (and finally building) alterations to a mental health facility, which begins with face-to-face interactions between the stakeholders, which are then summarized in writing in the planner’s report, and then incorporated into the design proposals (see Iedema, 2001, 2003, pp. 42–3). Iedema (2003, pp. 43) notes that ‘[t]hese transitions embedded the project’s progress in an increasingly durable and expensive – and therefore resistant – materiality’ [original emphasis]. While these particular kinds of transitions (in durability and cost) may not be immediately or obviously relevant in the ‘translation’ of a cultural artefact into another semiotic mode, the general principle – an investigation of how the meaning-making shifts between modes and between contexts – is certainly relevant, but probably a topic for another paper. As noted in the Introduction, I will return briefly to the potential for further investigations in this area in Section 15.5.
15.3 The texts The two texts to be analysed are presented in Appendices 15.1 to 15.3. Appendix 15.1 is the first four and last two stanzas of Slessor’s poem, which have been divided into clauses and annotated with some features of the grammatical analysis (the focus on these stanzas is explained in
270
Kathryn Tuckwell
Section 15.3.2 below). Appendices 15.2 and 15.3 are images of Olsen’s mural: Appendix 15.2 is a shot of the whole mural, while Appendix 15.3 shows a detail. As already noted, both of these texts are ‘highly valued texts’ in Australia – and particularly in Sydney, as they are considered particularly evocative of Sydney Harbour. Both Olsen and Slessor are regarded not just as highly skilled artists in their respective fields, but as artists that are particularly skilled at evoking what might be thought of as a ‘modern’ perspective on Australian life; both artists’ oeuvres include sympathetic portrayals of both rural and urban landscapes and lives.1 The poem and/or painting have inspired a number of artists to produce (further) reinterpretations: an early publication of the poem (Five Bells: XX Poems, 1939) included illustrations by Slessor’s friend Norman Lindsay; and, as well as ‘Salute to Five Bells’, at least two other of Olsen’s paintings refer to the poem in their title: ‘Five Bells, 1963’ and ‘The Sea Sun and Five Bells’ 1964 (Hart, 1991, p. 113). There are also at least seven pieces of music that make reference to the poem and/or painting: five classical compositions (Marcellino, 1984; Gyger, 1990, 1998; Beilharz, 1996; Sculthorpe, 2001) and two jazz pieces (‘Five Bells Suite’ by Miroslav Bukovsky, first played by Ten Part Invention in 2005 (see Westwood, 2005), and ‘Five Bells’ by Paul Grabowsky, recorded on the Allan Brown’s Australian Jazz Band album Five Bells and Other Inspirations, 2006). The fact that the poem and painting have been widely and repeatedly reinterpreted in a variety of modes again suggests their continued currency and value in Australian culture; there is obviously also a rich potential for multimodal work here, which again will be touched on in the final section of this chapter. 15.3.1 The poem Slessor wrote ‘Five Bells’ in 1939, as an elegy for his friend Joe Lynch. Lynch was a ‘black-and-white’ artist on the same Sydney newspaper where Slessor was a journalist, and they were great friends. One night in 1927, a group of friends including Slessor and Lynch were on a Sydney ferry on the way to a party – according to Slessor, Lynch’s coat pockets full of bottled beer – when it was suddenly noticed that Lynch was missing. The ferry hove to and the night water was searched, but there was no trace of Lynch, and his body was never recovered. So the ‘Joe’ of the poem is Joe Lynch. (See Stewart, 1977: Chapter 6 for a discussion of different versions of Lynch’s disappearance). The poem is quite long; the extracts in Appendix 15.1 represent about half of the complete poem. From approximately the 1950s to the 1980s, the poem was widely studied in Australian high school English classes, and there would be few students who attended a New South Wales high school during that period who have not read and studied the poem. Despite this, or perhaps because of it (Slessor himself refers to the detrimental effect that
Floods and Fidget Wheels
271
studying a poem can have on the student’s opinion of the poem or poet (Slessor 1965 [1993], p. 128)), the poem emerged as ‘Australia’s favourite poem’ in a 1998 poll conducted by the Australian Broadcasting Commission (the national public broadcaster). The poll ran for six weeks and 20,000 votes were cast; the voters were not constrained to vote for an Australian poem or poet. 15.3.2
The mural
Olsen’s mural hangs in the North Foyer of the Sydney Opera House Concert Hall, opposite the tall glass windows that face Sydney Harbour. It is painted in synthetic polymer on eight hardwood panels; it is 21.3 metres wide and 2.9 metres tall, and hangs on a slightly curved wall.2 Olsen was commissioned to paint a mural for the Opera House in 1971 by the William Dobell Foundation. He painted the mural during 1972–73, first in a nearby warehouse, and then on site, with the workers completing the building (and jeering insults about the painting) around him. He decided to paint a homage to Slessor’s poem both because the subject matter seemed appropriate for a building surrounded on three sides by water – Sydney Opera House is on a small peninsula that juts into the harbour adjacent to the city’s central ferry wharf and the Sydney Harbour Bridge. He also felt, as an artist who frequently took inspiration from poetry, that poetry should have a place in the building where so many other arts would be represented (including music, theatre, dance, sculpture and, of course, architecture). However, Olsen felt that the middle of the poem was too long. In his journal of the painting process, he writes out some of the lines he particularly likes, which, unsurprisingly, come from the parts of the poem that evoke the harbour most immediately, the beginning and the end, hence the focus in this chapter on those parts of the poem. As noted at the beginning of this section, approximately ten years before he painted ‘Salute to Five Bells’, Olsen had created two other works with titles that refer to Slessor’s poem. One of these, simply called ‘Five Bells’, is almost as well known as the mural, as it has hung in a prominent position in the Art Gallery of New South Wales since the gallery purchased it from private collectors in 1999. However, as Hart (1991, pp. 113–14) points out In these earlier works, the links with the poem were much more tenuous than in the mural. They were based on a feeling of shared affection for the harbour which continued in Salute to Five Bells, but were essentially full of youthful vigour and ebullience, removed from the central elegiac mood of the poem which provided the springboard for the mural. Although comparison between the poem and these other paintings would also be interesting, the comparison with ‘Salute to Five Bells’ provides a more rigorous test of the theoretical framework, as outlined in Section 15.1.2
272 Kathryn Tuckwell
above, since the mural is considered to be more closely aligned with the themes of the poem. It is to this comparison that we will now turn.
15.4 The analysis As indicated in Section 15.1, the analysis below draws largely on Halliday’s SF-grammar (following Halliday and Matthiessen, 2004) and O’Toole’s SF-semiotics of art (following O’Toole, 1994, 1999). The analysis of the mural also draws on Kress and van Leeuwen (1996), whose SF-framework for visual analysis is largely compatible with O’Toole (1994) in terms of its systems and functions (although it does not incorporate the dimension of rank scale as O’Toole’s framework usefully does), but which provides more detailed explanations of the options within systems such as Gaze. Table 15.1 is adapted from O’Toole (1994), showing the terminology for the ranks, functions and systems of his framework that will be used in the analysis below. The following discussion of the analyses is organized by metafunction, beginning with the engagement (interpersonal) metafunction, moving on to the representational (experiential) and finishing with the compositional (textual). Each section will deal with just one or two systems from the relevant metafunction that are very obviously realized in the painting, and the corresponding grammatical and semantic features of the poem; there is clearly much more that could be said about the analyses of these highly valued texts than is possible to fit into a chapter of this size. 15.4.1
Engagement and interpersonal meanings
This section considers three systems of the Engagement function of the mural in turn – Rhythm (in Section 15.4.1.1), Gaze (in Section 15.4.1.2) and Perspective (in Section 15.4.1.3) – with each of these subsections also discussing linguistic features of the poem that correspond with features in these systems in the painting. The final sub-section of this section briefly discusses the modalizing function of abstraction in the mural. 15.4.1.1
Rhythm and regularity
O’Toole (1994, p. 5) points out that ‘the best way to start talking about a picture when you’re standing in front of it in a gallery is in terms of how it engages your attention and thoughts and emotions’. Although it is probably not apparent from the image of the mural in Appendix 15.2, the most engaging thing about the mural when standing in front of it at the Opera House is its colours (which are more accurately reproduced in the detail in Appendix 15.3): the calm expanse of beautiful violet-blue, and the vivid, almost fluorescent, figures. While the size of the mural is initially somewhat overwhelming, the fact that the deep blue is used as quite a flat background across the whole mural means that it counteracts, to some extent at least, the sense that the mural is looming over the viewer. This can be seen as a
Floods and Fidget Wheels Table 15.1
273
Functions and systems in painting, following O’Toole (1994: Chapter 1)
Metafunction → Rank ↓
Representational
Engagement (formerly ‘Modal’)
Compositional
Work (whole painting)
Narrative themes Scenes Portrayals Interplay of episodes
Rhythm Gaze Frame Light Perspective Modality
Gestalt: Framing, horizontals, verticals, diagonals Proportion Line Rhythm Geometric Colour Cohesion
Episode
Actions, events Agents-patientsgoals Focal/side sequence Interplay of actions
Relative prominence: Scale, centrality interplay of modalities
Relative position in Gestalt and to each other Alignment Coherence Interplay of related forms
Figure
Character, object act/stance/gesture clothing components
Gaze Stance Characterization Contrast: Scale, line, light, colour
Relative position in Gestalt, in episode, and to each other Parallelism Opposition Subframing
Member
Part of body/object Natural form
Stylization
Cohesion: Reference, parallelism, contrast, rhythm
contribution to the even, gentle, wave-like Rhythm of the mural, which is one of the engagement systems at the rank of work in O’Toole’s schema. Other elements that contribute to the Rhythm are the repetition of small geometric figures, the repetition of small pieces of bright colour (particularly red and yellow), and the long, wavy horizontal lines of different colours. There is no particular direction to this Rhythm; that is, there are no consistent vectors or pathways that suggest a general movement to the left or right – the figures in general seem to be floating rather than progressing in a particular direction. This even Rhythm of the mural reflects the literary choice that Slessor made with the poetic rhythm of ‘Five Bells’, which is a very even pentameter, but it also reflects the ‘regular’ grammatical choices made in the interpersonal function of the poem. While in the segments of the poem shown in Appendix 15.1 there are several small ‘clusters’ where this regularity is broken (e.g. the interrogatives of clauses 9 and 10, the imperatives
274 Kathryn Tuckwell
of clauses 19–21, and the modal finites in clauses 37–47), in the analysis of the whole poem, the great majority of clauses are declarative and unmodalized. In both the painting and the poem, these consistencies provide a regular background for more unusual choices in the other systems to be discussed in 4.1.2. 15.4.1.2
Gaze and direct address
O’Toole (1994, p. 8) makes a quite direct link between Gaze and direct address as follows: The most obvious way that the Interpersonal function is realized in language ... is in direct address: the speaker uses a form of address of the pronoun ‘you’ – and interpersonal contact is established. Quite often paintings address us directly too. A clear example of this Modal function is the direct Gaze of one or more of the figures straight at us, the viewers. Kress and van Leeuwen (1996, p. 121ff) make a similar link between Gaze and speech roles, referring to direct gaze at the viewer as a visual demand, and absence of gaze as a visual offer. In the mural, however, the selections of Gaze cannot be simply classified as ‘present’ or ‘absent’, or even ‘direct’ or ‘oblique’. There are no obviously human figures in the mural, but there are several figures that look like sea creatures, and which have a dot which can be interpreted as an eye; there are also a number of figures that consist of dots within concentric circles, which again might be interpreted as eyes. However, while these circular figures and elements resemble eyes viewed ‘front on’, none of these ‘eyes’ are paired, so that the Gaze is not straightforwardly ‘direct at viewer’. The effect is rather unsettling, comparable perhaps to looking at a fish in a tank or a bird in a cage, and not really knowing if they are looking back, or the extent to which they are conscious of your Gaze. Therefore the vacillation between thinking the eyes are looking at me, and knowing, or thinking I know, that they cannot be looking back at me in the same way I am looking at them, is a source of emotional tension in the painting. Slessor’s use of direct address is a similar source of dissonance in the poem. There is the dissonance that occurs in any form of poetry (odes, elegies, etc.) where the first-person persona (I-persona) of the poem addresses a second-person persona (you-persona), which is the tension over how much the poet aligns with the I-persona, and how much the reader aligns with either persona. In ‘Five Bells’, there is additional tension created by the fact that the first time ‘you’ is used, it is in a clause with the vocative ‘dead man’, who has, in the first stanza, been identified as Joe. The reader is thereby engaged and positioned as addressee by the repeated use of first and second pronouns, as well as the clusters of interrogatives and imperatives noted above, but s/he simultaneously knows that s/he cannot really position him/
Floods and Fidget Wheels
275
herself as either ‘you’ or ‘I’, because we know that ‘I’ is not ‘I’, the reader, but ‘I’, the writer or narrator, and ‘you’ is Joe, the dead man. In this way, the Gaze in the painting and the direct address in the poem are construing a very similar interpersonal disorientation for the viewer and reader. 15.4.1.3
Perspective
One way of engaging a viewer and drawing them into a painting is through the use of classical linear (‘vanishing point’) perspective that draws the eye to a single point ‘within’ the painting. The mural, however, uses a different selection within the system of Perspective, which is planar rather than linear, and is (like the Rhythm) realized largely through the use of colour, or more particularly colour contrasts: the deep violet-blue is the deepest plane; the intense reds and yellows and fluorescent greens are on a plane ‘closest’ to the viewer; and the brighter blue and deeper green figures seem to be on an intermediate plane. The viewer is not particularly ‘drawn in’ by this simple perspective, as s/he would be by a vanishing point perspective, but it allows the viewer to position him/herself in terms of relative distance from figures or elements in the painting, that is, to answer the question ‘How far am I from these elements?’ 15.4.1.4 Modality and the function of abstraction However, at the same time as the planar Perspective allows us to sense how far we are from elements in the mural, aspects of the Representational function (discussed in Section 15.4.2.1) make it difficult for the viewer to know the answer to the question ‘Where am I in relation to these elements?’ This is another example of the simultaneous knowing and not knowing, or rather the rapid vacillation between these contrasting positions (like that which occurs when viewing a Necker’s cube or a drawing by Escher), which is a recurring source of conflict in both the painting and the poem. In O’Toole’s systematization of Modality, the degree of certain or uncertainty in a work of art is construed by the degree of verisimilitude or abstraction. This seems to be a point at which the Representational and Engagement functions are inextricably linked: Olsen’s representation of Sydney Harbour is neither lifelike nor entirely abstract, so there are clues that make the viewer think they know something about the Representational meanings, but the degree of abstraction means that the viewer can never be entirely sure about those meanings. Similarly, in the poem, as already mentioned, there is very little modality, but the Experiential analysis does reveal several examples of ambiguity that we might regard as being similar to the abstraction in the mural. 15.4.2
Representational and experiential meanings
Considering the point just noted about the function of abstraction in the mural, we will now move to the representational function, without fully
276 Kathryn Tuckwell
leaving interpersonal meanings behind. This section is divided into just two parts, the first dealing with representational meanings in the mural, and the second with corresponding experiential meanings in the poem. 15.4.2.1 The mural: Actions, events and participants The key to saying something about what Olsen’s mural ‘represents’ seems to lie in the answers to the questions ‘What are all these things?’ and ‘What are all these things doing?’ These questions largely correspond to elements of O’Toole’s (2004) model at the level of Episode, that is, Actions and Events, and the participants in those Actions and Events (Agents, Patients and Goals). The fact that the answers to these questions have to be ‘worked out’, as they will be below – that is, the fact that the answers to these questions are not necessarily obvious and completely agreed upon by all viewers – is a result of the abstraction in the painting. The process of working out the answers, of making Representational meaning from the abstract forms, is somewhat of a compulsion for the viewer, and is therefore a strongly interpersonal exercise. Colour was again my starting point in trying to analyse the Representational meanings in the mural: it seemed safe to assume that the broad expanse of blue was a broad expanse of water. The two large fish shapes just to the left and right of the centre of the painting seem the most recognizable figures in the mural, and their presence confirms the assumption that the blue is water. Recognition or identification of other figures in the painting depends very much on relational process questions: Do the figures have some attributes that help identify them? (That is, can I classify or identify them using a possessive relational process?) Are they like fish or unlike fish? (Can I classify or identify them using an intensive relational process?) What might they be, in light of the fact that their circumstantial Attribute is water? (Can I classify or identify them using a circumstantial relational process?) So, some of the figures seem recognizable as animals (rather than objects) because they have eyes; they do not look like fish but they live in water, so they are presumably types of sea creature. There are also figures that look manmade, such as the black object at the bottom of the painting, just to the left of the centre: taken in isolation, one might perceive it to be a crane or (with the red underneath) a cross-section of mine or an oil well. However, because this object is physically related to water (it is in, on, near or by the water) it is perceived as probably being part of a boat, or possibly a crane on the docks. In terms of the actions and events represented in the painting, it has already been noted (with respect to the lack of direction to the mural’s Rhythm) that the figures appear to be floating, and this again depends to some extent on interpreting the blue as water, and thus interpreting any Actions and Events as occurring in the context of water. Part of what contributes to the sense of floating – which we might construe in linguistic
Floods and Fidget Wheels
277
terms, as intransitive and undirected action – is that there is very little interplay between figures, such as strong directed lines or vectors, that suggests that the figures are acting on or moving towards – or even interacting with – each other. Identifying the figures and perhaps partly understanding some of the representational meanings is more of a hindrance than a help in answering the relational process question mentioned in the discussion of perspective in Section 15.4.1.3: Where am I in relation to the elements in this painting? While looking at the sea creatures, particularly the side-on view of the fish, they seem to be swimming and floating past the viewer: the viewer is under the water with them. But when looking at the little crescent moon on the upper-left side of the painting, I am not sure if I am looking up into the sky (from somewhere on land or from under the water) or if I am on land or in the air and looking down onto a reflection in the water. If the crane is part of a boat, am I looking at it from directly above (from somewhere in the air) or directly below (from in the water)? So once again, there is this slightly uncomfortable feeling of thinking you know something, but in the next instant knowing is something quite different. In deriving meaning from the mural, it is this discomfort, rather than actually knowing what the figures are, what they are doing and where one is in relation to them, that is important: if this type of absolute knowledge was essential to making meaning, Olsen would have painted instantly recognizable figures rather than abstract ones. In the poem too, the lexicogrammar encodes this type of ambiguity, as will be discussed in Section 15.4.2.2. 15.4.2.2
The poem: Processes, participants and circumstances
In the same way that many figures in the mural are not instantly recognizable, but the viewer can ‘work out’ what they might be, nominal groups like ‘little fidget wheels’, ‘deep and dissolving verticals of light’ and ‘waves with diamond quills and combs of light’ are not signifiers that have specific signifieds in the world outside of the poem, but readers can picture in their own minds or explain in their own words what they think the signifieds might be, in the context of the poem. The analysis of the experiential grammar also reveals similar ambiguities and dissonances in relationships between Participants, Processes and Circumstances in the poem, against a kind of ‘floating’ background seen in the transitivity. In order to give some feel for the patterning of process types and participants in the poem, some of the findings of the experiential analysis are tabulated in Table 15.2 below. This table shows the entities that are construed as ‘do-ers’ of processes in the extracts of the mural in Appendix 15.1, and lists the processes associated with those entities according to process type. Mental, verbal and behavioural processes have been grouped together as ‘typically human’ processes, and material processes have been separated into ‘middle’ and ‘effective’ voice. While there are many things
278
Kathryn Tuckwell
Table 15.2 Experiential meanings in Five Bells Main participant (‘do-er’)
Mental, Verbal & Behavioural processes
Material: effective
Material: middle
Time
Relational processes is not my Time ... (1) is over you (25)
I
have lived (2) think (9) felt (35) felt (36) could not feel (40) could find (41) could only find (42) could say (43) might not hear (47) looked (48) tried to hear (49)
Joe
lives (3)
thieve (10)
could not go (38)
light
ferry (4)
hangs (8)
(unspecified)
rung (5)
water
let (31)
pour (6) floats (7) ride over you (30) goes over (29)
is over you (23) is over you (24) are Water (32)
have gone from ... (11) gone even from ... (12) have gone (22)
have no suburb (28) are only part of an Idea (34) were here (44)
you
are shouting (17) cry (19) bawl (21)
squeezing (18) beat (20)
something
hits and cries against ... (15)
forms (14) beating (16)
was blind (37) was bound (39) all ([that I could hear]) was (50)
is (13)
mystery
is over you (26)
memory
is over you (27)
sea-pinks purpose bells
bend (33)
are (34)
gave (45) seized (46) ringing (51)
that could be said about the experiential analysis, the main point to notice in regard to Table 15.2 is that the processes are generally lacking in dynamism (Hasan, 1985, p. 46) – the majority of processes construe mental or bodily states, relationships, and ‘happening’ rather than ‘doing’ types of
Floods and Fidget Wheels
279
material action. The most dynamic processes – the effective material processes, and the processes that construe emotive verbal behaviour such as cry and bawl – are somewhat attenuated: some of them, such as ferry and let (down) are minimally effective and might better be analysed as middleranged; others have a reduced sense of dynamism because one or more of their participants are unspecified (rung), minimally specified (something hits and cries ...), dead (shout, bawl), or abstract (thieve, seized). In this way, the experiential patterns in the poem tend to construe the same sense of directionless floating that is construed by the lack of interplay and vectors noted in the mural. Against this ‘floating’ background, there is similar ambiguity about the reader’s position in relation to the world of the poem as that seen in the painting. There are a number of ‘mismatches’ that make the world of the poem seem a bit topsy-turvy, such that the reader has the difficulty of recognizing entities and reconciling their location in relation to those entities, just as is the case in the mural. In the first stanza, for example, the poem opens with the idea of time passing at different speeds, and there is a mismatch between the length of the prepositional phrase complex that begins Clause 2, and the very short time period that it construes. At the end of the first stanza, Joe is described as long dead, but this is immediately followed by a hypotactic elaborating clause which tells us he ‘lives between five bells’. Similarly, throughout the poem, entities are construed as being in a strange relationship to one another; for example, in the last two lines of the second stanza (Clauses 7 and 8), The Harbour floats/ In air, the Cross hangs upside-down in water. In the first of these clauses, the Harbour, which as a body of water would normally be a Place circumstance of the process ‘to float’, is instead the first participant in this process; in the next clause, the constellation (the Southern Cross) ‘hangs’ in water (encoded as a Place circumstance) instead of in the sky. In the second to last stanza, the lines But I was bound, and could not go that way, / But I was blind, and could not feel your hand (Clauses 37 to 40) create a dissonance that is emphasized by the parallelism. The first pair of clauses in this set is quite logical: if one is bound, one cannot go. The parallelism between the first pair and the second pair lead us to expect a similar logic, which is not so obviously there, so that we are forced to stop and ponder: Why does being blind prevent you from feeling? In both the poem and the mural, then, the world of experience is construed in a less than straightforward way, such that represented entities are not completely unrecognizable or abstract, but require work on the part of the viewer and reader, respectively. The viewer/reader is compelled to divine these meanings, and this is another way (apart from the typical interpersonal systems) that the poem engages the reader, so that in both the poem and the painting, the experiential meaning-making resources are also a resource by which the viewer or reader is engaged.
280 Kathryn Tuckwell
15.4.3
Compositional and textual meanings
This section concludes the presentation of the analytical findings, and again (like the discussion of Representational and Experiential meanings in Section 15.4.2) is divided into two parts, the first dealing with the Compositional meanings in the mural, and the second with Textual meanings in the poem. 15.4.3.1
The mural: Horizontal stability
Many of the major compositional systems in O’Toole’s model (see Table 15.1) do not seem relevant to ‘Salute to Five Bells’, because of its size and shape: there is no obvious use of geometry or framing in the painting. The mural itself is long and heavily horizontal, emphasized by the even, deep violet blueness of the background, which gives the painting a great deal of stability. There are no uplifting verticals or destabilizing diagonals; the most prominent vertical is the large anchor shape towards the left of the painting, and its downward-pointing arrowhead and heavy deep-blue lines add to, rather than detract from, the impression of stability. The profusion of foreground figures contrasts with this stability; their brightness, variation and detail makes them seem more dynamic, and they are not restricted by grids or framing. However, the contrast in size between these figures and the mural as a whole means that their dynamism does not seriously disrupt the overall impression of stability. There is a great deal of cohesion in the work at the rank of figure or member: there is a lot of repetition of forms (e.g. circles, dots, and spidery lines) and colours (particularly bright red, bright yellow, black and the two shades of blue) from one end of the painting to the other. This repetition does not create parallelism: it links the figures, but also emphasizes their differences, for example some circles are whole figures, and some are parts of other figures; and both the sea creatures and the manmade objects have spidery black lines attached; so while elements are repeated, no two figures are the same. 15.4.3.2 The poem: Textural stability Textually, the poem is also very ‘stable’: most of the clauses are thematically unmarked, and there is a great deal of lexical cohesion, particularly several strong lexical chains: first- and second-person pronouns, and references to time, water, sound, light and life. As with the ‘regular’ choices in the interpersonal metafunction, the predominance of unmarked Themes and the stability provided by the lexical chains are a background for the more ‘difficult’ choices in the other metafunctions. These lexical chains overlap and intertwine throughout the poem, and this increases the textual ‘stability’ of the poem, but it also leads to the type of collocational clashes that create interest and tension in the experiential metafunction.
Floods and Fidget Wheels
281
There are many examples of the intertwining of lexical chains in the poem, but just two will be discussed briefly here. The first is in the second stanza, Clause 6: Night and water / Pour to one rip of darkness. In this clause, the nominal group that construes the conflated Subject, Actor and unmarked Theme, Night and water, brings together ‘water’ with ‘night’, which itself conflates ideas of ‘time’ and ‘light/dark’. The process pour then construes that whole nominal group as some kind of liquid, and the circumstance of that process, to one rip of darkness, construes the darkness ambiguously through the word rip – the darkness may be something solid that can be ripped, or perhaps it is rough water, in which there can be a ‘rip’ (undertow). Another example of this intertwining of lexical chains can be seen in the second to last stanza, Clause 24, The turn of midnight water’s over you: again, the first participant is encoded as a nominal group that draws together ‘water’ with ‘midnight’ (again conflating ideas of ‘time’ and ‘light/dark’); ‘turn’ in this context might be the turning of the tide (itself a marking of time) or the turning of a clock; and all of this is described in terms of a circumstantial Attribute where the location is ‘you’, thus linking these chains of ideas to the chain of personal pronouns. Again, then, the poem and the painting show structural similarities with respect to their composition, textual organization and cohesion, and a similar patterning of patterns to that discussed earlier in relation to the interpersonal meanings: the compositional and textual functions, like the Engagement, and interpersonal functions, set up a regularity, calmness and stability in the selections from some systems, which then act as a stable background for more challenging selections in other systems.
15.5 Conclusion I will now turn, in this final section, to the relevance of the above-noted consistencies across metafunctions, and between the two works, to the overall themes of the works, and then conclude with a few remarks on aspects of this study that require further investigation. 15.5.1 Structural consistencies and underlying themes In both the mural and the poem, there is a series of consistencies that set up a contrast between ‘stability’/ ‘recognizability’ and an unsettling ambiguity. In both the mural and the poem, there are elements in each metafunction that act as a background and allow the viewer or reader to feel what we might think of as ‘calm’ and ‘secure’. In the mural, these include even, peaceful selections in the systems of Colour and Rhythm, the choice of simple planar perspective, the recognizability of the blueness as representing water and the compositional stability of the long horizontal shape of the mural and the lack of disrupting verticals and diagonals. In the poem, the ‘background’ is the regular declarative mood, the lack
282
Kathryn Tuckwell
of modality, the use of mainly middle voice, the unmarked Themes and the lexical cohesion (as well as the even pentameter rhythm). Against this background are the elements that provoke conflict, uncertainty and emotional tension in the viewer or reader. In the mural, there is the unsettling selection within the system of Gaze, the degree of abstraction, the profusion of isolated figures and the contrasts in colour, size and activity between the background and foreground figures. In the poem, there is the unsettling nature of the direct address, the ambiguities and contradictions in the experiential meanings, and the collocational clashes that conflate water, light/dark, time, life/death and personal pronouns. In an essay published in one of Sydney’s daily newspapers in 1967, Slessor describes an Arabian fairy tale where a man dips his head into a bowl of magic water for five seconds and within the time that his head is underwater, dreams that he is living his whole lifetime. Slessor (1967 [1993], p. 135) goes on to say This is partly the idea of ‘Five Bells’, a poem which suggests that the whole span of a human life can be imagined, and even vicariously experienced, in a flash of thought as brief as the interval between the strokes of a bell. ... The poem is therefore on two planes. First it attempts to epitomize the life of a specific human being, but fundamentally it is an expression of the relativeness of ‘time’. The opening lines of the poem express this relativeness fairly directly, through a negative identifying relational process clause in which the two participants are two different ‘types’ of time. The relativeness is then demonstrated in the marked topical Theme of the next clause, which is a very long prepositional phrase complex encoding a Time circumstance that in fact construes a very short period of time (the seconds between the strokes of the ship’s bell). The line ‘Five bells’ is repeated three times in the poem (twice in the section omitted from Appendix 15.1), as a reminder that while the life of Joe is being construed in a quite long poem, time is passing very rapidly. These relatively obvious construals of the relativeness of time are then repeated fractally and in a more subtle manner throughout both the poem as a whole and in the mural. That is, through consistencies across ranks and metafunctions (as summarized above), the contrast between the little fidget wheels and the flood that does not flow of the first line of the poem is repeated in a variety of ways in both the mural and in the poem to which it pays homage. 15.5.2 Some final considerations about isomorphism and resemiotization As was noted in Section 15.2, a similarity of subject matter is not necessary for demonstrating isomorphism between semiotic systems, but it does
Floods and Fidget Wheels
283
provide a rigorous test of the theory, as demonstrated by O’Toole’s (1994) comparative analysis of Auden and Bruegel, and hopefully also by the analyses in this chapter. Some obvious further tests of the theory would be analyses of Olsen’s other ‘Five Bells’ paintings, and a comparative SF-semiotic analysis of one or more of the musical pieces that make reference to the poem and/or mural, using or adapting van Leeuwen’s (1999) framework for the analysis of music and sound. In addition, there seems to be a great deal of potential for applying the notion of resemiotization (Iedema, 2001) to the ‘translations’ of highly valued texts such as these, to consider how the ‘literary’ meanings – the themes – change in the shifts between modes. It is also worth considering, in light of Hasan’s (1985, p. 96) assertion that verbal art itself is a stratified semiotic system (with the whole of the linguistic system being its expression plane, the patterning of patterns or ‘symbolic articulation’ being the equivalent of lexicogrammar, and the literary themes being the equivalent of the semantic stratum) whether we might think of the crafting of language involved in the making of verbal art as another type of resemiotization, that is, from ‘ordinary language’ to ‘literary language’. Part of the potential for investigating these ideas is provided by the artists discussed in this chapter, since both Slessor’s notebook for the making of the poem, and Olsen’s journal for the making of the mural, are held by the National Library of Australia, and are soon to be digitized (Meacham, 2008): the former might supply data for investigating how language becomes literature, and the latter, for investigating how literature can ‘become’ painting. These, however, are issues and ideas that will need to be taken up in other papers.
Appendix 15.1 ‘Five Bells’ analysis (Unfortunately there is not space here to include the poem in full and without annotations; however, the poem is widely available on online poetry websites; see, for example, www.poemhunter.com/poem/five-bells/ [Accessed 4 October 2008].) Key: As well as the usual conventions for dividing the text into clauses: underlining = marked topical Themes; a border around the clause number thus # = the mood of the clause is either interrogative or imperative; bold italics = a modal element (Adjunct or Finite). 1 Time [[that is moved by little fidget wheels]] Is not my Time, the flood [[that does not flow]].||| 2 Between the double and the single bell Of a ship’s hour, between a round of bells From the dark warship [[riding there below]], I have lived many lives, and this one life Of Joe, long dead,|| 3 who lives between five bells. ||| 4 Deep and dissolving verticals of light Ferry the falls of moonshine down. ||| 5 Five bells Coldly rung out in a machine’s voice. ||| 6 Night and water
284
Kathryn Tuckwell Pour to one rip of darkness, || 7 the Harbour floats In air, || 8 the Cross hangs upside-down in water.||| 9 Why do I think of you, dead man, || 10 why thieve These profitless lodgings from the flukes of thought [[Anchored in Time]]? ||| 11 You have gone from earth, || 12 Gone even from the meaning of a name; || 13 Yet something’s there, || 14 yet something forms its lips || 15 And hits and cries against the ports of space,|| 16 Beating their sides to make its fury heard.||| 17 Are you shouting at me, dead man, || 18 squeezing your face In agonies of speech on speechless panes?||| 19 Cry louder, || 20 beat the windows, || 21 bawl your name! [Five stanzas omitted] 22 Where have you gone? ||| 23 The tide is over you,|| 24 The turn of midnight water’s over you,|| 25 As Time is over you, || 26 and mystery,|| 27 And memory, the flood [[that does not flow]].||| 28 You have no suburb, like those easier dead In private berths of dissolution [[laid]]–|| 29 The tide goes over, || 30 the waves ride over you|| 31 And let their shadows down like shining hair,|| 31 But they are Water; || 32 and the sea-pinks bend Like lilies in your teeth, || 33 but they are Weed;|| 34 And you are only part of an Idea.||| 35 I felt [[the wet push its black thumb-balls in]],|| 36 The night [[you died]], I felt [[your eardrums crack]], And the short agony, the longer dream, The Nothing [[that was neither long nor short]];|| 37 But I was bound, || 38 and could not go that way,|| 39 But I was blind, || 40 and could not feel your hand.||| 41 If I could find an answer, || 42 could only find Your meaning, || 43 or could say || 44 why you were here [[Who now are gone]], || 45 what purpose gave you breath || 46 Or seized it back, || 47 might I not hear your voice? ||| 48 I looked out of my window in the dark At waves with diamond quills and combs of light [[That arched their mackerel-backs and smacked the sand In the moon’s drench, that straight enormous glaze]], And ships far off asleep, and Harbour-buoys [[Tossing their fireballs wearily each to each]],|| 49 And tried to hear your voice, || 50 but all [[I heard]] Was a boat’s whistle, and the scraping squeal Of seabirds’ voices far away, and bells, Five bells. ||| 51 Five bells coldly ringing out. ||| 52 Five bells.|||
Floods and Fidget Wheels
Appendix 15.2
285
Olsen’s mural, Salute to Five Bells
Appendix 15.3 A detail of Olsen’s mural
Acknowledgements Images included in this chapter are courtesy of the Sydney Opera House Trust. I would also like to thank Kelly Wheeler and Jenny Loughnan from the Sydney Opera House for their assistance in sourcing images and seeking permission. Finally, I am grateful to Martin Pilbeam, the photographer of the work.
Notes 1. For overviews of these artists’ lives and works, see Hart (1991) and Olsen (1997) on Olsen; and Haskell (1993), Dutton (1991) and Stewart (1977) on Slessor.
286
Kathryn Tuckwell
2. Details about the mural in this section are taken from Hart (1991, Chapter 8) and Olsen (1973), unless otherwise indicated.
References Beilharz, K. (1996) Between Five Bells: An Instrumental Setting of Kenneth Slessor’s Poem ‘Five Bells.’ [Score] (Köln-Rodenkirchen: P.J. Tonger). Caffarel, A., J. R. Martin and C. M. I. M. Matthiessen (2004) ‘Introduction: Systemic functional typology’, in A. Caffarel, J. R. Martin and C. M. I. M. Matthiessen (eds) Language Typology: A Functional Perspective (Amsterdam and Philadelphia: Benjamins), pp. 1–76. Dutton, Geoffrey (1991) Kenneth Slessor: A Biography (Ringwood: Viking). Gyger, E. (1990) Five Bells: For 6 Voices or A Cappella Chamber Choir [Facsimile score; music by Elliot Gyger, words by Kenneth Slessor; details sourced from Music Australia (National Library of Australia) catalogue on 3 October 2008; URL: http:// nla.gov.au/nla.cs-ma-an26046512]. —— (1998) Deep and Dissolving Verticals of Light: Nocturnal Concerto for Orchestra [Score] (Grosvenor Place, NSW: Australian Music Centre). Halliday, M. A. K. (2002 [1981]) ‘Text semantics and clause grammar: How is a text like a clause?’, in J. Webster (ed.) On Grammar. Vol. 1 in the Collected Works of M. A. K. Halliday (London: Continuum), pp. 219–60. Halliday, M. A. K. and C. M. I. M. Matthiessen (2004) An Introduction to Functional Grammar, 3rd edn (London: Arnold). Hart, D. (1991) John Olsen (Tortola, BVI: Craftsman House). Hasan, R. (1985) Linguistics, Language, and Verbal Art (Victoria: Deakin University Press). Haskell, D. (1993) ‘Introduction’, in K. Slessor, Selected Poems (Sydney: Angus & Robertson): iv–x. Iedema, R. (2001) ‘Resemiotization.’ Semiotica 137(1/4): pp. 23–39. —— (2003) ‘Multimodality, resemiotization: Extending the analysis of discourse as multi-semiotic practice.’ Visual Communication 2(29): pp. 29–57. Isomorphism (n.d.) Dictionary.com Unabridged (v 1.1). Retrieved 16 September 2008, from Dictionary.com website: http://dictionary.reference.com/browse/ isomorphism. Kress, G. and T. van Leeuwen (1996) Reading Images: The Grammar of Visual Design (London: Routledge). Marcellino, R. (1984) Five Bells: For String Quartet and Percussion [Facsimile score; details sourced from Music Australia (National Library of Australia) catalogue on 3 October 2008; URL: http://nla.gov.au/nla.cs-ma-an26044111]. Meacham, S. (2008). ‘The conundrum of Slessor’s sixth bell.’ Sydney Morning Herald, 3 June 2008 [Sourced from Factiva]. Olsen, J. (1973) Salute to Five Bells: John Olsen’s Opera House Journal (Sydney: Angus & Robertson). —— (1997) Drawn from Life (Potts Point: Duffy & Snellgrove). O’Toole, M. (1994) The Language of Displayed Art (London: Leicester University Press). —— (1999) Engaging with Art [CD-ROM] (Perth: Murdoch University). Sculthorpe, P. (2001) Between Five Bells: For piano [Score] (Grosvenor Place, NSW: Australian Music Centre).
Floods and Fidget Wheels
287
Slessor, K. (1965 [1993]) ‘Some notes on the poems. 1. “Five visions of Captain Cook” ’, in Selected Poems (Sydney: Angus & Robertson), pp. 128–34. —— (1967 [1993]) ‘Some notes on the poems. 2. “Five Bells” and other poems’, in Selected Poems (Sydney: Angus & Robertson), pp. 134–9. Stewart, D. (1977) Man of Sydney: An appreciation of Kenneth Slessor (West Melbourne: Nelson). van Leeuwen, T. (1999) Speech, Music, Sound. (London: Palgrave Macmillan). Westwood, M. (2005) ‘Tones from a watery grave’, The Australian, 2 September, p. 19.
This page intentionally left blank
Index abstraction 275–6 actions 276 advertisements car 189 digital 183 banner 183 pop-up 183 web page 186 print 141, 183 television 157–8, 166, 170 advertising 57, 183 affordance 23, 26, 33 ambiguity 275, 277, 279, 281 analytic categories 160 cinematographic conventions 159 devices 167 techniques 159 conjunctive relations 164, 170 discontinuity devices 164 shot boundaries 164 transitions 164 graphic relations 165–6, 168 mise-en-scène 161, 163, 166–7 phasal structures 166, 167 rhythmic–dynamic disjunctions 167 relations 163 sequencing 167 sound perspectives 160–1, 168–70 soundscapes 160–1, 168–70 soundtrack 160–1, 162, 167, 168 anchorage 84 audience 77 aural 23–5 banal citizenship 93, 96 practices 92 statehood 92, 96 biological system 13–14 cartoons 75, 78–80, 83–84, 86–87
chains discursive 95–7, 99 of recontextualization 95 channel (mode) 23–5 aural 23–5, 27 visual 23–5, 27 circumstances 277 citizen, citizenship 90–2 clauses, see also transitivity 15, 18–20, 29, 33 relational 128 mental 128 cognitive linguistics 56, 71 cohesion 39, 46–7, 49–51, 53–4, 280 lexical 213, 215, 281 visual 213, 217, 273 cohesive ties 46 colour 144, 272–3, 275–6, 280–2 communication 221 composition, see also information value 40, 45 compositional, see metafunctions (image) conceptual blending 35 conceptual domain 56–7, 69 connotative 12–15, 25, 33 value 145 construct location and character 130 contact (tenor) 27 content 12–13, 15, 18, 32–3 context of culture (genre) 246 context of situation (register) 246 context contextual parameter 23, 29, 34 field 28–9, 246, 249–51, 255 mode 23, 25, 246, 249, 254–5 tenor 27, 246, 251–2, 256–7 contextualising relations 141 co-contextualising 141 re-contextualising 141 coordination of semiotic systems 33 corpus 39, 46, 53 counterpointing interplay 112–13, 118 critical impetus 145 289
290 Index denotative value 145 dissonance 274, 277, 279 diversification degree of 16 division of labour semiotic 25–6 socio-semiotic 25 drawing 21–3, 26, 30, 32 elaboration 126 emotion 63, 65 engagement, see metafunctions (image) events 276 experiential meaning, see also metafunctions (language) 277–9 expression 11 face-to-face 11, 18, 24–5, 27, 34 field see also phenomenal domain see also socio-semiotic process film 63 framing of shots in 63 music in 64 silent 64 sound in 64, 67–8 font, see typography foregrounding 268 framing 40, 43–8, 53, 188 gaze 272, 274–5, 282 generic structure 127 orientation 127–8 complicating action 128–9 resolution 129 genre 59, 75, 78, 141, 183 geological discourse 209, 218 Gesamtkunstwerk 11 gesture 59 globalisation 184 hand, representation of 58, 64, 66–7, 69 haptic 143 humour 75 appreciation of 77–8, 81, 83 multimodal 83–4 verbo-visual 75, 78, 82–3, 86 hyper-context 220
hyper-environment 223 hyperlink 194 hypermedia 147 ideational complementarity 112, 116, 121 identity 139 ideology 189, 245, 246–7, 260 images 12, 18–21, 183 conceptual 125 presentational 125–6 sequences of 141 incongruity theories 80–2 information structure 41, 45, 54 information value 40–5, 49–50, 53 centre-margin 40 ideal-real 41, 46, 53, 188 given-new 41–3, 46, 201 instantiation (cline of) 267 instance 22 instantial system 22 potential 22 integration (cline of) material diversification 12 pole of maximal integration 15–16 pole of minimal integration 15–16 socio-semiotic integration 14 intentionality 75, 78 intentions 57 intermodality 46, 54 internet 183–4, 220 interpersonal meaning, see also metafunctions (language) 272–4 interplay of episodes 199 intersemiosis 141, 159, 166 intersemiotic analysis 160 complementarity 159 relationship 110–11 repetition 168 inter-visual relations 142 visual taxonomy and associating elements 142 visual taxis and transition relations 142 visual reference and visual linking devices 142 visual configuration and flow 142 intonation 12, 15, 33 intramodality 52 isomorphism 266–9, 282–3
Index 291 joint activity 57 jokes 75, 77–9, 83, 86 tendentious 80 verbal 81–2 L2 development 133 visual instructional materials 135 language proficiency 134 reading, teaching of 124 language protolanguage 15, 21–2 sign language 23 standard language 33 language-dependency 49, 53 legs, representation of 64 lexical chains 280–1 localisation 184 macro-genre 248, 253 marketing 184 matter (meaning), material 14 meaning potential 22 shift 95 transfer of 95 media print 139 digital 143 medium 183 medium (mode) signed 23 spoken 23 written 23 mental representation 130 metafunctions (image) compositional 188, 273, 280–1 engagement 188, 272–3, 275, 281 representational 187, 273, 275–6 metafunctions (language) 140, 184, 267 ideational, see also experiential meaning 40, 47, 125, 187 interpersonal, see also interpersonal meaning 40, 44, 47, 51, 187 textual, see also textual meaning 39, 46, 47–8, 51–2, 187 metaphor grammatical 56 multimodal 146
pictorial 70 semiotic 63 source concept in 96, 99, 112, 116–18, 120, 148 target concept in 57 metonym 56, 83, 85 chain of 69 dynamic character of 70 evaluative power of 58–9, 69 formal style of 72 matrix domain in 58, 71 monomodal 63 as motif 71 multimodal 58, 68–9, 70 pictorial 58, 61–3, 68 sonic 64, 68 source-in-target 58, 71 target-in-source 58, 71 verbal 58, 70 MMORPG (massive multiplayer online role-playing games) 220 modal density 183 modality (image) 275 mode 58, 63, 68, 71, 140 see channel see division of labour see medium mood (system) 15, 17 multimodal discourse analysis, see also systemic functional humour, see humour 187, 221 metaphor, see metaphor phenomena processing 140 semiotics 127 text, see text 188 transcription, see transcription multimodality 12, 29, 33, 108–9, 187 multinational companies 185 multisemiotic 159 frameworks 159 sequential – discourse 139 multisemiotics 187 museums 245–7, 249–50, 252–3, 254–5, 258–9 music 12, 15, 26, 65 musical strategies 218
292
Index
narrative 64, 69–70 emergent 141 themes 166, 168 stages 166 nationalism banal, see banal 92 civic 92 non-ergative clauses 210, 211, 213 noticing genre-oriented 133 top-down 133 bottom-up 133 technique 134 ontogenesis 246, 261 parallelism 279, 280 participants 276–7 patterning 267–8, 277, 281 perspective 275 phylogenesis 246, 261 physical system 13–14 pictorial grouping 56 oxymoron 56 picture books 107–9, 111, 119–20, 124 abridged 132 evaluation of 135 plane expression 12–15, 21, 23, 25 content 12–15, 32 popular science 208, 218 processes, see also transitivity 277 material 128 proxemics 24, 27 public infrastructure 91 sector 91–2 services 91, 94 transport 91, 94, 96 rank (image) 189, 272–3, 280, 282 work 189 episode 189 figure 189 member 189 reading direction 40–1, 49 path 188
register 12, 22, 24, 29, 31 registerial range 29–32 relay 84, 126 release theories 79–80 relevance 57, 58 representational, see metafunctions (image) resemiotization 91, 95, 96–7, 269, 282–3 resource material 93, 96, 100–1 semiotic, see semiotic rhythm 272–3, 281 salience 40–1, 44–48, 50–1, 53, 145, 188 semantic correspondence 129 semiosis bandwidth of 23–5, 27 semiotic landscape 93, 94, 100–1 modalities 221 mode 159, 166 resources 91–3, 94, 140, 170 co-deployment of 183 spanning 152 systems connotative 12–13, 15, 25 denotative 12–13, 15, 25, 28–9 multisemiotic 11–12, 26, 29, 34 stratification of 13 theories 159 semogenesis 245–6, 261 semohistory 20 social context 245–6, 261 social semiotics 90, 91, 140 social system 13–14, 25, 33 socio-semiotic process 28 state 91 institutions of 90 material infrastructure of 92 representations of 90 stratification 267 superiority theories 79, 83 symbol 62, 71 synecoche 57, 61, 64 systemic functional linguistics 108, 210 multimodal discourse analysis (SF-MDA) 140, 187 social semiotic theory 140
Index 293 system-metafunction fidelity 144 systems typology of 13–14 television documentary 208–9, 210, 214 tenor see context text multimodal 19–21, 39, 159, 170 chains of 91 text-image matching 125 relations see relay and elaboration textual meaning, see also metafunctions (language) 280–1 texture 39, 46–7, 50 thematic organisation 146 theme (literary, artistic) 268 theme / rheme 188 theories of humour General Theory of Verbal Humor (GTVH) 81 three-dimensional 197
transcription multimodal 159, 160–1, 171 transitivity 210 behavioural 110–11, 119, 222 existential 111, 222 material 210, 216 mental 210, 211–12, 214–15 relational 210, 215 verbal 210, 216 transmodality 54 typographic mediation 47–8 modulation 47–8, 50, 53 segmentation 47–8, 50–3 typography 39, 44, 46–8, 50–4 font 145 valuer 245, 247, 260 vectors 273, 277, 279 verbal / visual interaction 108–12, 210–12, 215, 218 contradictory 113, 120 symmetrical 112, 115, 120 verbal art 268, 283 video 25 visual art 268 visual grammar 187