Information Security and Ethics:
Concepts, Methodologies, Tools, and Applications Hamid Nemati The University of North Carolina at Greensboro, USA
Volume I
Information Science reference Hershey • New York
Assistant Executive Editor: Acquisitions Editor: Development Editor: Senior Managing Editor: Managing Editor: Typesetter: Cover Design: Printed at:
Meg Stocking Kristin Klinger Kristin Roth Jennifer Neidig Sara Reed Jennifer Neidig, Sara Reed, Sharon Berger, Diane Huskinson, Laurie Ridge, Jamie Snavely, Michael Brehm, Jeff Ash, Elizabeth Duke, Steve Whiskeyman Lisa Tosheff Yurchak Printing Inc.
Published in the United States of America by Information Science Reference (an imprint of IGI Global) 701 E. Chocolate Avenue, Suite 200 Hershey PA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail:
[email protected] Web site: http://www.igi-global.com/reference and in the United Kingdom by Information Science Reference (an imprint of IGI Global) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 0609 Web site: http://www.eurospanonline.com
Library of Congress Cataloging-in-Publication Data Information security and ethics : concepts, methodologies, tools and applications / Hamid Nemati, editor. p. cm. Summary: "This compilation serves as the ultimate source on all theories and models associated with information privacy and safeguard practices to help anchor and guide the development of technologies, standards, and best practices to meet these challenges"--Provided by publisher. Includes bibliographical references and index. ISBN-13: 978-1-59904-937-3 (hardciver) ISBN-13: 978-1-59904-938-0 (ebook) 1. Computer security. 2. Information technology--Security measures. 3. Information technology--Moral and ethical aspects. I. Nemati, Hamid R., 1958QA76.9.A25I54152 2008 005.8--dc22 2007031962 Copyright © 2008 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark. British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library.
Editor-in-Chief
Mehdi Khosrow-Pour, DBA Editor-in-Chief Contemporary Research in Information Science and Technology, Book Series
Associate Editors Steve Clarke University of Hull, UK
San Diego State University, USA Annie Becker Florida Institute of Technology USA Ari-Veikko Anttiroiko University of Tampere, Finland
Editorial Advisory Board Sherif Kamel American University in Cairo, Egypt In Lee Western Illinois University, USA Jerzy Kisielnicki Warsaw University, Poland Keng Siau University of Nebraska-Lincoln, USA Amar Gupta Arizona University, USA Craig van Slyke University of Central Florida, USA John Wang Montclair State University, USA Vishanth Weerakkody Brunel University, UK
Additional Research Collections found in the “Contemporary Research in Information Science and Technology” Book Series Data Mining and Warehousing: Concepts, Methodologies, Tools, and Applications John Wang, Montclair University, USA • 6-volume set • ISBN 978-1-59904-951-9 Electronic Commerce: Concepts, Methodologies, Tools, and Applications S. Ann Becker, Florida Institute of Technology, USA • 4-volume set • ISBN 978-1-59904-943-4 Electronic Government: Concepts, Methodologies, Tools, and Applications Ari-Veikko Anttiroiko, University of Tampere, Finland • 6-volume set • ISBN 978-1-59904-947-2 End-User Computing: Concepts, Methodologies, Tools, and Applications Steve Clarke, University of Hull, UK • 4-volume set • ISBN 978-1-59904-945-8 Global Information Technologies: Concepts, Methodologies, Tools, and Applications Felix Tan, Auckland University of Technology, New Zealand • 6-volume set • ISBN 978-1-59904-939-7 Information Communication Technologies: Concepts, Methodologies, Tools, and Applications Craig Van Slyke, University of Central Florida, USA • 6-volume set • ISBN 978-1-59904-949-6 Information Security and Ethics: Concepts, Methodologies, Tools, and Applications Hamid Nemati, The University of North Carolina at Greensboro, USA • 6-volume set • ISBN 978-1-59904-937-3 Intelligent Information Technologies: Concepts, Methodologies, Tools, and Applications Vijayan Sugumaran, Oakland University, USA • 4-volume set • ISBN 978-1-59904-941-0 Knowledge Management: Concepts, Methodologies, Tools, and Applications Murray E. Jennex, San Diego State University, USA • 6-volume set • ISBN 978-1-59904-933-5 Multimedia Technologies: Concepts, Methodologies, Tools, and Applications Syad Mahbubur Rahman, Minnesota State University, USA • 3-volume set • ISBN 978-1-59904-953-3 Online and Distance Learning: Concepts, Methodologies, Tools, and Applications Lawrence Tomei, Robert Morris University, USA • 6-volume set • ISBN 978-1-59904-935-9 Virtual Technologies: Concepts, Methodologies, Tools, and Applications Jerzy Kisielnicki, Warsaw University, Poland • 3-volume set • ISBN 978-1-59904-955-7
Free institution-wide online access with the purchase of a print collection!
Information Science reference Hershey • New York
Order online at www.igi-global.com or call 717-533-8845 ext.10 Mon–Fri 8:30am–5:00 pm (est) or fax 24 hours a day 717-533-8661
List of Contributors
Abraham, Ajith / Chung-Ang University, Korea...........................................................................1639 Acevedo, Andrés Garay / Georgetown University, USA...................................................................23 Adarkar, Hemant / Ness Technologies, India.................................................................................496 Aedo, Ignacio / Universidad Carlos III de Madrid, Spain............................................................1514 Agah, Afrand / University of Texas at Arlington, USA..................................................................3627 Ahamed, Sheikh I. / Marquette University. USA............................................................................927 Alam, Ashraful / The University of Texas at Dallas, USA................................................................72 Andrews, Cecilia / University of New South Wales, Australia......................................................3250 Andriole, Stephen J. / Villanova University, USA.........................................................................1603 Anttila, Juhani / Quality Integration, Finland..............................................................................3848 Ardagna, Claudio Agostino / University of Milan, Italy................................................................129 Aref, Walid / Purdue University, USA...........................................................................................1378 Artz, John M. / The George Washington University, USA....................................................238, 3824 Åström, Peik / Arcada Polytechnic, Helsinki, Finland..................................................................1339 Asvathanarayanan, Sridhar / Quinnipiac University, USA.........................................................1070 Auer, David J. / Western Washington University, USA..................................................................3659 Avolio, Bruce J. / Gallup Leadership Institute University of Nebraska, USA.................................513 Aytes, Kregg / Idaho State University, USA.........................................................................1994, 3866 Bajaj, Akhilesh / Carnegie Mellon University, USA.....................................................................1108 Baker, C. Richard / University of Massachusetts, USA....................................................................89 Barger, Robert N. / University of Notre Dame, USA....................................................................3659 Barkhi, Reza / Virginia Polytechnic Institute & State University, USA........................................2335 Barlow, Michael / University of New South Wales, Australia.........................................................419 Baskerville, R. / University of Oulu, Finland..................................................................................845 Baskerville, Richard / Georgia State University, USA.................................................................3577 Becker, Shirley Ann / Northern Arizona University, USA............................................................2286 Bekkering, Ernst / Mississippi State University and Northeastern State University, USA.......................................................................................................................402, 2114 Bellavista, Paolo / University of Bologna, Italy.............................................................................2163 Berbecaru, Diana / Politecnico di Torino, Italy............................................................................1210 Berkemeyer, Anthony / Texas Instruments Inc., USA...................................................................2286 Berners-Lee, Tim / Massachusetts Institute of Technology, USA.................................................1774 Bertino, Elisa / Purdue University, USA.....................................................................671, 1321, 1839 Bertoni, Guido / Politecnico di Milano, Italy..................................................................................771 Bessette, Keith / University of Connecticut, USA..........................................................................1741
Bhaskar, Tarun / Indian Institute of Management, India..............................................................1611 Bhattarakosol, Pattarasinee / Chulalongkorn University, Thailand............................................2750 Blaya, Juan A. Botía / University of Murcia, Spain........................................................................155 Bochmann, Gregor V. / University of Ottawa, Canada................................................................3765 Boregowda, Lokesh R. / Honeywell SLL, India............................................................................3968 Boswell, Katherine / Middle Tennessee State University, USA.....................................................2335 Bouguettaya, Athman / Virginia Tech, USA.................................................................................3713 Braber, Folker den / SINTEF ICT, Norway..........................................................................704, 1865 Brady, Fiona / Central Queensland University, Australia.............................................................2986 Brandt, Sebastian / RWTH Aachen University, Germany.............................................................2531 Brathwaite, Charmion / University of North Carolina at Greensboro, USA...............................2561 Breu, Ruth / Universität Innsbruck, Austria..................................................................................2686 Briggs, Pamela / Northumbria University, UK..............................................................................2905 Britto, K. S. Shaji / Sathyabama Deemed University, India.........................................................1416 Buche, Mari W. / Michigan Technological University, USA...........................................................340 Burk, Dan L. / University of Minnesota Law School, USA...........................................................2448 Burmester, M ke / Florida State University, USA.........................................................................1450 Bursell, Michael / Cryptomathic, UK............................................................................................1488 Butcher-Powell, Loreen Marie / Bloomsburg University of Pennsylvania, USA.................316, 2044 Candan, K. Selçuk / Arizona State University, USA.....................................................................2187 Cardoso, Rui C. / Universidade de Beira Interior, Portugal........................................................3620 Carminati, Barbara / Università dell’Insubria, Italy...................................................................1321 Cazier, Joseph A. / Appalachian State University, USA................................................................3133 Chai, Sangmi / State University of New York at Buffalo, USA......................................................2830 Chakravarty, Indrani / Indian Institute of Technology, India......................................................1947 Chalasani, Suresh / University of Wisconsin – Parkside, USA.....................................................1184 Champion, David R. / Slippery Rock University, USA.................................................................2366 Chang, Elizabeth / Curtin University of Technology, Australia....................................................1591 Chen, Minya / Polytechnic University, USA....................................................................................438 Chen, Shaokang / The University of Queensland, Australia.........................................................1165 Chen, Shu-Ching / Florida International University, USA...........................................................1378 Chen, Thomas M. / Southern Methodist University, USA...............................................................532 Chen, Yu-Che / Iowa State University, USA..................................................................................3739 Cheng, Hsing K. / University of Florida, USA................................................................................818 Cheng, Jie / Wayne State University, USA...............................................................................570, 580 Cheng, Qiang / Wayne State University, USA..........................................................................570, 580 Chibelushi, Caroline / Staffordshire University, UK.....................................................................1701 Christie, Vaughn R. / Purdue University, USA.............................................................................1396 Chun, Mark W. S. / Pepperdine University, USA.........................................................................1727 Chung, Jen-Yao / IBM T. J. Watson Research Center, USA..........................................................2303 Clemente, Félix J. García / University of Murcia, Spain......................................................155, 2991 Colbert, Bernard / Deakin University, Australia..........................................................................1125 Connolly, Dan / Massachusetts Institute of Technology, USA.......................................................1774 Connolly, Terry / University of Arizona, USA.....................................................................1994, 3866 Cook, Jack S. / Rochester Institute of Technology, USA........................................................211, 3805 Cook, Laura / State University of New York, USA........................................................................3805
Córdoba, José-Rodrigo / University of Hull, UK.........................................................................3387 Corradi, Antonio / University of Bologna, Italy...........................................................................2163 Cotte, Pierre / STMicroelectronics, France...................................................................................1884 Cremonini, Marco / Università di Milano, Italy...........................................................................2095 Croasdell, David / University of Nevada, USA..............................................................................2767 Crowell, Charles R. / University of Notre Dame, USA.......................................................3269, 3600 Curran, Kevin / University of Ulster at Magee, UK.....................................................................1426 Currie, Carolyn / University of Technology, Australia.................................................................3229 Damiani, Ernesto / University of Milan, Italy.............................................................129, 1288, 2095 Dampier, David A. / Mississippi State University, USA................................................................1795 Danielson, Peter / University of British Columbia, Canada.................................................457, 1504 Das, Sajal K. / University of Texas at Arlington, USA...................................................................3627 Dave, Dinesh S. / Appalachian State University, USA...................................................................3133 Davis, Chris / Texas Instruments, USA............................................................................................532 Davis, Kimberly / Mississippi State University, USA............................................................402, 2114 Dekker, Anthony H. / Defence Science and Technology Organisation, Australia........................1125 Demjurian, Steven A. / University of Connecticut, USA..............................................................1741 Derenale, Corrado / Politecnico di Torino, Italy..........................................................................1210 Deshmukh, Pooja / Washington State University, USA.................................................................2767 Desmedt, Yvo / University College of London, UK.......................................................................1176 Dhillon, Gurpreet / Virginia Commonwealth University, USA.....................................................2545 di Vimercati, Sabrina De Capitani / Università di Milano, Italy......................................1288, 2095 Díaz, Paloma / Universidad Carlos III de Madrid, Spain.............................................................1514 Dillon, Tharam / University of Technology, Australia..................................................................1591 Dittmann, Jana / Otto-von-Guericke University of Magdeburg, Germany..........................900, 3122 Doan, Thuong / University of Connecticut, USA...........................................................................1741 Dodge, Jr., Ronald C. / United States Military Academy, USA.....................................................1562 Dodig-Crnkovic, Gordana / Mälardalen University, Sweden......................................................2524 Doherty, Neil F. / Loughborough University, UK.................................................................964, 2722 Dortschy, Martin / Institute of Electronic Business – University of Arts, Germany.....................2958 Douligeris, Christos / University of Piraeus, Greece..........................................................................1 Du, Jun / Tianjin University, China.....................................................................................2388, 3020 Dudley-Sponaugle, Alfreda / Towson University, USA................................................................3112 Durresi, A. / Louisiana State University, USA...............................................................................1012 Economides, Anastasios A. / University of Macedonia, Spain.......................................................596 Eder, Lauren / Rider University, USA.............................................................................................613 El-Khatib, Khalil / National Research Council and Institute for Information Technology, Canada........................................................................................................174, 299, 2012 El-Sheikh, Asim / The Arab Academy for Banking & Financial Sciences, Jordan...........................62 English, Larry P. / Information Impact International Inc., USA...................................................3404 Esfandiari, Babak / Carleton University, Canada..........................................................................912 Fairchild, Alea M. / Vesalius College, Vrije Universiteit Brussel, Belgium..................................3188 Farkas, Csilla / University of South Carolina, USA......................................................................3309 Fernandez, Eduardo B. / Florida Atlantic University, USA.................................................654, 1828 Fernandez, Minjie H. / Florida Atlantic University, USA............................................................1828 Fernandez-Medina, Eduardo / Universidad de Castilla-La Mancha, Spain.....................1048, 1288
Ferrari, Elena / University of Insubria at Como, Università dell’Insubria, Italy.........................1321 Fink, Dieter / Edith Cowan University, Australia.........................................................................2958 Fjermestad, Jerry / New Jersey Institute of Technology, USA......................................................3045 Fornas, Thierry / STMicroelectronics, France.............................................................................1884 France, R. / Colorado State University, USA................................................................................2234 Frati, Fulvio / University of Milan, Italy.........................................................................................129 Freire, Mário M. / Universidade de Beira Interior, Portugal.......................................................3620 Friedman, William H. / University of Central Arkansas, USA.......................................................218 Fu, Lixin / The University of North Carolina at Greensboro, USA...............................................3451 Fugini, Maria Grazia / Politecnico di Milano, Italy.......................................................................872 Fulford, Heather / Loughborough University, UK................................................................964, 2722 Furnell, Steven / University of Plymouth, UK...............................................................................4014 Fürst, Karl / Vienna University of Technology, Austria................................................................2356 Gallo, Jason / Northwestern University, USA................................................................................1158 Georg, G. / Colorado State University, USA..................................................................................2234 Ghafoor, Arif / Purdue University, USA........................................................................................1378 Gilbert, Joe / University of Nevada Las Vegas, USA.......................................................................225 Giorgini, P. / University of Trento, Italy........................................................................200, 961, 3784 Glanzer, Klaus / Vienna University of Technology, Austria..........................................................2356 Goel, Sanjay / University at Albany, SUNY, and NYS Center for Information Forensics and Assurance, USA........................................................................................................................2849 Goldman, James E. / Purdue University, USA..............................................................................1396 Gomberg, Anna / University of Notre Dame, USA........................................................................3269 Goodman, Kenneth W. / University of Miami, USA.......................................................................292 Gould, Carmen / RMIT University, Australia...............................................................................2505 Graham, Ross Lee / Mid-Sweden University, Sweden..................................................................1580 Grahn, Kaj J. / Arcada Polytechnic, Helsinki, Finland......................................737, 761, 1339, 1349 Grant, Zachary / New Mexico Mounted Patrol, USA...................................................................2366 Griffy-Brown, Charla / Pepperdine University, USA...................................................................1727 Grodzinsky, Frances S. / Sacred Heart University, USA..............................................................2505 Guajardo, Jorge / Infineon Technologies AG, Germany.................................................................771 Guan, Sheng-Uei / National University of Singapore, Singapore and Brunel University, UK.......................................................................................................................2278, 2892 Gupta, Manish / State University of New York, USA..........................................................2075, 2666 Gupta, P. / Indian Institute of Technology, India...........................................................................1947 Gurău, Călin / Groupe Sup. de Co. Montpellier, France......................................................101, 3222 Hafner, Michael / Universität Innsbruck, Austria.........................................................................2686 Haley, C. B. / The Open University, UK........................................................................................3199 Han, Song / Curtin University of Technology, Australia...............................................................1591 Haraty, Ramzi A. / Lebanese American University, Lebanon.............................................1531, 3572 Harrington, Kara / University of North Carolina at Greensboro, USA.......................................2561 Hazari, Sunil / University of West Georgia, USA..........................................................................2319 Hendler, Jim / University of Maryland, USA.................................................................................1774 Hentea, Mariana / Southwestern Oklahoma State University, USA...............................................350 Herath, T. C. / State University of New York at Buffalo, USA.......................................................2830 Herdin, Thomas / University of Salzburg, Austria........................................................................3676
Hitchens, Michael / Macquarie University, Australia...................................................................2865 Hoffman, Gerald M. / Northwestern University, USA....................................................................191 Hofkirchner, Wolfgang / University of Salzburg, Austria.............................................................3676 Hollis, David M. / United States Army, USA.................................................................................2641 Hollis, Katherine M. / Electronic Data Systems, USA..................................................................2641 Hongladarom, Soraj / Chulalongkorn University, Thailand........................................................3644 Horniak, Virginia / Mälardalen University, Sweden.....................................................................2524 Hoskins, M. / University of Victoria, Canada................................................................................2650 Houmb, S. H. / Norwegian University of Science and Technology, Norway.................................2234 Hsieh, James S. F. / Ulead Systems Inc., Taiwan..........................................................................2303 Hsu, H. Y. Sonya / Southern Illinois University............................................................................1080 Hu, Wen-Chen / University of North Dakota, USA.............................................................1766, 1956 Huang, Thomas S. / University of Illinois at Urbana-Champaign and Wayne State University, USA.................................................................................................................................570 Huegle, Tobias / Edith Cowan University, Australia.....................................................................2958 Humphries, Lynne / University of Sunderland, UK......................................................................2634 Ilioudi, Christina / University of Piraeus, Greece........................................................................1759 Indrakanti, Sarath / Macquarie University, Australia.................................................................2865 Iyengar, S. Sitharama / Louisiana State University, USA............................................................1012 Jahnke, Tino / University of Cooperative Education Heidenheim, Germany.........................253, 554 Jajodia, Sushil / George Mason University, USA..........................................................................1236 James, Tabitha / Virginia Polytechnic Institute & State University, USA.....................................2335 Jarke, Matthias / RWTH Aachen University and Fraunhofer FIT, Germany...............................2531 Jarmoszko, Andrzej T. / Central Connecticut State University, USA..........................................3442 Jeckle, Mario / University of Applied Sciences, Germany............................................................2103 Jiao, Jianxin (Roger) / Nanyang Technological University, Singapore..............................2388, 3020 Jiao, Yuan-Yuan / Nankai University, China......................................................................2388, 3020 Johnston, Allen C. / University of Louisiana-Monroe, USA.........................................................2130 Joshi, James B. D. / University of Pittsburgh, USA......................................................................1378 Jürjens, J. / TU Munich, Germany................................................................................................2234 Kahai, Pallavi / Cisco Systems, USA.............................................................................................3988 Kahai, Surinder S. / University of New York at Binghamton, USA................................................513 Kajava, Jorma / Oulu University, Finland....................................................................................3848 Kamath B., Narasimha / Indian Institute of Management, India.................................................1611 Kannan, Rajgopal / Louisiana State University, USA..................................................................1012 Kao, Diana / University of Windsor, Canada.................................................................................4000 Kapucu, Naim / University of Central Florida, USA......................................................................451 Karaman, Faruk / Okan University, Turkey.................................................................................1931 Karlsson, Jonny / Arcada Polytechnic, Finland............................................................737, 761, 1349 Khan, Latifur / University of Texas at Dallas, USA......................................................................1145 Khurana, Himanshu / NCSA, University of Illinois, USA............................................................1361 Kimery, Kathryn M. / Saint Mary’s University, Canada................................................................272 Kimppa, Kai Kristian / University of Turku, Finland..................................................................3856 King, Brian / Indiana University – Purdue University Indianapolis (IU-PUI), USA...................1176 Klemen, Markus / Vienna University of Technology, Austria.......................................................2970 Knight, John / University of Central England, UK.........................................................................231
Knight, Linda V. / DePaul University, USA....................................................................................837 Koch, M. / Free University of Berlin, Germany............................................................................1456 Koleva, Radostina K. / NCSA, University of Illinois, USA...........................................................1361 Korba, Larry / National Research Council, Canada and Institute for Information Technology..................................................................................................174, 299, 2012 Kou, Weidong / Xidian University, PR China...............................................................................1766 Krishna, K. Pramod / State University of New York at Buffalo, USA..........................................3067 Krishnamurthy, Sandeep / University of Washington, Bothell, USA...........................................3953 Kuivalainen, T. / University of Oulu, Finland.................................................................................845 Kumar, Mohan / University of Texas at Arlington, USA...............................................................3627 Kuo, Feng-Yang / National Sun Yat-Sen University, Taiwan........................................................3366 Kvasny, Lynette / The Pennsylvania State University, USA..........................................................3470 Kyobe, Michael / University of the Free State, South Africa........................................................2704 Labruyere, Jean-Philippe P. / DePaul University, USA.................................................................837 Lally, Laura / Hofstra University, USA...............................................................................3419, 3887 Lam, Herman / University of Florida, USA..................................................................................3031 Lan, Blue C. W. / National Central University, Taiwan................................................................2303 Laney, R. / The Open University, UK............................................................................................3199 Larrondo-Petrie, M. M. / Florida Atlantic University, USA..........................................................654 Lauría, Eitel J. M. / Marist College, USA....................................................................................3006 Lawson, Danielle / Queensland University of Technology, Australia...........................................3321 Lazakidou, Athina / University of Piraeus, Greece......................................................................1759 Lazar, Jonathan / Towson University, USA...................................................................................3112 Lee, ByungKwan / Kwandong University, Republic of Korea........................................................639 Lee, Chung-wei / Auburn University, USA................................................................1766, 1956, 2259 Lee, Tai-Chi / Saginaw Valley State University, USA......................................................................639 Leong, Jessica / Ernst & Young, New Zealand..............................................................................2615 Leong, Leslie / Central Connecticut State University, USA..........................................................3442 Levy, Yair / Nova Southeastern University, USA...........................................................................2139 Lewis, Edward / University of New South Wales, Australia.........................................................3250 Li, Chang-Tsun / University of Warwick, UK...........................................................1192, 1719, 3788 Lias, Allen R. / Robert Morris University, USA.............................................................................3094 Lin, Ching-Yung / IBM T.J. Watson Research Center, USA..........................................................3282 Lin, Ping / Arizona State University, USA.....................................................................................2187 Lin, Tsau Young / San Jose State University, USA........................................................................1096 Lioy, Antonio / Politecnico di Torino, Italy...................................................................................1210 Liu, Jiang-Lung / National Defense University, Taiwan.......................................................144, 1192 Liu, L. / Tsinghua University, China..............................................................................................2462 Loebbecke, Claudia / University of Cologne, Germany...............................................................1923 López, Natalia / Universidad Complutense de Madrid, Spain......................................................3794 López-Cobo, J. S. / ATOS Origin, Spain.......................................................................................3691 Lotz, V. / SAP Research, France....................................................................................................3691 Lou, Der-Chyuan / National Defense University, Taiwan....................................................144, 1191 Lovell, Brian C. / The University of Queensland, Australia.........................................................1165 Lowry, Paul Benjamin / Brigham Young University, USA...........................................................3542 Lund, Mass Soldal / SINTEF ICT, Norway.....................................................................................704
Lu, Chang-Tien / Virginia Tech, USA............................................................................................2259 Macer, Darryl / UNESCO Bangkok, Thailand; Eubios Ethics Institute, Japan and New Zealand; and United Nations University Institute of Advanced Studies, Japan.............................3340 Maczewski, M. / University of Victoria, Canada..........................................................................2650 Maier-Rabler, Ursula / University of Salzburg, Austria...............................................................3676 Malik, Zaki / Virginia Tech, USA...................................................................................................3713 Maña, A. / University of Malaga, Spain........................................................................................3691 Maradan, Gerald / STMicroelectronics, France...........................................................................1884 Marsh, Stephen / National Research Council of Canada, Canada..............................................2905 Martino, Lorenzo / Purdue University, USA...................................................................................671 Massacci, F. / University of Trento, Italy.......................................................................................3691 Masters, Amelia / University of Wollongong, Australia................................................................1975 McCord, Mary / Central Missouri State University, USA..............................................................272 McDonald, Sharon / University of Sunderland, UK.....................................................................2634 Mead, N. R. / Carnegie Mellon University, USA.............................................................................943 Medlin, B. Dawn / Appalachian State University, USA................................................................3133 Melideo, M. / Engineering Ingegneria Informatica, Italy.............................................................3691 Melville, Rose / The University of Queensland, Australia.............................................................3612 Melzer, Ingo / DaimlerChrysler Research & Technology, Germany.............................................2103 Memon, Nasir / Polytechnic University, USA.................................................................................438 Michael, Katina / University of Wollongong, Australia................................................................1975 Mildal, Arne Bjørn / NetCom, Norway.........................................................................................1865 Miller, David W. / California State University, USA.....................................................................3352 Mishra, Nilesh / Indian Institute of Technology, India..................................................................1947 Mishra, Sushma / Virginia Commonwealth University, USA........................................................2545 Mitchell, Mark / Brigham Young University, USA........................................................................3542 Mitrakas, Andreas / Ubizen, Belgium and European Network and Information Security Agency (ENISA), Greece...................................................................................16, 1681, 2422 Mitrokotsa, Aikaterini / University of Piraeus, Greece.....................................................................1 Moffett, J. D. / The Open University, UK......................................................................................3199 Monsanto, Charlton / Prudential Fox & Roach Realtors/Trident, USA.......................................1603 Montero, Susana / Universidad Carlos III de Madrid, Spain.......................................................1514 Mortensen, Melanie J. / Montreal, Canada....................................................................................380 Mouratidis, H. / University of East London, UK..........................................................200, 961, 3784 Moyes, Aaron / Brigham Young University, USA..........................................................................3542 Murugesan, San / Southern Cross University, Australia..............................................................3433 Mylopoulos, J. / University of Toronto, Canada............................................................................2462 Nagarajan, Karthik / University of Florida, USA........................................................................3031 Nambiar, Seema / Virginia Tech, USA...........................................................................................2259 Namuduri, Kamesh / Wichita State University, USA....................................................................3938 Nand, Sashi / Rushmore University, BWI......................................................................................1062 Narvaez, Darcia / University of Notre Dame, USA.......................................................................3269 Nemati, Hamid R. / University of North Carolina at Greensboro, USA.............................2561, 3451 Nes, Jone / NetCom, Norway.........................................................................................................1865 Nowak, Andrea / Austrian Research Center Seibersdorf, Austria................................................2686 Nuseibeh, B. / The Open University, UK.......................................................................................3199
Oermann, Andrea / Otto-von-Guericke University of Magdeburg, Germany..............................3122 Orr, Martin / Waitemata District Health Board, New Zealand.......................................................358 Owens, Jan / University of Wisconsin – Parkside, USA................................................................1184 Paar, Christof / Ruhr Universität Bochum, Germany.....................................................................771 Paci, Federica / University of Milano, Italy.....................................................................................671 Padmanabhuni, Srinivas / Software Engineering and Technology Labs, Infosys Technologies Limited, India....................................................................................................................................496 Pagani, Margherita / Bocconi University, Italy............................................................................3499 Pallis, George / Aristotle University of Thessaloniki, Greece..........................................................713 Pang, Les / National Defense University, USA..............................................................................2623 Parisi-Presicce, F. / University of Rome “La Sapienza,” Italy......................................................1456 Park, Eun G. / McGill University, Canada...................................................................................2500 Park, I. / State University of New York at Buffalo, USA................................................................2830 Partow-Navid, Parviz / California State University, Los Angeles, USA.......................................2739 Pashupati, Kartik / Southern Methodist University, USA.............................................................3550 Paterson, Barbara / Marine Biology Research Institute, Zoology Department, University of Cape Town, South Africa............................................................................................................2432 Patrick, Andrew S. / National Research Council, Canada.................................................2012, 2905 Pattinson, Malcolm R. / University of South Australia, Australia................................................2059 Pauls, K. / Free University of Berlin, Germany.............................................................................1456 Peace, A. Graham / West Virginia University, USA...........................................................................62 Pendse, Ravi / Wichita State University, USA...............................................................................3739 Pérez, Gregorio Martínez / University of Murcia, Spain.....................................................155, 2991 Phan, Raphael C. W. / Swinburne University of Technology (Sarawak Campus), Malaysia.......2781 Phillips, Jr., Charles E. / United States Military Acadamy, West Point, USA...............................1741 Piattini, Mario / Universidad de Castilla-La Mancha, Spain.............................................1048, 1288 Pirim, Taner / Mississippi Center for Supercomputing Research, USA........................................2235 Plebani, Pierluigi / Politecnico di Milano, Italy..............................................................................872 Pollock, Clare / Curtin University of Technology, Australia...........................................................324 Pon, Damira / University at Albany, SUNY, and NYS Center for Information Forensics and Assurance, USA........................................................................................................................2849 Potdar, Vidyasagar / Curtin University of Technology, Australia.................................................1591 Pulkkis, Göran / Arcada Polytechnic, Helsinki, Finland....................................737, 761, 1339, 1349 Radl, Alison / Iowa State University, USA.....................................................................................3739 Raghuramaraju, A. / University of Hyderabad, India..................................................................3084 Ragsdale, Daniel / United States Military Academy, USA.............................................................1562 Ram, Sudha / University of Arizona, USA.....................................................................................1108 Raman, Pushkala / Florida State University, USA.......................................................................3550 Ramim, Michelle / Nova Southeastern University, USA...............................................................2139 Rananand, Pirongrong Ramasoota / Chulalongkorn University, Thailand................................2221 Rao, H. Raghav / State University of New York at Buffalo, USA....................2075, 2666, 2830, 3067 Rashed, Abdullah Abdali / The Arab Academy for Banking & Financial Sciences, Jordan...........62 Reale, Salvatore / Siemens Mobile Communication S.p.A., Italy....................................................129 Reithel, Brian / University of Mississippi......................................................................................2335 Rezgui, Abdelmounaam / Virginia Tech, USA..............................................................................3713 Roberts, Lynne / University of Western Australia, Australia..........................................................324
Rodríguez, Ismael / Universidad Complutense de Madrid, Spain................................................3794 Romano, Jr., Nicholas C. / Oklahoma State University, USA.......................................................3045 Rosen, Peter A. / Oklahoma State University and University of Evansville, USA..............1550, 3590 Ross, Steven C. / Western Washington University, USA................................................................3659 Roupas, Chrysostomos / University of Macedonia, Greece...........................................................596 Rowe, Neil C. / U.S. Naval Postgraduate School, USA.................................................................2717 Rudolph, C. / Fraunhofer Institute for Secure Information Technology, Germany.......................3691 Sadri, Fereidoon / The University of North Carolina at Greensboro, USA..................................3451 Salam, A. F. / University of North Carolina at Greensboro, USA.................................................1660 Salisbury, Wm. David / University of Dayton, USA.....................................................................3352 Samarati, Pierangela / Università di Milano, Italy.............................................................1288, 2095 Sandy, Geoffrey A. / Victoria University, Australia......................................................................3375 Sankaranarayanan, P. E. / Sathyabama Deemed University, India.............................................1416 Sanz, Daniel / Universidad Carlos III de Madrid, Spain...............................................................1514 Saygin, Yücel / Sabanci University, Turkey.....................................................................................589 Schilhavy, Richard / University of North Carolina at Greensboro, USA.....................................1660 Schlüter, Marcus / RWTH Aachen University, Germany..............................................................2531 Schmidt, Thomas / Vienna University of Technology, Austria......................................................2356 Schuldt, Barbara A. / Southeastern Louisiana University, USA...................................................3142 Schultz, Robert A. / Woodbury University, USA.............................................................................473 Seidman, Lee / Qualidigm, USA....................................................................................................2407 Seitz, Juergen / University of Cooperative Education Heidenheim, Germany.......................253, 554 Servida, Andrea / European Commission.....................................................................................1671 Shah, Hanifa / Staffordshire University, UK.................................................................................1701 Sharman, Raj / State University of New York at Buffalo, USA.....................................................3067 Sharp, Bernadette / Staffordshire University, UK........................................................................1701 Shukla, K. K. / Institute of Technology (IT-BHU), India...............................................................3968 Shyu, Mei-Ling / University of Miami, USA.................................................................................1378 Si, Huayin / University of Warwick, UK........................................................................................3788 Simon, Ed / XMLsec Inc., Canada.................................................................................................1267 Singh, Richa / Purvanchal University and Indian Institute of Technology, India..............1947, 3968 Singh, Sanjay K. / Purvanchal University, India..........................................................................3968 Siponen, M. / University of Oulu, Finland......................................................................................845 Siraj, Ambareen / Mississippi State University, USA....................................................................1795 Skarmeta, Antonio F. Gómez / University of Murcia...........................................................155, 2991 Skovira, Robert Joseph / Robert Morris University, USA............................................................2797 Slusky, Ludwig / California State University, Los Angeles, USA.................................................2739 Smith, Alan D. / Robert Morris University, USA.................................................................3094, 3728 Smith, Leigh / Curtin University of Technology, Australia.............................................................324 Smyth, Elaine / University of Ulster at Magee, UK......................................................................1426 Soininen, Aura / Lappeenranta University of Technology and Attorneys-at-Law Borenius & Kemppinen, Ltd., Finland............................................................................................2577 Song, Ronggong / National Research Council, Canada................................................................2012 Søraker, Johnny Hartz / Norwegian University of Science and Technology, Norway.................3829 Spanoudakis, G. / City University, UK..........................................................................................3691 Spiekermann, Sarah / Humboldt University Berlin, Germany...............................................481, 488
Squicciarini, Anna C. / Purdue University, USA............................................................................671 Stagg, Vernon / Deakin University, Australia................................................................................1626 Stahl, Bernd Carsten / University College Dublin, Ireland and De Montfort University, UK.......................................................................................................................3157, 3170 Stefanelli, Cesare / University of Bologna and University of Ferrara, Italy................................2163 Steinebach, Martin / Fraunhofer IPSI, Germany...........................................................................900 Stephens, Jackson / Brigham Young University, USA...................................................................3542 Stickel, Eberhard / University of Applied Sciences Bonn GmbH, Germany................................1257 Stølen, Ketil / SINTEF ICT, Norway......................................................................................704, 1865 Storey, M.-A. / University of Victoria, Canada.............................................................................2650 Stoupa, Konstantina / Aristotle University of Thessaloniki, Greece..............................................713 Su, Stanley Y.W. / University of Florida, USA..............................................................................3031 Subramanium, Mahesh / Oregon State University, USA.............................................................1027 Suhail, Mohamed Abdulla / University of Bradford, UK.............................................................3902 Sveningsson, Malin / Karlstad University, Sweden.......................................................................3484 Tarafdar, Monideepa / University of Toledo, USA.......................................................................3525 Tavani, Herman T. / Rivier College, USA.....................................................................................2505 Taylor, Art / Rider University, USA.................................................................................................613 Thomas, Johnson / Oklahoma State University, USA...................................................................1639 Thompson, Paul / Dartmouth College, USA.................................................................................1006 Thomsen, Michael / Florida Atlantic University, USA.................................................................1828 Thuraisingham, Bhavani / The University of Texas at Dallas and The MITRE Corporation, USA.............................................................................................................72, 627, 1145 Tian, Qi / Institute for Infocomm Research, Singapore...................................................................109 Trček, Denis / Jožef Stefan Institute, Ljubljana, Slovenia and University of Primorska, Koper, Slovenia.....................................................................................................................1806, 2931 Trujillo, Juan / Universidad de Alicante, Spain............................................................................1046 Tso, Hao-Kuan / National Defense University, Taiwan...................................................................144 Tsybulnik, Natasha / The University of Texas at Dallas, USA..........................................................72 Turner, Eva / University of East London, UK...............................................................................3758 Tyran, Craig K. / Western Washington University, USA...............................................................3659 Upadhyaya, Shambhu / State University of New York, USA....................................2075, 2666, 3067 Urbaczewski, Andrew / University of Michigan, USA..................................................................3352 Vakali, Athena / Aristotle University of Thessaloniki, Greece........................................................713 van der Velden, Maja / University of Bergen, Norway...................................................................859 Varadharajan, Vijay / Macquarie University, Australia..............................................................2865 Varonen, Rauno / Oulu University, Finland.................................................................................3848 Vatsa, Mayank / Purvanchal University, Indian Institute of Technology, India.................1947, 3968 Vaughn, Rayford B. / Mississippi State University, USA..............................................................1537 Vician, Chelley / Michigan Technological University, USA............................................................340 Villarroel, Rodolfo / Universidad Católica del Maule, Chile.......................................................1048 Vraalsen, Fredrik / SINTEF ICT, Norway............................................................................704, 1865 Vuyst, Bruno de / Vesalius College and the Institute for European Studies, Vrije Universiteit Brussel, Belgium.............................................................................................................................3188 Waddell, Dianne / Edith Cowan University, Australia..................................................................2949 Wakefield, Robin L. / Hankamer School of Business, Baylor University, USA............................2814
Wang, Harry J. / University of Arizona, USA.................................................................................818 Wang, Shouhong / University of Massachusetts Dartmouth, USA...............................................4000 Wang, Yingge / Wayne State University, USA.........................................................................570, 580 Wang, Yun / Yale University and Yale-New Haven Health System and Qualigidm, USA.............2407 Ward, Jeremy / Symantec EMEA, UK..........................................................................................4014 Warkentin, Merrill / Mississippi State University, USA.............................................402, 2114, 2130 Warren, Matthew / Deakin University, Australia.........................................................................1626 Weber, Barbara / Universität Innsbruck, Austria.........................................................................2686 Weippl, Edgar R. / Vienna University of Technology, Austria..................................1812, 2492, 2970 Weiss, Michael / Carleton University, Canada......................................................................912, 1476 Weitzner, Daniel J. / Massachusetts Institute of Technology, USA...............................................1774 Whitten, Dwayne / Mays School of Business, Texas A&M University, USA................................2814 Whitty, Monica / University of Western Sydney, Australia...........................................................3510 Wijesekera, Duminda / George Mason University, USA..............................................................1236 Wilson, Rick L. / Oklahoma State University, USA............................................................1550, 3590 Wilson, Sean / Brigham Young University, USA............................................................................3542 Wippel, Gerald / Vienna University of Technology, Austria.........................................................2356 Wong, Edward K. / Polytechnic University, USA...........................................................................438 Woodcock, Leone E. / Southern Cross University, Australia........................................................3433 Wu, Richard Yi Ren / University of Alberta, Canada..................................................................1027 Wyld, David C. / Southeastern Louisiana University, USA..........................................................2149 Wylupski, Warren / University of New Mexico, USA...................................................................2366 Xu, Changsheng / Institute for Infocomm Research, Singapore.....................................................109 Xu, Yuefei / Institute for Information Technology, Canada and National Research Council, Canada.............................................................................................................174, 299, 2012 Yamamoto, Gonca Telli / Okan University, Turkey.......................................................................1931 Yang, SeungHae / Kwandong University, Republic of Korea.........................................................639 Yang, Stephen J. H. / National Central University, Taiwan..........................................................2303 Yannas, Prodromos / Technological Educational Institution of Western Macedonia, Greece.......465 Yee, George / Institute for Information Technology, Canada and National Research Council, Canada...................................................................................................174, 299, 2012, 2516 Yeh, Jyh-haw / Boise State University, USA..................................................................................1956 Yu, E. / University of Toronto, Canada..........................................................................................2462 Zaitch, Damián / Erasmus University and European Network and Information Security Agency, Greece..................................................................................................................1681 Zannone, N. / University of Trento, Italy.........................................................................................981 Zhang, Eric Zhen / University of Ottawa, Canada.......................................................................3765 Zhang, Jie (Jennifer) / University of Toledo, USA........................................................................3525 Zhao, Fang / RMIT University, Australia......................................................................................2915 Zhao, J. Leon / University of Arizona, USA....................................................................................818 Zulkernine, Mohammad / Queen’s University, Canada................................................................927
Contents by Volume
Section 1. Fundamental Concepts and Theories This section serves as a foundation for this exhaustive reference tool by addressing crucial theories essential to the understanding of information security and ethics. Chapters found within these pages provide an excellent framework in which to position information security and ethics within the field of information science and technology. Insight regarding the critical incorporation of security measures into online and distance learning systems is addressed, while crucial stumbling blocks of information management are explored. With 45 chapters comprising this foundational section, the reader can learn and chose from a compendium of expert research on the elemental theories underscoring the information security and ethics discipline. Chapter 1.1. E-Government and Denial of Service Attacks / Aikaterini Mitrokotsa and Christos Douligeris..............................................................................................................................1 Chapter 1.2. Policy Frameworks for Secure Electronic Business / Andreas Mitrakas......................16 Chapter 1.3. Audio Watermarking: Properties, Techniques, and Evaluation / Andrés Garay Acevedo......................................................................................................................23 Chapter 1.4. Software Piracy: Possible Causes and Cures / Asim El-Sheikh, Abdullah Abdali Rashed, and A. Graham Peace...............................................................................62 Chapter 1.5. Administering the Semantic Web: Confidentiality, Privacy, and Trust Management / Bhavani Thuraisingham, Natasha Tsybulnik, and Ashraful Alam.....................................................72 Chapter 1.6. Human and Social Perspectives in Information Technology: An Examination of Fraud on the Internet / C. Richard Baker......................................................................................89 Chapter 1.7. Codes of Ethics in Virtual Communities / Călin Gurău.............................................101 Chapter 1.8. Digital Audio Watermarking / Changsheng Xu and Qi Tian.......................................109
Chapter 1.9. Secure Authentication Process for High Sensitive Data E-Services: A Roadmap / Claudio Agostino Ardagna, Ernesto Damiani, Fulvio Frati, and Salvatore Reale.........................129 Chapter 1.10. Evolution of Information-Hiding Technology / Der-Chyuan Lou, Jiang-Lung Liu, and Hao-Kuan Tso...........................................................................................................................144 Chapter 1.11. Description of Policies Enriched by Semantics for Security Management / Félix J. García Clemente, Gregorio Martínez Pérez, Juan A. Botía Blaya, and Antonio F. Gómez Skarmeta......................................................................................................155 Chapter 1.12. Privacy and Security in E-Learning / George Yee, Yuefei Xu, Larry Korba, and Khalil El-Khatib........................................................................................................................174 Chapter 1.13. Ethical Challenges for Information Systems Professionals / Gerald M. Hoffman....191 Chapter 1.14. Integrating Security and Software Engineering: An Introduction / H. Mouratidis and P. Giorgini.................................................................................................................................200 Chapter 1.15. Ethics Of Data Mining / Jack Cook.......................................................................... 211 Chapter 1.16. Privacy-Dangers and Protections / William H. Friedman.........................................218 Chapter 1.17. Ethics of New Technologies / Joe Gilbert.................................................................225 Chapter 1.18. Ethics and HCI / John Knight...................................................................................231 Chapter 1.19. The Central Problem in Cyber Ethics and How Stories Can Be Used to Address It / John M. Artz.................................................................................................................238 Chapter 1.20. Digital Watermarking: An Introduction / Juergen Seitz and Tino Jahnke.................253 Chapter 1.21. Signals of Trustworthiness in E-Commerce: Consumer Understanding of Third-Party Assurance Seals. / Kathryn M. Kimery and Mary McCord..........................................272 Chapter 1.22. Moral Foundations of Data Mining / Kenneth W. Goodman....................................292 Chapter 1.23. Privacy and Security in E-Learning. / Khalil El-Khatib, Larry Korba, Yuefei Xu, and George Yee................................................................................................................................299 Chapter 1.24. Telework Information Security / Loreen Marie Butcher-Powell..............................316 Chapter 1.25. Conducting Ethical Research Online: Respect for Individuals, Identities and the Ownership of Words / Lynne Roberts, Leigh Smith, and Clare Pollock....................................324 Chapter 1.26. A Unified Information Security Management Plan / Mari W. Buche and Chelley Vician..................................................................................................................................340
Chapter 1.27. Information Security Management / Mariana Hentea..............................................350 Chapter 1.28. The Challenge of Privacy and Security and the Implementation of Health Knowledge Management Systems / Martin Orr..............................................................................358 Chapter 1.29. Would Be Pirates: Webcasters, Intellectual Property, and Ethics / Melanie J. Mortensen.......................................................................................................................380 Chapter 1.30. Check-Off Password System (COPS): An Advancement in User Authentification Methods and Information Security / Merrill Warkentin, Kimberly Davis, and Ernst Bekkering....402 Chapter 1.31. The Game of Defense and Security / Michael Barlow..............................................419 Chapter 1.32. Data Hiding in Document Images / Minya Chen, Nasir Memon, and Edward K. Wong..............................................................................................................................438 Chapter 1.33. Ethics of Digital Government / Naim Kapucu..........................................................451 Chapter 1.34. Digital Morality and Ethics / Peter Danielson..........................................................457 Chapter 1.35. Net Diplomacy / Prodromos Yannas.........................................................................465 Chapter 1.36. Ethical Issues in Information Technology / Robert A. Schultz..................................473 Chapter 1.37. Protecting One’s Privacy: Insight into the Views and Nature of the Early Adopters of Privacy Services / Sarah Spiekermann........................................................................................481 Chapter 1.38. The Desire for Privacy: Insights into the Views and Nature of the Early Adopters of Privacy Services / Sarah Spiekermann........................................................................................488 Chapter 1.39. Security in Service-Oriented Architecture: Issues, Standards, and Implementations / Srinivas Padmanabhuni and Hemant Adarkar ...............................................................................496 Chapter 1.40. Leadership Style, Anonymity, and the Discussion of an Ethical Issue in an Electronic Context / Surinder S. Kahai and Bruce J. Avolio...........................................................513 Chapter 1.41. An Overview of Electronic Attacks / Thomas M. Chen and Chris Davis.................532 Chapter 1.42. An Introduction in Digital Watermarking: Applications, Principles, and Problems / Tino Jahnke and Juergen Seitz.........................................................................................................554 Chapter 1.43. Digital Rights Management for E-Content and E-Technologies / Yingge Wang, Qiang Cheng, Jie Cheng, and Thomas S. Huang............................................................................570 Chapter 1.44. E-Health Security and Privacy / Yingge Wang, Qiang Cheng, and Jie Cheng..........580 Chapter 1.45. Privacy and Confidentiality Issues in Data Mining / Yücel Saygin...........................589
Section 2. Development and Design Methodologies This section provides in-depth coverage of conceptual architecture frameworks to provide the reader with a comprehensive understanding of the emerging technological developments within the field of information security and ethics. Research fundamentals imperative to the understanding of developmental processes within information management are offered. From broad examinations to specific discussions on security tools, the research found within this section spans the discipline while offering detailed, specific discussions. From basic designs to abstract development, these chapters serve to expand the reaches of development and design technologies within the information security and ethics community. This section includes over 28 contributions from researchers throughout the world on the topic of information security and privacy within the information science and technology field. Chapter 2.1. Evaluation of Computer Adaptive Testing Systems / Anastasios A. Economides and Chrysostomos Roupas...............................................................................................................596 Chapter 2.2. A Comparison of Authentication, Authorization and Auditing in Windows and Linux / Art Taylor and Lauren Eder.............................................................................................................613 Chapter 2.3. Privacy-Preserving Data Mining: Development and Directions / Bhavani Thuraisingham...................................................................................................................627 Chapter 2.4. A SEEP (Security Enhanced Electronic Payment) Protocol Design using 3BC, ECC (F2m), and HECC Algorithm. / ByungKwan Lee, SeungHae Yang, and Tai-Chi Lee..............639 Chapter 2.5. A Methodology to Develop Secure Systems using Patterns / E. B. Fernandez and M. M. Larrondo-Petrie..............................................................................................................654 Chapter 2.6. An Adaptive Access Control Model for Web Services / Elisa Bertino, Anna C. Squicciarini, Lorenzo Martino, and Federica Paci...........................................................671 Chapter 2.7. Integrating Security in the Development Process with UML / Folker den Braber, Mass Soldal Lund, Ketil Stølen, and Fredrik Vraalsen....................................................................704 Chapter 2.8. Storage and Access Control Issues for XML Documents / George Pallis, Konstantina Stoupa, and Athena Vakali...........................................................................................713 Chapter 2.9. Taxonomies of User-Authentication Methods in Computer Networks / Göran Pulkkis, Kaj J. Grahn, and Jonny Karlsson.........................................................................737 Chapter 2.10. WLAN Security Management / Göran Pulkkis, Kaj J. Grahn, Jonny Karlsson......761 Chapter 2.11. Architectures for Advanced Cryptographic Systems / Guido Bertoni, Jorge Guajardo, and Christof Paar.................................................................................................771 Chapter 2.12. Web Services Enabled E-Market Access Control Model / Harry J. Wang, Hsing K. Cheng, and J. Leon Zhao..................................................................................................818
Chapter 2.13. Security Laboratory Design and Implementation / Linda V. Knight and Jean-Philippe P. Labruyere.......................................................................................................837 Chapter 2.14. Extending Security in Agile Software Development Methods / M. Siponen, R. Baskerville, and T. Kuivalainen...................................................................................................845 Chapter 2.15. Invisibility and the Ethics of Digitalization: Designing so as not to Hurt Others / Maja van der Velden........................................................................................................................859 Chapter 2.16. A Methodology for Developing Trusted Information Systems: The Security Requirements Analysis Phase / Maria Grazia Fugini and Pierluigi Plebani..................................872 Chapter 2.17. Design Principles for Active Audio and Video Fingerprinting / Martin Steinebach and Jana Dittmann...........................................................................................................................900 Chapter 2.18. Modeling Method for Assessing Privacy Technologies / Michael Weiss and Babak Esfandiari.............................................................................................................................912 Chapter 2.19. Software Security Engineering: Toward Unifying Software Engineering and Security Engineering / Mohammad Zulkernine and Sheikh I. Ahamed....................................927 Chapter 2.20. Identifying Security Requirements Using the Security Quality Requirements Engineering (SQUARE) Method / N. R. Mead................................................................................943 Chapter 2.21. Do Information Security Policies Reduce the Incidence of Security Breaches: An Exploratory Analysis / Neil F. Doherty and Heather Fulford....................................................964 Chapter 2.22. Modelling Security and Trust with Secure Tropos / P. Giorgini, H. Mouratidis, and N. Zannone................................................................................................................................981 Chapter 2.23. Text Mining, Names and Security / Paul Thompson...............................................1006 Chapter 2.24. Framework for Secure Information Management in Critical Systems / Rajgopal Kannan, S. Sitharama Iyengar, and A. Durresi..............................................................1012 Chapter 2.25. Building an Online Security System with Web Services / Richard Yi Ren Wu and Mahesh Subramanium............................................................................................................1027 Chapter 2.26. Designing Secure Data Warehouses / Rodolfo Villarroel, Eduardo Fernández-Medina, Juan Trujillo, and Mario Piattini...................................................1048 Chapter 2.27. Developing a Theory of Portable Public Key Infrastructure (PORTABLEPKI) for Mobile Business Security / Sashi Nand...................................................................................1062 Chapter 2.28. Potential Security Issues in a Peer-to-Peer Network from a Database Perspective / Sridhar Asvathanarayanan............................................................................................................1070
Chapter 2.29. Strategic Alliances of Information Technology Among Supply Chain Channel Members / H. Y. Sonya Hsu and Stephen C. Shih............................................................1080 Chapter 2.30. Chinese Wall Security Policy Model: Granular Computing on DAC Model / Tsau Young Lin...............................................................................................................................1096 Section 3. Tools and Technologies This section presents an extensive coverage of various tools and technologies available in the field of information security and ethics that practitioners and academicians alike can utilize to develop different techniques. These chapters enlighten readers about fundamental research on the many methods used to facilitate and enhance the integration of security controls exploring defense strategies for information warfare—an increasingly pertinent research arena. It is through these rigorously researched chapters that the reader is provided with countless examples of the up-and-coming tools and technologies emerging from the field of information security and ethics. With more than 32 chapters, this section offers a broad treatment of some of the many tools and technologies within the IT security community. Chapter 3.1. IAIS: A Methodology to Enable Inter-Agency Information Sharing in eGovernment / Akhilesh Bajaj and Sudha Ram...................................................................................................... 1108 Chapter 3.2. Network Robustness for Critical Infrastructure Networks / Anthony H. Dekker and Bernard Colbert...................................................................................................................... 1125 Chapter 3.3. Secure Semantic Grids / Bhavani Thuraisingham and Latifur Khan........................ 1145 Chapter 3.4. From CCTV to Biometrics Through Mobile Surveillance / Jason Gallo................. 1158 Chapter 3.5. Robust Face Recognition for Data Mining / Brian C. Lovell and Shaokang Chen.. 1165 Chapter 3.6. Securing an Electronic Legislature Using Threshold Signatures / Brian King and Yvo Desmedt............................................................................................................................ 1176 Chapter 3.7. Use of RFID In Supply Chain Data Processing / Jan Owens and Suresh Chalasani........................................................................................................................... 1184 Chapter 3.8. Digital Signature-Based Image Authentication / Der-Chyuan Lou, Jiang-Lung Liu, and Chang-Tsun Li......................................................................................................................... 1192 Chapter 3.9. Digital Certificates and Public-Key Infrastructures / Diana Berbecaru, Corrado Derenale, and Antonio Lioy............................................................................................1210 Chapter 3.10. A Flexible Authorization Framework / Duminda Wijesekera and Sushil Jajodia...1236 Chapter 3.11. A New Public-Key Algorithm for Watermarking of Digital Images / Eberhard Stickel.............................................................................................................................1257
Chapter 3.12. Protecting Privacy Using XML, XACML, and SAML / Ed Simon........................1267 Chapter 3.13. Multimedia Security and Digital Rights Management Technology / Eduardo Fernandez-Medina, Sabrina De Capitani di Vimercati, Ernesto Damiani, Mario Piattini, and Perangela Samarati.......................................................................................1288 Chapter 3.14. Merkle Tree Authentication in UDDI Registries / Elisa Bertino, Barbara Carminati, and Elena Ferrari.........................................................................................1321 Chapter 3.15. Current Network Security Systems / Göran Pulkkis, Kaj Grahn, and Peik Åström....................................................................................................................................1339 Chapter 3.16. WLAN Security Management / Göran Pulkkis, Kaj J. Grahn, and Jonny Karlsson..............................................................................................................................1349 Chapter 3.17. Scalable Security and Accounting Services for Content-Based Publish/Subscribe Systems / Himanshu Khurana and Radostina K. Koleva..............................................................1361 Chapter 3.18. A Multimedia-Based Threat Management and Information Security Framework / James B. D. Joshi, Mei-Ling Shyu, Shu-Ching Chen, Walid Aref, and Arif Ghafoor....................1378 Chapter 3.19. Metrics Based Security Assessment / James E. Goldman and Vaughn R. Christie.........................................................................................................................1396 Chapter 3.20. Multiplecasting in a Wired LAN Using CDMA Technique / K. S. Shaji Britto and P. E. Sankaranarayanan.........................................................................................................1416 Chapter 3.21. Exposing the Wired Equivalent Privacy Protocol Weaknesses in Wireless Networks / Kevin Curran and Elaine Smyth..................................................................................1426 Chapter 3.22. Trust Models for Ubiquitous Mobile Systems / Mike Burmester............................1450 Chapter 3.23. Access Control Specification in UML / M. Koch, F. Parisi-Presicce, and K. Pauls..........................................................................................................................................1456 Chapter 3.24. Modelling Security Patterns Using NFR Analysis / M. Weiss................................1476 Chapter 3.25. Security and Trust in P2P Systems / Michael Bursell.............................................1488 Chapter 3.26. Monitoring Technologies and Digital Governance / Peter Danielson....................1504 Chapter 3.27. Integrating Access Policies into the Development Process of Hypermedia Web Systems / Paloma Díaz, Daniel Sanz, Susana Montero, and Ignacio Aedo..........................1514 Chapter 3.28. Kernelized Database Systems Security / Ramzi A. Haraty.....................................1531
Chapter 3.29. High Assurance Products in IT Security / Rayford B. Vaughn................................1537 Chapter 3.30. Protecting Data through ‘Perturbation’ Techniques: The Impact on Knowledge Discovery in Databases / Rick L. Wilson and Peter A. Rosen........................................................1550 Chapter 3.31. Deploying Honeynets / Ronald C. Dodge, Jr. and Daniel Ragsdale......................1562 Chapter 3.32. Peer-to-Peer Security Issues in Nomadic Networks / Ross Lee Graham................1580 Chapter 3.33. Privacy-Preserving Transactions Protocol Using Mobile Agents with Mutual Authentication / Song Han, Vidyasagar Potdar, Elizabeth Chang, and Tharam Dillon...............1591 Chapter 3.34. Herding 3,000 Cats: Enabling Continuous Real Estate Transaction Processing / Stephen J. Andriole and Charlton Monsanto.................................................................................1603 Chapter 3.35. Intrusion Detection Using Modern Techniques: Integration of Genetic Algorithms and Rough Sets with Neural Nets / Tarun Bhaskar and Narasimha Kamath B......... 1611 Chapter 3.36. A National Information Infrastructure Model for Information Warfare Defence / Vernon Stagg and Matthew Warren...............................................................................................1626 Section 4. Utilization and Application This section discusses a variety of applications and opportunities available that can be considered by practitioners in developing viable and effective information security programs and processes. This section includes more than 47 chapters which review certain legal aspects of forensic investigation and additional self-regulatory measures that can be leveraged to investigate cyber crime in forensic investigations. Further chapters investigate issues affecting the selection of personal firewall software in organizations. Also considered in this section are the challenges faced when utilizing information security and ethics with healthcare systems. Contributions included in this section provide excellent coverage of today’s global community and how research into information security and ethics is impacting the social fabric of our present-day global village. Chapter 4.1. Distributed Intrusion Detection Systems: A Computational Intelligence Approach / Ajith Abraham and Johnson Thomas.............................................................................................1639 Chapter 4.2. Emerging Mobile Technology and Supply Chain Integration: Using RFID to Streamline the Integrated Supply Chain / Richard Schilhavy and A. F. Salam.............................1660 Chapter 4.3. Trust and Security in Ambient Intelligence: A Research Agenda for Europe / Andrea Servida...............................................................................................................................1671 Chapter 4.4. Law, Cyber Crime and Digital Forensics: Trailing Digital Suspects / Andreas Mitrakas and Damián Zaitch...........................................................................................1681
Chapter 4.5. ASKARI: A Crime Text Mining Approach / Caroline Chibelushi, Bernadette Sharp, and Hanifa Shah..............................................................................................1701 Chapter 4.6. Digital Watermarking for Multimedia Security Management / Chang-Tsun Li........1719 Chapter 4.7. A Case Study of Effectively Implemented Information Systems Security Policy / Charla Griffy-Brown and Mark W. S. Chun..................................................................................1727 Chapter 4.8. A Service-Based Approach for RBAC and MAC Security / Charles E. Phillips, Jr., Steven A. Demjurian, Thuong Doan, and Keith Bessette..............................................................1741 Chapter 4.9. Security in Health Information Systems / Christina Ilioudi and Athina Lazakidou.....1759 Chapter 4.10. Mobile Commerce Security and Payment / Chung-wei Lee, Weidong Kou, and Wen-Chen Hu..........................................................................................................................1766 Chapter 4.11. Creating a Policy-Aware Web: Discretionary, Rule-Based Access for the World Wide Web / Daniel J. Weitzner, Jim Hendler, Tim Berners-Lee, and Dan Connolly..........1774 Chapter 4.12. Intrusion Detection and Response / David A. Dampier and Ambareen Siraj.........1795 Chapter 4.13. E-Business Systems Security for Intelligent Enterprise / Denis Trcek...................1806 Chapter 4.14. Security and Trust in Mobile Multimedia / Edgar R. Weippl..................................1812 Chapter 4.15. Comparing the Security Architectures of Sun ONE and Microsoft .NET / Eduardo B. Fernandez, Michael Thomsen, and Minjie H. Fernandez..........................................1828 Chapter 4.16. Secure Data Dissemination / Elisa Bertino, Elena Ferrari, and Barbara Carminati........................................................................................................................1839 Chapter 4.17. Experiences from Using the CORAS Methodology to Analyze a Web Application / Folker den Braber, Arne Bjørn Mildal, Jone Nes, Ketil Stølen, and Fredrik Vraalsen.................1865 Chapter 4.18. Smart Card Applications and Systems: Market Trends and Impact on Other Technological Developments / Gerald Maradan, Pierre Cotte, and Thierry Fornas...................1884 Chapter 4.19. RFID in Retail Supply Chain / Claudia Loebbecke................................................1923 Chapter 4.20. Business Ethics and Technology in Turkey: An Emerging Country at the Crossroad of Civilizations / Gonca Telli Yamamoto and Faruk Karaman....................................1931 Chapter 4.21. Online Signature Recognition / Indrani Chakravarty, Nilesh Mishra, Mayank Vatsa, Richa Singh, and P. Gupta.....................................................................................1947 Chapter 4.22. Security Issues and Possible Countermeasures for a Mobile Agent Based M-Commerce Application / Jyh-haw Yeh, Wen-Chen Hu, and Chung-wei Lee............................1956
Chapter 4.23. Realized Applications of Positioning Technologies in Defense Intelligence / Katina Michael and Amelia Masters.............................................................................................1975 Chapter 4.24. Computer Security and Risky Computing Practices: A Rational Choice Perspective / Kregg Aytes and Terry Connolly....................................................................................................1994 Chapter 4.25. Privacy and Trust in Agent-Supported Distributed Learning / Larry Korba, George Yee, Yuefei Xu, Ronggong Song, Andrew S. Patrick, and Khalil El-Khatib......................2012 Chapter 4.26. Better Securing an Infrastructure for Telework / Loreen Marie Butcher-Powell....2044 Chapter 4.27. A Method of Assessing Information System Security Controls / Malcolm R. Pattinson.....................................................................................................................2059 Chapter 4.28. Electronic Banking and Information Assurance Issues: Survey and Synthesis / Manish Gupta, Raghav Rao, and Shambhu Upadhyaya................................................................2075 Chapter 4.29. Security, Privacy, and Trust in Mobile Systems / Marco Cremonini, Ernesto Damiani, Sabrina De Capitani di Vimercati, and Pierangela Samarati..........................2095 Chapter 4.30. Seamlessly Securing Web Services by a Signing Proxy / Mario Jeckle and Ingo Melzer....................................................................................................................................2103 Chapter 4.31. A TAM Analysis of an Alternative High-Security User Authentication Procedure / Merrill Warkentin, Kimberly Davis, and Ernst Bekkering............................................................. 2114 Chapter 4.32. IT Security Governance and Centralized Security Controls / Merrill Warkentin and Allen C. Johnston....................................................................................................................2130 Chapter 4.33. Securing E-Learning Systems: A Case of Insider Cyber Attacks and Novice IT Management in a Small University / Michelle Ramim and Yair Levy...........................................2139 Chapter 4.34. The Next Big RFID Application: Correctly Steering Two Billion Bags a Year Through Today’s Less-Than-Friendly Skies. / David C. Wyld......................................................2149 Chapter 4.35. Policy-Based Access Control for Context-Aware Services over the Wireless Internet / Paolo Bellavista, Antonio Corradi, and Cesare Stefanelli.............................................2163 Chapter 4.36. Data and Application Security for Distributed Application Hosting Services / Ping Lin and K. Selçuk Candan.....................................................................................................2187 Chapter 4.37. Information Privacy in a Surveillance State: A Perspective from Thailand / Pirongrong Ramasoota Rananand.................................................................................................2221 Chapter 4.38. An Integrated Security Verification and Security Solution Design Trade-Off Analysis Approach / S. H. Houmb, G. Georg, J. Jürjens, and R. France......................................2234 Chapter 4.39. Security Issues and Possible Countermeasures for a Mobile Agent Based M-Commerce Application / Jyh-haw Yeh, Wen-Chen Hu, and Chung-wei Lee............................2259
Chapter 4.40. Secure Agent for E-Commerce Applications / Sheng-Uei Guan............................2278 Chapter 4.41. A Case Study on a Security Maturity Assessment of a Business-to-Business Electronic Commerce Organization / Shirley Ann Becker and Anthony Berkemeyer...................2286 Chapter 4.42. Trustworthy Web Services: An Experience-Based Model for Trustworthiness Evaluation / Stephen J. H. Yang, Blue C. W. Lan, James S. F. Hsieh, and Jen-Yao Chung...........2303 Chapter 4.43. Perceptions of End-Users on the Requirements in Personal Firewall Software: An Exploratory Study / Sunil Hazari.............................................................................................2319 Chapter 4.44. Determining the Intention to Use Biometric Devices: An Application and Extension of the Technology Acceptance Model / Tabitha James, Taner Pirim, Katherine Boswell, Brian Reithel, and Reza Barkhi.....................................................................................................2335 Chapter 4.45. Security System for Distributed Business Applications / Thomas Schmidt, Gerald Wippel, Klaus Glanzer, and Karl Fürst.............................................................................2356 Chapter 4.46. Incident Preparedness and Response: Devlopming a Security Policy / Warren Wylupski, David R. Champion, and Zachary Grant..........................................................2366 Chapter 4.47. Applying Directory Services to Enhance Identification, Authentication, and Authorization for B2B Applications / Yuan-Yuan Jiao, Jun Du, and Jianxin (Roger) Jiao..........2388 Chapter 4.48. Risk Factors to Retrieve Anomaly Intrusion Information and Profile User Behavior / Yun Wang and Lee Seidman.........................................................................................2407 Chapter 4.49. Information Security for Legal Safety / Andreas Mitrakas.....................................2422 Section 5. Organizational and Social Implications This section includes a wide range of research pertaining to the social and organizational impact of information security technologies around the world. Chapters introducing this section critically analyze the links between computing and cultural diversity as well as the natural environment. Additional chapters included in this section examine the link between ethics and IT and the influence of gender on ethical considerations in the IT environment. Also investigating a concern within the field of information security is research which provides an introductory overview of identity management as it relates to data networking and enterprise information management systems. With 32 chapters, the discussions presented in this section offer research into the integration of security technology as well as implementation of ethical considerations for all organizations. Chapter 5.1. We Cannot Eat Data: The Need for Computer Ethics to Address the Cultural and Ecological Impacts of Computing / Barbara Paterson.................................................................2432 Chapter 5.2. Privacy and Property in the Global Datasphere / Dan L. Burk.................................2448
Chapter 5.3. A Social Ontology for Integrating Security and Software Engineering / E. Yu, L. Liu, and J. Mylopoulos..........................................................................................................................2462 Chapter 5.4. Computer Security in E-Learning. / Edgar R. Weippl...............................................2492 Chapter 5.5. Trust in Virtual Communities / Eun G. Park.............................................................2500 Chapter 5.6. Online Communities, Democratic Ideals, and the Digital Divide / Frances S. Grodzinsky and Herman T. Tavani...............................................................................2505 Chapter 5.7. Security and Privacy in Distance Education / George Yee........................................2516 Chapter 5.8. Ethics and Privacy of Communications in the E-Polis / Gordana Dodig-Crnkovic and Virginia Horniak.....................................................................................................................2524 Chapter 5.9. A Process Data Warehouse for Tracing and Reuse of Engineering Design Processes / Sebastian C. Brandt, Marcus Schlüter, and Matthias Jarke..........................................................2531 Chapter 5.10. The Impact of the Sarbanes-Oxley (SOX) Act on Information Security / Gurpreet Dhillon and Sushma Mishra...........................................................................................2545 Chapter 5.11. Privacy Implications of Organizational Data Mining / Hamid R. Nemati, Charmion Brathwaite, and Kara Harrington................................................................................2561 Chapter 5.12. Patents and Standards in the ICT Sector: Are Submarine Patents a Substantive Problem or a Red Herring? / Aura Soinen.....................................................................................2577 Chapter 5.13. Gender Influences on Ethical Considerations in the IT Environment / Jessica Leong.................................................................................................................................2615 Chapter 5.14. Radio Frequency Identification Technology in Digital Government / Les Pang....2623 Chapter 5.15. Gender Differences in the Navigation of Electronic Worlds / Sharon McDonald and Lynne Humphries....................................................................................................................2634 Chapter 5.16. Identity Management: A Comprehensive Approach to Ensuring a Secure Network Infrastructure / Katherine M. Hollis and David M. Hollis.............................................................2641 Chapter 5.17. Conducting Congruent, Ethical, Qualitative Research Internet-Mediated Research Environments / M. Maczewski, M.-A. Storey, and M. Hoskins......................................................2650 Chapter 5.18. Electronic Banking and Information Assurance Issues: Survey and Synthesis / Manish Gupta, Raghav Rao, and Shambhu Upadhyaya................................................................2666 Chapter 5.19. Model Driven Security for Inter-Organizational Workflows in E-Governent / Michael Hafner, Barbara Weber, Ruth Breu, and Andrea Nowak.................................................2686
Chapter 5.20. Entrepreneur Behaviors on E-Commerce Security / Michael Kyobe......................2704 Chapter 5.21. Ethics of Deception in Virtual Communities / Neil C. Rowe..................................2724 Chapter 5.22. Information Security Policies in Large Organisations: The Development of a Conceptual Framework to Explore Their Impact / Neil F. Doherty and Heather Fulford............2727 Chapter 5.23. IT Security Policy in Public Organizations / Parviz Partow-Navid and Ludwig Slusky................................................................................................................................2745 Chapter 5.24. Interactions among Thai Culture, ICT, and IT Ethics / Pattarasinee Bhattarakosol.....2755 Chapter 5.25. HIPAA: Privacy and Security in Health Care Networks / Pooja Deshmukh and David Croasdell......................................................................................................................2770 Chapter 5.26. Communication Security Technologies in Smart Organizations / Raphael C. W. Phan.....2782 Chapter 5.27. The Social Contract Revised: Obligation and Responsibility in the Information Society / Robert Joseph Skovira....................................................................................................2797 Chapter 5.28. Examining User Perceptions of Third-Party Organization Credibility and Trust in an E-Retailer / Robin L. Wakefield and Dwayne Whitten..............................................................2814 Chapter 5.29. Repeated Use of E-Gov Web Sites: A Satisfaction and Confidentiality Perspective / Sangmi Chai, T. C. Herath, I. Park, and H. R. Rao.......................................................................2830 Chapter 5.30. Information Security Risk Analysis: A Pedagogic Model Based on a Teaching Hospital / Sanjay Goel and Damira Pon.......................................................................................2849 Chapter 5.31. Authorization Service for Web Services and its Application in a Health Care Domain / Sarath Indrakanti, Vijay Varadharajan, and Michael Hitchens....................................2865 Chapter 5.32. Secure Agent Roaming for Mobile Business / Sheng-Uei Guan............................2892 Chapter 5.33. Social Issues of Trust and Digital Government / Stephen Marsh, Andrew S. Patrick, and Pamela Briggs..........................................................................................2905 Section 6. Managerial Impact This section presents contemporary coverage of the social implications of information security and ethics, more specifically related to the corporate and managerial utilization of information sharing technologies and applications, and how these technologies can be facilitated within organizations. Core ideas such as training and continuing education of human resources in modern organizations are discussed through these 12 chapters. Issues such as strategic planning related to the organizational elements and information security program requirements that are necessary to build a framework in order to institu-
tionalize and sustain information systems as a core business process are discussed. Equally as crucial, chapters within this section examine the internal, external/environmental, and behavioral dimensions of information privacy, while analyzing findings for e-entrepreneurship and e-business ethics. Concluding this section is research which examines growth of the Internet and the effects of the wide availability of toolsets and documentation, making malware development easy. Security issues such as phishing, pharming, spamming, spoofing, spyware, and hacking incidents are explained while offering security options to defend against these increasingly more complex breeches of security and privacy. Chapter 6.1. Online Information Privacy and its Implications for E-Entrepreneurship and E-Busines Ethics / Carmen Gould and Fang Zhao.......................................................................2915 Chapter 6.2. E-Business Systems Security for Intelligent Enterprise / Denis Trček.....................2931 Chapter 6.3. Resistance: A Medium for the Successful Implementation of Technological Innovation / Dianne Waddell.........................................................................................................2949 Chapter 6.4. A Model of Information Security Governance for E-Business / Dieter Fink, Tobias Huegle, and Martin Dortschy.............................................................................................2958 Chapter 6.5. Implementing IT Security for Small and Medium Enterprises / Edgar R. Weippl and Markus Klemen.......................................................................................................................2970 Chapter 6.6. Workarounds and Security / Fiona Brady.................................................................2986 Chapter 6.7. Policy-Based Management of Web and Information System Security: An Emerging Technology / Gregorio Martínez Pérez, Félix J. García Clemente, and Antonio F. Gómez Skarmeta......2991 Chapter 6.8. Exploring the Behavioral Dimension of Client/Server Technology Implementation: An Empirical Investiation / Eital J. M. Lauría..............................................................................3006 Chapter 6.9. A Security Blueprint for E-Business Applications: A Crime Text Mining Approach / Jun Du, Yuan-Yuan Jiao, and Jianxin (Roger) Jiao.......................................................................3020 Chapter 6.10. Integration of Business Event and Rule Management With the Web Services Model / Karthik Nagarajan, Herman Lam, and Stanley Y.W. Su...................................................3031 Chapter 6.11. Privacy and Security in the Age of Electronic Customer Relationship Management / Nicholas C. Romano, Jr. and Jerry Fjermestad.............................................................................3045 Chapter 6.12. Malware and Antivirus Deployment for Enterprise Security / Raj Sharman, K. Pramod Krishna, H. Raghov Rao, and Shambhu Upadhyaya..................................................3067
Section 7. Critical Issues This section contains 43 chapters addressing issues such as computer ethics, identify theft, e-fraud, social responsibility, cryptography, and online relationships, to name a few. Within the chapters, the reader is presented with an in-depth analysis of the most current and relevant issues within this growing field of study. Studies of the effects of technological innovation in the light of theories of regulation are revealed while analytical frameworks for new forms of information warfare which are threatening commercial and government computing systems are discussed. Crucial questions are addressed and alternatives offered such as the notion of social responsibility and its relationship to the information. Closing this section with a discussion of the mutual influence between culture and technology on a broad inter- and transcultural level concludes this section offering the research endless options for further research. Chapter 7.1. Computer Ethics: Constitutive and Consequential Morality / A. Raghuramaraju...3084 Chapter 7.2. Identity Theft and E-Fraud as Critical CRM Concerns / Alan D. Smith and Allen R. Lias...3094 Chapter 7.3. Web Accessibility for Users with Disabilities: A Multi-faceted Ethical Analysis / Alfreda Dudley-Sponaugle and Jonathan Lazar............................................................................ 3112 Chapter 7.4. Trust in E-Technologies / Andrea Oermann and Jana Dittmann..............................3122 Chapter 7.5. Password Security Issues on an E-Commerce Site / B. Dawn Medlin, Joseph A. Cazier and Dinesh S. Dave............................................................................................3133 Chapter 7.6. MAMA on the Web: Ethical Considerations for Our Networked World / Barbara A. Schuldt.........................................................................................................................3142 Chapter 7.7. What is the Social Responsibility in the Information Age? Maximising Profits? / Bernd Carsten Stahl.......................................................................................................................3157 Chapter 7.8. Responsibility for Information Assurance and Privacy: A Problem of Individual Ethics? / Bernd Carsten Stahl........................................................................................................3170 Chapter 7.9. Intellectual Property Rights, Resources Allocation and Ethical Usefulness / Bruno de Vuyst and Alea M. Fairchild...........................................................................................3188 Chapter 7.10. Arguing Satisfaction of Security Requirements / C. B. Haley, R. Laney, J. D. Moffett, and B. Nuseibeh.......................................................................................................3199 Chapter 7.11. Negotiating Online Privacy Rights / Călin Gurău..................................................3222 Chapter 7.12. Integrity and Security in the E-Century / Carolyn Currie.......................................3229 Chapter 7.13. Simulating Complexity-Based Ethics for Crucial Decision Making in Counter Terrorism / Cecilia Andrews and Edward Lewis...........................................................................3250
Chapter 7.14. Moral Psychology and Information Ethics: Psychological Distance and the Components of Moral Behavior in a Digital World / Charles R. Crowell, Darcia Narvaez, and Anna Gomberg........................................................................................................................3269 Chapter 7.15. Issues on Image Authentication / Ching-Yung Lin..................................................3282 Chapter 7.16. Data Confidentiality on the Semantic Web: Is There an Inference Problem? / Csilla Farkas..................................................................................................................................3309 Chapter 7.17. Blurring the Boundaries: Ethical Considerations for Online Research Using Synchronous CMC Forums / Danielle Lawson.............................................................................3321 Chapter 7.18. Computing Ethics: Intercultural Comparisons / Darryl Macer..............................3340 Chapter 7.19. Does ‘Public Access’ Imply ‘Ubiquitous’ or ‘Immediate’? Issues Surrounding Public Documents Online / David W. Miller, Andrew Urbaczewski, and Wm. David Salisbury...3352 Chapter 7.20. A Psychoanalytic Perspective of Internet Abuse / Feng-Yang Kuo.........................3366 Chapter 7.21. Protection of Minors from Harmful Internet Content / Geoffrey A. Sandy.............3375 Chapter 7.22. A Critical Systems View of Power-Ethics Interactions in Information Systems Evaluation / José-Rodrigo Córdoba..............................................................................................3387 Chapter 7.23. Information Quality: Critical Ingredient for National Security / Larry P. English..... 3404 Chapter 7.24. Insights from Y2K and 9/11 for Enhancing IT Security / Laura Lally...................3419 Chapter 7.25. Gender Differences in Ethics Perceptions in Information Technology / Leone E. Woodcock and San Murugesan.......................................................................................3433 Chapter 7.26. Cryptography: Deciphering Its Progress / Leslie Leong and Andrzej T. Jarmoszko......3442 Chapter 7.27. Privacy-Preserving Data Mining and the Need for Confluence of Research and Practice / Lixin Fu, Hamid Nemati, and Fereidoon Sadri.............................................................3451 Chapter 7.28. The Existential Significance of the Digital Divide for America’s Historically Underserved Populations / Lynette Kvasny....................................................................................3470 Chapter 7.29. Ethics in Internet Ethnography / Malin Sveningsson..............................................3484 Chapter 7.30. The Critical Role of Digital Rights Management Processes in the Context of the Digital Media Management Value Chain / Margherita Pagani.....................................................3499 Chapter 7.31. Peering into Online Bedroom Windows: Considering the Ethical Implications of Investigating Internet Relationships and Sexuality / Monica Whitty.............................................3510
Chapter 7.32. Analyzing the Influence of Web Site Design Parameters on Web Site Usability / Monideepa Tarafdar and Jie (Jennifer) Zhang..............................................................................3525 Chapter 7.33. Biometrics: A Critical Consideration in Information Security Management / Paul Benjamin Lowry, Jackson Stephens, Aaron Moyes, Sean Wilson, and Mark Mitchell..........3542 Chapter 7.34. Online Privacy: Consumer Concerns and Technological Competence / Pushkala Raman and Kartik Pashupati.........................................................................................3550 Chapter 7.35. Security Issues in Distributed Transaction Processing Systems / R. A. Haraty......3572 Chapter 7.36. Hacker Wars: E-Collaboration by Vandals and Warriors / Richard Baskerville.....3577 Chapter 7.37. Does Protecting Databases Using Perturbation Techniques Impact Knowledge Discovery? / Rick L. Wilson and Peter A. Rosen...........................................................................3590 Chapter 7.38. Ethics of “Parasitic Computing”: Fair Use or Abuse of TCP/IP Over the Internet / Robert N. Barger and Charles R. Crowell.....................................................................................3600 Chapter 7.39. Ethical Dilemmas in Online Research / Rose Melville...........................................3612 Chapter 7.40. Security Vulnerabilities and Exposures in Internet Systems and Services / Rui C. Cardoso and Mário M. Freire............................................................................................3620 Chapter 7.41. Security in Pervasive Computing / Sajal K. Das, Afrand Agah and Mohan Kumar......3627 Chapter 7.42. Analysis and Justification of Privacy from a Buddhist Perspective / Soraj Hongladarom........................................................................................................................3644 Chapter 7.43. Up In Smoke: Rebuilding after an IT Disaster / Steven C. Ross, Craig K. Tyran, and David J. Auer..........................................................................................................................3659 Chapter 7.44. Culture and Technology: A Mutual-Shaping Approach / Thomas Herdin, Wolfgang Hofkirchner, and Ursula Maier-Rabler.........................................................................3676 Section 8. Emerging Trends This section highlights research potential within the field of information security and ethics while exploring uncharted areas of study for the advancement of the discipline. Introducing this section are chapters that set the stage for future research directions and topical suggestions for continued debate. Discussions regarding the Normal Accident Theory and the Theory of High Reliability Organizations are offered. Another debate which currently finds itself at the forefront of research which discusses three major ethical theories, Lockean liberalism, consequentialism, and Kantian deontology and the implication of these three theories as they are applied to intellectual property rights in digitally distributed
media. Found in these 20 chapters concluding this exhaustive multi-volume set are areas of emerging trends and suggestions for future research within this rapidly expanding discipline. Chapter 8.1. Security Engineering for Ambient Intelligence: A Manifesto / A. Maña, C. Rudolph, G. Spanoudakis, V. Lotz, F. Massacci, M. Melideo, and J. S. López-Cobo....................................3691 Chapter 8.2. Enforcing Privacy on the Semantic Web / Abdelmounaam Rezgui, Athman Bouguettaya, and Zaki Malik...........................................................................................3713 Chapter 8.3. Strategic Importance of Security Standards / Alan D. Smith....................................3728 Chapter 8.4. Computer Security in Electronic Government: A State-Local Education Information System / Alison Radl and Yu-Che Chen.........................................................................................3739 Chapter 8.5. Teaching Gender Inclusive Computer Ethics / Eva Turner.......................................3758 Chapter 8.6. A Secure Authentication Infrastructure for Mobile Users / Gregor V. Bochmann and Eric Zhen Zhang.....................................................................................................................3765 Chapter 8.7. Integrating Security and Software Engineering: Future Vision and Challenges / H. Mouratidis and P. Giorgini.......................................................................................................3784 Chapter 8.8. Copyright Protection in Virtual Communities through Digital Watermarking / Huayin Si and Chang-Tsun Li........................................................................................................3788 Chapter 8.9. Analyzing the Privacy of a Vickrey Auction Mechanism / Ismael Rodríguez and Natalia López..........................................................................................................................3794 Chapter 8.10. The Ethics of Web Design: Ensuring Access for Everyone / Jack S. Cook and Laura Cook....................................................................................................................................3805 Chapter 8.11. Addressing the Central Problem in Cyber Ethics through Stories / John M. Artz...3824 Chapter 8.12. The Moral Status of Information and Information Technologies: A Relational Theory of Moral Status / Johnny Hartz Søraker............................................................................3829 Chapter 8.13. Radio Frequency Identification as a Challenge to Informaiotn Security and Privacy / Jorma Kajava, Juhani Antilla, and Rauno Varonen......................................................................3848 Chapter 8.14. Intellectual Property Rights—or Rights to the Immaterial—in Digitally Distributable Media Gone All Wrong / Kai Kristian Kimppa.......................................................3856 Chapter 8.15. Computer Security and Risky Computing Practices: A Rational Choice Perspective / Kregg Aytes and Terry Connolly....................................................................................................3866
Chapter 8.16. Information Technology as a Target and Shield in the Post 9/11 Environment / Laura Lally....................................................................................................................................3887 Chapter 8.17. Digital Watermarking for Protection of Intellectual Property / Mohamed Abdulla Suhail...............................................................................................................3902 Chapter 8.18. Tracing Cyber Crimes with a Privacy-Enabled Forensic Profiling System / Pallavi Kahai, Kamesh Namuduri, and Ravi Pendse....................................................................3938 Chapter 8.19. The Ethics of Conducting E-Mail Surveys / Sandeep Krishnamurthy....................3953 Chapter 8.20. Face Recognition Technology: A Biometric Solution to Security Problems / Sanjay K. Singh, Mayank Vatsa, Richa Singh, K. K. Shukla, and Lokesh R. Boregowda..............3968 Chapter 8.21. A Model for Monitoring and Enforcing Online Auction Ethics / Shouhong Wang and Diana Kao...............................................................................................................................4000 Chapter 8.22. Malware: An Evolving Threat / Steven Furnell and Jeremy Ward.........................4014
xxxv
Preface
Emphasis on knowledge and information is one of the key factors that differentiate the intelligent business enterprise of the 21st century. In order to harness knowledge and information to improve effectiveness, enterprises of the new millennium must capture, manage and utilize information with rapid speed in an effort to keep pace with the continually changing technology. Information security and ethical considerations of technology are important means by which organizations can better manage and secure information. Not easily defined, the field of information security and ethics embodies a plethora of categories within the field of information science and technology. Over the past two decades, numerous researchers have developed a variety of techniques, methodologies, and measurement tools that have allowed them to develop, deliver and at the same time evaluate the effectiveness of several areas of information security and ethics. The explosion of these technologies and methodologies have created an abundance of new, state-of-art literature related to all aspects of this expanding discipline, allowing researchers and practicing educators to learn about the latest discoveries within the field. Rapid technological changes, combined with a much greater interest in discovering innovative techniques to manage information security in today’s modern organizations, have led researchers and practitioners to continually search for literature that will help them stay abreast of the far-reaching effects of these changes, as well as to facilitate the development and deliverance of more ground-breaking methodologies and techniques utilizing new technological innovation. In order to provide the most comprehensive, in-depth, and recent coverage of all issues related to information security and ethics, as well as to offer a single reference source on all conceptual, methodological, technical and managerial issues, as well as the opportunities, future challenges, and emerging trends related to this subject, Information Science Reference is pleased to offer a six-volume reference collection on this rapidly growing discipline, in order to empower students, researchers, academicians, and practitioners with a comprehensive understanding of the most critical areas within this field of study. This collection, Information Security and Ethics: Concepts, Methodologies, Tools, and Applications is organized in eight distinct sections, providing the most wide-ranging coverage of topics such as: (1) Fundamental Concepts and Theories; (2) Development and Design Methodologies; (3) Tools and Technologies; (4) Utilization and Application; (5) Organizational and Social Implications; (6) Managerial Impact; (7) Critical Issues; and (8) Emerging Trends. The following provides a summary of what is covered in each section of this multi volume reference collection: Section 1, Fundamental Concepts and Theories, serves as a foundation for this exhaustive reference tool by addressing crucial theories essential to the understanding of information security and ethics. Chapters such as, Leadership Style, Anonymity, and the Discussion of an Ethical Issue in an Electronic Context by Surinder S. Kahai and Bruce J. Avolio as well as Information Security Management by Mariana Hentea provide an excellent framework in which to position information security and ethics
xxxvi
within the field of information science and technology. Privacy and Security in E-Learning by George Yee, Yuefei Xu, Larry Korba and Khalil El-Khatib offers excellent insight into the critical incorporation of security measures into online and distance learning systems, while chapters such as, A Unified Information Security Management Plan by Mari W. Buch and Chelley Vician address some of the basic, yet crucial stumbling blocks of information management. With 45 chapters comprising this foundational section, the reader can learn and chose from a compendium of expert research on the elemental theories underscoring the information security and ethics discipline. Section 2, Development and Design Methodologies, provides in-depth coverage of conceptual architecture frameworks to provide the reader with a comprehensive understanding of the emerging technological developments within the field of information security and ethics. Framework for Secure Information Management in Critical Systems by Rajgopal Kannan, S. Sitharama Iyengar, and A. Durresi offers research fundamentals imperative to the understanding of research and developmental processes within information management. From broad examinations to specific discussions on security tools such as, Tsau Young Lin’s, Chinese Wall Security Policy Model: Granular Computing on DAC Model the research found within this section spans the discipline while offering detailed, specific discussions. From basic designs to abstract development, chapters such as Do Information Security Policies Reduce the Incidence of Security Breaches: An Exploratory Analysis by Neil F. Doherty and Heather Fulford, and Potential Security Issues in a Peer-to-Peer Network from a Database Perspective by Sridhar Asvathanarayanan serve to expand the reaches of development and design technologies within the information security and ethics community. This section includes over 28 contributions from researchers throughout the world on the topic of information security and privacy within the information science and technology field. Section 3, Tools and Technologies, presents an extensive coverage of various tools and technologies available in the field of information security and ethics that practitioners and academicians alike can utilize to develop different techniques. Chapters such as Paloma Díaz, Daniel Sanz, Susana Montero and Ignacio Aedo’s Integrating Access Policies into the Development Process of Hypermedia Web Systems enlightens readers about fundamental research on one of the many methods used to facilitate and enhance the integration of security controls in hypermedia systems whereas chapters like, A National Information Infrastructure Model for Information Warfare Defence? by Vernon Stagg and Matthew Warren explore defense strategies for information warfare—an increasingly pertinent research arena. It is through these rigorously researched chapters that the reader is provided with countless examples of the up-and-coming tools and technologies emerging from the field of information security and ethics. With more than 32 chapters, this section offers a broad treatment of some of the many tools and technologies within the IT security community. Section 4, Utilization and Application, discusses a variety of applications and opportunities available that can be considered by practitioners in developing viable and effective information security programs and processes. This section includes more than 47 chapters such as Law, CyberCrime and Digital Forensics: Trailing Digital Suspects by Andreas Mitrakas and Damián Zaitch which reviews certain legal aspects of forensic investigation, the overall legal framework in the EU and U.S. and additional self-regulatory measures that can be leveraged upon to investigate cyber crime in forensic investigations. Additional chapters such as Sunil Hazari’s Perceptions of End-Users on the Requirements in Personal Firewall Software: An Exploratory Study investigates issues affecting selection of personal firewall software in organizations. Also considered in this section are the challenges faced when utilizing information security and ethics with healthcare systems as outlined by Christina Ilioudi and Athina Lazakidou’s, Security in Health Information Systems. Contributions included in this section provide excellent coverage of today’s global community and how research into information security and ethics is impacting the social fabric of our present-day global village.
xxxvii
Section 5, Organizational and Social Implications, includes a wide range of research pertaining to the social and organizational impact of information security technologies around the world. Introducing this section is Barbara Paterson’s chapter, We Cannot Eat Data: The Need for Computer Ethics to Address the Cultural and Ecological Impacts of Computing, which critically analyzes the links between computing and cultural diversity as well as the natural environment. Additional chapters included in this section such as Gender Influences on Ethical Considerations in the IT Environment by Jessica Leong examine the link between ethics and IT and the influence of gender on ethical considerations in the IT environment. Also investigating a concern within the field of information security is Katherine M. Hollis and David M. Hollis’ Identity Management: A Comprehensive Approach to Ensuring a Secure Network Infrastructure, which provides an introductory overview of identity management as it relates to data networking and enterprise information management systems. With 32 chapters the discussions presented in this section offer research into the integration of security technology as well as implementation of ethical considerations for all organizations. Section 6, Managerial Impact, presents contemporary coverage of the social implications of information security and ethics, more specifically related to the corporate and managerial utilization of information sharing technologies and applications, and how these technologies can be facilitated within organizations. Core ideas such as training and continuing education of human resources in modern organizations are discussed through these 12 chapters. A Security Blueprint for E-Business Applications by Jun Du, Yuan-Yuan Jiao and Jianxin (Roger) Jiao discusses strategic planning related to the organizational elements and information security program requirements that are necessary to build a framework in order to institutionalize and sustain information systems as a core business process. Equally as crucial, chapters such as Online Information Privacy and Its Implications for E-Entrepreneurship and E-Business Ethics by Carmen Gould and Fang Zhao contain a comprehensive examination of the internal, external/environmental, and behavioral dimensions of information privacy, as well as a description of findings for e-entrepreneurship and e-business ethics. Concluding this section is a chapter by Raj Sharman, K. Pramod Krishna, H. Raghov Rao and Shambhu Upadhyaya, Malware and Antivirus Deployment for Enterprise Security. This chapter examines growth of the Internet and the effects of the wide availability of toolsets and documentation, making malware development easy. As blended threats continue to combine multiple types of attacks into single and more dangerous payloads, newer threats are emerging. These professors explore phishing, pharming, spamming, spoofing, spyware, and hacking incidents while offering security options to defend against these increasingly more complex breeches of security and privacy. Section 7, Critical Issues, contains 43 chapters addressing issues such as computer ethics, identify theft, e-fraud, social responsibility, cryptography, and online relationships, to name a few. Within the chapters, the reader is presented with an in-depth analysis of the most current and relevant issues within this growing field of study. Carolyn Currie’s, Integrity and Security in the E-Century studies the effects of technological innovation in the light of theories of regulation that postulate a struggle between attempts to control innovation and further innovation and regulation while Hacker Wars: E-Collaboration by Vandals and Warriors by Richard Baskerville develops an analytical framework for new forms of information warfare that may threaten commercial and government computing systems by using e-collaboration in new ways. Crucial questions are addressed such as that presented in Bernd Carsten Stahl’s chapter, What is the Social Responsibility in the Information Age? Maximising Profits? which analyzes the notion of social responsibility and its relationship to the information age while expressing some of the normative questions of the information age. Culture and Technology: A Mutual-Shaping Approach by Thomas Herdin, Wolfgang Hofkirchner and Ursula Maier-Rabler closes this section with a discussion of the mutual influence between culture and technology on a broad inter- and transcultural level.
xxxviii
The concluding section of this authoritative reference tool, Emerging Trends, highlights research potential within the field of information security and ethics while exploring uncharted areas of study for the advancement of the discipline. Introducing this section is a chapter entitled, Security Engineering for Ambient Intelligence: A Manifesto, by A. Maña, C. Rudolph, G. Spanoudakis, V. Lotz, F. Massacci, M. Melideo, and J. S. López-Cobo which sets the stage for future research directions and topical suggestions for continued debate. Providing an alternative view of security in our post 9/11 world is Information Technology as a Target and Shield in the Post 9/11 Environment by Laura Lally. This chapter draws upon normal accident theory and the theory of high reliability organizations to examine the potential impacts of information technology being used as a target in terrorist and other malicious attacks, while arguing that IT can also be used as a shield to prevent further attacks and mitigate their impact if they should occur. Another debate which currently finds itself at the forefront of research within this field is presented by Kai Kristian Kimppa’s research, Intellectual Property Rights - or Rights to the Immaterial - in Digitally Distributable Media Gone All Wrong which discusses three major ethical theories, Lockean liberalism, consequentialism, and Kantian deontology and the implication of these three theories as they are applied to intellectual property rights in digitally distributed media. Found in these 20 chapters concluding this exhaustive multi-volume set are areas of emerging trends and suggestions for future research within this rapidly expanding discipline. Although the primary organization of the contents in this multi-volume is based on its eight sections, offering a progression of coverage of the important concepts, methodologies, technologies, applications, social issues, and emerging trends, the reader can also identify specific contents by utilizing the extensive indexing system listed at the end of each volume. Furthermore to ensure that the scholar, researcher and educator have access to the entire contents of this multi volume set as well as additional coverage that could not be include in the print version of this publication, the publisher will provide unlimited multiuser electronic access to the online aggregated database of this collection for the life of edition, free of charge when a library purchases a print copy. This aggregated database provides far more contents than what can be included in the print version in addition to continual updates. This unlimited access, coupled with the continuous updates to the database ensures that the most current research is accessible knowledge seekers. Information security and ethics as a discipline has witnessed fundamental changes during the past two decades, allowing information seekers around the globe to have access to information which two decades ago, was inaccessible. In addition to this transformation, many traditional organizations and business enterprises have taken advantage of the technologies offered by the development of information security systems in order to expand and augment their existing programs. This has allowed practitioners and researchers to serve their customers, employees and stakeholders more effectively and efficiently in the modern virtual world. With continued technological innovations in information and communication technology and with on-going discovery and research into newer and more innovative techniques and applications, the information security and ethics discipline will continue to witness an explosion of information within this rapidly evolving field. The diverse and comprehensive coverage of information security and ethics in this six-volume authoritative publication will contribute to a better understanding of all topics, research, and discoveries in this developing, significant field of study. Furthermore, the contributions included in this multi-volume collection series will be instrumental in the expansion of the body of knowledge in this enormous field, resulting in a greater understanding of the fundamentals while fueling the research initiatives in emerging fields. We at Information Science Reference, along with the editor of this collection, and the publisher hope that this multi-volume collection will become instrumental in the expansion of the discipline and will promote the continued growth of information security and ethics.
xxxix
Introductory Chapter:
Information Security and Ethics Hamid Nemati The University of North Carolina at Greensboro, USA
This book is dedicated to those whose ethics transcend their security fears. Hamid R. Nemati
ABSTRACT Information security and ethics has been viewed as one of the foremost areas of concern and interest by academic researchers and industry practitioners. Information security and ethics is defined as an all encompassing term that refers to all activities needed to secure information and systems that support it in order to facilitate its ethical use. In this introductory chapter, this very important field of study is introduced and the fundamental concepts and theories are discussed. A broad discussion of tools and technologies used to achieve the goals of information security and ethics is followed by a discussion of guidelines for the design and development of such tools and technologies. Managerial, organizational and societal implications of information security and ethics are then evaluated. The chapter concludes after an assessment of a number of future developments and activities on the horizon that will have an impact on this field.
INTRODUCTION Information defines us. It defines the age we live in and the societies we inhabit. Information is the output of our human intellectual endeavors which inherently defines who we are as humans and how we conduct our lives. We speak of the age we live in as the “information age” and our society as “information society.” The emergence of the society based on information signals a transition toward a new society based on the production and exchange of information as opposed to physical goods (Stephanidis et al., 1984). Information society refers to the new socioeconomic and technological paradigms that affect our human activities, individual behaviors, our collective consciousness, and our economic and social environments. The information age has important consequences for our lives as well. Essentially, it has ushered a new range of emerging computer-mediated activities that have revolutionized the way we live and interact with one another (Mesthene, 1968; Nardi, 1996; Stephanidis et al., 1984). More people
xl
are employed generating, collecting, handling, processing and distributing information than any other profession and in any other time (Mason 1986). New technology makes possible what was not possible before. This alters our old value clusters whose hierarchies were determined by a range of possibilities open to us at the time. By making available new options, new technologies can and will lead to a restructuring of the hierarchy of values (Mesthene, 1968). Mason argues that unique challenges facing our information society are the result of the evolving nature of information itself. Our modern notion of who we are and how we interact with information is based on the works of Greek philosopher Aristotle. Aristotle’s theory of animal behavior treats animals as information-processing entities. Bynum (2006) states “that the physiology of an animal, according to Aristotle determines: (1) the kinds of perceptual information that the animal can take in, (2) how this information is processed within the animal’s body, and (3) what the resulting animal behavior will be.” Bynum goes on to say that according to Aristotle, the most sophisticated information processing occurs in human beings. This human capacity to process information and to engage in rational thinking is what Aristotle refers to as intellect. This intellect is the foundation by which humans can engage in complex activities such as “concept formation”, “reasoning”, and “decision making” (Bynum, 2006). Therefore to facilitate information and its processing is akin to enhancing human intellectual activities which uniquely distinguishes us from other beings. We are the first generation of humans where the capabilities of the technologies that support our information processing activities are truly revolutionary and far exceed those of our forefathers. Although this technological revolution has brought us closer and has made our lives easier and more productive, paradoxically, it has also made us more capable of harming one another and more vulnerable to be harmed by each other. Our vulnerabilities are the consequence of our capabilities. Mason argues that in this age of information, a new form of social contract is needed in order to deal with the potential threats to the information which defines us. Mason (1986) states “Our moral imperative is clear. We must insure that information technology, and the information it handles, are used to enhance the dignity of mankind. To achieve these goals we much formulate a new social contract, one that ensures everyone the right to fulfill his or her own human potential” (Mason, 1986, p 26). In light of the Aristotelian notion of the intellect, this new social contract has a profound implication in the way our society views information and the technologies that support them. For information technology (IT) to enhance the “human dignity,” it should assist humans in exercising their intellects ethically. But is it possible to achieve this without assuring the trustworthiness of information and the integrity of the technologies we are using? Without security that guarantees the trustworthiness of information and the integrity our technologies, ethical uses of the information cannot be realized. This implies that securing information and its ethical uses are inherently intertwined and should be viewed synergistically. Therefore, we define information security and ethics as an all encompassing term that refers to all activities needed to secure information and systems that support it in order to facilitate its ethical use. Until recently, information security was exclusively discussed in terms of mitigating risks associated with data and the organizational and technical infrastructure that supported it. With the emergence of the new paradigm in information technology, the role of information security and ethics has evolved. As Information Technology and the Internet become more and more ubiquitous and pervasive in our daily lives, a more thorough understanding of issues and concerns over the information security and ethics is becoming one of the hottest trends in the whirlwind of research and practice of information technology. This is chiefly due to the recognition that whilst advances in information technology have made it possible for generation, collection, storage, processing and transmission of data at a staggering rate from various sources by government, organizations and other groups for a variety of purposes, concerns over security of what is collected and the potential harm from personal privacy violations resulting from their unethical uses have also skyrocketed. Therefore, understanding of pertinent issues in information security
xli
and ethics vis-à-vis technical, theoretical, managerial and regulatory aspects of generation, collection, storage, processing, transmission and ultimately use of information are becoming increasingly important to researchers and industry practitioners alike. Information security and ethics has been viewed as one of the foremost areas of concern and interest by academic researchers and industry practitioners from diverse fields such as engineering, computer science, information systems, and management. Recent studies of major areas of interest for IT researchers and professionals point to information security and ethics as one of the most pertinent. We have entered an exciting period of unparallel interest and growth in research and practice of all aspects of information security and ethics. Information security and ethics is the top IT priority facing organizations. According to the 18th Annual Top Technology Initiatives survey produced by the American Institute of Certified Public Accountants (AICPA, 2007) information security tops the list of ten most important IT priorities (http://infotech.aicpa.org/Resources/). According to the survey results, for the fifth consecutive year, information security is identified as the technology initiative expected to have the greatest impact in the upcoming year for organizations and is thus ranked as the top IT priority for organizations. Additionally, six out of the top ten technology initiatives discussed in this report are issues related to information security ethics as are the top four. The interest in all aspects of information security and ethics is also manifested by the recent plethora of books, journal articles, special issues, and conferences in this area. This has resulted in a number of significant advances in technologies, methodologies, theories and practices of information security and ethics. These advances, in turn, have fundamentally altered the landscape of research in a wide variety of disciplines, ranging from information systems, computer science and engineering to social and behavioral sciences and the law. This confirms what information security and ethics professionals and researchers have known for a long time that information security and ethics is not just a “technology” issue any more. It impacts and permeates almost all aspects of business and the economy. In this introductory chapter, we will introduce the topic of information security and ethics and discuss the fundamental concepts and theories. We will broadly discuss tools and technologies used in achieving the goals of information security and ethics, and provide guidelines for the design and development of such tools and technologies. We will consider the managerial, organizational and societal implications of information security and ethics and conclude by discussing a number of future developments and activities in information security and ethics on the horizon that we think will have an impact on this field. Our discussion in this chapter in not meant to be an exhaustive literature review of the research in information security and ethics, nor is it intended to be a comprehensive introduction to the field. The following excellent chapters appear in this multi volume series will provide that. Our main goal here is to describe the broad outlines of the field and provide a basic understanding of the most salient issues for researchers and practitioners.
FUNDAMENTAL CONCEPTS AND THEORIES IN INFORMATION SECURITY AND ETHICS Information Security Information security is concerned with the identification of an organization’s electronic information assets and the development and implementation of tools, techniques, policies, standards, procedures and guidelines to ensure the confidentiality, integrity and availability of these assets. Although Information Security can be defined in a number of ways, the most salient is set forth by the government of the
xlii
United States. The National Institute of Standards and Technology (NIST) defines Information Security based on the 44 United States Code Section 3542(b)(2), which states “Information Security is protecting information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction in order to provide integrity, confidentiality, and availability.” (NIST, 2003, p3). The Federal Information Security Management Act (FISMA, P.L. 107-296, Title X, 44 U.S.C. 3532) defines Information Security as “protecting information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction” and goes on to further define Information Security activities as those “carried out in order to identify and address the vulnerabilities of computer system, or computer network” (17 U.S.C. 1201(e), 1202(d)). The United States’ National Information Assurance Training and Education Center (NIATEC) defines information security as “a system of administrative policies and procedures” for identifying, controlling and protecting information against unauthorized access to or modification, whether in storage, processing or transit” (NIATEC, 2006). The over all goal of information security should be to enable an organization to meet al.l of its mission critical business objectives by implementing systems, policies and procedures to mitigate IT-related risks to the organization, its partners and customers (NIST, 2004). The Federal Information Processing Standards Publication 199 issued by the National Institute of Standards and Technology (NIST, 2004) defines three broad information security objectives: Confidentiality, Integrity and Availability. This trio of objectives sometimes is referred to as the “CIA Triad”. Confidentiality: “Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information…” [44 U.S.C., Sec. 3542]. Confidentiality is the assurance that information is not disclosed to unauthorized individuals, processes, or devices (NIST, 2003 p. 15). Confidentiality protection applies to data in storage, during processing, and while in transit. Confidentiality is an extremely important consideration for any organization dealing with information and is usually discussed in terms of privacy. A loss of confidentiality is the unauthorized disclosure of information. Integrity: To ensure that timely and reliable access to and use of information is possible. According to 44 United States Code Section 3542(b)(2), integrity is defined as “guarding against improper information modification or destruction, and includes ensuring information non-repudiation and authenticity…” . Therefore, integrity is interpreted to mean protection against the unauthorized modification or destruction of information. Integrity should be viewed both from a “data” and a “system” perspective. Data integrity implies that data has not been altered in an unauthorized manner while in storage, during processing, or while in transit. System integrity requires that a system is performing as intended and is not impaired and is free from unauthorized manipulation (NIST, 2003). Availability: Timely, reliable access to data and information services for authorized users (NIST, 2003). According to 44 United States Code Section 3542(b)(2), availability is “Ensuring timely and reliable access to and use of information…”. Availability is frequently viewed as an organization’s foremost information security objective. Information availability is a requirement that is intended to assure that all systems work promptly and service is not denied to authorized users. This should protect against the intentional or accidental attempts to either perform unauthorized access and alteration to organizational information or otherwise cause a denial of service or attempts to use system or data for unauthorized purposes. A loss of availability is the disruption of access to or use of information or an information system. In defining the objectives of information security, there are a number of extensions to the CIA Triad. Most prominent extensions to the CIA Triad include three additional goals of information security. They are: accountability, authentication, and nonrepudation. One such extension appears in the National Security Agency (NSA) definition of information security as “... measures that protect and
xliii
defend information and information systems by ensuring their availability, integrity, authentication, confidentiality, and nonrepudiation. These measures include providing for restoration of information systems by incorporating protection, detection, and reaction capabilities” (CNSS, 2003). This definition is almost identical to the way “cybersecurity” was defined by the 108th US Congress. A cybersecurity bill introduced in the 108th Congress, the Department of Homeland Security Cybersecurity Enhancement Act — H.R. 5068/Thornberry; reintroduced in the 109th Congress as H.R. 285 where cybersecurity is defined as …the prevention of damage to, the protection of, and the restoration of computers, electronic communications systems, electronic communication services, wire communication, and electronic communication, including information contained therein, to ensure its availability, integrity, authentication, confidentiality, and nonrepudiation. Accountability: Is the cornerstone of organizational information security objective in which auditing capabilities are established to ensure that users and producers of information are accountable for their actions and to verify that organizational security policies and due diligence are established, enforced and care is taken to comply with any government guidelines or standards. Accountability serves as a deterrent to improper actions and as an investigation tool for regulatory and law enforcement agencies. Authentication: Security measure designed to establish the validity of a transmission, message, or originator, or a means of verifying an individual’s authorization to receive specific categories of information (CNSS, 2003. p 5). In order for a system to achieve security, it should require that all users identify themselves before they can perform any other system actions. Once the identification is achieved the authorization should be the next step. Authorization is process of granting permission to a subject to access a particular object. Authentication is the process of establishing the validity of the user attempting to gain access, and is thus a basic component of access control, in which unauthorized access to the resources, programs, processes, systems are controlled. Access control can be achieved by using a combination of methods for authenticating the user. The primary methods of user authentication are: access passwords, access tokens, something the user owns which can be based on a combination of software or hardware that allows authorized access to that system (e.g., smart cards and smart card readers), the use of biometrics (something the user is, such as a fingerprint, palm print or voice print), access location (such as a particular workstation), user profiling (such as expected or acceptable behavior), and data authentication, to verify that the integrity of data has not been compromised. (CNSS, 2003) Nonrepudiation: Assurance the sender of data is provided with proof of delivery and the recipient is provided with proof of the sender’s identity, so neither can later deny having processed the data. (CNSS, 2003) Any information security initiative aims to minimize risk by reducing or eliminating threats to vulnerable organizational information assets. The National Institute of Standards and Technology (NIST, 2003, p. 7) defines risk as “…a combination of: (i) the likelihood that a particular vulnerability in an agency information system will be either intentionally or unintentionally exploited by a particular threat resulting in a loss of confidentiality, integrity, or availability, and (ii) the potential impact or magnitude of harm that a loss of confidentiality, integrity, or availability will have on agency operations (including mission, functions, and public confidence in the agency), an agency’s assets, or individuals (including privacy) should there be a threat exploitation of information system vulnerabilities.” Risks are often characterized qualitatively as high, medium, or low. (NIST, 2003, p 8). The same publication defines threat as “…any circumstance or event with the potential to intentionally or unintentionally exploit a specific vulnerability in an information system resulting in a loss of confidentiality, integrity, or availability,” and vulnerability as “…a flaw or weakness in the design or implementation of an information system (including security procedures and security controls associated with the system) that could be intentionally or unintentionally exploited to adversely affect an agency’s operations (including missions,
xliv
functions, and public confidence in the agency), an agency’s assets, or individuals (including privacy) through a loss of confidentiality, integrity, or availability” (NIST, 2003, 9). NetIQ (2004) discusses five different types of vulnerabilities that have direct impact on the governance of information security practices. They are: exposed user accounts or defaults, dangerous user behavior, configuration flaws, missing patches and dangerous or unnecessary service. An effective management of these vulnerabilities is critical for three basic reasons. First, an effective vulnerability management helps reduce the severity and growth of incidence. Second, it helps in regulatory compliance. And third and the most important reason can be summed as simply saying, it is a “good business practice” to be proactive in managing the vulnerabilities rather than be reactive by trying to control the damage from an incidence.
Information Security and Security Attacks Vulnerable systems can open themselves to a security attack. Security attacks are not only wide spread, they are growing fast. Counterpane Internet Security, Inc. monitored more than 450 networks in 35 countries, in every time zone. In 2004 they reported 523 billion network events and investigated over 648,000 information security attacks. According to a report by the Internet Security Systems (http://www. ISS.net), information security attacks jumped 80 percent from 2002 to 2003. There are a large number of types of attacks that exploit vulnerabilities in systems. Here we describe some of the more recent and technologically complex attacks that have plagued the information networks and systems. • • • • • • • •
Denial of service: The attacker tries to prevent a service from being used rather than compromising it. Numerous hosts are used to perform a denial of service attack. Trojan horse: A malicious software which disguises itself as a benign software. Computer virus: Reproduces itself by attaching to other executable files and once executed can cause damage. Worm: A self-reproducing program that creates copies of itself. Worms can spread easily using e-mail address books. Logic bomb: lays dormant until an event triggers it, such as a date, user action, or in some cases may have a random trigger. IP spoofing: An attacker may fake its IP address so the receiver thinks it is sent from a location that it is not viewed by the receiver as a threat. Man-in-the-middle attack: Sometimes referred to as session hijacking in which the attacker accesses the network though an open session and, once the network authenticates it, attacks the client computer to disable it and uses IP spoofing to claim to be the client. Rootkit: A set of tools used by an attacker after gaining root-level access to a host computer in order to conceal its activities on the host and permit the attacker to maintain root-level access to the host through covert means.
Denial of service (DoS) attacks have become more popular in recent years. Typically, the loss of service for the infected provider is the inability of a particular network service, such as e-mail, to be available or the temporary loss of all network connectivity and services. A denial of service attack can also destroy files in affected system. DoS attacks can force Web sites accessed by millions of people to temporarily cease operation causing millions of dollars in damage. The costs of these attacks can be monumental. Forrester, IDC, and the Yankee Group estimate that the cost of a 24-hour outage for a large e-commerce company would approach $30 million. Twenty five percent of respondents to the
xlv
2006 CSI/FBI Computer Crime and Security Survey performed by the Computer Security Institute had experienced a DoS Attack (Gordon, 2006). Worldwide, as many as 10,000 such attacks occur each day. Information Security magazine reports that since 1998, annually, about 20 percent of the surveyed financial institutions have suffered disruptions of their information and network systems due to attacks from hackers. The US Department of Justice’s office of cyber crime (http://www.cybercrime.gov) states that “in the week of February 7, 2000, hackers launched distributed denial of service (DDoS) attacks on several prominent Websites, including Yahoo!, E*Trade, Amazon.com, and eBay. In more recent years, the have been a number of well publicized DDoS attacks that have cost business and consumers millions of dollars. In a DDoS attack, dozens or even hundreds of computers all linked to the Internet are instructed by a rogue program to bombard the target site with nonsense data. This bombardment soon causes the target sites’ servers to run out of memory, and thus cause it to be unresponsive to the queries of legitimate customers.” However, attacks do not necessarily originate from outside of an organization. There are a number of studies that show that many hackers are employees or insiders (Escamilla 1998, Russell and Gangemi 1992). Viruses and their associated malware have been favorite types of attacks for hackers and others intend on harming information security. A global survey conducted by InformationWeek and Pricewaterhouse Coopers LLP estimated that computer viruses and hacking took a $1.6 trillion toll on the worldwide economy and a $266 billion toll in the United States alone (Denning, 2000). In the 2006 Computer Security Institute (CSI) and FBI survey of 313 respondents on computer crime and security ranked, computer virus contamination as the leading cause of security-related losses in 2006, resulting in a whopping $15,691,460 in losses per surveyed organization (Gordon et al., 2006). Computer Economics estimates (Computer economics, 2007), the ILOVEYOU virus that struck in 2000 and its variants caused $6.7 billion in damage in the first 5 days alone. The Melissa virus first appeared on the Internet in March of 1999. It spread rapidly throughout computer systems in the United States and Europe. On December 9, 1999, David Smith, the creator of this virus, pleaded guilty to state and federal charges associated with his creation of the Melissa virus. The US Department of Justice’s Office of Cyber Crime (http://www. cybercrime.gov) estimates that the virus caused $800 million in damages to computers worldwide and in the United States alone, the virus made its way through 1.2 million computers in one-fifth of the country’s largest businesses. The US Department of Justice’s office of cyber crime (http://www.cybercrime.gov) reports on the indictment of the Loverspy spyware which was designed and marketed by Mr. Perez for people to use to spy on others. According to the indictment, “prospective purchasers, after paying $89 through a Web site in Texas, were electronically redirected to Perez’s computers in San Diego, where the “members” area of Loverspy was located. Purchasers would then select from a menu an electronic greeting card to send to up to five different victims or email addresses. The purchaser would draft an email sending the card and use a true or fake email address for the sender. Unbeknownst to the victims, once the email greeting card was opened, Loverspy secretly installed itself on their computer. From that point on, all activities on the computer, including emails sent and received, Web sites visited, and passwords entered were intercepted, collected and sent to the purchaser directly or through Mr. Perez’s computers in San Diego. Loverspy also gave the purchaser the ability remotely to control the victim’s computer, including accessing, changing and deleting files, and turning on Web-enabled cameras connected to the victim computers. Over 1,000 purchasers from the United States and the rest of the world purchased Loverspy and used it against more than 2,000 victims. Mr. Perez’s operations were shut down by a federal search warrant executed in October 2003.” Identity theft is another major problem. In recent months there have been a number of high profile data breaches that have brought forth the potential great losses to organizations. One major problem associated with data breaches is the identity theft. How prevalent is identity theft? It is estimated that
xlvi
27 million victims in us over past five years. In 2003 alone, 10 million Americans were the victims. The U.S. Department of Justice estimates that 36% of the identity thefts were related to credit cards and other bank cards and 45% related to non-financial use. The financial impact of identity theft for the U.S. businesses is estimated to be $48 billions for 2003 and for consumers/victims, is estimated to be around $5 billions. On July 15, 2004, President Bush signed the Identity Theft Penalty Enhancement Act of 2004. In his signing remarks, the President said: “We’re taking an important step today to combat the problem of identity theft, one of the fastest growing financial crimes in our nation. Last year alone, nearly 10 million Americans had their identities stolen by criminals who rob them and the nation’s businesses of nearly $50 billion through fraudulent transactions.” There are number of highly publicized identity thefts. In late 2006, the U.S. Department of Veterans Affairs lost personal records of an estimated 26.5 million veterans in a data breach. A coalition of veterans groups filed a class action seeking $1,000 in damages for each person, a payout that could eventually reach $26.5 billion. Another high profile security breach case occurred in December 2006 when the TJX, a US retailing giant, detected a hacker intrusion against its credit card transaction processing system. Hackers stole personal information from 45.7 million customer credit and debit cards. It is estimated that this security breach will cost TJX nearly $1.7 billion (http://www.protegrity.com/). Another estimate puts the cost of this data breach at $100 per record, and for 45.7 million records, the total cost could reach as high as $4.5 billion. In addition, TJX is currently facing a number of law suites and legal claims from customers and shareholders who were impacted by this security breach. The costs and losses associated with data breaches are not just financial. In 2005, ChoicePoint, a leading provider of data to the industry, revealed that criminals had stolen personal information on over 163,000 consumers. On the day that the breach was reported, ChoicePoint’s stock value fell 3.1 percent and it continued to slide by as much as 9 percent. Currently and nearly two years after the reported data breach, ChoicePoint stock is about 20 percent of its all time high. Other types of security breaches that can be proven very costly to organizations are losses from computers, laptop or mobile hardware theft and destructions. The same CSI/FBI report shows total losses among 313 respondents due to the theft of computer and mobile hardware containing customer information amounted to $6,642,660 in 2006 (Gordon, 2006). The report also shows that the cost per respond from 2005 to 2006 has skyrocketed from $19,562 per respondent in 2005 to $30,057 per respondent in 2006. Although the actual costs of these security breaches are very high, the true damage can be intangible costs associated with a tarnished reputation and the high price of earning back a customer’s trust. Not only is this a serious problem, it is getting worse. For example, the percentage of companies that stated they were the victims of hacking grew from 36% in 1996 to 74% in 2002 (Power, 2002). Computer Emergency Response Team (CERT) Coordination Center estimates that the number of information attacks on businesses has almost doubled every year since 1997 (CERT, 2004). The number of computer intrusion cases filed with the Department of Justice jumped from 547 in 1998 to 1,154 in 1999 (Goodman and Brenner 2002). In 1999, according to reports by the Computer Security Institute (CSI) and FBI survey the losses from computer crime incidents $124 million (Gordon, 2006). The losses jumped to a reported $266 million in 2000 and $456 million in 2002 (Power, 2002). The losses from computer crime incidents reported by the Computer Security Institute (CSI)/Federal Bureau of Investigation (FBI) surveys were $456 million in 2002, in contrast to $378 million in 2000 and $266 million in 1999 (Power, 2002). According to a recent study by the Ponemon Institute, the average cost of a consumer data breach is $182 per record. Ponemon’s analysis of 31 different incidents showed that the total costs for each ranging from $226,000 to more than $22 million. These costs are typically incurred from legal, investigative, administrative expenses; drops in stock prices, customer defections, opportunity loss, reputation management and costs associated with customer support such as informa-
xlvii
tional hotlines and credit monitoring subscriptions. This is alarming when considering the actual number of breaches that are reported. According to Privacy Rights Clearinghouse, a nonprofit consumer rights and advocacy organization, over 150 million data records of U.S. residents have been exposed due to security breaches since January 2005. The Privacy Rights Clearinghouse findings are congruent with the recent findings of another report by Ponemon which surveyed nearly 500 IT security professionals. The results of the survey, entitled “Data at Risk,” showed 81 percent of respondents reported the loss of one or more laptop computers containing sensitive information during the previous 12 months. The same Ponemon Institute survey also showed that the cost of diverting employees from their every day tasks to managing a data breach from $15 per record in 2005 to $30 a record in 2006. Regardless of the types of attacks or where they originate, they can be very costly to organizations. According to Gordon et al. (2006) findings, virus attacks continue to be the source of the greatest financial losses to the organization, followed by unauthorized access and other financial losses related to the theft of laptops and mobile hardware and the theft of proprietary information (i.e., intellectual property). These four categories account for more than 74 percent of the reported financial losses. Other attacks in order of importance and according to the severity and impact to the organizations are loses that are due to: denial of service, insider abuse of network access or e-mail, bots zombies, system penetration by outsiders, phishing by outsiders in which organizations are fraudulently represented, abuses of wireless networks, instant messaging misuses, misuses of public Web applications, sabotage of data or networks, Web site defacements, and password sniffing (Gordon et al. 2006).
Ethics as Human Foundation of Information Security The field of ethics is concerned with the understanding of the concepts of right and wrong behaviors. ethics is the study of what human behavior ought to be. It is defined as the study of moral values in human behavior and making sense of human experience. Ethics is concerned with the morality of our actions. We define morality as the nature of how we treat others. Throughout the history of the mankind, understanding ethics and morality have been constant concerns. During the past three thousand years, a number of powerful and highly respected ethical theories have emerged within various cultures around the globe. Some of the most influential theories are associated with great philosophers like the Buddha, Lao Tse and Confucius in Eastern societies, and Aristotle, Aquinas, Bentham and Kant in Western societies (Bynum, 2006). The Western notion of ethics is based on the works of Aristotle. In the Aristotelian notion of intellect, the capacity to “make decisions” is an exercise in evaluating the ethical consequences of human behavior. The modern field of ethics can be traced to the work of the great modern moral philosophy Thomas Hobbes (1588-1679). Hobbes argues that the natural state of men is freedom and the concepts of good and evil are related to human desires, needs and aversions. That is to say, Hobbes sees “good” as manifestation of what one desires and evil as expression what one loathes. This notion of good and evil is based on philosophy of values rooted in a belief in self preservation and protection. Hobbes expresses concern over this rigid notion of good and evil. In his famous 1651 book called “Leviathan” he outlines his concept of the value of a social contract for a peaceful society. Hobbes claims that man is not naturally good, but naturally a selfish hedonist and in a voluntary act of every man, the object is to bring some good to himself. A peaceful society, he argues can not be achieved if all members of the society lived by their own self interests and their notion of good and evil. Such a society would be in a constant “state of war”. He refers to this as “war of every man against every man”. For a peaceful state to exist, Hobbes argues, members of a society need to form of “social contract” which establishes a sovereign power that would mediate all disputes among the members’ actions resulted from acting in their own
xlviii
individual self interest and preservation. Therefore Hobbes views the notion of a sovereign enforcing a social contract that delineates the boundaries of individuals’ actions an imperative in achieving a “state of peace” (Hobbes, 1651). The sovereign will be given a monopoly on violence and absolute authority. In return, he promises to exercise its absolute power to maintain a state of peace. Perhaps no other philosopher is more influential in the development of modern ethics as a distinct philosophical field as is the German philosopher Immanuel Kant (1724-1804). Most of our understanding of the ethical considerations related to information security can be directly traced to the worked pioneered by Kant. Kant rejects the Hobbes’ notion of the monopoly of power rested with the sovereign for achieving a state of peace. The central principle of Kant’s ethical theory is what he calls the categorical imperative. In describing categorical imperative, Kant wants us to act only according to an unconditional moral law that applies to all which represents an action as unconditionally necessary. He famously states: “Act only according to that maxim by which you can at the same time will that it should become a universal law.” In his book The Foundations of the Metaphysics of Morals (Kant, 1785), Kant discusses the “search for and establishment of the supreme principle of morality.” For Kant, the moral justification for an action is not found in the consequences of that action but in the motives of one who takes that action. Kant sees only one thing that is inherently good without qualification, and that is the good will. Good will is our power of rational moral choice. According to Kant, what makes the good will is the will that acts out of duty and not out of inclination. Acting out of duty is the act based on the respect for the moral law described in the “Categorical Imperative.” Philosophers have traditionally divided ethical theories into three general subject areas: metaethics, normative ethics, and applied ethics. Metaethics is the study of the origin and meaning of ethical concepts. Metaethics is the study of imperatives, genesis and the rationale for our ethical principles. Metaethical seeks to investigate the universality of truths, the will of God and the role of reason in ethical judgments. Normative ethics takes on a more practical approach to understanding ethics by attempting to devise moral standards to regulate right from wrong conduct. A classic example of a normative ethical principle is The Golden Rule: treat others as you would want to be treated. Therefore a normative approach to ethics seeks to establish principles against which we judge all actions. This is much akin to the notion of categorical imperative set forth by Kant. Applied ethics is the branch of ethics which applies the ethical consideration to analyze specific controversial moral issues. In recent years applied ethical issues have been subdivided into convenient groups such as medical ethics, business ethics, environmental ethics, and most recently computer ethics. For an issue to be considered an “applied ethical” issue, two features are necessary. First, the issue needs to be controversial and second, the issue must be a distinctly moral issue. From a more practical perspective, we apply normative principles in applied ethics. Some of the most common examples of such principles used in applied ethics are: • • • •
Principle of benevolence: help those in need. Principle of honesty: do not deceive others. Principle of harm: do not harm others. Principle of paternalism: assist others in pursuing their best interests when they cannot do so themselves.
Computer ethics, later known as information ethics or cyberethics, is the foundation by which the ethical implications of information security are studied. Computer ethics is a branch of applied ethics that has received considerable attention not only from ethicists but also from information technology researchers and professional. Most of this interest is the natural consequence of the rapid development and change in computer technology, its uses and its implications. Although the term “Computer Ethics”
xlix
was first coined by Walter Maner (1980) as the ethical problems “aggravated, transformed or created by computer technology,” Bynum (2007) argues that the roots, the evolution, and the intellectual foundations of the field of computer ethics as distinct discipline within the realm of ethics can be directly traced to the revolutionary works of the MIT computer scientist Norbert Wiener (Wiener, 1948, 1950, 1954 and 1964). In his seminal and profound book Cybernetics: or control and communication in the animal and the machine published in 1948, he discusses the importance of developing a new perspective to judge good and evil in light of our new technologies. In 1950, Norbert Wiener went on to publish another important book titled The Human Use of Human Beings where he foresees a society based on a ubiquitous computing technology that will eventually remake the society and will radically change everything (Bynum 2007). He refers to this as the “second industrial revolution”. Bynum sees the importance of information is this second industrial revolution and recalls Wiener stating: “The needs and the complexity of modern life make greater demands on this process of information than ever before.... To live effectively is to live with adequate information. Thus, communication and control belong to the essence of man’s inner life, even as they belong to his life in society.” Bynum (2007) holds that the consequence of this second industrial revolution will be that the “workers must adjust to radical changes in the work place; governments must establish new laws and regulations; industry and businesses must create new policies and practices; professional organizations must develop new codes of conduct for their members; sociologists and psychologists must study and understand new social and psychological phenomena; and philosophers must rethink and redefine old social and ethical concepts.” This has a profound implication for the way we need and should view information security and ethics. We need to fundamentally rethink the way we view and approach management techniques, our technologies, our organizational policies, our societal laws and regulations, and our professional codes of conduct.
INFORMATION SECURITY AND ETHICS TOOLS AND TECHNOLOGIES Information security and ethics is a complex, growing and dynamic field. It encompasses all aspects of the organization. As stated earlier in this chapter, information security and ethics has received considerable attention from researchers, developers and practitioners. Given the complexities of the issues involved, and the pace of technological change, tools and technologies to support the organizational security efforts are diverse and multifaceted. This diversity of tools and technologies available makes it difficult, if not impossible, for even seasoned professionals to keep up with new tools, technologies, and terminologies. Gordon et al. (2006) presents a comprehensive and detailed description of the most widely used tools and technologies used by organizations to secure their most precious information assets. Some of the most important are: firewalls, malicious code detection systems (e.g., anti-virus software, anti-spyware software), server-based access control lists, intrusion detection systems, encryption of data for storage, encryption of data for transmission, reusable accounts and login passwords, intrusion prevention systems, log management software, application level firewalls, smart cards, one time password tokens, forensics tools, public key infrastructures, specialized wireless security systems, endpoint security client software, and the use of biometrics technologies to secure and restrict access to the information and networks. A detailed discussion of these tools, techniques and technologies is outside the scope of the current chapter. However, given the importance of this topic, we provide the basics of five categories of such tools and technologies for information security. Although we realize that there are number of very important tools and technologies currently available and a number of additional promising tools and technologies are on the horizon, we have focused our discussion here to only the five most important
and fundamental categories of tools and technologies. For additional discussion of tools and technologies used to achieve the goals of information security and ethics, readers are encouraged to consult other sources. Two excellent reports that we have consulted in this chapter are National Institute of Standards and Technology (NIST) Special Publications 800-12 (NIST, 1995), 800-36 (Grance et al., 2003), and 800-41 (Wack, Cutler, and Pole, 2002).
Identification and Authentication (NIST, 1995) Special Publication 800-12 defines “identification as the means by which a user provides a claimed identity to the system. Authentication is the means of establishing the validity of this claim. Authorization is the process of defining and maintaining the allowed actions. Identification and authentication establishes the basis for accountability and the combination of all three enables the enforcement of identity-based access control” (NIST, 1995, p.5). The user’s identity can be authenticated using the following mechanisms: • • •
Requiring the user to provide something they have (e.g., token) Requiring the user to provide something they alone know (e.g., password) Sampling a personal characteristic (e.g., fingerprint).
Access Control Grance et al. (2003) states “access control ensures that only authorized access to resources occurs. Access control helps protect confidentiality, integrity, and availability and supports the principles of legitimate use, least privilege, and separation of duties. Access control simplifies the task of maintaining enterprise network security by reducing the number of paths that attackers might use to penetrate system or network defenses. Access control systems grant access to information system resources to authorized users, programs, processes, or other systems. Access control may be managed solely by the application, or it may use controls on files. The system may put classes of information into files with different access privileges” (Grance et al., 2003, p. 25). Controlling access can be based on any or a combination of the following: • • • •
User identity Role memberships Group membership Other information known to the system.
Intrusion Detection Grance et al. (2003) describes intrusion detection as “... the process of monitoring events occurring in a computer system or network and analyzing them for signs of intrusions, defined as attempts to perform unauthorized actions, or to bypass the security mechanisms of a computer or network. Intrusions are caused by any of the following: attackers who access systems from the Internet, authorized system users who attempt to gain additional privileges for which they are not authorized and authorized users who misuse the privileges given them. Intrusion detection systems (IDS) are software or hardware products that assist in the intrusion monitoring and analysis process” (Grance et al., 2003, p. 30).
li
Firewall Wack, Cutler, and Pole (2002) define firewall as: “… devices or systems that control the flow of network traffic between networks or between a host and a network. A firewall acts as a protective barrier because it is the single point through which communications pass. Internal information that is being sent can be forced to pass through a firewall as it leaves a network or host. Incoming data can enter only through the firewall. Network firewalls are devices or systems that control the flow of network traffic between networks employing differing security postures. In most modern applications, firewalls and firewall environments are discussed in the context of Internet connectivity and the TCP/IP protocol suite. However, firewalls have applicability in network environments that do not include or require Internet connectivity. For example, many corporate enterprise networks employ firewalls to restrict connectivity to and from internal networks servicing more sensitive functions, such as the accounting or personnel department. By employing firewalls to control connectivity to these areas, an organization can prevent unauthorized access to the respective systems and resources within the more sensitive areas. The inclusion of a proper firewall or firewall environment can therefore provide an additional layer of security that would not otherwise be available. The most basic, fundamental type of firewall is called a packet filter. Packet filter fire-walls are essentially routing devices that include access control functionality for system addresses and communication sessions. (Wack et al., 2002, p 67)
Malicious Code Protection “Viruses, worms and other malicious code are typically hidden in software and require a host to replicate. Malicious code protection requires strict procedures and multiple layers of defense. Protection includes prevention, detection, containment, and recovery. Protection hardware and access-control software can inhibit this code as it attempts to spread. Most security products for detecting malicious code include several programs that use different techniques” (Grance et al., 2003, p. 45).
Vulnerability Scanners “Vulnerability scanners examine hosts such as servers, workstations, firewalls and routers for known vulnerabilities. Each vulnerability presents a potential opportunity for attackers to gain unauthorized access to data or other system resources. Vulnerability scanners contain a database of vulnerability information, which is used to detect vulnerabilities so that administrators can mitigate through network, host and application-level measures before they are exploited. By running scanners on a regular basis, administrators can also see how effectively they have mitigated vulnerabilities that were previously identified. Products use dozens of techniques to detect vulnerabilities in hosts’ operating systems, services and applications” (Grance et al., 2003, p 48).
UTILIZATION AND APPLICATION OF INFORMATION SECURITY AND ETHICS Information security is not just a technology issue alone. It encompasses all aspects of business from people to processes to technology. Bruce Schneier, founder and editor of Schneier.com, states that “If you think technology can solve your security problems, then you don’t understand the problems and you don’t understand the technology.” Information security involves consideration of many interrelated fundamental issues. Among them are technological, developmental and design, and managerial
lii
considerations. The technology component of information security is perhaps the easiest to develop and to achieve. The technological component of information security and ethics is concerned with the development, acquisition, and implementation of hardware and software needed to achieve security. The developmental and design component of information security deals with issues related techniques and methodologies used to proactively development and design systems that are secure. The managerial and personnel component focuses on the complex issues of dealing with the human elements in information security and ethics. It deals with policies, procedures and assessments required for the management of the operation of security activities. Undoubtedly, this is the hardest part of the information security to achieve since it is a clear commitment to security by an organization’s leadership, assignment of appropriate roles and responsibilities, implementation of physical and personnel security measures to control and monitor access, training that is appropriate for the level of access and responsibility, and accountability. In the following section we will describe these important issues further.
INFORMATION SECURITY AND ETHICS DEVELOPMENT AND DESIGN METHODOLOGIES The design of software can have a significant effect on its vulnerability to malware. It is a general principle of information security regarding software design, that the more complex a piece of software is, the more vulnerable to attack it could be. In fact, software engineers should be cognizant of the fact that the complexities of a software design may create potential vulnerabilities that malware can exploit. Additionally, complex software is more difficult to analyze for potential security vulnerabilities that may have even been hidden from the developers themselves. Thompson (1984) posits that it is essentially impossible to determine whether a piece of software is trustworthy by examining its source code, no matter how carefully. He argues that in order to achieve trustworthiness in a software system, the entire system must be evaluated (Thompson, 1984). Problems resulting from poor software design affect many computer systems. Among the most notorious and the most exploited software for their weaknesses and their vulnerabilities are computer operating systems and email programs. Specifically, since these types of software are the most widely used by the largest segment of the users who are the least security conscious and who may not even be security savvy, they can permit individual computer systems to compromised or allow the download of malware to the computer systems. Consider for example the software patches that we routinely download, whose developers may not have been aware of specific vulnerabilities until they were discovered and exploited by attackers. Given the complexities of systems developed for security information systems, the teams of security specialists, security architects, systems analysts, systems programmers, and system testers and ultimately the users of the security systems must work together to develop security systems that meet the organizational security needs. As with other large scale projects, a systematic development methodology needs to be utilized. Among the basic and widely used development methodology models that are adopted by many system development professional are: A. B. C. D.
System Development Life Cycle (SDLC) Model Prototyping Model Rapid Application Development Model Component Assembly Model
liii
To manage the complexities of developing such a massive system, researchers have relied on the system development life cycle (SDLC) models to facilitate the development process. System development life cycle (SDLC) is a development methodology used for information systems development using planning, investigation, analysis, design, implementation and maintenance phases. SDLC is a systematic approach to develop information security system is made up of several phases, each comprised of multiple steps. In this section, we will describe a SDLC approach to developing information security systems. The SDLC presented here is based on the SDLC methodology presented by Bowen, Hash, and Wilson (2007). The authors present a SDLC model specifically tailored to “ensure appropriate protection for the information that the system is intended to transmit, process, and store” (Bowen et al., 2007, p. 19). This proposed SDLC is made of the following phases: Initiation, Development and Acquisition, Implementation, Operations and Maintenance, and Disposal In the following tables, we present the list of activities that needs to be accomplished in each phase (Bowen et al., 2007, pp. 21-24). SDLC Activities
Security Activities and Definitions A. Initiation Phase Define a problem that might be solved through product acquisition. Traditional components of needs determination are establishing a basic system idea, defining preliminary requirements, assessing feasibility, assessing technology, and identifying a form of approval to further investigate the problem. Establish and document need and purpose of the system.
Needs Determination
Security Categorization
Preliminary Risk Assessment
Identify information that will be transmitted, processed, or stored by the system and define applicable levels of information categorization. Handling and safeguarding of personally identifiable information should be considered.
Establish an initial description of the basic security needs of the system. A preliminary risk assessment should define the threat environment in which the system or product will operate.
B. Development and Acquisition Phase Requirements Conduct a more in-depth study of the need that draws on and further develops the Analysis/ work performed during the initiation phase. Development Develop and incorporate security requirements into specifications.
Analyze functional requirements that may include system security environment (e.g., enterprise information security policy and enterprise security architecture) and security functional requirements. Analyze assurance requirements that address the acquisition and product integration activities required and assurance evidence needed to produce the desired level of confidence that the product will provide required information security features correctly and effectively. The analysis, based on legal, regulatory, protection, and functional security requirements, will be used as the basis for determining how much and what kinds of assurance are required.
Risk Assessment
Conduct formal risk assessment to identify system protection requirements. This analysis builds on the initial risk assessment performed during the initiation phase, but will be more in-depth and specific. Security categories derived from FIPS 199 are typically considered during the risk assessment process to help guide the initial selection of security controls for an
liv
Cost Considerations and Reporting
Determine how much of the product acquisition and integration cost can be attributed to information security over the life cycle of the system. These costs include hardware, software, personnel, and training.
Security Planning
F ully document agreed-upon security controls, planned or in place. D evelop the system security plan. D evelop documents supporting the organization’s information security program (e.g., CM plan, contingency plan, incident response plan, security awareness and training plan, rules of behavior, risk assessment, security test and evaluation results, system interconnection agreements, security authorizations/accreditations, and plans of action and milestones. Develop awareness and training requirements, including user manuals and operations/ administrative manuals.
Security Control Development
D evelop, design, and implement security controls described in the respective security plans. For information systems currently in operation, the security plans for those systems that may call for developing additional security controls to supplement the controls already in place or for those that may call for modifying selected controls that are deemed to be less than effective.
Developmental Security Test and Evaluation
Test security controls developed for a new information system or product for proper and effective operation. Some types of security controls (primarily those controls of a nontechnical nature) cannot be tested and evaluated until the information system is deployed; these controls are typically management and operational controls. Develop test plan/script/scenarios.
Other Planning Components
Ensure that all necessary components of the product acquisition and integration process are considered when incorporating security into the life cycle. These components include selection of the appropriate contract type, participation by all necessary functional groups within an organization, participation by the certifier and accreditor, and development and execution of necessary contracting plans and processes.
C. Implementation Phase Security Test and Evaluation Inspection and Acceptance System Integration/ Installation
Security Certification
D evelop test data. Test unit, subsystem, and entire system. Ensure system undergoes technical evaluation. Verify and validate that the functionality described in the specification is included in the deliverables. I ntegrate the system at the operational site where it is to be deployed for operation. Enable security control settings and switches in accordance with vendor instructions and proper security implementation guidance. Ensure that the controls are effectively implemented through established verification techniques and procedures and give organization officials confidence that the appropriate safeguards and countermeasures are in place to protect the organization’s information. Security certification also uncovers and describes the known vulnerabilities in the information system. Existing security certification may need to be updated to include acquired products. The security certification determines the extent to which the security controls in the information system are implemented correctly, operating as intended, and producing the desired outcome with respect to meeting security requirements for the system.
lv
Provide the necessary security authorization of an information system to process, store, or transmit information that is required. This authorization is granted by a senior organization official and is based on the verified effectiveness of security controls to some agreed-upon level of assurance and on an identified residual risk to agency assets or operations. This process determines whether the remaining known vulnerabilities in the information system pose an acceptable level of risk to agency operations, agency assets, or individuals. Upon successful completion of this phase, system owners will either have authority to operate, interim authorization to operate, or denial of authorization to operate the information system.
Security Accreditation
E. Disposal Phase:
Information Preservation
Media Sanitization Hardware and Software Disposal
Retain information, as necessary, to conform to current legal requirements and to accommodate future technology changes that may render the retrieval method obsolete. Consult with agency office on retaining and archiving federal records. Ensure long-term storage of cryptographic keys for encrypted data. Determine archive, discard or destroy information. D etermine sanitization level (overwrite, degauss, or destroy). Delete, erase, and overwrite data as necessary.
Dispose of hardware and software as directed by governing agency policy.
D. Operations/Maintenance Phase
Configuration Management and Control
Continuous Monitoring
Ensure adequate consideration of the potential security impacts due to specific changes to an information system or its surrounding environment. Configuration Management and configuration control procedures are critical to establishing an initial baseline of hardware, software, and firmware components for the information system and for subsequently controlling and maintaining an accurate inventory of any changes to the system. Develop CM plan o Establish baselines o Identify configuration o Describe configuration control process o Identify schedule for configuration audits Monitor security controls to ensure that controls continue to be effective in their application through periodic testing and evaluation. Security control monitoring (i.e., verifying the continued effectiveness of those controls over time) and reporting the security status of the information system to appropriate agency officials is an essential activity of a comprehensive information security program. Monitor to ensure system security controls are functioning as required. Perform self-administered or independent security audits or other assessments periodically. Types: using automated tools, internal control audits, security checklists, and penetration testing. Monitor system and/or users. Methods: review system logs and reports, use automated tools, review change management, monitor external sources (trade literature, publications, electronic news, etc.), and perform periodic reaccreditation. o POA&Ms o Measurement and metrics o Network monitoring
lvi
MANAGERIAL IMPACT OF INFORMATION SECURITY AND ETHICS Information is a critical asset that supports the mission of an organization. Protecting this asset is critical to the survivability and longevity of any organization. Maintaining and improving information security is critical to the operations, reputation, and ultimately the success and longevity of any organization. However, information and the systems that support it are vulnerable to many threats that can inflict serious damage to organizations resulting in significant losses. The concerns over information security risks can originate from a number of different security threats. They can come from hacking and unauthorized attempts to access private information, fraud, sabotage, theft and other malicious acts or they can originate from more innocuous, but no less harmful sources, such as natural disasters or even user errors. David Mackey, IBM’s Director of Security Intelligence, estimates that IBM recorded more than 1 billion suspicious computer security events in 2005. He estimates a higher level of malicious traffic in 2006. The damage from these “security events” can range from loss of integrity of the information to total physical destruction or corruption of the entire infrastructure that supports it. The damages can stem from the actions of a variety of sources, such as disgruntled employees defrauding a system, careless errors committed by trusted employees, or hackers gaining access to the system from outside of the organization. Precision in estimating computer security-related losses is not possible because many losses are never discovered, and others are “swept under the carpet” to avoid unfavorable publicity. The effects of various threats vary considerably: some affect the confidentiality or integrity of data while others affect the availability of a system. Broadly speaking, the main purpose of information security is to protect an organization’s valuable resources, such as information, hardware, and software. The importance of securing our information infrastructure is not lost to the government of the United States. The US Department of Homeland Security (DHS) identifies a Critical Infrastructure (CI) as “systems and assets, whether physical or virtual, so vital to the United States that the incapacity or destruction of such systems and assets would have a debilitating impact on security, national economic security, national public health or safety, or any combination of those matters.” According a recent report by the DHS titled “The National Strategy for Homeland Security,” which identified thirteen CIs, disruption in any components of a CI can have catastrophic economic, social, and national security impacts. Information security is identified as a major area of concern for the majority of the thirteen identified CIs. For example, many government and private-sector databases contain sensitive information which can include personally identifiable data such as medical records, financial information such as credit card numbers, and other sensitive proprietary business information or classified security-related data. Securing these databases which form the back bone of a number of CI’s is of paramount importance. Losses due to electronic theft of information and other forms of cybercrime against to such databases can result in tens of millions of dollars annually. In addition to specific costs incurred as a result of malicious activities such as identity theft, virus attacks, or denial of service attacks, one of the major consequences of dealing with a security attack is the decrease in customer and investor confidence in the company. This is an area of major concern for the management. According to an event-study analysis using market evaluations done by Cavusoglu, Mishra, and Raghunathan (2004) to assess the impact of security breaches on the market value of breached firms, announcing a security breach is negatively associated with the market value of the announcing firm. The breached firms in the sample lost, on average, 2.1 percent of their market value within two days of the announcement—an average loss in market capitalization of $1.65 billion per breach (Cavusoglu, Mishra, and Raghunathan, 2004). The study suggests that the cost of poor security is very high for investors and bad for business. Financial consequences may range from fines levied by regulatory authorities to brand erosion. As a result, organizations are spending a larger portion of their IT budget in
lvii
information security. A study by the Forrester Research Group estimates that in 2007 businesses across North American and Europe will spend almost 13% of their IT budgets on security related activities. The same report shows the share of security spenditure was around 7% in 2006. It is obvious that information security is a priority for the management, as it should be. Regardless of the source, the impact on organizations can be severe, ranging from interruption in delivery of services and goods, loss of physical and other assets, and loss of customer good will and confidence in the organization to disclosure of sensitive data. Such breaches to sensitive data can be very costly to the organization. However, recent research shows that investing in and upgrading the information security infrastructure is a smart business practice. By doing so, an organization can reduce the frequency and severity of losses resulted from security breaches in computer systems and infrastructures.
Information Security Risk Management Cycle Given the complexities and challenges facing organizations contemplating developing a complete and integrated information security program, the need for a comprehensive development framework is apparent. The need for such a framework is obvious when considering the numerous policy, managerial, technical, legal, and human resource issues that need to be integrated. In a large scale study of leading organizations that have successfully developed an Information security program, the United States General Accounting Office’s Accounting and Information Management Division proposed a comprehensive framework for developing information security programs. The report titled, “Executive Guide Information Security Management: Learning From Leading Organizations” (GAO/AIMD-98-68 Information Security Management) presents a comprehensive framework for information security program development based on successful implementation of risk management principles undertaken by the leading organizations that were studied. These principles are classified in five broad factors. They are: • • • • •
Assess risk and determine needs Establish a central management focal point Implement appropriate policies and related controls Promote awareness Monitor and evaluate policy and control effectiveness
According to (US GAO, 1998), “An important factor in effectively implementing these principles was linking them in a cycle of activity that helped ensure that information security policies addressed current risks on an ongoing basis. The single most important factor in prompting the establishment of an effective security program was a general recognition and understanding among the organization’s most senior executives of the enormous risks to business operations associated with relying on automated and highly interconnected systems” (US GAO, 1998, p. 17).The GAO report a risk management cycle in which successful implementation requires the coordination of all activities by a central security management office which serves as consultants and facilitators to individual business units and senior management. Figure 1 presents the proposed risk management cycle. United States General Accounting Office, Accounting and Information Management Division, concludes that information security managers at each organization that was studies agreed that a successful implementation of the five principles presented in the Risk Management Cycle can be achieved using sixteen practices that are outlined in Figure 2. These 16 practices which relate to the five risk management principles were keys to the effectiveness of their programs (US GAO 1998).
lviii
Figure 1. Principles and practices to implement the risk management cycle (Source: GAO/AIMD-98-68 Information Security Management) Information Security Risk Management Cycle Access Risk and Determine Needs
Implement Polices and Controls
Central Focal Point
Monitor and Evaluate
Promote Awareness
Lessons for the Management A common motivation for corporations to invest in information security is to safeguard their confidential data. This motivation is based on the erroneous view of information security as a risk mitigation activity rather than a strategic business enabler. No longer should information security be viewed solely as a measure to reduce risk to organizational information and electronic assets, it should be viewed as way the business needs to be conducted. To achieve success in information security goals, it should be organization information security should support the mission of the organization. The Information Systems Security Association (ISSA) has been developing a set of generally accepted information security principles (GAISP). GAISP include a number of information security practices including the need for involvement of top management, the need for customized information security solutions, need for periodic reassessment, the need for an evolving security strategy and the need for a privacy strategy. This implies that it should be viewed as an integral part of the organizational strategic mission and therefore, it requires a comprehensive and integrated approach. It should be viewed as an element of sound management in which the cost-effectiveness is not the only driver of the project. Management should realize that information security is a smart business practice. By investing in security measures, an organization can reduce the frequency and severity of security-related losses. Information security requires a comprehensive approach that extends throughout the entire information life cycle. The management needs to understand that without a physical security, information security would be impossible. As a result, it should take into considerations a variety of issues, both technical and managerial and from within and outside of the organization. The management needs to realize that this comprehensive approach requires that the managerial, legal, organizational policies, operational, and technical controls
lix
Figure 2. Information security risk management cycle (Source: GAO/AIMD-98-68 Information Security Management)
can work together synergistically. This requires that senior managers be actively involved in establishing information security governance. Effective information security controls often depend upon the proper functioning of other controls but responsibilities must be assigned and carried out by appropriate functional disciplines. These interdependencies often require a new understanding of the trade offs that may exist, that achieving one may actually undermine another. The management must insist that information security responsibilities and accountability be made explicit and the system owners have responsibilities that may exist outside their own functional domains. An individual or work group should be designated to take the lead role in the information security as a broad organization wide process. That requires that security policies be established and documented and the awareness among all employees should be increased through employee training and other incentives. This requires that information security priorities be communicated to all stakeholders, including, customers, and employees at all levels within the organization to ensure a successful implementation. The management should insist that information security activities be integrated into all management activities, including strategic planning and capital planning. Management should
lx
also insist that an assessment of needs and weaknesses should be initiated and security measures and policies should be monitored and evaluated continuously. Information security professionals are charged with protecting organizations against their information security vulnerabilities. Given the importance of securing information to an organization, this is an important position with considerable responsibility. It is the responsibility of information security professionals and management to create an environment where the technology is used in an ethical manner. Therefore, one cannot discuss information security without discussing the ethical issues fundamental in the development and use of the technology. According to a report by the European Commission (EC, 1999, p. 7), “Information Technologies can be and are being used for perpetrating and facilitating various criminal activities. In the hands of persons acting with bad faith, malice, or grave negligence, these technologies may become tools for activities that endanger or injure the life, property or dignity of individuals or damage the public interest.” Information technology operates in a dynamic environment. Considerations of dynamic factors such as advances in new technologies, the dynamic nature of the user, the information latency and value, systems’ ownerships, the emergence of a new threat and new vulnerabilities, dynamics of external networks, changes in the environment, the changing regulatory landscape should be viewed as important. Therefore the management should insist on an agile, comprehensive, integrated approach to information security.
ORGANIZATIONAL AND SOCIAL IMPLICATIONS OF INFORMATION SECURITY AND ETHICS Professional Ethical Codes of Conduct Most, if not all, professional organizations have adopted a set of ethical code of conducts. Parker (Parker, 1968, p. 200) states that “… the most ancient and well known written statement of professional ethics is the Hippocratic Oath of the medical profession. Suggestions related to the Oath date back to Egyptian papyri of 2000 B.C. The Greek medical writings making up the Hippocratic Collection were put together about 400 B.C. The present form of the Hippocratic Oath originated about 300 A.D. The accelerated pace of advances in information technologies transformed computer and information ethics from a theoretical exercise envisioned by Wiener into a reality faced by practitioners (Bynum, 2007). As information technology became more wide spread and its practitioners developed a professional identity of their own, the need for a professional code of ethical conduct for information technology professionals became apparent (ACM, 1993; Barquin, 1992; Becker-Kornstaedt, 2001; Bynum, 2000, 2001, 2004, 2006; Mason 1986) . In the mid-1960s, Donn Parker, pioneer and expert in the field of computer and information crime and security, became the first computer scientist to set forth a set of formal rules of ethics for computer professionals. As the chairman of the ACM Professional Standards and Practices Committee in his 1968 article, “Rules of Ethics in Information Processing” in Communications of the ACM (Parker, 1968) discussed rules of ethics for information processing, which were adopted by the ACM Council on November 11, 1966, as a set of Guidelines for Professional Conduct in Information Processing. These guidelines later became the first Code of Professional Conduct for the Association for Computing Machinery. ACM established in 1947 as “the world’s first educational and scientific computing society.” ACM’s code of ethics provides specific guidelines for protecting information confidentiality, protecting others’ privacy, causing no harm, and respecting others’ intellectual property. According to the ACM constitution, “This Code, consisting of 24 imperatives formulated as statements of personal responsibility, identifies the
lxi
elements of such a commitment. The Code and its supplemented Guidelines are intended to serve as a basis for ethical decision making in the conduct of professional work. Secondarily, they may serve as a basis for judging the merit of a formal complaint pertaining to violation of professional ethical standards” (http://www.acm.org/constitution/code.html). This code of conduct consists of four sections: Section 1, General Moral Imperatives, outlines fundamental ethical considerations for computing professionals; Section 2, More Specific Professional Responsibilities, discusses additional, more specific considerations of professional conduct; Section 3, Organizational Leadership Imperatives concerns with the code of conducts for individuals who have a leadership role in the computing field; Section 4, Compliance with the Code, discusses how computing professionals can become compliant with this code. Please see the full version of the code at: http://www.acm.org/constitution/code.html. The importance of ethical conduct in the face of changing technology is not lost to the professional organizations representing diverse groups of information technology professionals. Most professional organizations in information technology have developed their own codes of ethics. A sampling of some of those codes appears below.
Table 1. Ten Commandments of Computer Ethics (Created by the Computer Ethics Institute) 1. 2. 3. 4. 5. 6. 7. 8. 9.
Thou Shalt Not Use A Computer To Harm Other People. Thou Shalt Not Interfere With Other People’s Computer Work. Thou Shalt Not Snoop Around In Other People’s Computer Files. Thou Shalt Not Use A Computer To Steal. Thou Shalt Not Use A Computer To Bear False Witness. Thou Shalt Not Copy Or Use Proprietary Software For Which You have Not Paid. Thou Shalt Not Use Other People’s Computer Resources Without Authorization Or Proper Compensation. Thou Shalt Not Appropriate Other People’s Intellectual Output. Thou Shalt Think About The Social Consequences Of The Program You Are Writing Or The System You Are Designing. 10. Thou Shalt Always Use A Computer In Ways That Insure Consideration And Respect For Your Fellow Humans.
Figure 3. Code of Ethics Association of Information Technology Professionals (AITP)
lxii
Table 2. IEEE Code of Ethics IEEE Code of Ethics We, the members of the IEEE, in recognition of the importance of our technologies in affecting the quality of life throughout the world, and in accepting a personal obligation to our profession, its members and the communities we serve, do hereby commit ourselves to the highest ethical and professional conduct and agree: 1. To accept responsibility in making decisions consistent with the safety, health and welfare of the public, and to disclose promptly factors that might endanger the public or the environment; 2. To avoid real or perceived conflicts of interest whenever possible, and to disclose them to affected parties when they do exist; 3. To be honest and realistic in stating claims or estimates based on available data; 4. To reject bribery in all its forms; 5. To improve the understanding of technology, its appropriate application, and potential consequences; 6. To maintain and improve our technical competence and to undertake technological tasks for others only if qualified by training or experience, or after full disclosure of pertinent limitations; 7. To seek, accept, and offer honest criticism of technical work, to acknowledge and correct errors, and to credit properly the contributions of others; 8. To treat fairly all persons regardless of such factors as race, religion, gender, disability, age, or national origin; 9. To avoid injuring others, their property, reputation, or employment by false or malicious action; 10. To assist colleagues and co-workers in their professional development and to support them in following this code of ethics.
Laws, Regulations Impacting Information Security As our societies become increasingly dependent on information technologies, effective practical legal means will have to be employed to help manage the associated risks. Currently, there are a number of United States Federal agencies that specifically deal with information security related issues. For example, the US Department of Justice has an office dedicated to computer and cyber crimes (http://www. cybercrime.gov). The National Infrastructure Protection Center (NIPC), formally of US Department of Justice was fully integrated into the Information Analysis and Infrastructure Protection Directorate of the Department of Homeland Security (DHS) (www.dhs.gov). The most widely accepted principles on which many laws related to information use and security in the United States, Canada, European Union and other parts of the world are based are the Fair Information Practice Principles (FIPP). The Principles were first formulated by the U. S. Department of Health, Education and Welfare in 1973 for collecting and use of information on consumers. FIPP are quoted here from the Organization for Economic Cooperation and Development’s Guidelines on the Protection of Privacy and Transborder Flows of Personal Data (Text here is reproduced from the report available at http://www1.oecd.org/publications/e-book/9302011E.PDF). Openness There should be a general policy of openness about developments, practices and policies with respect to personal data. Means should be readily available for establishing the existence and nature of personal data, and the main purposes of their use, as well as the identity and usual residence of the data controller.
lxiii
Collection Limitation There should be limits to the collection of personal data and any such data should be obtain by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject. Purpose Specification The purpose for which personal data are collected should be specified not later than at the time of data collection and the subsequent use limited to the fulfillment of those purposes or such others as are not incompatible with those purposes and as are specified on each occasion of change of purpose. Use Limitation Personal data should not be disclosed, made available or otherwise used for purposes other than those specified as described above, except with the consent of the data subject or by the authority of law. Data Quality Personal data should be relevant to the purposes for which they are to be used, and, to the extent necessary for those purposes, should be accurate, complete, relevant and kept up-to-date. Individual Participation An individual should have the right to: (a) obtain from a data controller, or otherwise, confirmation of whether or not the data controller has data relating to him; (b) have communicated to him, data relating to him within a reasonable time; at a charge, if any, that is not excessive; in a reasonable manner; and in a form that is readily intelligible to him; (c) be given reasons if a request is denied and to be able to challenge such denial; and (d) challenge data relating to him and, if the challenge is successful, to have the data erased, rectified, completed or amended. Security Safeguards Personal data should be protected by reasonable security safeguards against such risks as loss or unauthorized access, destruction, use, modification or disclosure of data. Accountability A data controller should be accountable for complying with measures which give effect to the principles stated above. The Federal Trade Commission (FTC) is the primary federal agency responsible for the enforcement of various laws governing the privacy of an individual’s information on the Internet. The Federal Trade Commission Act (FTCA), 15 U.S.C. § 45(a), gives the FTC investigative and enforcement authority over businesses and organizations engaged in interstate commerce. While waiting on the enactment of new legislation, the FTC utilizes existing laws to protect consumers from unfair and deceptive trade practices. The FTC has allowed most businesses to self-regulate. However, the government has regulated some industries such as healthcare and financial services. They also require Web sites to follow specific rules when obtaining information from children. Currently there are a number of U.S. federal laws that directly impact the information security communities. These laws address different topics of information security, such as protecting confidentiality and privacy of information or requiring the documentation and audit trails for financial data and transactions. Non compliance with these laws can bring significant financial and legal liability. Some of the most significant laws are: Gramm-Leach-Bliley Act (GLB), Sarbanes-Oxley Act (SOX), Health
lxiv
Insurance Portability and Accountability Act (HIPAA), Food and Drug Administration (FDA) 21 Code of Federal Regulations (CFR) Part 11. Gramm-Leach-Bliley Act (GLB): In November 1999, the Gramm-Leach-Bliley Act was passed to regulate the privacy and protection of customer records maintained by financial organizations. GLB compliance for financial institutions became mandatory by July 2001, including the implementation of the following security requirements: • • • •
Access controls on customer information systems Encryption of electronic customer information Monitoring systems to perform attacks and intrusion detection into customer information systems Specify actions that have to be taken when unauthorized access has occurred (GLB, 1999)
To comply with GLB, institutions have to focus on administrative and technological safeguards to ensure the confidentiality and integrity of customer records, through the implementation of security solutions and secure systems management. Sarbanes-Oxley Act (SOX): The Sarbanes-Oxley Act passed the United States Congress in 2002 has caused major changes in corporate governance, reporting the accuracy of financial reporting, financial statement disclosure, corporate executive compensation and auditor independence. According to CSI/FBI 2006 report, the impact of the Sarbanes-Oxley Act on information security continues to be substantial and overreaching. The impact of SOX has generally been positive on information security. Compliance with the Sarbanes-Oxley Act has raised organizational awareness and interest in information security and has changed the focus of organizational information security concerns from technology to corporate governance. Complying with Sarbanes-Oxley requires companies to have specific internal controls in place to protect their data from these vulnerabilities. Section 404 of the SOX requires companies to put in place internal controls over business operations to ensure the integrity of financial audit records within the company with a real emphasis on computer and network security. This involves: • • • •
Internal operational controls: Control interactions between people and applications and audit rights and responsibilities Employees and business partners controls: Put in place authentication and control access to know who can access which systems and data and what they can do with those resources Applications controls: Apply operational controls directly to systems that will be connected to access each other’s data Auditing and reporting: Show compliance of all implementations of internal controls
Enforcing controls and making them operational are organizations’ main objectives to comply with SOX. Another focal point of this law is the improvement of security policies and procedures to address risks to the achievement of specific control objectives, which includes to: • • •
Define security standards of protection Create security education programs for employees Identify and document security exposures and policy exceptions
lxv
•
Evaluate periodically security compliance with metrics and put in place action plans to ensure compliance of policies (SOX, 2002)
Ensuring security and integrity of systems is a key focus of complying with SOX. Organizations have to implement new security measures to improve the integrity of their systems. Food and Drug Administration (FDA) 21 Code of Federal Regulations (CFR) Part 11 For medical/pharmaceutical organizations, FDA regulations became effective in 1997 and enforced in 2000. CFR part 11 established the US Food and Drug Administration requirements for electronic records and signatures. It includes the following requirements: • • • •
Secure audit trails must be maintained on the system Only authorized persons can use the system and perform specific operations Records must be stored in a protected database Identity of each user must be verified before providing them any credential.
Part 11 is very high level and does not provide strict recommendations, however this regulation provides the basic principle for the use of computers in the pharmaceutical industry. To be compliant, an organization must define, implement, and enforce procedures and controls to ensure the authenticity, integrity, and the confidentiality of electronic records. Health Insurance Portability and Accountability Act (HIPAA) HIPAA is a US law which came into effect in 1996. It provides a standard for electronic health care transactions over the Internet. As the integrity and confidentiality of patient information is critical, this requires being able to uniquely identify and authenticate an individual. HIPAA has strict guidelines on how healthcare organizations can manage private health information. This includes: • • • •
Authentication: A unique identification for individuals using the health care system Access control: Manage accounts and restrict access to health information Password management: Centrally define and enforce a global password policy Auditing: Centralize activity logs related to the access of health information
Securing the information systems necessary for the operation of federal government is an important national security federal government consideration and therefore a number of laws and regulations mandate that agencies protect their computers, the information they process, telecommunications infrastructure and other related technologies. The most important are listed below: • • •
The Computer Security Act of 1987 which requires government and federal agencies to identify sensitive systems, conduct computer security training, and develop computer security plans. The Federal Information Resources Management Regulation (FIRMR) is the primary regulation for the use, management, and acquisition of computer resources in the federal government. OMB Circular A-130 (specifically Appendix III) requires that federal agencies establish security programs containing specified elements.
lxvi
Guidelines for the Security of Information Systems It is safe to conclude that laws will not protect us, but can we seek the salvation in technology? The answer is no. Technology by itself is not the solution. Technology should be viewed as an enabler. It is the people who use the technology that are responsible to its ethical use. To illustrate, consider the following scenario. You are an information security professional charged with developing a security policy, analyzing risks and vulnerabilities, developing an organization’s security infrastructure, and setting up intrusion detection systems. Suppose you discover an unauthorized access to your network. Having done your job correctly, you are able to identify the intruder. Now, is your work as an information security professional done? Not by a long shot. Once an intruder is identified, what is the next step? Does your organization have policies to deal with this intruder? Are there laws that deal specifically with this type of crime? Given the borderless nature of the Internet, these types of crimes can be perpetrated by anyone in any geographical location. Who has the jurisdiction over these laws? Therefore the question that you need to answer is where do I go from here? The technology is advancing at a breath taking pace. New technologies bring new possibilities to do harm and to commit crimes. The legal system is not capable of keeping pace with the development of technologies. Laws are reactive. They are the reactions of the societies to adverse events. Very seldom are laws proactive. Therefore, relying on laws and the legal system to protect against crimes made possible by a fast moving technology is not a wise course of action. The Organization for Economic Cooperation and Development (OECD) (http://www.oecd.org), an international consortium of over 30 countries established to foster good governance in the public service and in corporate activity has released its updated Guidelines for the Security of Information Systems. These guidelines are meant to increase our understanding of the importance of good security practices and to provide specific guidelines as how to achieve them. The Guidelines consist of nine core principles that aim to increase public awareness, education, information sharing, and training that can lead to a better understanding of online security and the adoption of best practices. A formal declaration from OECD states: “These guidelines apply to all participants in the new information society and suggest the need for a greater awareness and understanding of security issues, including the need to develop a “culture of security”—that is, a focus on security in the development of information systems and networks, and the adoption of new ways of thinking and behaving when using and interacting within information systems and networks. The guidelines constitute a foundation for work towards a culture of security throughout society (OECD, 2007).
CRITICAL ISSUES IN INFORMATION SECURITY AND ETHICS This proclamation about data volume growth is no longer surprising, but continues to amaze even the experts. For businesses, more data isn’t always better. Organizations must assess what data they need to collect and how to best leverage it. Collecting, storing and managing business data and associated databases can be costly, and expending scarce resources to acquire and manage extraneous data fuels inefficiency and hinders optimal performance. The generation and management of business data also loses much of its potential organizational value unless important conclusions can be extracted from it quickly enough to influence decision making while the business opportunity is still present. Managers must rapidly and thoroughly understand the factors driving their business in order to sustain a competitive advantage. Organizational speed and agility supported by fact-based decision making are critical to ensure an organization remains at least one step ahead of its competitors. According to Kakalik and Wright
lxvii
Table 3. OECD’s Guidelines for the Security of Information Systems OECD’s Guidelines for the Security of Information Systems: • • • • • • • • •
Accountability - The responsibilities and accountability of owners, providers and users of information systems and other parties...should be explicit. Awareness - Owners, providers, users and other parties should readily be able, consistent with maintaining security, to gain appropriate knowledge of and be informed about the existence and general extent of measures... for the security of information systems. Ethics - The Information systems and the security of information systems should be provided and used in such a manner that the rights and legitimate interest of others are respected. Multidisciplinary - Measures, practices and procedures for the security of information systems should take account of and address all relevant considerations and viewpoints. Proportionality - Security levels, costs, measures, practices and procedures should be appropriate and proportionate to the value of and degree of reliance on the information systems and to the severity, probability and extent of potential harm.... Integration - Measures, practices and procedures for the security of information systems should be coordinated and integrated with each other and other measures, practices and procedures of the organization so as to create a coherent system of security. Timeliness - Public and private parties, at both national and international levels, should act in a timely coordinated manner to prevent and to respond to breaches of security of information systems. Reassessment - The security of information systems should be reassessed periodically, as information systems and the requirements for their security vary over time. Democracy - The security of information systems should be compatible with the legitimate use and flow of data and information in a democratic society.
(1996), a normal consumer is on more than 100 mailing lists and at least 50 databases. A survey of 10,000 Web users conducted by the Georgia Institute of Technology concludes that “Privacy now overshadows censorship as the No. 1 most important issue facing the Internet” (Machlis 1997). Of Internet users 81 percent and of people who buy products and services on the Internet 79 percent are concerned about threats to their personal privacy according to a Price Waterhouse survey (Merrick 1998). In the UCLA study released on February 2003, reported that 88.8% of the respondents said that they were somewhat or extremely concerned about when buying online. According to this report, the top five categories in terms of number of responses identifying the major and most critical issues in information security for their organizations for were (1) data protection, (2) regulatory compliance (including Sarbanes–Oxley), (3) identity theft and leakage of private information (4) viruses and worms, and (5) management involvement, risk management and resource allocation. Table 4 summarizes the results of the survey.
Information Privacy As early as 1968, invasion of privacy caused by the use of computers was seen as a “serious ethical problem in the arts and sciences of information processing” (Parker, 1968). Information security and ethics are fundamentally related to information privacy. Technological advances, decreased costs of hardware and software, and the World Wide Web revolution have allowed for vast amounts of data to be generated, collected, stored, processed, analyzed, distributed and used at an ever-increasing rate by organizations and governmental agencies. Almost any activity that an organization or an individual is engaged in creates an electronic foot print that needs to be managed, processed, stored, and communicated. According a survey by U.S. Department of Commerce, an increasing number of Americans are going online and
lxviii
Table 4. Most critical Information Security issues in next two years, CSI/FBI 2006 Computer Crime and Security Survey 2006: 426 respondents (Source: Gordon, 2006) Critical Issue for Information Security
Percentage of Responded ranked it as Critical
Data protection (e.g.., data classification, identification and encryption) and application software (e.g. Web application, VoIP) vulnerability security
17%
Policy and regulatory compliance (Sarbanes–Oxley, HIPAA)
15%
Identity theft and leakage of private information (e.g. proprietary information, intellectual property and business secrets)
14%
Viruses and worms
12%
Management involvement, risk management, or supportive resources (human resources, capital budgeting and expenditures)
11%
Access control (e.g. passwords)
10%
User education, training and awareness
10%
Wireless infrastructure security
10%
Internal network security (e.g. insider threat)
9%
Spyware
8%
Social engineering (e.g. phishing, pharming)
8%
Mobile (handheld) computing devices
6%
Malware or malicious code
5%
Patch management
4%
Zero-day attacks
4%
Intrusion detection systems
4%
Instant messaging
4%
E-mail attacks (e.g. spam)
4%
Employee misuse
3%
Physical security
2%
Web attacks
2%
Two-factor authentication
2%
Bots and botnets
2%
Disaster recovery (e.g. data back-up)
2%
Denial of service
2%
Endpoint security
1%
Managed cybersecurity provider
1%
PKI implementation
1%
Rootkits
1%
Sniffing
1%
Standardization, configuration management
1%
lxix
engaging in several online activities, including online purchases and conducting banking online. The growth in Internet usage and e-commerce has offered businesses and governmental agencies the opportunity to collect and analyze information in ways never previously imagined. “Enormous amounts of consumer data have long been available through offline sources such as credit card transactions, phone orders, warranty cards, applications and a host of other traditional methods. What the digital revolution has done is increase the efficiency and effectiveness with which such information can be collected and put to use” (Adkinson, Eisenach, & Lenard, 2002). The significance of privacy has not been lost to the information security and ethics research and practitioners’ communities as was revealed in Nemati and Barko (Nemati et al., 2001) of the major industry predictions that are expected to be key issues in the future (Nemati et al., 2001). Chiefly among them are concerns over the security of what is collected and the privacy violations of what is discovered (Margulis, 1977; Mason, 1986; Culnan, 1993; Smith, 1993; Milberg, Smith, & Kallman, 1995; Smith, Milberg, & Burke, 1996). About 80 percent of survey respondents expect data mining and consumer privacy to be significant issues (Nemati et al., 2001).
Privacy Definitions and Issues Privacy is defined as “the state of being free from unsanctioned intrusion” (Dictionary.com, 2006). Westin (1967) defined the right to privacy as “the right of the individuals… to determine for themselves when, how, and to what extent information about them is communicated to others.” The Forth Amendment to the U.S. Constitution’s Bill of Rights states that “The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated.” This belief carries back through history in such expressions from England, at least circa 1603, “Every man’s house is his castle.” The Supreme Court has since ruled that “We have recognized that the principal object of the Fourth Amendment is the protection of privacy rather than property, and have increasingly discarded fictional and procedural barriers rested on property concepts.” Thus, because the Amendment “protects people, not places,” the requirement of actual physical trespass is dispensed with and electronic surveillance was made subject to the Amendment’s requirements (Findlaw.com, 2006). Generally the definitions of privacy in regards to business are quite clear. On the Internet, however, privacy raises greater concerns as consumers realize how much information can be collected without their knowledge. Companies are facing an increasingly competitive business environment which forces them to collect vast amounts of customer data in order to customize their offerings. Eventually, as consumers become aware of these technologies, new privacy concerns will arise, and these concerns will gain a higher level of importance. The security of personal data and subsequent misuse or wrongful use without prior permission of an individual raises privacy concerns and often end up in questioning the intent behind collecting private information in the first place (Dhillon & Moores, 2001). Privacy information holds the key to power over the individual. When privacy information is held by organizations that have collected the information without the knowledge or permission of the individual the rights of the individual are at risk. By 1997, consumer privacy had become a prominent issue in the United States (Dyson, 1998).
Cost of Privacy and Why Privacy Matters In practice, information privacy deals with an individual’s ability to control and release personal information. The individual is in control of the release process: to whom information is released, how much is released and for what purpose the information is to be used. “If a person considers the type and amount of information known about them to be inappropriate, then their perceived privacy is at risk” (Roddick & Wahlstrom, 2001). Consumers are likely to lose confidence in the online marketplace because of
lxx
these privacy concerns. Business must understand consumers’ concern about these issues and aim to build consumer trust. It is important to note that knowledge about data collection can have a negative influence on a customer’s trust and confidence level online. Privacy concerns are real and have profound and undeniable implications on people’s attitude and behavior (Sullivan, 2002). The importance of preserving customers’ privacy becomes evident when we study the following information: In its 1998 report, the World Trade Organization projected that the worldwide Electronic Commerce would reach a staggering $220 billion. A year later, Wharton Forum on E-commerce revised that WTO projection down to $133 billion. What accounts for this unkept promise of phenomenal growth? The U.S. Census Bureau, in its February 2004 report, states that “Consumer privacy apprehensions continue to plague the Web and hinder its growth.” In a report by Forrester Research it is stated that privacy fears will hold back roughly $15 billion in e-commerce revenue. In May 2005, Jupiter Research reported that privacy and security concerns could cost online sellers almost $25 billion by 2006. Whether justifiable or not, consumers have concerns about their privacy and these concerns have been reflected in their behavior. The chief privacy officer of Royal Bank of Canada said “Our research shows that 80% of our customers would walk away if we mishandled their personal information.” Privacy considerations will become more important to customers interacting electronically with businesses. As a result, privacy will become an import business driver. People (customers) feel ‘violated’ when their privacy is invaded. They respond to it differently, despite the intensity of their feelings. Given this divergent and varied reaction to privacy violation, a lot of companies still do not appreciate the depth of consumer feelings and the need to revamp their information practices, as well as their infrastructure for dealing with privacy. Privacy is no longer about just staying within the letter of the latest law or regulation. As sweeping changes in attitudes of people their privacy will fuel an intense political debate and put once-routine business and corporate practices under the microscope. Two components of this revolution will concern business the most, rising consumer fears and a growing patchwork of regulations. Both are already underway. Regulatory complexity will grow as privacy concerns surface in scattered pieces of legislation. Companies need to respond quickly and comprehensively. They must recognize that privacy should be a core business issue. Privacy policies and procedures that cover all operations must be enacted. Privacy preserving identity management should be viewed as a business issue, not a compliance issue.
EMERGING TRENDS IN INFORMATION SECURITY AND ETHICS Information security and ethics will be everyone’s business, not just the IT. This change in the way companies view and approach information security will be driven primarily due to consumer demand. The consumers will demand more security of the information about them and will insist on more ethical uses of that information. This demand will drive business profitability measures and will ultimately manifest itself as pressure on the government and other regulatory agencies to pass tougher and more intrusive legislation and regulations resulting in a greater pressure to comply and to demonstrate a commitment to information security. Therefore to succeed, organizations need to focus on information security not just as an IT issue rather as a business imperative. They need to develop business processes that aligns business, IT and security operations. For example, Information security consideration will play more of a prominent role in offshoring, collaborations and outsourcing agreements consideration. In the same vain, business partners must prove that their processes, databases and network’s are security. This will also have an important implication for the outsourcing/off shoring agreements and collaborations. The need for more vigilant and improved policies and practices in monitoring of insiders who may be
lxxi
leaking or stealing confidential information will become more apparent. The black hat will become the norm. Hacking will be increasingly become a criminal profession and will no longer be the domain of hobbyists. The attaches will be more targeted, organized and will have a criminal intent meant to steal information for a profit. Regulatory and compliance requirements will continue to plague the organizations. Regulations and laws will have direct impact on IT implementations and practices. Management teams will be held accountable. Civil and criminal penalties may apply for noncompliance. Security audits will become more widespread as companies are forced to comply with new regulations and laws. The regulatory agencies and law enforcement will become more vigilant in enforcing existing laws such as HIPAA, Sarbanes-Oxley Act. Identity management will continue to be the sore spot of information security. The use of identity federations will increase. With advances in technology and the need for more secure and accurate identity management, biometrics will become mainstream and widely used. Additionally, the use of federated identity management systems will become more widespread. In a federated identity management environment, users will be able to link identity information between accounts without centrally storing personal information. The user can control when and how their accounts and attributes are linked and shared between domains and service providers, allowing for greater control over their personal data. Advanced technical security measures, such as data-at-rest encryption, granular auditing, vulnerability assessment, and intrusion detection to protect private personally identifiable data will become more wide spread. Database security continues to be a major concern for developers, vendor and customers. Organizations demand more secure code and vendors and developers will try to accommodate. In addition to more secure code, the demand for an explicit focus on unified application security architecture will force vendors and developers to seek further interoperability. This is the direct result of increased in sophistication of malware. Malware will morph and become more sophisticated than ever. The new breed of malware will be able to take advantage of operating system and browser vulnerabilities to infect end-user computers with malicious codes for keylogging that monitor and track end users’ behaviors such as Web surfing habits and other behaviors. Malware sophistication will include vulnerability assessment tools for scanning and penetrating corporate network defenses for looking for weaknesses. Phishing will grow in frequency and sophistication. Phishing techniques will morph and become more advanced. Phishing is defined as a method where private information such as social security numbers, usernames, and passwords is collected from users under false pretense by criminals masquerading as legitimate organizations. Malicious Web sites that are intended to violate end users’ privacy by intentionally modify end users’ systems such as browser settings, bookmarks, homepage, and startup files without their consent will gain more sophisticated codes that can infect the users’ computers simply by visiting these sites. These infections can range from installing adware and spyware on a user’s computers, installing dialers, keyloggers and Trojan horses on a user’s machine. Keyloggers are able to be installed remotely by bypassing firewalls and email scanners, and in most cases may not be detected by antivirus. The most sophisticated keyloggers will be able to capture all keystrokes, screenshots, and passwords, encrypt them, and send this information to remote sites undetected. Malicious code such as BOTs will grow as a problem for network administrators. BOT applications are used to capture users’ computers and transform them into BOT networks. These BOT networks can then be used for illegal network uses such as SPAM relay, generic traffic proxies, distributed denial of service (DDoS) attacks, and hosting phishing and other malicious code Web sites. The proliferation of Internet use will accelerate. People, companies, and governments will conduct more and more of their daily business on the Internet. No only will the Internet be used for more, but it will also be used for more complex and previously unimagined purposes. This will be partly fueled by
lxxii
advances in the Internet technologies that will be more complex and far reaching. However, the pace of advances in security technology will be able to keep pace with the Internet’s growth and complexity. As social computing networks such peer-to-peer, instant messaging, and chat gain more popularity and continued adoption of these technologies, organizations will be exposed to new and more disruptive threats. These social computing networks will drain more and more of the corporate bandwidth and will require additional technologies to combat. For example, it is estimated that in 2007, instant messaging will surpass e-mails as the most dominate form of electronic communication. Yet instant messaging is not regulated in most companies and is not subject to the same level of scrutiny as the e-mail systems are. Similarly, individuals are not as vigilant when using instant messaging tools. Therefore, these social computing technologies are fast becoming very popular with attackers. According to a recent study the most popular malicious use of instant messaging is to send the user a link to a malicious, a phishing or a fraudulent Website which then installs and runs a malicious application on the user’s computer in order to steal confidential information.
CONCLUSION As early as July 1997, vice president Albert Gore stated in a report titled A Framework For Global Electronic Commerce that “we are on the verge of a revolution that is just as profound as the change in the economy that came with the industrial revolution. Soon electronic networks will allow people to transcend the barriers of time and distance and take advantage of global markets and business opportunities not even imaginable today, opening up a new world of economic possibility and progress.” It is unmistakably apparent that the “profound revolution” that Gore was discussing has arrived. The electronic network revolution has transformed our lives in way unimaginable only a decade ago. Yet, we are only at the threshold of this revolution. The dizzying pace of advances in information technology promises to transform our lives even more drastically. In order for us to take full advantage of the possibilities offered by this new interconnectedness, organizations, governmental agencies, and individuals must find ways to address the associated security and ethical implications. As we move forward, new security and ethical challenges will likely to emerge. It is essential that we are prepared for these challenges.
REFERENCES ACM Executive Council (1993). ACM code of ethics and professional conduct. Communications of the ACM, 36(2), 99-105. Adkinson, W., Eisenach, J., & Lenard, T. (2002). Privacy online: A report on the information practices and policies of commercial Web sites. Retrieved August 2006, from http://www.pff.org/publications/privacyonlinefinalael.pdf American Institute of Certified Public Accountants (AICPA) information security tops the list of ten most important IT priorities (2007). http://infotech.aicpa.org/Resources American Psychological Association (1992). Ethical principles of psychologists and code of conduct. American Psychologist, 47(12), 1597-1611. Anderson, R. (1992). Social impacts of computing: Codes of professional ethics. Social Science Computing Review, 10(2), 453-469.
lxxiii
Anderson, R. D., Johnson, G., Gotterbarn, D., & Perrolle, J. (1993). Using the new ACM code of ethics in decision making. Comm. ACM, 36(2), 98-107. Aristotle. (n.d.). On the movement of animals; On the soul; Nicomachean ethics; and Eudemian ethics. Barker, W., & Lee, A. (2004). Information security, Volume II: Appendices to guide for mapping types of information and information systems to security categories. National Institute of Standards and Technology, , NIST Special Publication 800- 60 Version II. http://csrc.nist.gov/publications/nistpubs/800-60/SP800-60V2-final.pdf Barker, W. (2004). Guide for mapping types of information and information systems to security categories. National Institute of Standards and Technology, NIST Special Publication 800-60 Version 1.0. http://csrc.nist.gov/publications/nistpubs/800-60/SP800-60V1-final.pdf Barquin, R. (1992). The Ten Commandments of Computer Ethics. Computer Ethics Institute. Becker-Kornstaedt, U. (2001). Descriptive software process modeling: How to deal with sensitive process information. Empirical Software Eng., 6(4). Bynum, T. (2000). The Foundation of Computer Ethics. Computers and Society, 6-13. Bynum, T. (2001). Computer ethics: Basic concepts and historical overview. In E.N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/entries/ethics-computer/ Bynum, T. (2004). Ethical challenges to citizens of ‘‘the automatic age’’: Norbert Wiener on the information society. Journal of Information, Communication and Ethics in Society, 2(2), 65-74. Bynum, T. (2006). Flourishing ethics. Ethics and Information Technology 8, 157-173. Bynum, T. (2007). Norbert Wiener and the rise of information ethics. In W.J. van den Hoven & J. Weckert (Eds.), Moral philosophy and information technology. Cambridge University Press. Committee on National Security Systems (CNSS) (2003). National Security Agency, “National Information Assurance (IA) Glossary.” CNSS Instruction No. 4009. http://www.cnss.gov/Assets/pdf/cnssi_4009.pdf Computer Economics. (n.d.). Trends in IT security threats: 2007 report. www.Computereconomics.com Computer Security Division of the Information Technology Laboratory of National Institute of Standards and Technology (2004). Standards for Security Categorization of Federal Information and Information Systems, FIPS PUB 199. http://csrc.nist.gov/publications/fips/fips199/FIPS-PUB-199-final.pdf Computer Security Institute (n.d.). 2005 Computer Crime and Security Survey. http://www.gocsi.com/ Culnan, M. J. (1993). How did they my name? An exploratory investigation of consumer attitudes toward secondary information use. MIS Quart., 17(3), 341-363. Dhillon, G., & Moores, T. (2001). Internet privacy: Interpreting key issues. Information Resources Management Journal, 14(4). Dictionary.com. (2006). Retrieved July 2006, from http://dictionary.reference.com/browse/privacy Dyson, E. (1998). Release 2.0: A design for living in the digital age. Bantam Doubleday Dell Pub. European Commission (1999). Creating a safer information society by improving the security of information infrastructures and combating computer-related crime. http://www.cybercrime.gov/intl/EUCommunication.0101.pdf Findlaw.com. (2006). Findlaw Homepage. Retrieved July 2006, from http://public.findlaw.com/ Gordon, L., Loeb, M., Lucyshyn, W., & Richardson, R. (n.d.). The 2006 CSI/FBI Computer Crime And Security Survey. http://i.cmpnet.com/gocsi/db_area/pdfs/fbi/FBI2006.pdf
lxxiv
Gotterbarn, D., Miller, K., & Rogerson, S. (1999). Software engineering code of ethics is approved. Comm. ACM, 42(10), 102-107. Gramm-Leach-Bliley Security Requirements (1999). http://www.itsecurity.com/papers/recourse1.htm. Grance, T., Stevens, M., & Myers, M. (2003). Guide to selecting information technology security products. National Institute of Standards and Technology, NIST Special Publication 800-36. http://csrc.nist.gov/publications/ nistpubs/800-36/NIST-SP800-36.pdf HIPAA Compliance and Identity & Access Management (n.d.). http://www.evidian.com/newsonline/art040901. php Hobbes, T. (1994). Leviathan, 1651 in ed., E. Curley., Chicago, IL: Hackett Publishing Company. Huseyin, C., Mishra, B., & Raghunathan, S. (n.d.). The effect of Internet security breach announcements on market value: Capital market reactions for breached firms and Internet security developers. International Journal of Electronic Commerce, 9(1), 69-04. Huseyin, C., Mishra, B., & Raghunathan, S. (2005). The value of intrusion detection systems in information technology security architecture. Information Systems Research, 16(1), 28-46. IEEE Board of Directors (1990). IEEE Code of Ethics. http://www.ieee.org/about/whatiscode.html IEEE-CS/ACM Joint Task Force on Software Engineering Ethics and Professional Practices (1998). Software Engineering Code of Ethics and Professional Practice. http://www.acm.org/serving/secode.htm Kant, I. (1985). Grounding for the metaphysics of morals. tr, James W. Ellington. Indianapolis: Hackett Publishing Company. Kissel, R. (2006). Glossary of key information security terms. National Institute of Standards and Technology. Laudon, K. (1995). Ethical concepts and information technology. Communications of the ACM, 38(12). Linares, M. (2005). Identity and access management solution. SANS Conference, Amsterdam. Machlis, S. (1997). Web sites rush to self-regulate. Computerworld, 32, 19. Margulis, S. T. (1977). Conceptions of privacy: Current status and next steps. J. of Social Issues, (33), 5-10. Mason, R. (1986). Four ethical issues of the information age. MIS Quarterly, 10(1). Mesthene, E. (1968). How technology will shape the future. Science, 135-143. Milberg, S. J., Smith, H. J., & Kallman, E. A. (1995). Values, personal information privacy, and regulatory approaches. Comm. of the ACM, 38, 65-74. Moor, J. (1995). What is computer ethics. In D.G. Johnson & H.Nissenbaum (Ed.), Computers, ethics & social values. Prentice-Hall. Morgan Stanley (2004). The Internet Banking Report. http://www.morganstanley.com Nardi, B. (1996). Context and consciousness: Activity theory and human computer interaction. National Institute of Standards and Technology, “Risk Management Guide for Information Technology Systems. NIST Special Publication 800-30, October 2001, p. 25 Nemati, H., Barko, R., & Christopher, D. (2001). Issues in organizational data mining: A survey of current practices. Journal of Data Warehousing, 6(1), 25-36. NetIQ (2004). Controlling your controls: Security solutions for Sarbanes-Oxley. http://download.netiq.com/Library/White_Papers/NetIQ_SarbanesWP.pdf
lxxv
NIST, Special Publication 800-12: An Introduction to Computer Security - The NIST Handbook National Institute of Standards and Technology (1995). http://csrc.nist.gov/publications/nistpubs/800-12/800-12-html/index.html NIST’s Generally Accepted Principles and Practices for Securing Information Technology Systems (1996). OECD Recommendation, guidelines and explanatory memorandum for the security of information systems (1992). Organisation for Economic Co-operation and Development. Parker, D (1968). Rules of ethics in information processing. Communications of the ACM, 11(3). Power, R. (2002). 2002 CSI/FBI computer crime and security survey. Computer Security Issues and Trends, 8. Roddick, J., & Wahlstrom, K. (2001). On the impact of knowledge discovery and data mining. Australian Computer Society. Sheehan, K. B., & Hoy, M. G. (2000). Dimensions of privacy concern among online consumer. Journal of Public Policy and Marketing, 19, 1. Smith, H. J. (1993). Privacy policies and practices: Inside the organizational maze. Comm. of the ACM, 36, 105122. SOX Achieving Sarbanes-Oxley Compliance with Oblix Management Solutions (2007). http://www.oblix.com/ resources/whitepapers/sol/wp_oblix_sarbox_compliance.pdf Stephanidis, C., Salvendy, G., Akoumianakis, D., Bevan, N., Brewer, J., Emiliani, P. L.,& Thompson, K. (1984). Reflections on trusting trust. Communications of the ACM, 27, 761-763. Sullivan, B. (2002). Privacy groups debate DoubleClick settlement. Retrieved August 2006, from http://www.cnn. com/2002/TECH/internet/05/24/doubleclick.settlement.idg/index.html Thompson, K (1984). Reflections on trusting trust. Communications of the ACM, 27(8), 761-763. Trevino, L., & Brown, M. (2004). Managing to be ethical: Debunking five business ethics myths. Academy of Management Executive, 18(2), 69-82. United States General Accounting Office, Accounting and Information Management Division, Information Security Management: Learning From Leading Organizations (1998). http://www.gao.gov/archive/1998/ai98068.pdf Wack, J., Cutler, K., & Pole, J. (2002). Guidelines on firewalls and firewall policy: Recommendations of the National Institute of Standards and Technology. National Institute of Standards and Technology, NIST Special Publication 800-41. 2002. http://csrc.nist.gov/publications/nistpubs/800-41/sp800-41.pdf Weckert, J., & Adeney, D. (1997). Computer and information ethics. Westport, CT: Greenwood Publishing. Westin, A. (1967). Privacy and freedom. New York: Atheneum. Wiener, N. (1948). Cybernetics or control and communication in the animal and the machine. Technology Press. Wiener, N. (1950). The human use of human beings: Cybernetics and society. Houghton Mifflin. (Second Edition Revised, Doubleday Anchor, 1954). Wiener, N. (1964). God & Golem, Inc. A comment on certain points where cybernetics impinges on religion. MIT Press.
lxxvi
About the Editor
Hamid R. Nemati is an associate professor of information systems at the Information Systems and Operations Management Department of the University of North Carolina at Greensboro (UNCG), USA. He holds a doctorate degree in management sciences and information technology from the University of Georgia and a Master of Business Administration from the University of Massachusetts. He has extensive professional IT experience as an analyst, and has consulted with a number of major corporations. Before coming to UNCG, he was on the faculty of J. Mack Robinson College of Business Administration at Georgia State University. His research specialization is in the areas of organizational data mining, decision support systems, data warehousing, and knowledge management. He has presented nationally and internationally on a wide range of topics relating to these research interests, and his research has been published in numerous top tier scholarly journals.
Section 1
Fundamental Concepts and Theories This section serves as a foundation for this exhaustive reference tool by addressing crucial theories essential to the understanding of information security and ethics. Research found in this section provides an excellent framework in which to position information security and ethics within the field of information science and technology. Excellent insight into the critical incorporation of learning systems into global enterprises is offered, while basic, yet crucial stumbling blocks of information management are explored. With 43 chapters comprising this foundational section, the reader can learn and chose from a compendium of expert research on the elemental theories underscoring the information security and ethics discipline.
Chapter 1.1
E-Government and Denial of Service Attacks Aikaterini Mitrokotsa University of Piraeus, Greece Christos Douligeris University of Piraeus, Greece
AbstrAct The use of electronic technologies in government services has played a significant role in making citizens’ lives more convenient. Even though the transition to digital governance has great advantages for the quality of government services it may be accompanied with many security threats. One of the major threats and hardest security problems e-government faces are the denial of service (DoS) attacks. DoS attacks have already taken some of the most popular e-government sites off-line for several hours causing enormous losses and repair costs. In this chapter, important incidents of DoS attacks and results from surveys that indicate the seriousness of the problem are presented. In order to limit the problem of DoS attacks in government organizations, we also present a list of best practices that can be used to combat the problem together with a classification of attacks and defense mechanisms.
INtrODUctION Since we live in a world where electronic and Internet technologies are playing an important role in helping us lead easier lives, local and state governments are required to adopt and participate in this technology revolution. Digital government or e-government technologies and procedures allow local and national governments to disseminate information and provide services to their citizens and organisations in an efficient and convenient way resulting in reducing waiting lines in offices and in minimizing the time to pick up and return forms and process and acquire information. This modernization of government facilitates the connection and cross cooperation of authorities in several levels of government—central, regional, and local—allowing an easy interchange of data and access to databases and resources that would be impossible otherwise.
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
E-Government and Denial of Service Attacks
E-government undoubtedly makes citizens’ lives and communication easier by saving time, by avoiding and bypassing the bureaucracy, and by cutting down paper work. It also provides the same opportunities for communication with government not only to people in cities but also to people in rural areas. Moreover, e-government permits greater access to information, improves public services, and promotes democratic processes. This shift to technology use and the transition to a “paperless government” is constantly increasing. According to Holden, Norris, and Fletcher (2003), in 1995 8.7% of local governments had Web sites, while in 2003 this number showed an increase that reached 83%. Despite these encouraging statistics, the adoption of digital government proceeds with a slow pace as security issues, like confidentiality and reliability, affect the fast progress of e-government. Since e-government is mainly based on Internet technologies, it faces the danger of interconnectivity and the well-documented vulnerabilities of the Internet infrastructure. The Institute for E-Government Competence Center (IFG.CC, 2002) states that in 2002, 36 government Web sites were victims of intrusions. Most of the e-government attacks have taken place in Asia (25%) and more precisely in China and Singapore (19%), as well as in the USA (19%). According to the U.S. Subcommittee on Oversight and Investigations (2001), the FedCIRC incident records indicate that in 1998 the number of incidents that were reported was 376, affecting 2,732 U.S. Government systems. In 1999, there were 580 incidents causing damage on 1,306,271 U.S. Government systems and in 2000 there were 586 incidents having impact on 575,568 U.S. government systems. Symantec (2004) (Volume VI, released September 2004, activity between January 2004 and June 2004) gives information about Government specific attack data. In this report, one can see that the third most common attack e-government has faced, besides worm-re-
lated attacks and the Slammer worm, is the TCP SYN Flood denial of service attack. So in order to have effective e-government services without interruptions in Web access as well as e-mail and database services, there is a need for protection against DoS attacks. Only with reliable e-government services not threatened by DoS attacks governments may gain the trust and confidence of citizens. Moore, Voelker, and Savage (2001) state that the denial of service (DoS) attacks constitute one of the greatest threats in globally connected networks, whose impact has been well demonstrated in the computer network literature and have recently plagued not only government agencies but also well known online companies. The main aim of DoS is the disruption of services by attempting to limit access to a machine or service. This results in a network incapable of providing normal service either because its bandwidth or its connectivity has been compromised. These attacks achieve their goal by sending at a victim a stream of packets in such a high rate so that the network is rendered unable to provide services to its regular clients. Distributed denial of service (DDoS) is a relatively simple, yet very powerful, technique to attack Internet resources. DDoS attacks add the many-to-one dimension to the DoS problem making the prevention and mitigation of such attacks more difficult and their impact proportionally severe. DDoS attacks are comprised of packet streams from disparate sources. These attacks use many Internet hosts in order to exhaust the resources of the target and cause denial of service to legitimate clients. DoS or DDoS attacks exploit the advantage of varying packet fields in order to avoid being traced back and characterized. The traffic is usually so aggregated that it is difficult to distinguish between legitimate packets and attack packets. More importantly, the attack volume is often larger than the system can handle. Unless special care is taken, a DDoS victim can suffer
E-Government and Denial of Service Attacks
damages ranging from system shutdown and file corruption to total or partial loss of services. Extremely sophisticated, “user-friendly,” and powerful DDoS toolkits are available to potential attackers increasing the danger that an e-government site becomes a victim in a DoS or a DDoS attack by someone without a detailed knowledge of software and Internet technologies. Most of the DDoS attack tools are very simple and have a small memory size something that is exploited by attackers, who achieve easily implementation and manage to carefully hide the code. Attackers constantly modify their tools to bypass security systems developed by system managers and researchers, who are in a constant alert to modify their approaches in order to combat new attacks. The attackers in order to have more devastating results change their tactics and the way they launch DoS attacks. One of these tactics is the silent degradation of services for a long period of time in order to exhaust a large amount of bandwidth instead of a quick disruption of network services. The result of these attacks in government organisations among others include reduced or unavailable network connectivity and, consequently, reduction of the organisation’s ability to conduct legitimate business on the network for an extended period of time. The duration and the impact of the attack depends on the number of possible attack networks. It is also worth bearing in mind that even if an organisation is not the target of an attack, it may experience increased network latency and packet losses, or possibly a complete outage, as it may be used from the attacker in order to launch a DDoS attack. In this chapter, we stress the severity that a DoS attack may have for e-government agencies. To this end, statistics and characteristic incidents of DoS attacks in e-government agencies are presented. Furthermore, we present a classification of DoS and DDoS attacks, so that one can have a good view of the potential problems. Moreover, we outline a list of best practices that can be used
in government organisations in order to further strengthen the security of their systems and to help them protect their systems from being a part of a distributed attack or being a target of DoS/DDoS attacks. Long-term countermeasures are also proposed that should be adopted for more efficient solutions to the problem. Following this introduction, this chapter is organised as follows. In the section “Denial of Service Attacks” the problem of DoS attacks is investigated, DoS incidents and results from surveys related to DoS attacks, and a classification of DoS attacks are presented. In the section “Distributed Denial of Service Attacks” the problem of DDoS attacks is introduced, giving the basic characteristics of well known DDoS tools, and presenting a taxonomy of DDoS attacks. In the section “Classification of DDoS Defense Mechanisms,” we present the DDoS defense problems and propose a classification of DDoS defense mechanisms. In the section “Best Practices for Defeating Denial of Service Attacks” best practices for defeating DoS attacks that can be used by government organizations are presented, while in the section “Long Term Countermeasures” some long-term efforts against DoS attacks are presented.
DENIAL OF sErVIcE AttAcKs Defining Denial of Service Attacks The WWW Security FAQ (Stein & Stewart, 2002) states that “a DoS attack can be described as an attack designed to render a computer or network incapable of providing normal services.” In a DoS attack, a computer or network resource is blocked or degraded resulting in unavailable system resources but not necessarily in the damage of data. The most common DoS attacks target the computer network’s bandwidth or connectivity (Stein & Stewart, 2002). In bandwidth attacks, the network is flooded with a high volume of traffic
E-Government and Denial of Service Attacks
leading to the exhaustion of all available network resources, so that legitimate user requests cannot get through, resulting in degraded productivity. In connectivity attacks, a computer is flooded with a high volume of connection requests leading to the exhaustion of all available operating system resources, thus rendering the computer unable to process legitimate user requests.
Denial of Service Incidents Undoubtedly, DoS attacks are a threatening problem for the Internet, causing disastrous financial losses by rendering organisations’ sites off-line for a significant amount of time as we can easily confirm by frequent news reports naming as victims of DoS attacks well-known large organisations with significant exposure in the e-economy. Howard (1998) reports denial of service attacks’ statistics where one can see the dramatic increase in such attacks even in the first years of the Web. The Internet Worm (Spafford, 1998) was a prominent story in the news because it “DoS-ed” hundreds of machines. But it was in 1999 when a completely new breed of DoS attacks appeared. The so-called distributed denial of service attacks stroke a huge number of prominent Web sites. Criscuolo (2000) reports that the first DDoS attack occurred at the University of Minnesota in August 1999. The attack, flooding the Relay chat server, lasted for two days and it was estimated that at least 214 systems were involved in the attack launch. In February 2000, a series of massive denial-of-service (DoS) attacks rendered out of service several Internet e-commerce sites including Yahoo.com. This attack kept Yahoo off the Internet for 2 hours and lead Yahoo a significant advertising loss. In October 2002 (Fox News, 2002), 13 routers that provide the DNS service to Internet users were victims of a DDoS attack. Although the attack lasted only for an hour, 7 of the 13 root servers were shut down, something that indicates the potential vulnerability of the Internet to DDoS attacks. In January of 2001,
Microsoft’s (WindowsITPro, 2001) Web sites hosting Hotmail, MSN, Expedia, and other major services were inaccessible for about 22 hours because of a DDoS attack. Despite attacks on high-profile sites, the majority of the attacks are not well publicized for obvious reasons. CERT (2001) reports that in July 2001, the Whitehouse Web site was the target of the Code Red worm. The attack on the Whitehouse lasted from about 8 a.m. to about 11:15 a.m. Between 1 p.m. and 2 p.m., page request continued failing, while after 2 p.m. the site was occasionally inaccessible. In order to alleviate the effects of the attack, the Whitehouse momentarily changed the IP address of the Whitehouse.gov Web site. Sophos.com (2002) reports that in June 2002, the Pakistani’s Government Web site accepted a DoS attack that was launched by Indian sympathizers. The attack was launched through a widespread Internet worm called W32/Yaha-E, which encouraged Indian hackers and virus writers to launch an attack against Pakistan Government sites. The worm arrived as an e-mail attachment and its subject was relative to love and friendship. The worm highlighted the political tensions between Indian and Pakistan and managed to render the www.pak.gov.pk Web site unreachable. The worm created a file on infected computers that urged others to participate in the attack against the Pakistani government. ITworld.com (2001) reports that even the site of CERT was the victim of a DDoS attack on May 28, 2001. Although the CERT Coordination Center is the first place where someone can find valuable information in order to be prevented against malicious cyber attacks it was knocked offline for two days by a DDoS attack accepting information at rates several hundred times higher than normal. Cs3.Inc (2005) reports that a DDoS attack was launched on the U.S. Pacific command in April 2001. The source addresses of the attack belonged to the People’s Republic of China, although the
E-Government and Denial of Service Attacks
exact origin of the attack has yet not been identified. Despite the fact that the internal networks of the command were not affected, in the long-term no one can deny the fact that critical government operations may be easily disrupted by attackers. After this incident, the political tension between the two countries increased considerably. The U.S. government worries that U.S. critical network assets may be a target of a DDoS attack as a digital continuation of the terrorist attacks against New York in September of 2001. But government systems can not only be victims of DoS attacks, but may also be used unwittingly in order for a DoS attack to be performed by hosting the agents of a DDoS attack, thus participating involuntarily in the conduction of the attack. Moore et al. (2001) report that in February of 2001, UCSD network researchers from the San Diego Supercomputer Center (SDSC) and the Jacobs School of Engineering analyzed the worldwide pattern of malicious denial-of-service (DoS) attacks against the computers of corporations, universities, and private individuals. They proposed a new technique, called “backscatter analysis” that gives an estimate of worldwide denial of service activity. This research provided the only publicly available data quantifying denial of service activity in the Internet and enabled network engineers to understand the nature of DoS attacks. The researchers used data sets that were collected and analyzed in a three-week long period. They assessed the number, duration, and focus of the attacks, in order to characterize their behaviour and observed that more than 12,000 attacks against more than 5,000 distinct targets, ranging from well-known e-commerce companies to small foreign Internet service providers and even individual personal computers on dial-up connections. Some of the attacks flooded their targets with more than 600,000 messages/packets per second. In addition, they reported that 50% of the at-
tacks were less than ten minutes in duration, 80% were less than thirty minutes, and 90% lasted less than an hour. Two percent of the attacks were longer than five hours, 1% is greater than ten hours, and a few dozen spanned multiple days. Furthermore, according to this research, 90% were TCP-based attacks and around 40% reached rates as high as 500 packets per second (pps) or greater. Analyzed attacks peaked at around 500,000 pps, while other anecdotal sources report larger attacks consuming 35 megabits per second (Mbps) for periods of around 72 hours, with high-volume attacks reaching 800 Mbps. The Computer Security Institute (2003) in the 2003 CSI/FBI survey reported that denial of service attacks represent more than a third among the WWW site incidents, where unauthorized access or misuse was conducted. Forty-two percent of respondents to the 2003 survey reported DoS attacks. In 2000, 27% reported such attacks. There appears to be a significant upward trend in DoS attacks. The Computer Security Institute (2004) in the 2004 CSI/FBI survey reported that the highest reported financial losses due to a single DoS attack increased from $1 million in 1998 to $26 million in 2004 and emerged for the first time as the incident type generating the largest total losses. We should also keep in mind that many government organisations interpret DDoS attacks as simply being an experience of inadequate service from their ISP and are not aware that they are under attack. This has as result the fact that nine out of ten DDoS attacks go unreported. In spite of such evidence, most government organisations overlook the necessity of using preventive mechanisms to combat DoS attacks. Although there is no panacea for all types of DoS attacks, there are many defense mechanisms that can be used in order to make the launch of an attack more difficult and provide the means to reach the disclosure of the identity of the attacker.
E-Government and Denial of Service Attacks
Denial of Service Attack Classification DoS attacks can be classified into five categories based on the attacked protocol level. More specifically, Karig and Lee (2001) divide DoS attacks in attacks in the Network Device Level, the OS Level, application based attacks, data flooding attacks, and attacks based on protocol features. DoS attacks in the Network Device Level include attacks that might be caused either by taking advantage of bugs or weaknesses in software, or by exhausting the hardware resources of network devices. One example of a network device exploit is the one that is caused by a buffer-overrun error in the password checking routine. Using this exploit, certain routers (Karig et al., 2001) could be crashed if the connection to the router is performed via telnet and entering extremely long passwords. The OS level DoS attacks (Karig et al., 2001) take advantage of the ways protocols are implemented by operating systems. One example of this category of DoS attacks is the Ping of Death attack (Insecure.org, 1997). In this attack, ICMP echo requests having data sizes greater than the maximum IP standard size are sent to the victim. This attack often results in the crashing the victim’s machine. Application-based attacks try to take a machine or a service out of order either by exploiting bugs in network applications that are running on the target host or by using such applications to drain the resources of their victim. It is also possible that the attacker may have found points of high algorithmic complexity and exploits them in order to consume all available resources on a remote host. One example of an application-based attack (Karig et al., 2001) is the finger bomb. A malicious user could cause the finger routine to be recursively executed on the victim, in order to drain its resources.
In data flooding attacks, an attacker attempts to use the bandwidth available to a network, host, or device at its greatest extent, by sending massive quantities of data and so causing it to process extremely large amounts of data. An example is flood pinging. DoS attacks based on protocol features take advantage of certain standard protocol features. For example, several attacks exploit the fact that IP source addresses can be spoofed. Moreover, several types of DoS attacks attempt to attack DNS cache on name servers. A simple example of attacks exploiting DNS is when an attacker owning a name server traps a victim name server into caching false records by querying the victim about the attacker’s own site. If the victim name server is vulnerable, it would then refer to the malicious server and cache the answer.
DIstrIbUtED DENIAL OF sErVIcE AttAcKs Defining Distributed Denial of Service Attacks The WWW Security FAQ (Stein & Stewart, 2002) states “A DDoS attack uses many computers to launch a coordinated DoS attack against one or more targets. Using client/server technology, the perpetrator is able to multiply the effectiveness of the DoS significantly by harnessing the resources of multiple unwitting accomplice computers, which serve as attack platforms.” It is distinguished from other attacks by its ability to deploy its weapons in a “distributed” way over the Internet and to aggregate these forces to create lethal traffic. The main goal of a DDoS attack is to cause damage on a victim either for personal reasons or for material gain or for popularity. Mirkovic, Martin, and Reiher (2001) state that the following Internet characteristics make DDoS attacks very destructive:
E-Government and Denial of Service Attacks
1.
2.
3.
Interdependency of Internet security: When a machine is connected to the Internet, it is also connected to countless insecure and vulnerable hosts, making it difficult to provide a sufficient level of security. Limited resources: Every host in the Internet has unlimited resources, so sooner or later its resources will be consumed. Many against afew: If the attacker’s resources are greater than the victim’s resources then a DDoS attack is almost inevitable.
DDoS Strategy
3.
4.
The following steps take place in order to prepare and conduct a DDoS attack: •
A distributed denial of service attack is composed of four elements, as shown in Figure 1. 1. 2.
The real attacker The handlers or masters, who are compromised hosts with a special program capable
of controlling multiple agents, running on them (Cisco Systems, Inc., 2006) The attack daemon agents or zombie hosts, who are compromised hosts, running a special program and generate a stream of packets towards the victim (Cisco Systems, Inc., 2006) A victim or target host
•
Step 1. Selection of agents: The attacker chooses the agents that will perform the attack. The selection of the agents is based on the existence of vulnerabilities in those machines that can be exploited by the attacker in order to gain access to them. Step 2. Compromise: The attacker exploits the security holes and vulnerabilities of the agent machines and plants the attack code.
Figure 1. Architecture of DDoS attacks
Handler
Control Traffic
. . .
Control Traffic
. . .
Agent
Agent
Agent Handler
Attacker
Handler
. . .
Flood Traffic
Victim
. . .
Control Control Agent Traffic Traffic
Agent
Agent
E-Government and Denial of Service Attacks
•
•
Furthermore, the attacker tries to protect the code from discovery and deactivation. Selfpropagating tools such as the Ramen worm (CIAC Information Bulletin, 2001) and Code Red (CERT, 2001) soon automated this phase. When participating in a DDoS attack, each agent program uses only a small amount of resources (both in memory and bandwidth), so that the users of computers experience minimal change in performance The people who use the agent systems do not know that their systems are compromised and used for the launch of a DDoS attack (Specht & Lee, 2003). When participating in a DDoS attack, agent programs consume little resources this means that the users of computers experience minimal change in performance. Step 3. Communication (Specht et al., 2003): Before the attacker commands the onset of the attack, he communicates with the handlers in order to find out which agents can be used in the attack, if it is necessary to upgrade the agents and when is the best time to schedule the attack. Step 4. Attack: At this step, the attacker commands the onset of the attack (Mirkovic, 2002). The victim and the duration of the attack as well as special features of the attack such as the type, port numbers, length, TTL, and so forth can be adjusted.
In a new generation of DDoS attacks, the onset of the attack is not commanded by the attacker but starts automatically during a monitoring procedure of a public location on the Internet. For instance, a chat room may be monitored and when a specific word is typed the DDoS attack is triggered. It is even more difficult to trace the attacker and reveal its true origin in such an environment. We can understand the enormity of the danger if the trigger word or phrase is commonly used. Specht et al. (2003) state that a multi-user, online chatting system known as Internetrelay
chat (IRC) channels is often used for the communication between the attacker and the agents, since IRC chat networks allow their users to create public, private and secret channels. An IRC-based DDoS attack model does not have many differences computed to the agent-handler DDoS attack model except from the fact that an IRC server is responsible for tracking the addresses of agents and handlers and for facilitating the communication between them. The main advantage of the IRC-based attack model over the agent-handler attack model is the anonymity it offers to the participant of the attack.
DDoS Tools There are several known DDoS attack tools. The architecture of these tools is very similar whereas some tools have been constructed through minor modifications of other tools. In this section, we present the functionality of some of these tools. For presentation purposes, we divide them in agent-based and IRC-based DDoS tools. Agent-based DDoS tools are based on the agent—handler DDoS attack model that consists of handlers, agents, and victim(s) as it has already been described in the section on DDoS attacks. Some of the most known agent-based DDoS tools are the following: Trinoo, TFN, TFN2K, Stacheldraht, mstream, and Shaft. Trinoo (Criscuolo, 2000) is the most known and mostly used DDoS attack tool. It is a tool that is able to achieve bandwidth depletion and can be used to launch UDP flood attacks. Tribe Flood Network (TFN) (Dittrich, 1999a) is a DDoS attack tool that is able to perform resource and bandwidth depletion attacks. Some of the attacks that can be launched by TFN include Smurf, UDP flood, TCP SYN flood, ICMP echo request flood, and ICMP directed broadcast. TFN2K (Barlow & Thrower, 2000) is a derivative of the TFN tool and is able to implement Smurf, SYN, UDP, and ICMP Flood attacks. TFN2K has a special feature of being able to add encrypted messaging
E-Government and Denial of Service Attacks
between all of the attack components (Specht et al., 2003). Stacheldraht (Dittrich, 1999b) (German term for “barbed wire”), that is based on early versions of TFN, attempts to eliminate some of its weak points and implement Smurf, SYN Flood, UDP Flood, and ICMP Flood attacks. Mstream (Dittrich, Weaver, Dietrich, & Long, 2000) is a simple TCP ACK flooding tool that is able to overwhelm the tables used by fast routing routines in some switches. Shaft (Dietrich et al., 2000) is a DDoS tool similar to Trinoo that is able to launch packet flooding attacks by controlling the duration of the attack as well as the size of the flooding packets. Many IRC-based DDoS tools are very sophisticated as they include some important features that are also found in many agent-handler attack tools. One of the most known IRC-based DDoS tools is Trinity (Hancock, 2000). Trinity v3 (Dietrich et al., 2000) besides the up to now well-known UDP, TCP SYN, TCP ACK, TCP NUL packet floods introduces TCP fragment floods, TCP RST
packet floods, TCP random flag packet floods, and TCP established floods. In the same generation with Trinity is myServer (Dietrich et al., 2000) and Plague (Dietrich et al., 2000). MyServer relies on external programs to provide DoS and Plague provides TCP ACK and TCP SYN flooding. Knight (Bysin, 2001) is a very lightweight and powerful IRC-based DDoS attack tool able to perform UDP Flood attacks, SYN attacks and an urgent pointer flooder. A DDoS tool that is based on Knight is Kaiten (Specht et al., 2003). Kaiten includes UDP, TCP flood attacks, SYN, and PUSH+ACH attacks and it also randomizes the 32 bits of its source address.
DDoS Classification To be able to understand DDoS attacks it is necessary to have a formal classification. We propose a classification of DDoS attacks that combines efficiently the classifications proposed by Mirkovic et al. (2001), Specht et al. (2003), and
Figure 2. Classification of DDoS attacks
Classification by degree of automation
DDoS Attacks Classification by exploited vulnerability
Manual SemiAutomatic Direct Indirect Automatic
Classification by attack rate dynamics
Flood attack
Continuous
Disruptive
Variable
Degrading
UDP flood ICMP flood Amplification Attack
Classification by impact
Fluctuating Increasing
Smurf Attack Fraggle Attack Protocol Ex loit Attack Malformed Packet Attack
E-Government and Denial of Service Attacks
more recent research results. This classification is illustrated in Figure 2 and consists of two levels. In the first level, attacks are classified according to their degree of automation, exploited vulnerability, attack rate dynamics and their impact. In the second level, specific characteristics of each first level category are recognized.
cLAssIFIcAtION OF DDOs DEFENsE MEcHANIsMs DDoS attack detection is extremely difficult. The distributed nature of DDoS attacks makes them extremely difficult to combat or trace back. Moreover, the automated tools that make the deployment of a DDoS attack possible can be easily downloaded. Attackers may also use IP spoofing in order to hide their true identity. This spoofing makes the traceback of DDoS attacks even more difficult. We may classify DDoS defense mechanisms using two different criteria. The first classification categorizes the DDoS defense mechanisms according to the activity deployed as follows: 1.
2.
3.
4.
0
Intrusion prevention: Tries to stop DDoS attacks from being launched in the first place. Intrusion detection: Focuses on guarding host computers or networks against being a source of network attack as well as being a victim of DDoS attacks either by recognizing abnormal behaviours or by using a database of known. Intrusion response: Tries to identify the attack source and block its traffic accordingly. Intrusion tolerance and mitigation: Accepts that it is impossible to prevent or stop DDoS attacks completely and focuses on minimizing the attack impact and on maximizing the quality of the offered services.
The second classification divides the DDoS defenses according to the location deployment resulting (Mirkovic, 2002) into the following three categories of defense mechanisms: 1.
2.
3.
Victim network mechanisms: Helps the victim recognize when it is the main target of an attack and gain more time to respond. Intermediate network mechanisms: Are more effective than victim network mechanisms since they achieve a better handling of the attack traffic and an easier tracing back to the attackers. Source network mechanisms: Trys to stop attack flows before they enter the Internet core and facilitate the traceback and investigation of an attack.
The previous classification of DDoS defense mechanisms is described thoroughly in Douligeris and Mitrokotsa (2004).
bEst PrActIcEs FOr DEFEAtING DENIAL OF sErVIcE AttAcKs DoS attacks can lead to a complete standstill of entire government organisations, thereby costing millions of dollars in lost revenue and/or productivity and moving citizens away from e-services. Some governments do not understand the seriousness of the problem, resulting in vulnerable and easy to compromise systems. These systems pose a threat not only to the organisations themselves but also to anyone else targeted by a hacker through these systems. This means it is critical to take preemptive measures to reduce the possibility of these attacks and minimize their impact. Since DoS attacks are extremely complicated one must note that there is no single-point solution and no system is secure proof. No one can deny though that with effective advance planning government agencies could respond efficiently and
E-Government and Denial of Service Attacks
rapidly to security threats like denial of service. Below we list some practices that can be used in order to reduce these attacks and diminish their impact. 1.
2.
3.
4.
Establish a security policy and educate: As stated by Walters (2001), it is of great importance to establish and maintain a security policy. In addition to covering the basics of antivirus, user access, and software updates, on no account one should neglect to address ways to combat DoS/DDoS attacks in such a policy. Moreover, a security policy should be adequately communicated to all employees. It is important to verify that the knowledge skills of system administrators and auditors are current, something that can be achieved by frequent certifications. Of great importance is the continuous training of the organisation’s personnel in new technologies and forensic techniques. Use multiple ISPs: Government organisations should consider using more than one ISP, in order to make a DoS/DDoS attack against them harder to carry out. In the selection of ISPs, it is important to keep in mind that providers should use different access routes in order to avoid a complete loss of access in the case one pipe becomes disabled (Walters, 2001). It has also been proposed to set legislation to make it obligatory for ISPs to set up egress filtering. Load balancing: Specht et al. (2003) state that a good approach in order to avoid being a victim of DoS attacks is to distribute an organisation’s systems’ load across multiple servers. In order to achieve this, a “Round Robin DNS” or hardware routers could be used to send incoming requests to one or many servers. Avoid a single point failure: In order to avoid a single point failure the best solution is to have redundant (“hot spares”) machine that can be used in case a similar machine
5.
6.
7.
is disabled (Householder, Manion, Pesante, Weaver, & Thomas, 2001). Furthermore, organisations should develop recovery plans that will cover each potential failure point of their system. In addition, organisations should use multiple operating systems in order to create “biodiversity” and avoid DoS attack tools that target specific Operating Systems (OSs). Protect the systems with a firewall: Walters (2001) states that since the exposure to potential intruders is increasing, the installation of firewalls that tightly limit transactions across the systems’ periphery government organisations should be built to provide effective defenses. Firewalls should be configured appropriately keeping open only the necessary ports. In addition, firewalls are able to carefully control, identify, and handle overrun attempts. Moreover, ingress filtering should be established in government Web servers so that they cannot be used as zombies for launching attacks on other servers. Government departments should also specify a set of IP addresses that could be used only by Government servers. Disable unused services: It is important, that as Leng and Whinston (2000) state, organisations’ systems remain simple by minimizing the number of services running on them. This can be achieved by shutting down all services that are not required. It is important to turn off or restrict specific services that might otherwise be compromised or subverted in order to launch DoS attacks. For instance, if UDP echo or character generator services are not required, disabling them will help to defend against attacks that exploit these services. Be up to date on security issues: As it is widely known the best way to combat DoS attacks is to try to be always protected and up-to-date on security issues (Householder et al., 2001). It is important to be informed
E-Government and Denial of Service Attacks
about the current upgrades, updates, security bulletins, and vendor advisories in order to prevent DoS attacks. Thus, the exposure to DoS attacks can be substantially reduced, although one would not expect the risk to be eliminated entirely. 8. Test and monitor systems carefully: The first step in order to detect anomalous behaviour is to “characterize” what normal means in the context of a government agency’s network. The next step should be the auditing of access privileges, activities, and applications. Administrators should perform 24x7 monitoring in order to reduce the exhausting results of DoS attacks that inflict government servers. Through this procedure, organisations would be able to detect unusual levels of network traffic or CPU usage (Householder et al., 2001). There are a variety of tools that are able to detect, eliminate, and analyze denial-of-service attacks. 9. Mitigate spoofing: An approach that intruders often use in order to conceal their identity when launching DoS attacks is source-address spoofing. Although it is impossible to completely eliminate IP spoofing, it is important to mitigate it (Singer, 2000). There are some approaches that can be used in order to make the origins of attacks harder to hide and to shorten the time to trace an attack back to its origins. System administrators can effectively reduce the risk of IP spoofing by using ingress and egress packet filtering on firewalls and/or routers. 10. Stop broadcast amplification: It is important to disable inbound directed broadcasts in order to prevent a network from being used as an amplifier for attacks like ICMP Flood and Smurf (Leng et al., 2000). Turning off the configuration of IP directed broadcast packets in routers and making this a default configuration is the best action that could be performed by network hardware vendors.
DNS for access control should not be used: Using hostnames in access list instead of IP addresses make systems vulnerable to name spoofing (Leng et al., 2000). Systems should not rely on domain or host names in order to determine if an access is authorized or not. Otherwise, intruders can masquerade a system, by simply modifying the reverselookup tables. 12. Create an incident response plan: It is important to be prepared and ready for any possible attack scenario. Government organisations should define a set of clear procedures that could be followed in emergency situations and train personnel teams with clearly defined responsibilities ready to respond in emergency cases (Householder et al., 2001). Any attacks or suspicious system flaws should be reported to local law enforcement and proper authorities (such as FBI and CERT) so that the information could be used for the defense of other users as well. 11.
LONG-tErM cOUNtErMEAsUrEs The variety and sophistication of DoS attacks are likely to increase, so despite the defensive measures that can be used now, we need to confront DoS attacks as a problem that requires a long-term effort in order to define and implement effective solutions. It is important to note here that governments should adopt a non-intrusive approach for the protection against DoS attacks while there is a fine line between limiting criminal activity and limiting economy, education, information, and personal freedoms. Suns Institute (2000) identifies some actions that will help in defending against DoS attacks more effectively in the distant future. Among them one finds the accelerated adoption of the IPsec components of IPv6 and Secure DNS. It is important that the security updating process
E-Government and Denial of Service Attacks
be automated. Vendors should be encouraged to implement this on behalf of their clients in order to make it easier to update their products and provide information on security issues. Furthermore, research and development of safer operating systems is necessary. Topics to be addressed should include among others anomaly-based detection and other forms of intrusion detection. In addition, governments should consider making some changes in their government procurement policies in a way that security and safety are emphasized. A significant role in the fight against denial of service attacks would be the establishment of organisations that would be responsible for network security monitoring and incident handling. These organisations should encourage the public awareness about security issues, inform critical owners’ infrastructures and government departments about threats, promote and encourage the adoption and production of security standards and maintain statistics and incident databases as well as cooperate with similar organisations (e.g., CERT). Governments should also ensure that government agencies take all the necessary steps in order to ensure their IT security. Government departments should encourage a better investigation of computer attacks while respecting the privacy and personal rights of Internet users. Additional funding for the training of expert personnel in securing IT Technologies and educating citizens in order to be prevented from cyber crime is a must. It is also important to promote and encourage law enforcement authorities to prosecute perpetrators across national borders and examine the legal framework to facilitate this cooperation.
cONcLUsION Undoubtedly, DoS attacks should be treated as a serious problem in the Internet. Their rate of growth and wide acceptance challenge the general public’s view of electronic transactions and
create skeptical governments and businesses. No one can deny that DoS attacks will continue to pose a significant threat to all organisations including government organisations. New defense mechanisms will be followed by the emergence of new DoS attack modes. A network infrastructure must be both robust enough to survive direct DoS attacks and extensible enough to adopt and embrace new defenses against emerging and unanticipated attack modes. In order to ensure high resiliency and high performance in public and private networks efforts need to be concerted by administrators, service providers and equipment manufacturers. It is of great importance that citizens communicate with their government authorities online. No one should be allowed to shut down valuable e-government services. A more enlightened approach would be to ask all citizens to take responsibility for securing the Internet in their hands. Public awareness is the key in order to securely exist and succeed in the world of e-government.
rEFErENcEs Barlow, J., & Thrower, W. (2000). TFN2K—An analysis. Retrieved from http://seclists.org/lists/ bugtraq/2000/Feb/0190.html Bysin. (2001). Knight.c Sourcecode. Retrieved from http://packetstormsecurity.nl/ distributed/ knight.c CERT. (2001). CERT Coordination Center Advisory CA-2001-19 Code Red Worm Exploiting Buffer Overflow in IIS Indexing Service DLL. Carnegie Mellon Software Engineering Institute. Retrieved from http://www.cert.org/advisories/ CA-2001-19.html CIAC Information Bulletin. (2001). L-040: The Ramen Worm. Computer Incident Advisory Capability (CIAC). Retrieved from http://www.ciac. org/ciac/bulletins/l-040.shtml
E-Government and Denial of Service Attacks
Cisco Systems, Inc. (2006). Strategies to protect against distributed denial of service (DDoS) attacks (Document ID: 13634). Retrieved from http://www.cisco.com/warp/public/707/newsflash.html Computer Security Institute. (2003). 2003 CSI/FBI Computer Crime and Security Survey. CSI Inc. Computer Security Institute. (2004). 2004 CSI/FBI Computer Crime and Security Survey. CSI Inc. Criscuolo, P. J. (2000). Distributed denial of service Trin00, Tribe Flood Network, Tribe Flood Network 2000, and Stacheldraht CIAC-2319 (Tech. Rep. No. , UCRL-ID-136939, Rev. 1.). Department of Energy Computer Incident Advisory Capability (CIAC), Lawrence Livermore National Laboratory. Retrieved from http://ftp.se.kde.org/ pub/security/csir/ciac/ ciacdocs/ciac2319.txt Cs3 Inc. (2005). Defending government network infrastructure against distributed denial of service attacks. CS3-inc.com. Retrieved from http:// www.cs3-inc.com/government-ddos-threat-andsolutions.pdf Dietrich, S., Long, N., & Dittrich, D. (2000). Analyzing distributed denial of service tools: The shaft case. In Proceedings of the 14th Systems Administration Conference (LISA 2000) (pp. 329339), New Orleans, LA. Dittrich, D. (1999a). The tribe flood network distributed denial of service attack tool. University of Washington. Retrieved from http://staff.washington.edu/dittrich/misc/ trinoo.analysis.txt Dittrich, D. (1999b). The Stacheldraht distributed denial of service attack tool. University of Washington. Retrieved from http://staff.washington. edu/dittrich/misc/ stacheldraht.analysis.txt Dittrich, D., Weaver, G., Dietrich, S., & Long, N. (2000). The mstream distributed denial of service attack tool. University of Washington.
Retrieved from http://staff.washington.edu/dittrich/misc/mstream.analysis.txt Douligeris C., & Mitrokotsa, A. (2004). DDoS attacks and defense mechanisms: Classification and state-of-the-art. Computer Networks, 44(5), 643-666. Fox News. (2002). Powerful attack cripples Internet. Retrieved from http://www.linux.security. com/content/view/112716/65/ Hancock, B. (2000). Trinity v3, A DDoS tool, hits the streets. Computers & Security, 19(7), 574-574. Holden, S., Norris, D., & Fletcher, P. (2003). Electronic government at the local level: Progress to date and future issues. Public Performance and Management Review, 26(4), 325-344. Householder, A., Manion, A., Pesante, L., Weaver, G. M., & Thomas, R. (2001). Trends in denial of service attack technology (v10.0). CERT Coordination Center, Carnegie Mellon University. Retrieved from http://www.cert.org/archive/pdf/ DoS_trends.pdf Howard, J. (1998). An analysis of security incidents on the Internet 1989-1995. PhD thesis, Carnegie Mellon University. Retrieved from http://www. cert.org/research/ JHThesis/Start.html Insecure.org. (1997). Ping of death. Retrieved from http://www.insecure.org/sploits/ ping-odeath.html Institute for e-government Competence Center (IfG.CC). (2002). eGovernment: “First fight the hackers.” Retrieved from http://www.unipotsdam. de/db/elogo/ifgcc/index.php?option=com_conte nt&task=view&id=1450&Itemid=93&lan g=en_GB ITworld.com. (2001). CERT hit by DDoS attack for a third day. Retrieved from http://www.itworld. com/Sec/3834/IDG010524CERT2/
E-Government and Denial of Service Attacks
Karig, D., & Lee, R. (2001). Remote denial of service attacks and countermeasures (Tech. Rep. No. CE-L2001-002). Department of Electrical Engineering, Princeton University.
Spafford, E. H. (1998). The Internet worm program: An analysis (Tech. Rep. No. SD-TR-823). Department of Computer Science Purdue University, West Lafayette, IN.
Leng, X., & Whinston, A.B. (2000). Defeating distributed denial of service attacks. IEEE IT Professional, 2(4) 36-42.
Specht, S., & Lee R. (2003). Taxonomies of distributed denial of service networks, attacks, tools, and countermeasures (Tech. Rep. No. CEL2003-03). Princeton University.
Mirkovic, J. (2002). D-WARD: DDoS network attack recognition and defense. PhD dissertation prospectus. Retrieved from http://www.lasr. cs.ucla.edu/ddos/prospectus.pdf Mirkovic, J., Martin, J., & Reiher P. (2001). A taxonomy of DDoS attacks and DDoS defense mechanisms (Tech. Rep. No. 020018). UCLA CSD. Moore, D., Voelker, G., & Savage, S. (2001). Inferring Internet denial of service activity. In Proceedings of the USENIX Security Symposium, Washington, DC (pp. 9-22). SANS Institute. (2000). Consensus roadmap for defeating distributed denial of service attacks (Version 1.10). Sans Portal. Retrieved from http:// www.sans.org/dosstep/ roadmap.php Singer, A. (2000). Eight things that ISP’s and network managers can do to help mitigate distributed denial of service attacks. San Diego Supercomputer Center (SDSC), (NPACI). Retrieved from http://security.sdsc.edu/publications/ddos.shtml Sophos.com. (2002). Indian sympathisers launch denial of service attack on Pakistani government. Retrieved from http://www.sophos.com/virusinfo/articles/yahae3.html
Symantec. (2004). Symantec reports government specific attack data (Article ID 4927). Symantec.com. Retrieved from http://enterprisesecurity.symantec.com/publicsector/ article. cfm?articleid=4927 WindowsITPro. (2001). Microsoft suffers another DoS attack. WindowsITPro Instant Doc 19770. Retrieved from http://www.windowsitpro.com/ Articles/Index.cfm? ArticleID=19770&Display Tab=Article U.S. Subcommittee on Oversight and Investigations Hearing. (2001). Protecting America’s critical infrastructures: How secure are government computer systems? Energycommerce.house.gov. Retrieved from http://energycommerce.house. gov/ 107/hearings/04052001Hearing153/McDonald229.htm Stein, L. D., & Stewart, J. N. (2002). The World Wide Web Security FAQ version 3.1.2. World Wide Web Consortium (W3C). Retrieved from http://www.w3.org/Security/Faq Walters, R. (2001). Top 10 ways to prevent denialof-service attacks. Information Systems Security, 10(3), 71-72.
This work was previously published in Secure E-Government Web Services, edited by A. Mitrakas, P. Hengeveld, D. Polemi, and J. Gamper, pp. 124-142, copyright 2007 by Idea Group Publishing (an imprint of IGI Global).
Chapter 1.2
Policy Frameworks for Secure Electronic Business Andreas Mitrakas Ubizen, Belgium
INtrODUctION Terms conveyed by means of policy in electronic business have become a common way to express permissions and limitations in online transactions. Doctrine and standards have contributed to determining policy frameworks and making them mandatory in certain areas such as electronic signatures. A typical example of limitations conveyed through policy in electronic signatures includes certificate policies that certification authorities (CAs) typically make available to subscribers and relying parties. Trade partners might also use policies to convey limitations to the way electronic signatures are accepted within specific business frameworks. Examples of transaction constraints might include limitations in roles undertaken to carry out an action in a given context, which can be introduced by means of attribute certificates. Relying parties might also use signature policies to denote the conditions for the validation and verification of electronic signatures they accept. Furthermore, signature policies might contain additional transaction-specific limitations in
validating an electronic signature addressed to end users. Large-scale transactions that involve the processing of electronic signatures in a mass scale within diverse applications rely on policies to convey signature-related information and limitations in a transaction. As legally binding statements, policies are used to convey trust in electronic business. Extending further the use of policy in transaction environments can enhance security, legal safety, and transparency in a transaction. Additional improvements are required, however, in order to render applicable terms that are conveyed through policy and enforce them unambiguously in a transaction. The remainder of this article discusses common concepts of policies and certain applications thereof.
bAcKGrOUND An early example of a transaction framework is open EDI (Electronic Data Interchange) that aims at using openly available structured data formats and is delivered over open networks. While the
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Policy Frameworks for Secure Electronic Business
main goal of open EDI has been to enable shortterm or ad hoc commercial transactions among organisations (Kalakota & Whinson, 1996), it has also aimed at lowering the entry barriers of establishing structured data links between trading partners by minimising the need for bilateral framework agreements, known as interchange agreements. One specific requirement of open EDI is to set up the operational and contract framework within which a transaction is carried out. Automating the process of negotiating and executing agreements regarding the legal and technical conditions for open EDI can significantly lower the entry barriers, especially for non-recurrent transactions (Mitrakas, 2000). Building on the model for open EDI, the Business Collaboration Framework is a set of specifications and guides, the centre of which is the UN/CEFACT; it aims at further lowering the entry barriers of electronic commerce based on structured data formats. The need for flexibility and versatility to loosely coupled applications and communication on the Internet has led to the emergence of Web services. A Web service is a collection of protocols and standards that are used to exchange data between applications. While applications can be written in various languages and run on various platforms, they can use Web services to exchange data over the Internet. In Web services, using open standards ensures interoperability. These standards also include formal descriptions of models of business procedures to specify classes of business transactions that all serve the same goal. A trade procedure stipulates the actions, the parties, the order, and the timing constraints on performing actions (Lee, 1996). In complex business situations, transaction scenarios typically might belong to a different trade partner that each one owns a piece of that scenario. Associating a scenario with a trade partner often requires electronic signatures. When a trade partner signs with an electronic signature, she might validate or approve of the way that individual procedural components might
operate within a transaction. The signatory of an electronic document or a transaction procedure depends on the performance of complex and often opaque-to-the-end-user systems. Trust in the transaction procedures and the provision of services is a requirement that ensures that the signatory eventually adheres to transparent contract terms that cannot be repudiated (Mitrakas, 2003). Policy is seen as a way to formalise a transaction by highlighting those aspects of a transaction that are essential to the end user (Mitrakas, 2004). The immediate effect of using policies to convey limitations is that the party that relies on a signed transaction adheres to the limitations of that policy. Policy is, therefore, used to convey limitations to a large number of users in a way that makes a transaction enforceable. While these limitations are mostly meaningful at the operational or technical level of the transaction, they often have a binding legal effect and are used to convey contractual terms. Although these terms are not necessarily legal by nature, they are likely to have a binding effect. Sometimes they can be more far reaching by constraining relying parties that validate electronic signatures. Limitations might be mandated by law or merely by agreement, as in the case of limitations of qualified signatures according to European Directive 1999/93/EC on a common framework for electronic signatures (ETSI TS 101 456).
POLIcY cONstrAINts IN ELEctrONIc bUsINEss Electronic signatures have been seen as a lynchpin of trust in electronic transactions. The subject matter of current electronic signature regulation addresses the requirements on the legal recognition of electronic signatures used for non-repudiation and authentication (Adams & Lloyd, 1999). Non-repudiation is addressed in both technical standards such as X.509 and legislation.
Policy Frameworks for Secure Electronic Business
Non-repudiation addresses the requirement for electronic signing in a transaction in such a way that an uncontested link to the declaration of will of the signatory is established. Non-repudiation is the attribute of a communication that protects against a successful dispute of its origin, submission, delivery, or content (Ford & Baum, 2001). From a business perspective non-repudiation can be seen as a service that provides a high level of assurance on information being genuine and nonrefutable (Pfleeger, 2000). From a legal perspective non-repudiation, in the meaning of the Directive 1999/93/EC on a common framework on electronic signatures, has been coined by the term, qualified signature, which is often used to describe an electronic signature that uses a secure signature creation device and is supported by a qualified certificate. A qualified signature is defined in the annexes of the directive and is granted the same legal effect as hand-written signatures where law requires them in the transactions. Policies aim at invoking trust in transactions to ensure transparency and a spread of risk among the transacting parties. Policies are unilateral declarations of will that complement transaction frameworks based on private law. Policies can be seen as guidelines that relate to the technical organizational and legal aspects of a transaction, and they are rendered enforceable by means of an agreement that binds the transacting parties. In Public Key Infrastructure (PKI), a CA typically uses policy in the form of a certification practice statement (CPS) to convey legally binding limitations to certificate users, being subscribers and relying parties. A CPS is a statement of the practices that a CA employs in issuing certificates (ABA, 1996). A CPS is a comprehensive treatment of how the CA makes its services available and delimiting the domain of providing electronic signature services to subscribers and relying parties. A certificate policy (CP) is sometimes used with a CPS to address the certification objectives of the CA implementation. While the CPS is typically seen as answering “how” security objectives are
met, the CP is the document that sets these objectives (ABA, 2001). A CP and a CPS are used to convey information needed to subscribers and parties relying on electronic signatures, in order to assess the level of trustworthiness of a certificate that supports an electronic signature. By providing detailed information on security and procedures required in managing the life cycle of a certificate, policies become of paramount importance in transactions. Sometimes, a PKI Disclosure Statement (PDS) distils certain important policy aspects and services the purpose of notice and conspicuousness of communicating applicable terms (ABA, 2001). The Internet Engineering Task Force (IETF) has specified a model framework for certificate policies (RFC 3647). Assessing the validity of electronic signatures is yet another requirement of the end user, most importantly, the relying parties. A signature policy describes the scope and usage of such electronic signature with a view to address the operational conditions of a given transaction context (ETSI TR 102 041). A signature policy is a set of rules under which an electronic signature can be created and determined to be valid (ETSI TS 101 733). A signature policy determines the validation conditions of an electronic signature within a given context. A context may include a business transaction, a legal regime, a role assumed by the signing party, and so forth. In a broader perspective, a signature policy can be seen as a means to invoke trust and convey information in electronic commerce by defining appropriately indicated trust conditions. In signature policies it is also desirable to include additional elements of information associated with certain aspects of general terms and conditions to relate with the scope of the performed action as it applies in the transaction at hand (Mitrakas, 2004). A signature policy might, therefore, include content that relates it to the general conditions prevailing in a transaction, the discreet elements of a transaction procedure as provided by the various parties involved in
Policy Frameworks for Secure Electronic Business
building a transaction, as well as the prevailing certificate policy (ETSI TS 102 041). Trade parties might use transaction constraints to designate roles or other attributes undertaken to carry out an action within a transaction framework. Attribute certificates are used to convey such role constraints and are used to indicate a role, a function, or a transaction type constraint. Attribute policies are used to convey limitations associated with the use and life cycle of such attributes (ETSI TS 101 058). Processing signed electronic invoices is an application area of using policies. By means of a signature policy, the recipient of an invoice might mandate a specific signature format and associated validation rules. The sender of the invoice might require that signing an invoice might only be carried out under a certain role; therefore, an attribute certificate issued under a specific attribute policy might be mandated. This attribute policy complements the certification practice statement that the issuer of electronic certificates makes available. It is expected that certificate policies shall influence the requirements to make a signature policy binding (Mitrakas, 2003).
bINDING POLIcIEs IN ELEctrONIc bUsINEss Communicating and rendering policies binding has been an issue of significant importance in electronic transactions. Inherent limitations in the space available for digital certificates dictate that policies are often conveyed and used in a transaction by incorporating them by reference (Wu, 1998). Incorporation by reference is to make one message part of another message by identifying the message to be incorporated, providing information that enables the receiving party to access and obtain the incorporated message in its entirety, expressing the intention that it be part of the other message (ABA, 1996). The incorporation of policies for electronic signatures into
the agreement between signatory and recipient can take place by referencing the intent to use such policy in transactions. When the recipient accepts the signed document of the signatory, he implicitly agrees on the conditions of the underlying signature policy. In practice, incorporating policy into the agreement between signatory and recipient can also be effected by: • •
Referring to a policy in a parties’ agreement that explicitly refers to such policy. Accepting a signed document and implicitly agreeing on the conditions of the underlying policy, although this option might be more restrictive in case of a dispute.
An issue arises with regard to how and under which conditions a particular policy framework can be incorporated into an agreement of a signatory in a way that binds a relying party, regardless of its capacity to act as consumer or business partner. Incorporation of contract terms into consumer contracts and incorporation of contract terms into business contracts follow different rules. Incorporation by reference in a business contract is comparatively straightforward, whereas in a consumer contract stricter rules have to be followed as mandated by consumer protection regulations. Limitations to the enforceability of legal terms that are conveyed by means of policy are applied as a result of consumer protection legislation. In Europe, consumer protection legislation includes the Council Directive 93/13/EC on unfair terms in consumer contracts, Directive 97/7/EC on the protection consumers in distance transactions, and Directive 1999/44/EEC on certain aspects of the sale of consumer goods and associated guarantees (Hoernle, Sutter & Walden, 2002). In an effort to proactively implement these legal requirements, service providers strive to set up specific consumer protection frameworks (GlobalSign, 2004). Sometimes the scope of the underlying certificate policy frameworks is to equip the transacting parties with the ability to use a certificate
Policy Frameworks for Secure Electronic Business
as evidence in a court of law. It is necessary to also provide transacting parties with assurance that allows a certificate to be admitted in legal proceedings and that it provides binding evidence against the parties involved in it, including the CA, the subscriber, and relying parties (Reed, 2000). Qualified electronic signatures in the meaning of Directive 1999/93/EC establish a rebuttable presumption that reverses the burden of proof. In other words the court may at first admit a signature that claims to be qualified as an equivalent of a handwritten signature. The counter-party is allowed to prove that such signature does not meet the requirements for qualified signatures, and could therefore be insecure for signing documents requiring a handwritten signature (UNCITRAL, 2000). To further answer the question of admissibility, it is necessary to examine the admissibility of electronic data as evidence in court, which is a matter that has been addressed in Directive 2000/31/EC on electronic commerce. Consequently, electronic data can be admitted as evidence as long as certain warranties are provided with regard to the production and retention of such data. In assessing the reliability of a certificate, a Court will have to examine the possibility of a certificate being the product of erroneous or fraudulent issuance, and if is not, the Court should proclaim it as sufficient evidence against the parties involved within the boundaries of conveyed and binding policy.
FUtUrE trENDs While case law is expected to determine and enhance the conditions of admissibility and evidential value of policy in transactions based on electronic signatures, additional technological features such as the use of object identifiers (OIDs) and hashing are expected to further enhance the certainty required to accept policies. Remarkably, to date there has been little done to address in a common manner the practical aspects of identify-
0
ing individual policies and distinguishing among the versions thereof. Additionally, mapping and reconciling policy frameworks in overlapping transactions also threaten transactions, which are based on the use and acceptance of varying terms. A typical hard case might involve for example overlapping policy conditions, which apply to certificates issued by different CAs. The situation is exacerbated if those CAs do not have the means to recognise one another, while they issue certificates that can be used in the same transaction frameworks (ETSI TS 102 231). Although such certificates may well be complementary to a transaction framework, the varying assurance levels they provide might threaten the reliability of the transaction framework as a whole. The immediate risk for the transacting parties can be an unwarranted transaction environment that threatens to render otherwise legitimate practices unenforceable. Reconciling the methods used across various electronic signing environments is likely to contribute to creating trust in electronic business. An additional area of future attention may address policy frameworks related to the application layer in a transaction. As present-day requirements for transparency are likely to be further raised, it is expected that online applications will increasingly become more demanding in explaining to the end user what they do and actually warranting the performance. To date general conditions and subscriber agreements cover part of this requirement; however, it is further needed to provide a comprehensive description of the technical features and functionality of the online application. In electronic business, consumers and trade partners are likely to benefit from it. Policies for the application layer are likely to become more in demand in electronic government applications, where the requirement for transparency in the transaction is even higher than in electronic business. Finally, specifying policies further to meet the needs of particular groups of organisations is an additional expectation. Again in electronic government it is
Policy Frameworks for Secure Electronic Business
expected that interoperability will be enhanced through best practices and standards regarding policy in specific vertical areas.
cONcLUsION While policies emerge as a necessary piece in the puzzle of invoking trust and legal safety in electronic transactions, policy frameworks can still have repercussions that reach well beyond the scope of single transaction elements and procedures in isolated electronic business environments. Formal policy frameworks require additional attention to ensure that apparently loosely linked policy elements do not threaten to overturn the requirements of transaction security and legal safety, which are the original objectives of using policy frameworks. Electronic transaction frameworks for diverse application areas can benefit from the processing of data on the basis of policy-invoked constraints among the parties involved. Large-scale processing that requires policy to convey operational and legal conditions in electronic transactions benefits from a combination of policy instruments, including certificate polices, signature policies, attribute certificate policies, and so forth, to enhance the outlining of the transaction framework and allow the transacting parties to further rely on electronic business for carrying out binding transactions.
NOtE The views expressed in this article are solely the views of the author.
rEFErENcEs American Bar Association. (1996). Digital signature guidelines. Washington, DC.
American Bar Association. (2001). PKI assessment guidelines. Washington, DC. Adams, C. & Lloyd, S. (1999). Understanding public key infrastructure. Macmillan Technical Publishing, London. ETSI TS 101 733. (2000). Electronic signature formats. Sophia-Antipolis. ETSI TS 101 456. (2001). Policy requirements for CAs issuing qualified certificates. SophiaAntipolis ETSI 102 041. (2002). Signature policy report, ETSI. Sophia-Antipolis. ETSI TS 101 058. (2003). Policy requirements for attribute authorities. Sophia-Antipolis. ETSI TS 102 231. (2003). Provision of harmonized trust service provider status information. Sophia-Antipolis. Ford, W. & Baum, M.(2001). Secure electronic commerce (2nd edition). Englewood Cliffs, NJ: Prentice-Hall. GlobalSign. (2004). Certification practice statement. Retrieved from.http://www.globalsign. net/repository Hoernle, J., Sutter, G. & Walden, I. (2002). Directive 97/7/EC on the protection of consumers in respect of distance contracts. In A. Lodder & H.W.K. Kaspersen (Eds.), eDirectives: Guide to European Union Law on e-commerce. Kluwer Law International, The Hague. IETF RFC 3647. (2003). Internet X.509 public key infrastructure—certificate policies and certification practices framework. Retrieved from http://www.faqs.org/rfcs/rfc3647.html ITU-T Recommendation X.509, ISO/IEC 95948. Information technology—open systems interconnection—the directory: Public-key and attribute certificate frameworks. Draft revised recommendation. Retrieved from http://www.iso.
Policy Frameworks for Secure Electronic Business
ch/iso/en/Catalogue DetailPage.CatalogueDetail ?CSNUMBER =34551 &ICS1=35 Kalakota, R. & Whinson A. (1996). Frontiers of electronic commerce. Boston: Addison-Wesley. Lee, R. (1996). InterProcs: Modelling environment for automated trade procedures. User documentation, EURIDIS, WP 96.10.11, Erasmus University, Rotterdam. Mitrakas, A. (2003). Policy constraints and role attributes in electronic invoices. Information Security Bulletin, 8(5). Mitrakas, A. (2004). Policy-driven signing frameworks in open electronic transactions. In G. Doukidis, N. Mylonopoulos & N. Pouloudi (Eds.), Information society or information economy? A combined perspective on the digital era. Hershey, PA: Idea Group Publishing. Mitrakas, A. (2000). Electronic contracting for open EDI. In S.M. Rahman & M. Raisinghani (Eds.), Electronic commerce: Opportunities and challenges. Hershey, PA: Idea Group Publishing. Pfleeger, C. (2000). Security in computing. Englewood Cliffs, NJ: Prentice-Hall. Reed, C. (2000). Internet law: Text and materials. Butterworths. United Nations. (2000). Guide to enactment of the UNCITRAL uniform rules on electronic signatures. New York. Wu, S. (1998). Incorporation by reference and public key infrastructure: Moving the law beyond the paper-based world. Jurimetrics, 38(3).
KEY tErMs Certification Authority: An authority such as GlobalSign that issues, suspends, or revokes a digital certificate. Certification Practice Statement: A statement of the practices of a certificate authority and the conditions of issuance, suspension, revocation, and so forth of a certificate. Electronic Data Interchange (EDI): The interchange of data message structured under a certain format between business applications. Incorporation by Reference: To make one document a part of another by identifying the document to be incorporated, with information that allows the recipient to access and obtain the incorporated message in its entirety, and by expressing the intention that it be part of the incorporating message. Such an incorporated message shall have the same effect as if it had been fully stated in the message. Public Key Infrastructure (PKI): The architecture, organization, techniques, practices, and procedures that collectively support the implementation and operation of a certificate-based public key cryptographic system. Relying Party: A recipient who acts by relying on a certificate and an electronic signature. Signature Policy: A set of rules for the creation and validation of an electronic signature, under which the signature can be determined to be valid.
This work was previously published in Encyclopedia of Information Science and Technology, Vol. 4, edited by M. KhosrowPour, pp. 2288-2292, copyright 2005 by Idea Group Reference (an imprint of IGI Global).
Chapter 1.3
Audio Watermarking:
Properties, Techniques, and Evaluation Andrés Garay Acevedo Georgetown University, USA
AbstrAct The recent explosion of the Internet as a collaborative medium has opened the door for people who want to share their work. Nonetheless, the advantages of such an open medium can pose very serious problems for authors who do not want their works to be distributed without their consent. As new methods for copyright protection are devised, expectations around them are formed and sometimes improvable claims are made. This chapter covers one such technology: audio watermarking. First, the field is introduced, and its properties and applications are discussed. Then, the most common techniques for audio watermarking are reviewed, and the framework is set for the objective measurement of such techniques. The last part of the chapter proposes a novel test and a set of metrics for thorough benchmarking of audio watermarking schemes. The development of such a benchmark constitutes a first step towards the
standardization of the requirements and properties that such systems should display.
INtrODUctION The recent explosion of the Internet as a collaborative medium has opened the door for people who want to share their work. Nonetheless, the advantages of such an open medium can pose very serious problems for authors who do not want their works to be distributed without their consent. The digital nature of the information that traverses through modern networks calls for new and improved methods for copyright protection1. In particular, the music industry is facing several challenges (as well as opportunities) as it tries to adapt its business to the new medium. Content protection is a key factor towards a comprehensive information commerce infrastructure (Yeung, 1998), and the industry expects new
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Audio Watermarking
technologies will help them protect against the misappropriation of musical content. One such technology, digital watermarking, has recently brought a tide of publicity and controversy. It is an emerging discipline, derived from an older science: steganography, or the hiding of a secret message within a seemingly innocuous cover message. In fact, some authors treat watermarking and steganography as equal concepts, differentiated only by their final purpose (Johnson, Duric, & Jajodia, 2001). As techniques for digital watermarking are developed, claims about their performance are made public. However, different metrics are typically used to measure performance, making it difficult to compare both techniques and claims. Indeed, there are no standard metrics for measuring the performance of watermarks for digital audio. Robustness does not correspond to the same criteria among developers (Kutter & Petitcolas, 1999). Such metrics are needed before we can expect to see a commercial application of audio watermarking products with a provable performance. The objective of this chapter is to propose a methodology, including performance metrics, for evaluating and comparing the performance of digital audio watermarking schemes. In order to do this, it is necessary first to provide a clear definition of what constitutes a watermark and a watermarking system in the context of digital audio. This is the topic of the second section, which will prove valuable later in the chapter, as it sets a framework for the development of the proposed test. After a clear definition of a digital watermark has been presented, a set of key properties and applications of digital watermarks can be defined and discussed. This is done in the third section, along with a classification of audio watermarking schemes according to the properties presented. The importance of these properties will be reflected on the proposed tests, discussed later in
the chapter. The survey of different applications of watermarking techniques gives a practical view of how the technology can be used in a commercial and legal environment. The specific application of the watermarking scheme will also determine the actual test to be performed to the system. The fourth section presents a survey of specific audio watermarking techniques developed. Five general approaches are described: amplitude modification, dither watermarking, echo watermarking, phase distortion, and spread spectrum watermarking. Specific implementations of watermarking algorithms (i.e., test subjects) will be evaluated in terms of these categories2. The next three sections describe how to evaluate audio watermarking technologies based on three different parameters: fidelity, robustness, and imperceptibility. Each one of these parameters will be precisely defined and discussed in its respective section, as they directly reflect the interests of the three main actors involved in the communication process3: sender, attacker, and receiver, respectively. Finally, the last section provides an account on how to combine the three parameters described above into a single performance measure of quality. It must be stated, however, that this measure should be dependant upon the desired application of the watermarking algorithm (Petitcolas, 2000). The topics discussed in this chapter come not only from printed sources but also from very productive discussions with some of the active researchers in the field. These discussions have been conducted via e-mail, and constitute a rich complement to the still low number of printed sources about this topic. Even though the annual number of papers published on watermarking has been nearly doubling every year in the last years (Cox, Miller, & Bloom, 2002), it is still low. Thus it was necessary to augment the literature review with personal interviews.
Audio Watermarking
WAtErMArKING: A DEFINItION Different definitions have been given for the term watermarking in the context of digital content. However, a very general definition is given by Cox et al. (2002), which can be seen as application independent: “We define watermarking as the practice of imperceptibly altering a Work to embed a message about that Work”. In this definition, the word work refers to a specific song, video or picture4. A crucial point is inferred by this definition, namely that the information hidden within the work, the watermark itself, contains information about the work where it is embedded. This characteristic sets a basic requirement for a watermarking system that makes it different from a general steganographic tool. Moreover, by distinguishing between embedded data that relate to the cover work and hidden data that do not, we can derive some of the applications and requirements of the specific method. This is exactly what will be done later. Another difference that is made between watermarking and steganography is that the former has the additional notion of robustness against attacks (Kutter & Hartung, 2000). This fact also has some implications that will be covered later on.
Finally, if we apply Cox’s definition of watermarking into the field of audio signal processing, a more precise definition, this time for audio watermarking, can be stated. Digital audio watermarking is defined as the process of “embedding a user specified bitsream in digital audio such that the addition of the watermark (bitstream) is perceptually insignificant” (Czerwinski, Fromm, & Hodes, 1999). This definition should be complemented with the previous one, so that we do not forget the watermark information refers to the digital audio file.
Elements of an Audio Watermarking System Embedded watermarks are recovered by running the inverse process that was used to embed them in the cover work, that is, the original work. This means that all watermarking systems consist of at least two generic building blocks: a watermark embedding system and a watermark recovery system. Figure 1 shows a basic watermarking scheme, in which a watermark is both embedded and recovered in an audio file. As can be seen, this process might also involve the use of a secret key. In general terms, given the audio file A, the wa-
Figure 1. Basic watermarking system Watermark (W)
Watermarking Algorithm
Audio File (A)
Secret Key (K)
Embedding
Watermarked Audio File (A’)
Watermarking Algorithm -1
Watermark or Confidence Measure
Recovery Key (K’)
Recovery
Audio Watermarking
termark W and the key K, the embedding process is a mapping of the form A×K×W→A' Conversely, the recovery or extraction process receives a tentatively watermarked audio file A’, and a recovery key K’ (which might be equal to K), and it outputs either the watermark W or a confidence measure about the existence of W (Petitcolas, Anderson, & G., 1999). At this point it is useful to attempt a formal definition of a watermarking system, based on that of Katzenbeisser (2000), and which takes into account the architecture of the system. The quintuple ξ = ‹ C, W, K, Dk, Ek ›, where C is the set of possible audio covers5, W the set of watermarks with |C| ≥ |W|, K the set of secret keys, Ek: C×K×W→C the embedding function and Dk: C×K→W the extraction function, with the property that Dk (Ek (c, k, w) k) = w for all w ∈ W, c ∈ C and k ∈ K is called a secure audio watermarking system. This definition is almost complete, but it fails to cover some special cases. Some differences might arise between a real world system, and the one just defined; for example, some detectors may not output the watermark W directly but rather report the existence of it. Nonetheless, it constitutes a good approximation towards a widely accepted definition of an audio watermarking system. If one takes into account the small changes that a marking scheme can have, a detailed classification of watermarking schemes is possible. In this classification, the different schemes fall into three categories, depending on the set of inputs and outputs (Kutter & Hartung, 2000). Furthermore, a specific and formal definition for each scheme can be easily given by adapting the definition just given for an audio watermarking system. Private watermarking systems require the original audio A file in order to attempt recovery of the watermark W. They may also require a copy of the embedded watermark and just yield a yes or no answer to the question: does A’ contain W? Semi-private watermarking schemes do not use the original audio file for detection, but they
also answer the yes/no question shown above. This could be described by the relation A’×K×W→ {“Yes”, “No”}. Public watermarking (also known as blind or oblivious watermarking) requires neither the original file A, nor the embedded watermark W. These systems just extract n bits of information from the watermarked audio file. As can be seen, if a key is used then this corresponds to the definition given for a secure watermarking system.
Watermark as a Communication Process A watermarking process can be modeled as a communication process. In fact, this assumption is used throughout this chapter. This will prove to be beneficial in the next chapter when we differentiate between the requirements of the content owner and consumer. A more detailed description of this model can be found in Cox et al. (2002). In this framework, the watermarking process is viewed as a transmission channel through which the watermark message is communicated. Here the cover work is just part of the channel. This is depicted in Figure 2, based on that from Cox et al. (2002). In general terms, the embedding process consists of two steps. First, the watermark message m is mapped into an added pattern6 Wa, of the same type and dimension as the cover work A. When watermarking audio, the watermark encoder produces an audio signal. This mapping may be done with a watermark key K. Next, Wa is embedded into the cover work in order to produce the watermarked audio file A’. After the pattern is embedded, the audio file is processed in some way. This is modeled as the addition of noise to the signal, which yields a noisy work A’n. The types of processing performed on the work will be discussed later, as they are of no importance at this moment. However, it is important to state the presence of noise, as any transmission medium will certainly induce it.
Audio Watermarking
Figure 2. Watermark communication process Noise Watermark embedder Input message
m
Watermark encoder
A’
W
K Watermark Key
Watermark detector
n
A Original Audio File
The watermark detector performs a process that is dependant on the type of watermarking scheme. If the decoder is a blind or public decoder, then the original audio file A is not needed during the recovery process, and only the key K is used in order to decode a watermark message mn. This is the case depicted in Figure 2, as it is the one of most interest to us. Another possibility is for the detector to be informed. In this case, the original audio cover A must be extracted from A’n in order to yield Wn, prior to running the decoding process. In addition, a confidence measure can be the output of the system, rather than the watermark message.
PrOPErtIEs, cLAssIFIcAtION AND APPLIcAtIONs After a proper definition of a watermarking scheme, it is possible now to take a look at the fundamental properties that comprise a watermark. It can be stated that an ideal watermarking scheme will present all of the characteristics here detailed, and this ideal type will be useful for developing a quality test. However, in practice there exists a fundamental trade-off that restricts watermark designers. This fundamental trade-off exists between three key variables: robustness, payload and perceptibility
A’n
Watermark decoder
mn
Output message
K Watermark Key
(Cox, Miller, Linnartz, & Kalker, 1999; Czerwinski et al., 1999; Johnson et al., 2001; Kutter & Petitcolas, 1999; Zhao, Koch, & Luo, 1998). The relative importance given to each of these variables in a watermarking implementation depends on the desired application of the system.
Fundamental Properties A review of the literature quickly points out the properties that an ideal watermarking scheme should possess (Arnold, 2000; Boney, Tewfik & Hamdy, 1996; Cox, Miller, & Bloom, 2000; Cox et al., 1999, 2002; Kutter & Hartung, 2000; Kutter & Petitcolas, 1999; Swanson, Zhu, Tewfik, & Boney, 1998). These are now discussed. Imperceptibility. “The watermark should not be noticeable … nor should [it] degrade the quality of the content” (Cox et al., 1999). In general, the term refers to a similarity between the original and watermarked versions of the cover work. In the case of audio, the term audibility would be more appropriate; however, this could create some confusion, as the majority of the literature uses perceptibility. This is the same reason why the term fidelity is not used at this point, even though Cox et al. (1999) point out that if a watermark is truly imperceptible, then it can be removed by perceptually-based lossy compression algorithms. In fact, this statement will prove to be
Audio Watermarking
a problem later when trying to design a measure of watermark perceptibility. Cox’s statement implies that some sort of perceptibility criterion must be used not only to design the watermark, but to quantify the distortion as well. Moreover, it implies that this distortion must be measured at the point where the audio file is being presented to the consumer/receiver. If the distortion is measured at the receiver’s end, it should also be measured at the sender’s. That is, the distortion induced by a watermark must also be measured before any transmission process. We will refer to this characteristic at the sending end by using the term fidelity. This distinction between the terms fidelity and imperceptibility is not common in the literature, but will be beneficial at a later stage. Differentiating between the amount and characteristics of the noise or distortion that a watermark introduces in a signal before and after the transmission process takes into account the different expectations that content owners and consumers have from the technology. However, this also implies that the metric used to evaluate this effect must be different at these points. This is exactly what will be done later on this chapter. Artifacts introduced through a watermarking process are not only annoying and undesirable, but may also reduce or destroy the commercial value of the watermarked data (Kutter & Hartung, 2000). Nonetheless, the perceptibility of the watermark can increase when certain operations are performed on the cover signal. Robustness refers to the ability to detect the watermark after common signal processing operations and hostile attacks. Examples of common operations performed on audio files include noise reduction, volume adjustment or normalization, digital to analog conversion, and so forth. On the other hand, a hostile attack is a process specifically designed to remove the watermark. Not all watermarking applications require robustness against all possible signal processing operations. Only those operations likely to oc-
cur between the embedding of the mark and the decoding of it should be addressed. However, the number and complexity of attack techniques is increasing (Pereira, Voloshynovskiy, Madueño, Marchand-Maillet, & Pun, 2001; Voloshynovskiy, Pereira, Pun, Eggers, & Su, 2001), which means that more scenarios have to be taken into account when designing a system. A more detailed description of these attacks is given in the sixth section. Robustness deals with two different issues; namely the presence and detection of the watermark after some processing operation. It is not necessary to remove a watermark to render it useless; if the detector cannot report the presence of the mark then the attack can be considered successful. This means that a watermarking scheme is robust when it is able to withstand a series of attacks that try to degrade the quality of the embedded watermark, up to the point where it’s removed, or its recovery process is unsuccessful. “No such perfect method has been proposed so far, and it is not clear yet whether an absolutely secure watermarking method exists at all” (Kutter & Hartung, 2000). Some authors prefer to talk about tamper resistance or even security when referring to hostile attacks; however, most of the literature encompasses this case under the term robustness. The effectiveness of a watermarking system refers to the probability that the output of the embedder will be watermarked. In other words, it is the probability that a watermark detector will recognize the watermark immediately after inserting it in the cover work. What is most amazing about this definition is the implication that a watermarking system might have an effectiveness of less than 100%. That is, it is possible for a system to generate marks that are not fully recoverable even if no processing is done to the cover signal. This happens because perfect effectiveness comes at a very high cost with respect to other properties, such as perceptibility (Cox et al., 2002). When a known watermark is not suc-
Audio Watermarking
cessfully recovered by a detector it is said that a false negative, or type-II error, has occurred (Katzenbeisser, 2000). Depending on the application, one might be willing to sacrifice some performance in exchange for other characteristics. For example, if extremely high fidelity is to be achieved, one might not be able to successfully watermark certain type of works without generating some kind of distortion. In some cases, the effectiveness can be determined analytically, but most of the time it has to be estimated by embedding a large set of works with a given watermark and then trying to extract that mark. However, the statistical characteristics of the test set must be similar to those of the works that will be marked in the real world using the algorithm. Data payload. In audio watermarking this term refers to the number of embedded bits per second that are transmitted. A watermark that encodes N bits is referred to as an N-bit watermark, and can be used to embed 2N different messages. It must be said that there is a difference between the encoded message m, and the actual bitstream that is embedded in the audio cover work. The latter is normally referred to as a pseudorandom (PN) sequence. Many systems have been proposed where only one possible watermark can be embedded. The detector then just determines whether the watermark is present or not. These systems are referred to as one-bit watermarks, as only two different values can be encoded inside the watermark message. In discussing the data payload of a watermarking method, it is also important to distinguish between the number of distinct watermarks that may be inserted, and the number of watermarks that may be detected by a single iteration with a given watermark detector. In many watermarking applications, each detector need not test for all the watermarks that might possibly be present (Cox et al., 1999). For example, one might insert two different watermarks into the same audio file,
but only be interested in recovering the last one to be embedded.
Other Properties Some of the properties reviewed in the literature are not crucial for testing purposes; however they must be mentioned in order to make a thorough description of watermarking systems: •
•
•
False positive rate: A false positive or type-I error is the detection of a watermark in a work that does not actually contain one. Thus a false positive rate is the expected number of false positives in a given number of runs of the watermark detector. Equivalently, one can detect the probability that a false positive will occur in a given detector run. In some applications a false positive can be catastrophic. For example, imagine a DVD player that incorrectly determines that a legal copy of a disk (for example a homemade movie) is a non-factory-recorded disk and refuses to play it. If such an error is common, then the reputation of DVD players and consequently their market can be seriously damaged. Statistical invisibility: This is needed in order to prevent unauthorized detection and/or removal. Performing statistical tests on a set of watermarked files should not reveal any information about the nature of the embedded information, nor about the technique used for watermarking (Swanson et al., 1998). Johnson et al. (2001) provide a detailed description of known signatures that are created by popular information hiding tools. Their techniques can be also extended for use in some watermarking systems. Redundancy: To ensure robustness, the watermark information is embedded in multiple places on the audio file. This means that the watermark can usually be recovered
Audio Watermarking
•
•
•
0
from just a small portion of the watermarked file. Compression ratio (or similar compression characteristics as the original file): Audio files are usually compressed using different schemes, such as MPEG-Layer 3 audio compression. An audio file with an embedded watermark should yield a similar compression ratio as its unmarked counterpart, so that its value is not degraded. Moreover, the compression process should not remove the watermark. Multiple watermarks: Multiple users should be able to embed a watermark into an audio file. This means that a user has to ideally be able to embed a watermark without destroying any preexisting ones that might be already residing in the file. This must hold true even if the watermarking algorithms are different. Secret keys: In general, watermarking systems should use one or more cryptographically secure keys to ensure that the watermark cannot be manipulated or erased. This is important because once a watermark can be read by someone, this same person might alter it since both the location and embedding algorithm of the mark will be known (Kutter & Hartung, 2000). It is not safe to assume that the embedding algorithm is unknown to the attacker. As the security of the watermarking system relies in part on the use of secret keys, the keyspace must be large, so that a brute force attack is impractical. In most watermarking systems the key is the PN-pattern itself, or at least is used as a seed in order to create it. Moreover, the watermark message is usually encrypted first using a cipher key, before it is embedded using the watermark key. This practice adds security at two different levels. In the highest level of secrecy, the user cannot read or decode the watermark,
•
or even detect its presence. The second level of secrecy permits any user to detect the presence of the watermark, but the data cannot be decoded without the proper key. Watermarking systems in which the key is known to various detectors are referred to as unrestricted-key watermarks. Thus, algorithms for use as unrestricted-key systems must employ the same key for every piece of data (Cox et al., 1999). Those systems that use a different key for each watermark (and thus the key is shared by only a few detectors) are known as restricted-key watermarks. Computational cost: The time that it takes for a watermark to be embedded and detected can be a crucial factor in a watermarking system. Some applications, such as broadcast monitoring, require real time watermark processing and thus delays are not acceptable under any circumstances. On the other hand, for court disputes (which are rare), a detection algorithm that takes hours is perfectly acceptable as long as the effectiveness is high.
Additionally, the number of embedders and detectors varies according to the application. This fact will have an effect on the cost of the watermarking system. Applications such as DVD copy control need few embedders but a detector on each DVD player; thus the cost of recovering should be very low, while that of embedding could be a little higher7. Whether the algorithms are implemented as plug-ins or dedicated hardware will also affect the economics of deploying a system.
Different Types of Watermarks Even though this chapter does not relate to all kinds of watermarks that will be defined, it is important to state their existence in order to later derive some of the possible applications of watermarking systems:
Audio Watermarking
•
•
•
Robust watermarks: These are simply watermarks that are robust against attacks. Even if the existence of the watermark is known, it should be difficult for an attacker to destroy the embedded information without the knowledge of the key8. An implication of this fact is that the amount of data that can be embedded (also known as the payload) is usually smaller than in the case of steganographic methods. It is important to say that watermarking and steganographic methods are more complementary than competitive. Fragile watermarks: These are marks that have only very limited robustness (Kutter & Hartung, 2000). They are used to detect modifications of the cover data, rather than convey inerasable information, and usually become invalid after the slightest modification of a work. Fragility can be an advantage for authentication purposes. If a very fragile mark is detected intact in a work, we can infer that the work has probably not been altered since the watermark was embedded (Cox et al., 2002). Furthermore, even semi-fragile watermarks can help localize the exact location where the tampering of the cover work occurred. Perceptible watermarks: As the name states, these are watermarks that are easily perceived by the user. Although they are usually applied to images (as visual patterns or logos), it is not uncommon to have an audible signal overlaid on top of a musical work, in order to discourage illegal copying. As an example, the IBM Digital Libraries project (Memon & Wong, 1998; Mintzer, Magerlein, & Braudaway, 1996) has developed a visible watermark that modifies the brightness of an image based on the watermark data and a secret key. Even though perceptible watermarks are important for some special applications, the rest of this chapter focuses
•
•
on imperceptible watermarks, as they are the most common. Bitstream watermarks: These are marks embedded directly into compressed audio (or video) material. This can be advantageous in environments where compressed bitstreams are stored in order to save disk space, like Internet music providers. Fingerprinting and labeling: They denote special applications of watermarks. They relate to watermarking applications where information such as the creator or recipient of the data is used to form the watermark. In the case of fingerprinting, this information consists of a unique code that uniquely identifies the recipient, and that can help to locate the source of a leak in confidential information. In the case of labeling, the information embedded is a unique data identifier, of interest for purposes such as library retrieving. A more thorough discussion is presented in the next section.
Watermark Applications In this section the seven most common application for watermarking systems are presented. What is more important, all of them relate to the field of audio watermarking. It must be kept in mind that each of these applications will require different priorities regarding the watermark’s properties that have just been reviewed: •
Broadcast monitoring: Different individuals are interested in broadcast verification. Advertisers want to be sure that the ads they pay for are being transmitted; musicians want to ensure that they receive royalty payments for the air time spent on their works. While one can think about putting human observers to record what they see or hear on a broadcast, this method becomes
Audio Watermarking
•
costly and error prone. Thus it is desirable to replace it with an automated version, and digital watermarks can provide a solution. By embedding a unique identifier for each work, one can monitor the broadcast signal searching for the embedded mark and thus compute the air time. Other solutions can be designed, but watermarking has the advantage of being compatible with the installed broadcast equipment, since the mark is included within the signal and does not occupy extra resources such as other frequencies or header files. Nevertheless, it is harder to embed a mark than to put it on an extra header, and content quality degradation can be a concern. Copyright owner identification: Under U.S. law, the creator of an original work holds copyright to it the instant the work is recorded in some physical form (Cox et al., 2002). Even though it is not necessary to place a copyright notice in distributed copies of work, it is considered a good practice, since a court can award more damages to the owner in the case of a dispute. However, textual copyright notices9 are easy to remove, even without intention. For example, an image may be cropped prior to publishing. In the case of digital audio the problem is even worse, as the copyright notice is not visible at all times. Watermarks are ideal for including copyright notices into works, as they can be both imperceptible and inseparable from the cover that contains them (Mintzer, Braudaway, & Bell, 1998). This is probably the reason why copyright protection is the most prominent application of watermarking today (Kutter & Hartung, 2000). The watermarks are used to resolve rightful ownership, and thus require a very high level of robustness (Arnold, 2000). Furthermore, additional issues must be considered; for example, the marks must be unambiguous, as other parties can
•
•
try to embed counterfeit copyright notices. Nonetheless, it must be stated that the legal impact of watermark copyright notices has not yet been tested in court. Proof of ownership: Multimedia owners may want to use watermarks not just to identify copyright ownership, but also to actually prove ownership. This is something that a textual notice cannot easily do, since it can be forged. One way to resolve an ownership dispute is by using a central repository, where the author registers the work prior to distribution. However, this can be too costly10 for many content creators. Moreover, there might be lack of evidence (such as sketch or film negatives) to be presented at court, or such evidence can even be fabricated. Watermarks can provide a way for authenticating ownership of a work. However, to achieve the level of security required for proof of ownership, it is probably necessary to restrict the availability of the watermark detector (Cox et al., 2002). This is thus not a trivial task. Content authentication: In authentication applications the objective is to detect modifications of the data (Arnold, 2000). This can be achieved with fragile watermarks that have low robustness to certain modifications. This proves to be very useful, as it is becoming easier to tamper with digital works in ways that are difficult to detect by a human observer. The problem of authenticating messages has been well studied in cryptography; however, watermarks are a powerful alternative as the signature is embedded directly into the work. This eliminates the problem of making sure the signature stays with the work. Nevertheless, the act of embedding the watermark must not change the work enough to make it appear invalid when compared with the signature. This can be accomplished by
Audio Watermarking
•
•
separating the cover work in two parts: one for which the signature is computed, and the other where it is embedded. Another advantage of watermarks is that they are modified along with the work. This means that in certain cases the location and nature of the processing within the audio cover can be determined and thus inverted. For example, one could determine if a lossy compression algorithm has been applied to an audio file11. Transactional watermarks: This is an application where the objective is to convey information about the legal recipient of digital data, rather than the source of it. This is done mainly to identify single distributed copies of data, and thus monitor or trace back illegally produced copies of data that may circulate12. The idea is to embed a unique watermark in each distributed copy of a work, in the process we have defined as fingerprinting. In these systems, the watermarks must be secure against a collusion attack, which is explained in the sixth section, and sometimes have to be extracted easily, as in the case of automatic Web crawlers that search for pirated copies of works. Copy control/device control: Transactional watermarks as well as watermarks for monitoring, identification, and proof of ownership do not prevent illegal copying (Cox et al., 2000). Copy protection is difficult to achieve in open systems, but might be desirable in proprietary ones. In such systems it is possible to use watermarks to indicate if the data can be copied or not (Mintzer et al., 1998). The first and strongest line of defense against illegal copying is encryption, as only those who possess the decryption key can access the content. With watermarking, one could do a very different process: allow the media to be perceived, yet still prevent it from being recorded. If this is the case,
•
a watermark detector must be included on every manufactured recorder, preferably in a tamper resistant device. This constitutes a serious nontechnical problem, as there is no natural incentive for recording equipment manufacturers to include such a detector on their machines. This is due to the fact that the value of the recorder is reduced from the point of view of the consumer. Similarly, one could implement play control, so that illegal copies can be made but not played back by compliant equipment. This can be done by checking a media signature, or if the work is properly encrypted for example. By mixing these two concepts, a buyer will be left facing two possibilities: buying a compliant device that cannot play pirated content, or a noncompliant one that can play pirated works but not legal ones. In a similar way, one could control a playback device by using embedded information in the media they reproduce. This is known as device control. For example, one could signal how a digital audio stream should be equalized, or even extra information about the artist. A more extreme case can be to send information in order to update the firmware of the playback device while it is playing content, or to order it to shut down at a certain time. This method is practical, as the need for a signaling channel can be eliminated. Covert communication: Even though it contradicts the definition of watermark given before, some people may use watermarking systems in order to hide data and communicate secretly. This is actually the realm of steganography rather than watermarking, but many times the boundaries between these two disciplines have been blurred. Nonetheless, in the context of this chapter, the hidden message is not a watermark but rather a robust covert communication. The use of watermarks for hidden annotation (Zhao et al., 1998), or labeling, constitutes a
Audio Watermarking
different case, where watermarks are used to create hidden labels and annotations in content such as medical imagery or geographic maps, and indexes in multimedia content for retrieval purposes. In these cases, the watermark requirements are specific to the actual media where the watermark will be embedded. Using a watermark that distorts a patient’s radiography can have serious legal consequences, while the recovery speed is crucial in multimedia retrieval.
AUDIO WAtErMArKING tEcHNIqUEs In this section the five most popular techniques for digital audio watermarking are reviewed. Specifically, the different techniques correspond to the methods for merging (or inserting) the cover data and the watermark pattern into a single signal, as was outlined in the communication model of the second section. There are two critical parameters to most digital audio representations: sample quantization method and temporal sampling rate. Data hiding in audio signals is especially challenging, because the human auditory system (HAS) operates over a wide dynamic range. Sensitivity to additive random noise is acute. However, there are some “holes” available. While the HAS has a large dynamic range, it has a fairly small differential range (Bender, Gruhl, Morimoto, & Lu, 1996). As a result, loud sounds tend to mask out quiet sounds. This effect is known as masking, and will be fully exploited in some of the techniques presented here (Swanson et al., 1998). These techniques do not correspond to the actual implementation of commercial products that are available, but rather constitute the basis for some of them. Moreover, most real world applications can be considered a particular case of the general methods described next.
Finally, it must be stated that the methods explained are specific to the domain of audio watermarking. Several other techniques that are very popular for hiding marks in other types of media, such as discrete cosine transform (DCT) coefficient quantization in the case of digital images, are not discussed. This is done because the test described in the following sections is related only to watermarking of digital audio.
Amplitude Modification This method, also known as least significant bit (LSB) substitution, is both common and easy to apply in both steganography and watermarking (Johnson & Katzenbeisser, 2000), as it takes advantage of the quantization error that usually derives from the task of digitizing the audio signal. As the name states, the information is encoded into the least significant bits of the audio data. There are two basic ways of doing this: the lower order bits of the digital audio signal can be fully substituted with a pseudorandom (PN) sequence that contains the watermark message m, or the PN-sequence can be embedded into the lower order bitstream using the output of a function that generates the sequence based on both the nth bit of the watermark message and the nth sample of the audio file (Bassia & Pitas, 1998; Dugelay & Roche, 2000). Ideally, the embedding capacity of an audio file with this method is 1 kbps per 1 kHz of sampled data. That is, if a file is sampled at 44 kHz then it is possible to embed 44 kilobits on each second of audio. In return for this large channel capacity, audible noise is introduced. The impact of this noise is a direct function of the content of the host signal. For example, crowd noise during a rock concert would mask some of the noise that would be audible in a string quartet performance. Adaptive data attenuation has been used to compensate for this variation in content (Bender et al., 1996). Another option is to shape the PN-sequence itself
Audio Watermarking
so that it matches the audio masking characteristics of the cover signal (Czerwinski et al., 1999). The major disadvantage of this method is its poor immunity to manipulation. Encoded information can be destroyed by channel noise, resampling, and so forth, unless it is encoded using redundancy techniques. In order to be robust, these techniques reduce the data rate, often by one to two orders of magnitude. Furthermore, in order to make the watermark more robust against localized filtering, a pseudorandom number generator can be used to spread the message over the cover in a random manner. Thus, the distance between two embedded bits is determined by a secret key (Johnson & Katzenbeisser, 2000). Finally, in some implementations the PN-sequence is used to retrieve the watermark from the audio file. In this way, the watermark acts at the same time as the key to the system. Recently proposed systems use amplitude modification techniques in a transform space rather than in the time (or spatial) domain. That is, a transformation is applied to the signal, and then the least significant bits of the coefficients representing the audio signal A on the transform domain are modified in order to embed the watermark W. After the embedding, the inverse transformation is performed in order to obtain the watermarked audio file A’. In this case, the technique is also known as coefficient quantization. Some of the transformations used for watermarking are the discrete Fourier transform (DFT), discrete cosine transform (DCT), Mellin-Fourier transform, and wavelet transform (Dugelay & Roche, 2000). However, their use is more popular in the field of image and video watermarking.
To implement dithering, a noise signal is added to the input audio signal with a known probability distribution, such as Gaussian or triangular. In the particular case of dithering for watermark embedding, the watermark is used to modulate the dither signal. The host signal (or original audio file) is quantized using an associated dither quantizer (RLE, 1999). This technique is known as quantization index modulation (QIM) (Chen & Wornell, 2000). For example, if one wishes to embed one bit (m=1 or m=2) in the host audio signal A then one would use two different quantizers, each one representing a possible value for m. If the two quantizers are shifted versions of each other, then they are called dither quantizers, and the process is that of dither modulation. Thus, QIM refers to embedding information by first modulating an index or sequence of indices with the embedded information and then quantizing the host signal with the associated quantizer or sequence of quantizers (Chen & Wornell, 1999). A graphical view of this technique is shown in Figure 3, taken from Chen (2000). Here, the points marked with X’s and O’s belong to two different quantizers, each with an associated index; that is, each one embedding a different value. The distance dmin can be used as an informal measure of robustness, while the size of the quantization cells (one is shown in the figure) measures the distortion on the audio file. If the Figure 3. A graphical view of the QIM technique
Dither Watermarking Dither is a noise signal that is added to the input audio signal to provide better sampling of that input when digitizing the signal (Czerwinski et al., 1999). As a result, distortion is practically eliminated, at the cost of an increased noise floor.
Audio Watermarking
watermark message m=1, then the audio signal is quantized to the nearest X. If m=2 then it is quantized to the nearest O. The two quantizers must not intersect, as can be seen in the figure. Furthermore, they have a discontinuous nature. If one moves from the interior of the cell to its exterior, then the corresponding value of the quantization function jumps from an X in the cell’s interior to one X on its exterior. Finally, as noted above, the number of quantizers in the ensemble determines the information-embedding rate (Chen & Wornell, 2000). As was said above, in the case of dither modulation, the quantization cells of any quantizer in the ensemble are shifted versions of the cells of any other quantizer being used as well. The shifts traditionally correspond to pseudorandom vectors called the dither vectors. For the task of watermarking, these vectors are modulated with the watermark, which means that each possible embedded signal maps uniquely to a different dither vector. The host signal A is then quantized with the resulting dithered quantizer in order to crate the watermarked audio signal A’.
Echo Watermarking Echo watermarking attempts to embed information on the original discrete audio signal A(t) by introducing a repeated version of a component of the audio signal with small enough offset (or delay), initial amplitude and decay rate αA(t – ∆t) to make it imperceptible. The resulting signal can be then expressed as A’(t) = A(t) + αA(t – ∆t). In the most basic echo watermarking scheme, the information is encoded in the signal by modifying the delay between the signal and the echo. This means that two different values ∆t and ∆t' are used in order to encode either a zero or a one. Both offset values have to be carefully chosen in a way that makes the watermark both inaudible and recoverable (Johnson & Katzenbeisser, 2000). As the offset between the original and the echo decreases, the two signals blend. At a certain
point, the human ear cannot distinguish between the two signals. The echo is perceived as added resonance (Bender et al., 1996). This point is hard to determine exactly, as it depends on many factors such as the quality of the original recording, the type of sound being echoed, and the listener. However, in general one can expect the value of the offset ∆t to be around one millisecond. Since this scheme can only embed one bit in a signal, a practical approach consists of dividing the audio file into various blocks prior to the encoding process. Then each block is used to encode a bit, with the method described above. Moreover, if consecutive blocks are separated by a random number of unused samples, the detection and removal of the watermark becomes more difficult (Johnson & Katzenbeisser, 2000). Finally, all the blocks are concatenated back, and the watermarked audio file A’ is created. This technique results in an embedding rate of around 16 bits per second without any degradation of the signal. Moreover, in some cases the resonance can even create a richer sound. For watermark recovery, a technique known as cepstrum autocorrelation is used (Czerwinski et al., 1999). This technique produces a signal with two pronounced amplitude humps or spikes. By measuring the distance between these two spikes, one can determine if a one or a zero was initially encoded in the signal. This recovery process has the benefit that the original audio file A is not needed. However, this benefit also becomes a drawback in that the scheme presented here is susceptible to attack. This will be further explained in the sixth section.
Phase Coding It is known that the human auditory system is less sensitive to the phase components of sound than to the noise components, a property that is exploited by some audio compression schemes. Phase coding (or phase distortion) makes use of
Audio Watermarking
this characteristic as well (Bender et al., 1996; Johnson & Katzenbeisser, 2000). The method works by substituting the phase of the original audio signal A with one of two reference phases, each one encoding a bit of information. That is, the watermark data W is represented by a phase shift in the phase of A. The original signal A is split into a series of short sequences Ai, each one of length l. Then a discrete Fourier transform (DFT) is applied to each one of the resulting segments. This transforms the signal representation from the time domain to the frequency domain, thus generating a matrix of phases Φ and a matrix of Fourier transform magnitudes. The phase shifts between consecutive signal segments must be preserved in the watermarked file A’. This is necessary because the human auditory system is very sensitive to relative phase differences, but not to absolute phase changes. In other words, the phase coding method works by substituting the phase of the initial audio segment with a reference phase that represents the data. After this, the phase of subsequent segments is adjusted in order to preserve the relative phases between them (Bender et al., 1996). Given this, the embedding process inserts the watermark information in the phase vector of the first segment of A, namely . Then it creates a new phase matrix Φ', using the original phase differences found in Φ. After this step, the original matrix of Fourier transform magnitudes is used alongside the new phase matrix Φ' to construct the watermarked audio signal A’, by applying the inverse Fourier transform (that is, converting the signal back to the time domain). At this point, the absolute phases of the signal have been modified, but their relative differences are preserved. Throughout the process, the matrix of Fourier amplitudes remains constant. Any modifications to it could generate intolerable degradation (Dugelay & Roche, 2000). In order to recover the watermark, the length of the segments, the DFT points, and the data
interval must be known at the receiver. When the signal is divided into the same segments that were used for the embedding process, the following step is to calculate the DFT for each one of these segments. Once the transformation has been applied, the recovery process can measure the value of vector and thereby restore the originally encoded value for W. With phase coding, an embedding rate between eight and 32 bits per second is possible, depending on the audio context. The higher rates are usually achieved when there is a noisy background in the audio signal. A higher embedding rate can result in phase dispersion, a distortion13 caused by a break in the relationship of the phases between each of the frequency components (Bender et al., 1996).
Spread Spectrum Watermarking Spread spectrum techniques for watermarking borrow most of the theory from the communications community (Czerwinski et al., 1999). The main idea is to embed a narrow-band signal (the watermark) into a wide-band channel (the audio file). The characteristics of both A and W seems to suit this model perfectly. In addition, spread spectrum techniques offer the possibility of protecting the watermark privacy by using a secret key to control the pseudorandom sequence generator that is needed in the process. Generally, the message used as the watermark is a narrow band signal compared to the wide band of the cover (Dugelay & Roche, 2000; Kirovski & Malvar, 2001). Spread spectrum techniques allow the frequency bands to be matched before embedding the message. Furthermore, high frequencies are relevant for the invisibility of the watermark but are inefficient as far as robustness is concerned, whereas low frequencies have the opposite characteristics. If a low energy signal is embedded on each of the frequency bands, this conflict is partially solved. This is why spread spectrum techniques are valuable not only for robust communication but for watermarking as well.
Audio Watermarking
There are two basic approaches to spread spectrum techniques: direct sequence and frequency hopping. In both of these approaches the idea is to spread the watermark data across a large frequency band, namely the entire audible spectrum. In the case of direct sequence, the cover signal A is modulated by the watermark message m and a pseudorandom (PN) noise sequence, which has a wide frequency spectrum. As a consequence, the spectrum of the resulting message m’ is spread over the available band. Then, the spread message m’ is attenuated in order to obtain the watermark W. This watermark is then added to the original file, for example as additive random noise, in order to obtain the watermarked version A’. To keep the noise level down, the attenuation performed to m’ should yield a signal with about 0.5% of the dynamic range of the cover file A (Bender et al., 1996). In order to recover the watermark, the watermarked audio signal A’ is modulated with the PN-sequence to remove it. The demodulated signal is then W. However, some keying mechanisms can be used when embedding the watermark, which means that at the recovery end a detector must also be used. For example, if bi-phase shift keying is used when embedding W, then a phase detector must be used at the recovery process (Czerwinski et al., 1999). In the case of frequency hopping, the cover frequency is altered using a random process, thus describing a wide range of frequency values. That is, the frequency-hopping method selects a pseudorandom subset of the data to be watermarked. The watermark W is then attenuated and merged with the selected data using one of the methods explained in this chapter, such as coefficient quantization in a transform domain. As a result, the modulated watermark has a wide spectrum. For the detection process, the pseudorandom generator used to alter the cover frequency is used to recover the parts of the signal where the watermark is hidden. Then the watermark can be
recovered by using the detection method that corresponds to the embedding mechanism used. A crucial factor for the performance of spread spectrum techniques is the synchronization between the watermarked audio signal A’ and the PN-sequence (Dugelay & Roche, 2000; Kirovski & Malvar, 2001). This is why the particular PN-sequence used acts as a key to the recovery process. Nonetheless, some attacks can focus on this delicate aspect of the model.
MEAsUrING FIDELItY Artists, and digital content owners in general, have many reasons for embedding watermarks in their copyrighted works. These reasons have been stated in the previous sections. However, there is a big risk in performing such an operation, as the quality of the musical content might be degraded to a point where its value is diminished. Fortunately, the opposite is also possible and, if done right, digital watermarks can add value to content (Acken, 1998). Content owners are generally concerned with the degradation of the cover signal quality, even more than users of the content (Craver, Yeo, & Yeung, 1998). They have access to the unwatermarked content with which to compare their audio files. Moreover, they have to decide between the amount of tolerance in quality degradation from the watermarking process and the level of protection that is achieved by embedding a stronger signal. As a restriction, an embedded watermark has to be detectable in order to be valuable. Given this situation, it becomes necessary to measure the impact that a marking scheme has on an audio signal. This is done by measuring the fidelity of the watermarked audio signal A’, and constitutes the first measure that is defined in this chapter. As fidelity refers to the similitude between an original and a watermarked signal, a statistical metric must be used. Such a metric will fall in
Audio Watermarking
one of two categories: difference metrics or correlation metrics. Difference metrics, as the name states, measure the difference between the undistorted original audio signal A and the distorted watermarked signal A’. The popularity of these metrics is derived from their simplicity (Kutter & Petitcolas, 1999). In the case of digital audio, the most common difference metric used for quality evaluation of watermarks is the signal to noise ratio (SNR). This is usually measured in decibels (dB), so SNR(dB) = 10 log10 (SNR). The signal to noise ratio, measured in decibels, is defined by the formula: SNR(dB) = 10 log 10
∑A n
∑(A n
n
n
2
− A' n ) 2
where An corresponds to the nth sample of the original audio file A, and A’n to the nth sample of the watermarked signal A’. This is a measure
of quality that reflects the quantity of distortion that a watermark imposes on a signal (Gordy & Burton, 2000). Another common difference metric is the peak signal to noise ratio (PSNR), which measures the maximum signal to noise ratio found on an audio signal. The formula for the PSNR, along with some other difference metrics found in the literature are presented in Table 1 (Kutter & Hartung, 2000; Kutter & Petitcolas, 1999). Although the tolerable amount of noise depends on both the watermarking application and the characteristics of the unwatermarked audio signal, one could expect to have perceptible noise distortion for SNR values of 35dB (Petitcolas & Anderson, 1999). Correlation metrics measure distortion based on the statistical correlation between the original and modified signals. They are not as popular as the difference distortion metrics, but it is important to state their existence. Table 2 shows the most important of these.
Table 1. Common difference distortion metrics Maximum Difference
MD = max | An − A' n |
Average Absolute Difference
AD =
Normalized Average Absolute Difference
NAD = ∑ | An − A' n | / ∑ | An |
Mean Square Error
1 MSE = ∑ ( An − A' n ) 2 N n
Normalized Mean Square Error
NMSE = ∑ ( An − A' n ) 2 / ∑ An
1 ∑ | An − A'n | N n n
n
n
2
n
1/ p
LP-Norm
1 LP = ∑ | An − A' n | N n
Laplacian Mean Square Error
LMSE = ∑ (∇ 2 An −∇ 2 A' n ) 2 / ∑ (∇ 2 An ) 2
Signal to Noise Ratio
SNR = ∑ An / ∑ ( An − A' n ) 2
Peak Signal to Noise Ratio
PSNR = N max An2 / ∑ ( An − A' n ) 2
Audio Fidelity
AF = 1 − ∑ ( An − A' n ) 2 / ∑ An
n
n
2
n
n
n
n
n
2
n
Audio Watermarking
Table 2. Correlation distortion metrics Normalized Cross-Correlation Correlation Quality
NC =
∑A n
n
Table 3. ITU-R Rec. 500 quality rating
~ 2 An / ∑ An n
~ CQ = ∑ An An / ∑ An n
n
For the purpose of audio watermark benchmarking, the use of the signal to noise ratio (SNR) should be used to measure the fidelity of the watermarked signal with respect to the original. This decision follows most of the literature that deals with the topic (Gordy & Burton, 2000; Kutter & Petitcolas, 1999, 2000; Petitcolas & Anderson, 1999). Nonetheless, in this measure the term noise refers to statistical noise, or a deviation from the original signal, rather than to perceived noise on the side of the hearer. This result is due to the fact that the SNR is not well correlated with the human auditory system (Kutter & Hartung, 2000). Given this characteristic, the effect of perceptual noise needs to be addressed later. In addition, when a metric that outputs results in decibels is used, comparisons are difficult to make, as the scale is not linear but rather logarithmic. This means that it is more useful to present the results using a normalized quality rating. The ITU-R Rec. 500 quality rating is perfectly suited for this task, as it gives a quality rating on a scale of 1 to 5 (Arnold, 2000; Piron et al., 1999). Table 3 shows the rating scale, along with the quality level being represented. This quality rating is computed by using the formula: Quality = F =
5 1 + N * SNR
where N is a normalization constant and SNR is the measured signal to noise ratio. The resulting value corresponds to the fidelity F of the watermarked signal.
0
Rating
Impairment
Quality
5
Imperceptible
Excellent
4
Perceptible, not annoying
Good
3
Slightly annoying
Fair
2
Annoying
Poor
1
Very annoying
Bad
Data Payload The fidelity of a watermarked signal depends on the amount of embedded information, the strength of the mark, and the characteristics of the host signal. This means that a comparison between different algorithms must be made under equal conditions. That is, while keeping the payload fixed, the fidelity must be measured on the same audio cover signal for all watermarking techniques being evaluated. However, the process just described constitutes a single measure event and will not be representative of the characteristics of the algorithms being evaluated, as results can be biased depending on the chosen parameters. For this reason, it is important to perform the tests using a variety of audio signals, with changing size and nature (Kutter & Petitcolas, 2000). Moreover, the test should also be repeated using different keys. The amount of information that should be embedded is not easy to determine, and depends on the application of the watermarking scheme. In Kutter and Petitcolas (2000) a message length of 100 bits is used on their test of image watermarking systems as a representative value. However, some secure watermarking protocols might need a bigger payload value, as the watermark W could include a cryptographic signature for both the audio file A, and the watermark message m in order to be more secure (Katzenbeisser & Veith, 2002). Given this, it is recommended to use a longer watermark bitstream for the test, so that a
Audio Watermarking
real world scenario is represented. A watermark size of 128 bits is big enough to include two 56bit signatures and a unique identification number that identifies the owner.
presented, and a practical measure for robustness is proposed.
Speed
Before defining a metric, it must be stated that one does not need to erase a watermark in order to render it useless. It is said that a watermarking scheme is robust when it is able to withstand a series of attacks that try to degrade the quality of the embedded watermark, up to the point where it is removed, or its recovery process is unsuccessful. This means that just by interfering with the detection process a person can create a successful attack over the system, even unintentionally. However, in some cases one can overcome this characteristic by using error-correcting codes or a stronger detector (Cox et al., 2002). If an error correction code is applied to the watermark message, then it is unnecessary to entirely recover the watermark W in order to successfully retrieve the embedded message m. The use of stronger detectors can also be very helpful in these situations. For example, if a marking scheme has a publicly available detector, then an attacker will try to tamper with the cover signal up to the point where the detector does not recognize the watermark’s presence14. Nonetheless, the content owner may have another version of the watermark detector, one that can successfully recover the mark after some extra set of signal processing operations. This “special” detector might not be released for public use for economic, efficiency or security reasons. For example, it might only be used in court cases. The only thing that is really important is that it is possible to design a system with different detector strengths. Given these two facts, it makes sense to use a metric that allows for different levels of robustness, instead of one that only allows for two different states (the watermark is either robust or not). With this characteristic in mind, the basic procedure
Besides fidelity, the content owner might be interested in the time it takes for an algorithm to embed a mark (Gordy & Burton, 2000). Although speed is dependent on the type of implementation (hardware or software), one can suppose that the evaluation will be performed on software versions of the algorithms. In this case, it is a good practice to perform the test on a machine with similar characteristics to the one used by the end user (Petitcolas, 2000). Depending on the application, the value for the time it takes to embed a watermark will be incorporated into the results of the test. This will be done later, when all the measures are combined together.
MEAsUrING rObUstNEss Watermarks have to be able to withstand a series of signal operations that are performed either intentionally or unintentionally on the cover signal and that can affect the recovery process. Given this, watermark designers try to guarantee a minimum level of robustness against such operations. Nonetheless, the concept of robustness is ambiguous most of the time and thus claims about a watermarking scheme being robust are difficult to prove due to the lack of testing standards (Craver, Perrig, & Petitcolas, 2000). By defining a standard metric for watermark robustness, one can then assure fairness when comparing different technologies. It becomes necessary to create a detailed and thorough test for measuring the ability that a watermark has to withstand a set of clearly defined signal operations. In this section these signal operations are
How to Measure
Audio Watermarking
for measuring robustness is a three-step process, defined as follows: 1.
2.
3.
For each audio file in a determined test set embed a random watermark W on the audio signal A, with the maximum strength possible that does not diminish the fidelity of the cover below a specified minimum (Petitcolas & Anderson, 1999). Apply a set of relevant signal processing operations to the watermarked audio signal A’. Finally, for each audio cover, extract the watermark W using the corresponding detector and measure the success of the recovery process.
Some of the early literature considered the recovery process successful only if the whole watermark message m was recovered (Petitcolas, 2000; Petitcolas & Anderson, 1999). This was in fact a binary robustness metric. However, the use of the bit-error rate has become common recently (Gordy & Burton, 2000; Kutter & Hartung, 2000; Kutter & Petitcolas, 2000), as it allows for a more detailed scale of values. The bit-error rate (BER) is defined as the ratio of incorrect extracted bits to the total number of embedded bits and can be expressed using the formula:
BER =
100 l −1 1, W ' n = Wn ∑ l n =0 0, W ' n ≠ Wn
where l is the watermark length, Wn corresponds to the nth bit of the embedded watermark and W’n corresponds to the nth bit of the recovered watermark. In other words, this measure of robustness is the certainty of detection of the embedded mark (Arnold, 2000). It is easy to see why this measure makes more sense, and thus should be used as the metric when evaluating the success of the watermark recovery process and therefore the robustness of an audio watermarking scheme.
A final recommendation must be made at this point. The three-step procedure just described should be repeated several times, since the embedded watermark W is randomly generated and the recovery can be successful by chance (Petitcolas, 2000). Up to this point no details have been given about the signal operations that should be performed in the second step of the robustness test. As a rule of thumb, one should include as a minimum the operations that the audio cover is expected to go through in a real world application. However, this will not provide enough testing, as a malicious attacker will most likely have access to a wide range of tools as well as a broad range of skills. Given this situation, several scenarios should be covered. In the following sections the most common signal operations and attacks that an audio watermark should be able to withstand are presented.
Audio Restoration Attack Audio restoration techniques have been used for several years now, specifically for restoring old audio recordings that have audible artifacts. In audio restoration the recording is digitized and then analyzed for degradations. After these degradations have been localized, the corresponding samples are eliminated. Finally the recording is reconstructed (that is, the missing samples are recreated) by interpolating the signal using the remaining samples. One can assume that the audio signal is the product of a stationary autoregressive (AR) process of finite order (Petitcolas & Anderson, 1998). With this assumption in mind, one can use an audio segment to estimate a set of AR parameters and then calculate an approximate value for the missing samples. Both of the estimates are calculated using a least-square minimization technique. Using the audio restoration method just described one can try to render a watermark undetectable by processing the marked audio signal A’.
Audio Watermarking
The process is as follows: First divide the audio signal A’ into N blocks of size m samples each. A value of m=1000 samples has been proposed in the literature (Petitcolas & Anderson, 1999). A block of length l is removed from the middle of each block and then restored using the AR audio restoration algorithm. This generates a reconstructed block also of size m. After the N blocks have been processed they are concatenated again, and an audio signal B’ is produced. It is expected that B’ will be closer to A than to A’ and thus the watermark detector will not find any mark in it. An error free restoration is theoretically possible in some cases, but this is not desired since it would produce a signal identical to A’. What is expected is to create a signal that has an error value big enough to mislead the watermark detector, but small enough to prevent the introduction of audible noise. Adjusting the value of the parameter l controls the magnitude of the error (Petitcolas & Anderson, 1999). In particular, a value of l=80 samples has proven to give good results.
Invertibility Attack When resolving ownership cases in court, the disputing parties can both claim that they have inserted a valid watermark on the audio file, as it is sometimes possible to embed multiple marks on a single cover signal. Clearly, one mark must have been embedded before the other. The ownership is resolved when the parties are asked to show the original work to court. If Alice has the original audio file A, which has been kept stored in a safe place, and Mallory has a counterfeit original file Ã, which has been derived from A, then Alice can search for her watermark W in Mallory’s file and will most likely find it. The converse will not happen, and the case will be resolved (Craver et al., 2000). However, an attack to this procedure can be created, and is known as an invertibility attack.
Normally the content owner adds a watermark W to the audio file A, creating a watermarked audio file A’ = A+W, where the sign “+” denotes the embedding operation. This file is released to the public, while the original A and the watermark W are stored in a safe place. When a suspicious audio file à appears, the difference is computed. This difference should be equal to W if A’ and à are equal, and very close to W if à was derived from A’. In general, a correlation function ƒ(W, ) is used to determine the similarity between the watermark W and the extracted data . This function will yield a value close to 1, if W and are similar. However, Mallory can do the following: she can subtract (rather than add) a second watermark w from Alice’s watermarked file A’, using the inverse of the embedding algorithm. This yields an audio file  = A’- w = A + W- w, which Mallory can now claim to be the original audio file, along with w as the original watermark (Craver, Memon, Yeo, & Yeung, 1998). Now both Alice and Mallory can claim copyright violation from their counterparts. When the two originals are compared in court, Alice will find that her watermark is present in Mallory’s audio file, since  – A = W-w is calculated, and ƒ(W-w, W) ≈ 1. However, Mallory can show that when A –  = w -W is calculated, then ƒ(w -W, w) ≈ 1 as well. In other words, Mallory can show that her mark is also present in Alice’s work, even though Alice has kept it locked at all times (Craver, Memon, & Yeung, 1996; Craver, Yeo et al., 1998). Given the symmetry of the equations, it is impossible to decide who is the real owner of the original file. A deadlock is thus created (Craver, Yeo et al., 1998; Pereira et al., 2001). This attack is a clear example of how one can render a mark unusable without having to remove it, by exploiting the invertibility of the watermarking method, which allows an attacker to remove as well as add watermarks. Such an attack can be prevented by using a non-invert-
Audio Watermarking
ible cryptographic signature in the watermark W; that is, using a secure watermarking protocol (Katzenbeisser & Veith, 2002; Voloshynovskiy, Pereira, Pun et al., 2001).
Specific Attack on Echo Watermarking The echo watermarking technique presented in this chapter can be easily “attacked” simply by detecting the echo and then removing the delayed signal by inverting the convolution formula that was used to embed it. However, the problem consists of detecting the echo without knowing the original signal and the possible delay values. This problem is referred to as blind echo cancellation, and is known to be difficult to solve (Petitcolas, Anderson, & G., 1998). Nonetheless, a practical solution to this problem appears to lie in the same function that is used for echo watermarking extraction: cepstrum autocorrelation. Cepstrum analysis, along with a brute force search can be used together to find the echo signal in the watermarked audio file A’. A detailed description of the attack is given by Craver et al. (2000), and the idea is as follows: If we take the power spectrum of A’(t) = A(t) + αA(t – ∆t), denoted by Φ and then calculate the logarithm of Φ, the amplitude of the delayed signal can be augmented using an autocovariance function15 over the power spectrum Φ'(ln(Φ)). Once the amplitude has been increased, then the “hump” of the signal becomes more visible and the value of the delay ∆t can be determined (Petitcolas et al., 1998). Experiments show that when an artificial echo is added to the signal, this attack works well for values of ∆t between 0.5 and three milliseconds (Craver et al., 2000). Given that the watermark is usually embedded with a delay value that ranges from 0.5 to two milliseconds, this attack seems to be well suited for the technique and thus very likely to be successful (Petitcolas et al., 1999).
Collusion Attack A collusion attack, also known as averaging, is especially effective against basic fingerprinting schemes. The basic idea is to take a large number of watermarked copies of the same audio file, and average them in order to produce an audio signal without a detectable mark (Craver et al., 2000; Kirovski & Malvar, 2001). Another possible scenario is to have copies of multiple works that have been embedded with the same watermark. By averaging the sample values of the audio signals, one could estimate the value of the embedded mark, and then try to subtract it from any of the watermarked works. It has been shown that a small number (around 10) of different copies are needed in order to perform a successful collusion attack (Voloshynovskiy, Pereira, Pun et al., 2001). An obvious countermeasure to this attack is to embed more than one mark on each audio cover, and to make the marks dependant on the characteristics of the audio file itself (Craver et al., 2000).
Signal Diminishment Attacks and Common Processing Operations Watermarks must be able to survive a series of signal processing operations that are commonly performed on the audio cover work, either intentionally or unintentionally. Any manipulation of an audio signal can result in a successful removal of the embedded mark. Furthermore, the availability of advanced audio editing tools on the Internet, such as Audacity (Dannenberg & Mazzoni, 2002), implies that these operations can be performed without an extensive knowledge of digital signal processing techniques. The removal of a watermark by performing one of these operations is known as a signal diminishment attack, and probably constitutes the most common attack performed on digital watermarks (Meerwald & Pereira, 2002).
Audio Watermarking
Given this, a set of the most common signal operations must be specified, and watermark resistance to these must be evaluated. Even though an audio file will most likely not be subject to all the possible operations, a thorough list is necessary. Defining which subset of these operations is relevant for a particular watermarking scheme is a task that needs to be done; however, this will be addressed later in the chapter. The signal processing operations presented here are classified into eight different groups, according to the presentation made in Petitcolas et al. (2001). These are: •
•
•
•
•
Dynamics: These operations change the loudness profile of the audio signal. The most basic way of performing this consists of increasing or decreasing the loudness directly. More complicated operations include limiting, expansion and compression, as they constitute nonlinear operations that are dependant on the audio cover. Filter: Filters cut off or increase a selected part of the audio spectrum. Equalizers can be seen as filters, as they increase some parts of the spectrum, while decreasing others. More specialized filters include low-pass, high-pass, all-pass, FIR, and so forth. Ambience: These operations try to simulate the effect of listening to an audio signal in a room. Reverb and delay filters are used for this purpose, as they can be adjusted in order to simulate the different sizes and characteristics that a room can have. Conversion: Digital audio files are nowadays subject to format changes. For example, old monophonic signals might be converted to stereo format for broadcast transmission. Changes from digital to analog representation and back are also common, and might induce significant quantization noise, as no conversion is perfect. Lossy compression: These algorithms are becoming popular, as they reduce the
•
•
•
•
amount of data needed to represent an audio signal. This means that less bandwidth is needed to transmit the signal, and that less space is needed for its storage. These compression algorithms are based on psychoacoustic models and, although different implementations exist, most of them rely on deleting information that is not perceived by the listener. This can pose a serious problem to some watermarking schemes, as they sometimes will hide the watermark exactly in these imperceptible regions. If the watermarking algorithm selects these regions using the same method as the compression algorithm, then one just needs to apply the lossy compression algorithm to the watermarked signal in order to remove the watermark. Noise: This can be added in order to remove a watermark. This noise can even be imperceptible, if it is shaped to match the properties of the cover signal. Fragile watermarks are especially vulnerable to this attack. Sometimes noise will appear as the product of other signal operations, rather than intentionally. Modulation: This effects like vibrato, chorus, amplitude modulation and flanging are not common post-production operations. However, they are included in most of the audio editing software packages and thus can be easily used in order to remove a watermark. Time stretch and pitch shift: These operations either change the length of an audio passage without changing its pitch, or change the pitch without changing its length in time. The use of time stretch techniques has become common in radio broadcasts, where stations have been able to increase the number of advertisements without devoting more air time to these (Kuczynski, 2000). Sample permutations: This group consists of specialized algorithms for audio manipu-
Audio Watermarking
lation, such as the attack on echo hiding just presented. Dropping of some samples in order to misalign the watermark decoder is also a common attack to spread-spectrum watermarking techniques.
Secure Digital Music Initiative (SDMI), International Federation of the Phonographic Industry (IFPI), and the Japanese Society for Rights of Authors, Composers and Publishers (JASRAC). These guidelines constitute the baseline for any robustness test. In other words, they describe the minimum processing that an audio watermark should be able to resist, regardless of its intended application. Table 4 summarizes these requirements (JASRAC, 2001; SDMI, 2000).
It is not always clear how much processing a watermark should be able to withstand. That is, the specific parameters of the diverse filtering operations that can be performed on the cover signal are not easy to determine. In general terms one could expect a marking scheme to be able to survive several processing operations up to the point where they introduce annoying audible effects on the audio work. However, this rule of thumb is still too vague. Fortunately, guidelines and minimum requirements for audio watermarking schemes have been proposed by different organizations such as the
False Positives When testing for false positives, two different scenarios must be evaluated. The first one occurs when the watermark detector signals the presence of a mark on an unmarked audio file. The second case corresponds to the detector successfully finding a watermark W’ on an audio file
Table 4. Summary of SDMI, STEP and IFPI requirements Processing Operation Digital to analog conversion
Requirements Two consecutive digital to analog and analog to digital conversions. 10 band graphic equalizer with the following characteristics:
Equalization
Freq. (Hz)
31
62
125
250
500
1k
2k
4k
8k
16k
Gain (db)
-6
+6
-6
+3
-6
+6
-6
+6
-6
+6
Band-pass filtering
100 Hz – 6 kHz, 12dB/oct.
Time stretch and pitch change
+/- 10% compression and decompression.
Codecs (at typically used data rates)
AAC, MPEG-4 AAC with perceptual noise substitution, MPEG-1 Audio Layer 3, QDesign, Windows Media Audio, Twin-VQ, ATRAC-3, Dolby Digital AC-3, ePAC, RealAudio, FM, AM, PCM.
Noise addition
Adding white noise with constant level of 40dB lower than total averaged music power (SNR: 40dB).
Time scale modification
Pitch invariant time scaling of +/- 4%.
Wow and flutter
0.5% rms, from DC to 250Hz.
Echo addition
Delay up to 100 milliseconds, feedback coefficient up to 0.5.
Down mixing and surround sound processing
Stereo to mono, 6 channel to stereo, SRS, spatializer, Dolby surround, Dolby headphone.
Sample rate conversion
44.1 kHz to 16 kHz, 48 kHz to 44.1 kHz, 96 kHz to 48/44.1 kHz.
Dynamic range reduction
Threshold of 50dB, 16dB maximum compression. Rate: 10-millisecond attack, 3-second recovery.
Amplitude compression
16 bits to 8 bits.
Audio Watermarking
that has been marked with a watermark W (Cox et al., 2002; Kutter & Hartung, 2000; Petitcolas et al., 2001). The testing procedure for both types of false positives is simple. In the first case one just needs to run the detector on a set of unwatermarked works. For the second case, one can embed a watermark W using a given key K, and then try to extract a different mark W’ while using the same key K. The false positive rate (FPR) is then defined as the number of successful test runs divided by the total number of test runs. A successful test run is said to occur whenever a false positive is detected. However, a big problem arises when one takes into account the required false positive rate for some schemes. For example, a popular application such as DVD watermarking requires a false positive rate of 1 in 1012 (Cox et al., 2002). In order to verify that this rate is accomplished one would need to run the described experiment during several years. Other applications such as proof of ownership in court are rare, and thus require a lower false positive rate. Nonetheless, a false rate probability of 10-6, required for the mentioned application, can be difficult to test.
MEAsUrING PErcEPtIbILItY Digital content consumers are aware of many aspects of emerging watermarking technologies. However, only one prevails over all of them: users are concerned with the appearance of perceptible (audible) artifacts due to the use of a watermarking scheme. Watermarks are supposed to be imperceptible (Cox et al., 2002). Given this fact, one must carefully measure the amount of distortion that the listener will perceive on a watermarked audio file, as compared to its unmarked counterpart. Formal listening tests have been considered the only relevant method for judging audio quality, as traditional objective measures such as the signalto-noise ratio (SNR) or total-harmonic-distortion16
(THD) have never been shown to reliably relate to the perceived audio quality, as they can not be used to distinguish inaudible artifacts from audible noise (ITU, 2001; Kutter & Hartung, 2000; Thiede & Kabot, 1996). There is a need to adopt an objective measurement test for perceptibility of audio watermarking schemes. Furthermore, one must be careful, as perceptibility must not be viewed as a binary condition (Arnold & Schilz, 2002; Cox et al., 2002). Different levels of perceptibility can be achieved by a watermarking scheme; that is, listeners will perceive the presence of the watermark in different ways. Auditory sensitivities vary significantly from individual to individual. As a consequence, any measure of perceptibility that is not binary should accurately reflect the probability of the watermark being detected by a listener. In this section a practical and automated evaluation of watermark perceptibility is proposed. In order to do so, the human auditory system (HAS) is first described. Then a formal listening test is presented, and finally a psychoacoustical model for automation of such a procedure is outlined.
Human Auditory System (HAS) Figure 4, taken from Robinson (2002), presents the physiology of the human auditory system. Each one of its components is now described. The pinna directionally filters incoming sounds, producing a spectral coloration known as head related transfer function (or HRTF). This function enables human listeners to localize the sound source in three dimensions. The ear canal filters the sound, attenuating both low and high frequencies. As a result, a resonance arises around 5 kHz. After this, small bones known as the timpanic membrane (or ear drum), malleus and incus transmit the sound pressure wave through the middle ear. The outer and middle ear perform a band pass filter operation on the input signal.
Audio Watermarking
Figure 4. Overview of the human auditory system (HAS)
The sound wave arrives at the fluid-filled cochlea, a coil within the ear that is partially protected by a bone. Inside the cochlea resides the basilar membrane (BM), which semi-divides it. The basilar membrane acts as a spectrum analyzer, as it divides the signal into frequency components. Each point on the membrane resonates at a different frequency, and the spacing of these resonant frequencies along the BM is almost logarithmic. The effective frequency selectivity is related to the width of the filter characteristic at each point. The outer hair cells, distributed along the length of the BM, react to feedback from the brainstem. They alter their length to change the resonant properties of the BM. As a consequence, the frequency response of the membrane becomes amplitude dependent. Finally, the inner hair cells of the basilar membrane fire when the BM moves upward. In doing so, they transduce the sound wave at each point into a signal on the auditory nerve. In this way the signal is half wave rectified. Each cell needs a certain time to recover between successive firings, so the average response during a steady tone is lower than at its onset. This means that the inner hair cells act as an automatic gain control. The net result of the process described above is that an audio signal, which has a relatively widebandwidth, and large dynamic range, is encoded for transmission along the nerves. Each one of these nerves offers a much narrower bandwidth,
and limited dynamic range. In addition, a critical process has happened during these steps. Any information that is lost due to the transduction process within the cochlea is not available to the brain. In other words, the cochlea acts as a lossy coder. The vast majority of what we cannot hear is attributable to this transduction process (Robinson & Hawksford, 1999). Detailed modeling of the components and processes just described will be necessary when creating an auditory model for the evaluation of watermarked audio. In fact, by representing the audio signal at the basilar membrane, one can effectively model what is effectively perceived by a human listener.
Perceptual Phenomena As was just stated, one can model the processes that take place inside the HAS in order to represent how a listener responds to auditory stimuli. Given its characteristics, the HAS responds differently depending on the frequency and loudness of the input. This means that all components of a watermark may not be equally perceptible. Moreover, it also denotes the need of using a perceptual model to effectively measure the amount of distortion that is imposed on an audio signal when a mark is embedded. Given this fact, in this section the main processes that need to be included on a perceptual model are presented.
Audio Watermarking
Sensitivity refers to the ear’s response to direct stimuli. In experiments designed to measure sensitivity, listeners are presented with isolated stimuli and their perception of these stimuli is tested. For example, a common test consists of measuring the minimum sound intensity required to hear a particular frequency (Cox et al., 2002). The main characteristics measured for sensitivity are frequency and loudness. The responses of the HAS are frequency dependent; variations in frequency are perceived as different tones. Tests show that the ear is most sensitive to frequencies around 3kHz and that sensitivity declines at very low (20 Hz) and very high (20 kHz) frequencies. Regarding loudness, different tests have been performed to measure sensitivity. As a general result, one can state that the HAS is able to discern smaller changes when the average intensity is louder. In other words, the human ear is more sensitive to changes in louder signals than in quieter ones. The second phenomenon that needs to be taken into account is masking. A signal that is clearly audible if presented alone can be completely inaudible in the presence of another signal, the masker. This effect is known as masking, and the masked signal is called the maskee. For example, a tone might become inaudible in the presence of a second tone at a nearby frequency that is louder. In other words, masking is a measure of a listener’s response to one stimulus in the presence of another. Two different kinds of masking can occur: simultaneous masking and temporal masking (Swanson et al., 1998). In simultaneous masking, both the masker and the maskee are presented at the same time and are quasi-stationary (ITU, 2001). If the masker has a discrete bandwidth, the threshold of hearing is raised even for frequencies below or above the masker. In the situation where a noise-like signal is masking a tonal signal, the amount of masking is almost frequency independent; if the sound pressure of the maskee is about
5 dB below that of the masker, then it becomes inaudible. For other cases, the amount of masking depends on the frequency of the masker. In temporal masking, the masker and the maskee are presented at different times. Shortly after the decay of a masker, the masked threshold is closer to simultaneous masking of this masker than to the absolute threshold (ITU, 2001). Depending on the duration of the masker, the decay time of the threshold can vary between five ms and 150 ms. Furthermore, weak signals just before loud signals are masked. The duration of this backward masking effect is about five ms. The third effect that has to be considered is pooling. When multiple frequencies are changed rather than just one, it is necessary to know how to combine the sensitivity and masking information for each frequency. Combining the perceptibilities of separate distortions gives a single estimate for the overall change in the work. This is known as pooling. In order to calculate this phenomenon, it is common to apply the formula: 1
p D( A, A' ) = ∑ | d [i ] | p i
where d[i] is an estimate of the likelihood that an individual will notice the difference between A and A’ in a temporal sample (Cox et al., 2002). In the case of audio, a value of p=1 is sometimes appropriate, which turns the equation into a linear summation.
ABX Listening Test Audio quality is usually evaluated by performing a listening test. In particular, the ABX listening test is commonly used when evaluating the quality of watermarked signals. Other tests for audio watermark quality evaluation, such as the one described in Arnold and Schilz (2002), follow a similar methodology as well. Given this, it becomes desirable to create an automatic model
Audio Watermarking
that predicts the response observed from a human listener in such a procedure. In an ABX test the listener is presented with three different audio clips: selection A (in this case the non-watermarked audio), selection B (the watermarked audio) and X (either the watermarked or non-watermarked audio), drawn at random. The listener is then asked to decide if selection X is equal to A or B. The number of correct answers is the basis to decide if the watermarked audio is perceptually different than the original audio and one will, therefore, declare the watermarking algorithm as “perceptible”. In the other case, if the watermarked audio is perceptually equal to the original audio, the watermarking algorithm will be declared as transparent, or imperceptible. In the particular case of Arnold and Schilz (2002), the level of transparency is assumed to be determined by the noise-to-mask ratio (NMR). The ABX test is fully described in ITU Recommendation ITU-R BS.1116, and has been successfully used for subjective measurement of impaired audio signals. Normally only one attribute is used for quality evaluation. It is also defined that this attribute represents any and all detected differences between the original signal and the signal under test. It is known as basic audio quality (BAQ), and is calculated as the difference between the grade given to the impaired signal and the grade given to the original signal. Each one of these grades uses the five-level impairment scale that was presented previously. Given this fact, values for the BAQ range between 0 and -4, where 0 corresponds to an imperceptible impairment and -4 to one judged as very annoying. Although its results are highly reliable, there are many problems related to performing an ABX test for watermark quality evaluation. One of them is the subjective nature of the test, as the perception conditions of the listener may vary with time. Another problem arises from the high costs associated with the test. These costs include the setup of audio equipment17, construction of a noisefree listening room, and the costs of employing
0
individuals with extraordinarily acute hearing. Finally, the time required to perform extensive testing also poses a problem to this alternative. Given these facts it becomes desirable to automate the ABX listening test, and incorporate it into a perceptual model of the HAS. If this is implemented, then the task measuring perceptibility can be fully automated and thus watermarking schemes can be effectively and thoroughly evaluated. Fortunately, several perceptual models for audio processing have been proposed. Specifically, in the field of audio coding, psychoacoustic models have been successfully implemented to evaluate the perceptual quality of coded audio. These models can be used as a baseline performance tool for measuring the perceptibility of audio watermarking schemes; thus they are now presented.
A Perceptual Model A perceptual model used for evaluation of watermarked content must compare the quality of two different audio signals in a way that is similar to the ABX listening test. These two signals correspond to the original audio cover A and the watermarked audio file A’. An ideal system will receive both signals as an input, process them through an auditory model, and compare the representations given by this model (Thiede et al., 1998). Finally it will return a score for the watermarked file A’ in the five-level impairment scale. More importantly, the results of such an objective test must be highly correlated with those achieved under a subjective listening test (ITU, 2001). The general architecture of such a perceptual measurement system is depicted in Figure 5. The auditory model used to process the input signals will have a similar structure to that of the HAS. In general terms, the response of each one of the components of the HAS is modeled by a series of filters. In particular, a synopsis of the models proposed in Robinson and Hawksford
Audio Watermarking
Figure 5. Architecture of a perceptual measurement system Original Audio Signal A
Auditory Model
Comparison of Representations
Watermarked Audio Signal A’
Audio Quality Estimate
Auditory Model
(1999), Thiede and Kabot (1996), Thiede et al. (1998), and ITU (2001) is now presented. The filtering performed by the pinna and ear canal is simulated by an FIR filter, which has been derived from experiments with a dummy head. More realistic approaches can use measurements from human subjects. After this prefiltering, the audio signal has to be converted to a basilar membrane representation. That is, the amplitude dependent response of the basilar membrane needs to be simulated. In order to do this, the first step consists of processing the input signal through a bank of amplitude dependant filters, each one adapted to the frequency response of a point on the basilar membrane. The center frequency of each filter should be linearly spaced on the Bark scale, a commonly used frequency scale18. The actual number of filters to be used depends on the particular implementation. Other approaches might use a fast Fourier transform to decompose the signal, but this creates a trade-off between temporal and spectral resolution (Thiede & Kabot, 1996). At each point in the basilar membrane, its movement is transduced into an electrical signal by the hair cells. The firing of individual cells is pseudorandom, but when the individual signals are combined, the proper motion of the BM is derived. Simulating the individual response of
each hair cell and combining these responses is a difficult task, so other practical solutions have to be applied. In particular, Robinson and Hawksford (1999) implement a solution based on calculating the half wave response of the cells, and then using a series of feedback loops to simulate the increased sensitivity of the inner hair cells to the onset of sounds. Other schemes might just convolve the signal with a spreading function, to simulate the dispersion of energy along the basilar membrane, and then convert the signal back to decibels (ITU, 2001). Independently of the method used, the basilar membrane representation is obtained at this point. After a basilar membrane representation has been obtained for both the original audio signal A, and the watermarked audio signal A’, the perceived difference between the two has to be calculated. The difference between the signals at each frequency band has to be calculated, and then it must be determined at what level these differences will become audible for a human listener (Robinson & Hawksford, 1999). In the case of the ITU Recommendation ITU-R BS.1387, this task is done by calculating a series of model variables, such as excitation, modulation and loudness patterns, and using them as an input to an artificial neural network with one hidden layer (ITU, 2001). In the model proposed in Robinson and Hawksford
Audio Watermarking
(1999), this is done as a summation over time (over an interval of 20 ms) along with weighting of the signal and peak suppression. The result of this process is an objective difference between the two signals. In the case of the ITU model, the result is given in a negative five-level impairment scale, just like the BAQ, and is known as the objective difference grade (ODG). For other models, the difference is given in implementation-dependant units. In both cases, a mapping or scaling function, from the model units to the ITU-R. 500 scale, must be used. For the ITU model, this mapping could be trivial, as all that is needed is to add a value of five to the value of the ODG. However, a more precise mapping function could be developed. The ODG has a resolution of one decimal, and the model was not specifically designed for the evaluation watermarking schemes. Given this, a nonlinear mapping (for example using a logarithmic function) could be more appropriate. For other systems, determining such a function will depend on the particular implementation of the auditory model; nonetheless such a function should exist, as a correlation between objective and subjective measures was stated as an initial requirement. For example, in the case of Thiede and Kabot (1996), a sigmoidal mapping function is used. Furthermore, the parameters for the mapping function can be calculated using a control group consisting of widely available listening test data. The resulting grade, in the five-level scale, is defined as the perceptibility of the audio watermark. This means that in order to estimate the perceptibility of the watermarking scheme, several test runs must be performed. Again, these test runs should embed a random mark on a cover signal, and a large and representative set of audio cover signals must be used. The perceptibility test score is finally calculated by averaging the different results obtained for each one of the individual tests.
FINAL bENcHMArK scOrE In the previous sections, three different testing procedures have been proposed, in order to measure the fidelity, robustness and perceptibility of a watermarking scheme. Each one of these tests has resulted in several scores, some of which may be more useful than others. In this section, these scores are combined in order to obtain a final benchmarking score. As a result, fair comparison amongst competing technologies is possible, as the final watermarking scheme evaluation score is obtained. In addition, another issue is addressed at this point: defining the specific parameters to be used for each attack while performing the robustness test. While the different attacks were explained in the sixth section, the strength at which they should be applied was not specified. As a general rule of thumb, it was just stated that these operations should be tested up to the point where noticeable distortion is introduced on the audio cover file. As it has been previously discussed, addressing these two topics can prove to be a difficult task. Moreover, a single answer might not be appropriate for every possible watermarking application. Given this fact, one should develop and use a set of application-specific evaluation templates to overcome this restriction. In order to do so, an evaluation template is defined as a set of guidelines that specifies the specific parameters to be used for the different tests performed, and also denotes the relative importance of each one of the tests performed on the watermarking scheme. Two fundamental concepts have been incorporated into that of evaluation templates: evaluation profiles and application specific benchmarking. Evaluation profiles have been proposed in Petitcolas (2000) as a method for testing different levels of robustness. Their sole purpose is to establish the set of tests and media to be used when evaluating a marking algorithm. For example, Table 4, which summarizes the robustness requirements imposed by various organizations,
Audio Watermarking
constitutes a general-purpose evaluation profile. More specific profiles have to be developed when evaluating more specific watermarking systems. For example, one should test a marking scheme intended for advertisement broadcast monitoring with a set of recordings similar to those that will be used in a real world situation. There is no point in testing such an algorithm with a set of high-fidelity musical recordings. Evaluation profiles are thus a part of the proposed evaluation templates. Application specific benchmarking, in turn, is proposed in Pereira et al. (2001) and Voloshynovskiy, Pereira, Iquise and Pun (2001) and consists of averaging the results of the different tests performed to a marking scheme, using a set of weights that is specific to the intended application of the watermarking algorithm. In other words, attacks are weighted as a function of applications (Pereira et al., 2001). In the specific case of the evaluation templates proposed in this document, two different sets of weights should be specified: those used when measuring one of the three fundamental characteristics of the algorithm (i.e., fidelity, robustness and perceptibility); and those used when combining these measures into a single benchmarking score. After the different weights have been established, the overall watermarking scheme score is calculated as a simple weighted average, with the formula: Score = w f * s f + wr * sr + w p * s p
where w represents the assigned weight for a test, s to the score received on a test, and the subscripts f, r, p denote the fidelity, robustness and perceptibility tests respectively. In turn, the values of sf, sr, and sp are also determined using a weighted average for the different measures obtained on the specific subtests. The use of an evaluation template is a simple, yet powerful idea. It allows for a fair comparison of watermarking schemes, and for ease of automated testing. After these templates have been
defined, one needs only to select the intended application of the watermarking scheme that is to be evaluated, and the rest of the operations can be performed automatically. Nonetheless, time has to be devoted to the task of carefully defining the set of evaluation templates for the different applications sought to be tested. A very simple, general-purpose evaluation template is shown next, as an example (see Example 1).
Presenting the Results The main result of the benchmark presented here is the overall watermarking scheme score that has just been explained. It corresponds to a single, numerical result. As a consequence, comparison between similar schemes is both quick and easy. Having such a comprehensive quality measure is sufficient in most cases. Under some circumstances the intermediate scores might also be important, as one might want to know more about the particular characteristics of a watermarking algorithm, rather than compare it against others in a general way. For example, one might just be interested in the perceptibility score of the echo watermarking algorithm, or in the robustness against uniform noise for two different schemes. For these cases, the use of graphs, as proposed in Kutter and Hartung (2000) and Kutter and Petitcolas (1999, 2000) is recommended. The graphs should plot the variance in two different parameters, with the remaining parameters fixed. That is, the test setup conditions should remain constant along different test runs. Finally, several test runs should be performed, and the results averaged. As a consequence, a set of variable and fixed parameters for performing the comparisons are possible, and thus several graphs can be plotted. Some of the most useful graphs, based on the discussion presented in Kutter and Petitcolas (1999), along with their corresponding variables and constants, are summarized in Table 5.
Audio Watermarking
Example 1. Application: General Purpose Audio Watermarking Final Score Weights: Fidelity = 1/3, Robustness = 1/3, Perceptibility = 1/3 FIDELITY TEST Measure
Parameters
Weight
Quality
N/A
0.75
Data Payload
Watermark length = 100 bits, score calculated as BER.
0.125
Speed
Watermark length = 50 bits, score calculated as 1 if embedding time is less than 2 minutes, 0 otherwise.
0.125
ROBUSTNESS TEST Measure D/A Conversion
Parameters
Weight
D/A ↔ A/D twice.
1/14
10 band graphic equalizer with the following characteristics: Equalization
Freq. (Hz)
31
62
125
250
500
1k
2k
4k
8k
16k
Gain (db)
-6
+6
-6
+3
-6
+6
-6
+6
-6
+6
1/14
Band-pass filtering
100 Hz – 6 kHz, 12dB/oct.
1/14
Time stretch and pitch change
+/- 10% compression and decompression
1/14
Codecs
AAC, MPEG-4 AAC with perceptual noise substitution, MPEG-1 Audio Layer 3, Windows Media Audio, and Twin-VQ at 128 kbps.
1/14
Noise addition
Adding white noise with constant level of 40dB lower than total averaged music power (SNR: 40dB)
1/14
Time scale modification
Pitch invariant time scaling of +/- 4%
1/14
Wow and flutter
0.5% rms, from DC to 250Hz
1/14
Echo addition
Delay = 100 milliseconds, feedback coefficient = 0.5
1/14
Down mixing
Stereo to mono, and Dolby surround
1/14
Sample rate conversion
44.1 kHz to 16 kHz
1/14
Dynamic range reduction
Threshold of 50dB, 16dB maximum compression Rate: 10 millisecond attack, 3 second recovery
1/14
Amplitude compression
16 bits to 8 bits
1/14 PERCEPTIBILITY TEST
Measure Watermark perceptibility
Parameters N/A
Of special interest to some watermark developers is the use of receiver operating characteristic (ROC) graphs, as they show the relation between false positives and false negatives for a given watermarking system. “They are useful for as-
Weight 1
sessing the overall behavior and reliability of the watermarking scheme being tested” (Petitcolas & Anderson, 1999). In order to understand ROC graphs, one should remember that a watermark decoder can be viewed
Audio Watermarking
as a system that performs two different steps: first it decides if a watermark is present on the audio signal A’, and then it tries to recover the embedded watermark W. The first step can be viewed as a form of hypothesis testing (Kutter & Hartung, 2000), where the decoder decides between the alternative hypothesis (a watermark is present), and the null hypothesis (the watermark is not present). Given these two options, two different errors can occur, as was stated in the third section: a false positive, and a false negative. ROC graphs plot the true positive fraction (TPF) on the Y-axis, and the false positive fraction (FPF) on the X-axis. The TPF is defined by the formula: TPF =
TP TP + FN
where TP is the number of true positive test results, and FN is the number of false negative tests. Conversely, the FPF is defined by: FPF =
FP TN + FP
where TN is the number of false-positive results, and FP the number of true negative results. An optimal detector will have a curve that goes from the bottom left corner to the top left, and then to the top right corner (Kutter & Petitcolas, 2000). Finally, it must be stated that the same number of watermarked and unwatermarked audio samples should be used for the test, although false-positive testing can be time-consuming, as was previously discussed in this document.
Automated Evaluation The watermarking benchmark proposed here can be implemented for the automated evaluation of different watermarking schemes. In fact, this idea has been included in test design, and has motivated some key decisions, such as the use
of a computational model of the ear instead of a formal listening test. Moreover, the establishment of an automated test for watermarking systems is an industry need. This assertion is derived from the following fact: to evaluate the quality of a watermarking scheme one can do one of the following three options (Petitcolas, 2000): • • •
Trust the watermark developer and his or her claims about watermark performance. Thoroughly test the scheme oneself. Have the watermarking scheme evaluated by a trusted third party.
Only the third option provides an objective solution to this problem, as long as the evaluation methodology and results are transparent to the public (Petitcolas et al., 2001). This means that anybody should be able to reproduce the results easily. As a conclusion, the industry needs to establish a trusted evaluation authority in order to objectively evaluate its watermarking products. The establishment of watermark certification programs has been proposed, and projects such as the Certimark and StirMark benchmarks are under development (Certimark, 2001; Kutter & Petitcolas, 2000; Pereira et al., 2001; Petitcolas et al., 2001). However, these programs seem to be aimed mainly at testing of image watermarking systems (Meerwald & Pereira, 2002). A similar initiative for audio watermark testing has yet to be proposed. Nonetheless, one problem remains unsolved: watermarking scheme developers may not be willing to give the source code for their embedding and recovery systems to a testing authority. If this is the situation, then both watermark embedding and recovery processes must be performed at the developer’s side, while the rest of the operations can be performed by the watermark tester. The problem with this scheme is that the watermark developer could cheat and always report the watermark as being recovered by the detector. Even if a basic zero knowledge protocol is used
Audio Watermarking
in the testing procedure, the developer can cheat, as he or she will have access to both the original audio file A and the modified, watermarked file à that has been previously processed by the tester. The cheat is possible because the developer can estimate the value of the watermarked file A’, even if it has always been kept secured by the tester (Petitcolas, 2000), and then try to extract the mark from this estimated signal. Given this fact, one partial solution consists of giving the watermark decoder to the evaluator, while the developer maintains control over the watermark embedder, or vice versa19. Hopefully, as the need for thorough testing of watermarking systems increases, watermark developers will be more willing to give out access to their systems for thorough evaluation. Furthermore, if a common testing interface is agreed upon by watermark developers, then they will not need to release the source code for their products; a compiled library will be enough for practical testing of the implemented scheme if it follows a previously defined set of design guidelines. Nonetheless, it is uncertain if both the watermarking industry and community will undergo such an effort.
cONcLUsION Digital watermarking schemes can prove to be a valuable technique for copyright control of digital material. Different applications and properties of digital watermarks have been reviewed in this chapter, specifically as they apply to digital audio. However, a problem arises as different claims are made about the quality of the watermarking schemes being developed; every developer measures the quality of their respective schemes using a different set of procedures and metrics, making it impossible to perform objective comparisons among their products. As the problem just described can affect the credibility of watermarking system developers, as
well as the acceptance of this emerging technology by content owners, this document has presented a practical test for measuring the quality of digital audio watermarking techniques. The implementation and further development of such a test can prove to be beneficial not only to the industry, but also to the growing field of researchers currently working on the subject. Nonetheless, several problems arise while implementing a widely accepted benchmark for watermarking schemes. Most of these problems have been presented in this document, but others have not been thoroughly discussed. One of these problems consists of including the growing number of attacks against marking systems that are proposed every year. These attacks get more complex and thus their implementation becomes more difficult (Meerwald & Pereira, 2002; Voloshynovskiy, Pereira, Pun et al., 2001); nonetheless, they need to be implemented and included if real world testing is sought. Another problem arises when other aspects of the systems are to be evaluated. For example, user interfaces can be very important in determining whether a watermarking product will be widely accepted (Craver et al., 2000). Its evaluation is not directly related to the architecture and performance of a marking system, but it certainly will have an impact on its acceptance. Legal constraints can also affect watermark testing, as patents might protect some of the techniques used for watermark evaluation. In other situations, the use of certain watermarking schemes in court as acceptable proofs of ownership cannot be guaranteed, and a case-by-case study must be performed (Craver, Yeo et al., 1998; Lai & Buonaiuti, 2000). Such legal attacks depend on many factors, such as the economic power of the disputing parties. While these difficulties are important, they should not be considered severe and must not undermine the importance of implementing a widely accepted benchmarking for audio watermarking systems. Instead, they show the need for further
Audio Watermarking
development of the current testing techniques. The industry has seen that ambiguous requirements and unmethodical testing can prove to be a disaster, as they can lead to the development of unreliable systems (Craver et al., 2001). Finally, the importance of a specific benchmark for audio watermarking must be stated. Most of the available literature on watermarking relates to the specific field of image watermarking. In a similar way, the development of testing techniques for watermarking has focused on the marking of digital images. Benchmarks currently being developed, such as Stirmark and Certimark, will be extended in the future to manage digital audio content (Certimark, 2001; Kutter & Petitcolas, 2000); however, this might not be an easy task, as the metrics used in these benchmarks have been optimized for the evaluation of image watermarking techniques. It is in this aspect that the test proposed in this document proves to be valuable, as it proposes the use of a psychoacoustical model in order to measure the perceptual quality of audio watermarking schemes. Other aspects, such as the use of a communications model as the base for the test design, are novel as well, and hopefully will be incorporated into the watermark benchmarking initiatives currently under development.
rEFErENcEs Acken, J.M. (1998, July). How watermarking adds value to digital content. Communications of the ACM, 41, 75-77. Arnold, M. (2000). Audio watermarking: Features, applications and algorithms. Paper presented at the IEEE International Conference on Multimedia and Expo 00. Arnold, M., & Schilz, K. (2002, January). Quality evaluation of watermarked audio tracks. Paper presented at the Proceedings of the SPIE, Security and Watermarking of Multimedia Contents IV, San Jose, CA.
Bassia, P., & Pitas, I. (1998, August). Robust audio watermarking in the time domain. Paper presented at the 9th European Signal Processing Conference (EUSIPCO’98), Island of Rhodes, Greece. Bender, W., Gruhl, D., Morimoto, N., & Lu, A. (1996). Techniques for data hiding. IBM Systems Journal, 35(5). Boney, L., Tewfik, A.H., & Hamdy, K.N. (1996, June). Digital watermarks for audio signals. Paper presented at the IEEE International Conference on Multimedia Computing and Systems, Hiroshima, Japan. Certimark. (2001). Certimark benchmark, metrics & parameters (D22). Geneva, Switzerland. Chen, B. (2000). Design and analysis of digital watermarking, information embedding, and data hiding systems. Boston: MIT. Chen, B., & Wornell, G.W. (1999, January). Dither modulation: A new approach to digital watermarking and information embedding. Paper presented at the SPIE: Security and Watermarking of Multimedia Contents, San Jose, CA. Chen, B., & Wornell, G.W. (2000, June). Quantization index modulation: A class of provably good methods for digital watermarking and information embedding. Paper presented at the International Symposium on Information Theory ISIT-2000, Sorrento, Italy. Cox, I.J., Miller, M.L., & Bloom, J.A. (2000, March). Watermarking applications and their properties. Paper presented at the International Conference on Information Technology: Coding and Computing, ITCC 2000, Las Vegas, NV. Cox, I.J., Miller, M.L., & Bloom, J.A. (2002). Digital watermarking (1st ed.). San Francisco: Morgan Kaufmann. Cox, I.J., Miller, M.L., Linnartz, J.-P.M.G., & Kalker, T. (1999). A review of watermarking principles and practices. In K.K. Parhi & T. Nishitani
Audio Watermarking
(Eds.), Digital signal processing in multimedia systems (pp. 461-485). Marcell Dekker. Craver, S., Memon, N., Yeo, B.-L., & Yeung, M.M. (1998). Resolving rightful ownerships with invisible watermarking techniques: Limitations, attacks and implications. IEEE Journal on Selected Areas in Communications, 16(4), 573-586. Craver, S., Memon, N., & Yeung, M.M. (1996). Can invisible watermarks resolve rightful ownerships? (RC 20509). IBM Research. Craver, S., Perrig, A., & Petitcolas, F.A.P. (2000). Robustness of copyright marking systems. In F.A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 149-174). Boston: Artech House. Craver, S., Wu, M., Liu, B., Stubblefield, A., Swartzlander, B., Wallach, D.S., Dean, D., & Felten, E.W. (2001, August). Reading between the lines: Lessons from the SDMI challenge. Paper presented at the USENIX Security Symposium, Washington, DC. Craver, S., Yeo, B.-L., & Yeung, M.M. (1998, July). Technical trials and legal tribulations. Communications of the ACM, 41, 45-54. Czerwinski, S., Fromm, R., & Hodes, T. (1999). Digital music distribution and audio watermarking (IS 219). University of California - Berkeley. Dannenberg, R., & Mazzoni, D. (2002). Audacity (Version 0.98). Pittsburgh, PA. Dugelay, J.-L., & Roche, S. (2000). A survey of current watermarking techniques. In F. A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 121-148). Boston: Artech House. Gordy, J.D., & Burton, L.T. (2000, August). Performance evaluation of digital audio water-
marking algorithms. Paper presented at the 43rd Midwest Symposium on Circuits and Systems, Lansing, MI. Initiative, S.D.M. (2000). Call for proposals for Phase II screening technology, Version 1.0: Secure Digital Music Initiative. ITU. (2001). Method for objective measurements of perceived audio quality (ITU-R BS.1387). Geneva: International Telecommunication Union. JASRAC. (2001). Announcement of evaluation test results for “STEP 2001”, International evaluation project for digital watermark technology for music. Tokyo: Japan Society for the Rights of Authors, Composers and Publishers. Johnson, N.F., Duric, Z., & Jajodia, S. (2001). Information hiding: Steganography and watermarking - Attacks and countermeasures (1st ed.). Boston: Kluwer Academic Publishers. Johnson, N.F., & Katzenbeisser, S.C. (2000). A survey of steganographic techniques. In F.A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 43-78). Boston: Artech House. Katzenbeisser, S., & Veith, H. (2002, January). Securing symmetric watermarking schemes against protocol attacks. Paper presented at the Proceedings of the SPIE, Security and Watermarking of Multimedia Contents IV, San Jose, CA. Katzenbeisser, S.C. (2000). Principles of steganography. In F.A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 17-41). Boston: Artech House. Kirovski, D., & Malvar, H. (2001, April). Robust cover communication over a public audio channel using spread spectrum. Paper presented at the Information Hiding Workshop, Pittsburgh, PA.
Audio Watermarking
Kuczynski, A. (2000, January 6). Radio squeezes empty air space for profit. The New York Times. Kutter, M., & Hartung, F. (2000). Introduction to watermarking techniques. In F.A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 97-120). Boston, MA: Artech House. Kutter, M., & Petitcolas, F.A.P. (1999, January). A fair benchmark for image watermarking systems. Paper presented at the Electronic Imaging ‘99. Security and Watermarking of Multimedia Contents, San Jose, CA. Kutter, M., & Petitcolas, F.A.P. (2000). Fair evaluation methods for image watermarking systems. Journal of Electronic Imaging, 9(4), 445-455. Lai, S., & Buonaiuti, F.M. (2000). Copyright on the Internet and watermarking. In F.A.P. Petitcolas & S. Katzenbeisser (Eds.), Information hiding: Techniques for steganography and digital watermarking (1st ed., pp. 191-213). Boston: Artech House. Meerwald, P., & Pereira, S. (2002, January). Attacks, applications, and evaluation of known watermarking algorithms with Checkmark. Paper presented at the Proceedings of the SPIE, Security and Watermarking of Multimedia Contents IV, San Jose, CA. Memon, N., & Wong, P.W. (1998, July). Protecting digital media content. Communications of the ACM, 41, 35-43. Mintzer, F., Braudaway, G.W., & Bell, A.E. (1998, July). Opportunities for watermarking standards. Communications of the ACM, 41, 57-64. Mintzer, F., Magerlein, K.A., & Braudaway, G.W. (1996). Color correct digital watermarking of images. Pereira, S., Voloshynovskiy, S., Madueño, M., Marchand-Maillet, S., & Pun, T. (2001, April).
Second generation benchmarking and application oriented evaluation. Paper presented at the Information Hiding Workshop, Pittsburgh, PA. Petitcolas, F.A.P. (2000). Watermarking schemes evaluation. IEEE Signal Processing, 17(5), 5864. Petitcolas, F.A.P., & Anderson, R.J. (1998, September). Weaknesses of copyright marking systems. Paper presented at the Multimedia and Security Workshop at the 6th ACM International Multimedia Conference, Bristol UK. Petitcolas, F.A.P., & Anderson, R.J. (1999, June). Evaluation of copyright marking systems. Paper presented at the IEEE Multimedia Systems, Florence, Italy. Petitcolas, F.A.P., Anderson, R.J., & G., K.M. (1998, April). Attacks on copyright marking systems. Paper presented at the Second Workshop on Information Hiding, Portland, OR. Petitcolas, F.A.P., Anderson, R.J., & G., K. M. (1999, July). Information hiding – A survey. Paper presented at the IEEE. Petitcolas, F.A.P., Steinebach, M., Raynal, F., Dittmann, J., Fontaine, C., & Fatès, N. (2001, January 22-26). A public automated Web-based evaluation service for watermarking schemes: StirMark Benchmark. Paper presented at the Electronic Imaging 2001, Security and Watermarking of Multimedia Contents, San Jose, CA. Piron, L., Arnold, M., Kutter, M., Funk, W., Boucqueau, J.M., & Craven, F. (1999, January). OCTALIS benchmarking: Comparison of four watermarking techniques. Paper presented at the Proceedings of SPIE: Security and Watermarking of Multimedia Contents, San Jose, CA. RLE. (1999). Leaving a mark without a trace [RLE Currents 11(2)]. Available at http://rleweb. mit.edu/Publications/currents/cur11-1/11-1watermark. htm
Audio Watermarking
Robinson, D.J.M. (2002). Perceptual model for assessment of coded audio. University of Essex, Essex.
ENDNOtEs 1
Robinson, D.J.M., & Hawksford, M.J. (1999, September). Time-domain auditory model for the assessment of high-quality coded audio. Paper presented at the 107th Conference of the Audio Engineering Society, New York. Secure Digital Music Initiative. (2000). Call for proposal for Phase II screening technology (FRWG 000224-01). Swanson, M.D., Zhu, B., Tewfik, A.H., & Boney, L. (1998). Robust audio watermarking using perceptual masking. Signal Processing, 66(3), 337-355. Thiede, T., & Kabot, E. (1996). A new perceptual quality measure for bit rate reduced audio. Paper presented at the 100th AES Convention, Copenhagen, Denmark. Thiede, T., Treurniet, W.C., Bitto, R., Sporer, T., Brandenburg, K., Schmidmer, C., Keyhl, K., G., B. J., Colomes, C., Stoll, G., & Feiten, B. (1998). PEAQ - der künftige ITU-Standard zur objektiven messung der wahrgenommenen audioqualität. Paper presented at the Tonmeistertagung Karlsruhe, Munich, Germany. Voloshynovskiy, S., Pereira, S., Iquise, V., & Pun, T. (2001, June). Attack modelling: Towards a second generation benchmark. Paper presented at the Signal Processing. Voloshynovskiy, S., Pereira, S., Pun, T., Eggers, J.J., & Su, J.K. (2001, August). Attacks on digital watermarks: Classification, estimation-based attacks and benchmarks. IEEE Communications Magazine, 39, 118-127.
2
3
4
5
6
7
Yeung, M.M. (1998, July). Digital watermarking. Communications of the ACM, 41, 31-33. Zhao, J., Koch, E., & Luo, C. (1998, July). In business today and tomorrow. Communications of the ACM, 41, 67-72.
0
8
It must be stated that when information is digital there is no difference between an original and a bit by bit copy. This constitutes the core of the threat to art works, such as music recordings, as any copy has the same quality as the original. This problem did not exist with technologies such as cassette recorders, since the fidelity of a secondgeneration copy was not high enough to consider the technology a threat. A test subject is defined as a specific implementation of a watermarking algorithm, based on one of the general techniques presented in this document. It is implied that the transmission of a watermark is considered a communication process, where the content creator embeds a watermark into a work, which acts as a channel. The watermark is meant to be recovered later by a receiver, but there is no guarantee that the recovery will be successful, as the channel is prone to some tampering. This assumption will be further explained later in the document. Or a copy of such, given the digital nature of the medium. A cover is the same thing as a work. C, the set of all possible covers (or all possible works), is known as content. This pattern is also known as a pseudo-noise (PN) sequence. Even though the watermark message and the PN-sequence are different, it is the latter one we refer to as the watermark W. The fingerprinting mechanism implemented by the DiVX, where each player had an embedder rather than a decoder, constitutes an interesting and uncommon case. This in accordance to Kerckhoff’s principle.
Audio Watermarking
9
10
11
12
13
14
In the case of an audio recording, the symbol along with the owner name must be printed on the surface of the physical media. The registration fee at the Office of Copyrights and Patents can be found online at: http://www.loc.gov/copyright. In fact, the call for proposal for Phase II of SDMI requires this functionality (Initiative, 2000). This is very similar to the use of serial numbers in software packages. Some of the literature refers to this distortion as beating. This is known as an oracle attack.
(
)
15
C(x ) = E (x − x )(x − x )
16
THD is the amount of undesirable harmonics present in an output audio signal, expressed as a percentage. The lower the percentage the better. A description of the equipment used on a formal listening test can be found in Arnold and Schilz (2002). 1 Bark corresponds to 100 Hz, and 24 Bark correspond to 15000 Hz. This decision will be motivated by the economics of the system; that is, by what part of the systems is considered more valuable by the developer.
17
18
19
∗
This work was previously published in Multimedia Security: Steganography and Digital Watermarking Techniques for Protection of Intellectual Property, edited by C.-S. Lu, pp. 75-125, copyright 2005 by Idea Group Publishing (an imprint of IGI Global).
Chapter 1.4
Software Piracy:
Possible Causes and Cures Asim El-Sheikh The Arab Academy for Banking & Financial Sciences, Jordan Abdullah Abdali Rashed The Arab Academy for Banking & Financial Sciences, Jordan A. Graham Peace West Virginia University, USA
AbstrAct
INtrODUctION
Software piracy costs the information technology industry billions of dollars in lost sales each year. This chapter presents an overview of the software piracy issue, including a review of the ethical principles involved and a summary of the latest research. In order to better illustrate some of the material presented, the results of a small research study in the country of Jordan are presented. The findings indicate that piracy among computer-using professionals is high, and that cost is a significant factor in the decision to pirate. Finally, some potential preventative mechanisms are discussed, in the context of the material presented previously in the chapter.
Software piracy takes place when an individual knowingly or unknowingly copies a piece of software in violation of the copyright agreement associated with that software. Despite the best efforts of industry organizations, such as the Business Software Alliance (BSA) and the Software and Information Industry Association (SIIA), and extensive legislation in many countries, piracy is rampant in most parts of the world. While illegal copying has decreased in the past few years, most likely due to the activities mentioned above, it is estimated that piracy cost the
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Software Piracy
software industry a combined US$13 billion, in 2002 alone. Thirty-nine percent (39%) of all business application software installed in 2002 was pirated (BSA, 2003). This chapter will discuss the current state of the research into software piracy, focusing specifically on potential causes and cures. The results of a study of software piracy in the country of Jordan are presented, both to demonstrate the extent of the problem outside of the typically studied Western world, and as a basis for discussion of the theories and data presented in the rest of the chapter. It is hoped that this chapter will make the reader aware of the major issues involved in preventing piracy.
bAcKGrOUND The growth of the importance of software in both the personal and professional worlds has led to a corresponding increase in the illegal copying of software. While academic research often splits illegal software copying into “software piracy” (the act of copying software illegally for business purposes) and “softlifting” (the act of copying software illegally for personal use), this chapter will use the term “software piracy” to encompass both activities, as is often done in the popular press. The following provides an overview of the ethical issues involved in the decision to pirate and the results of previous research.
Ethics of Piracy The ethics of piracy are not as cut and dried as it may first seem. By definition, when piracy is committed, the copyright agreement or software license is violated, clearly breaking the law. However, does that make the act unethical? Obviously, the fact that something is illegal does not necessarily make it unethical, and vice versa (many laws have been overturned when their unethical nature became apparent, such as laws governing
slavery). Also, in the case of digital products, such as software, we are faced with the unique situation where the product can be replicated at virtually no cost and without “using up” any of the original version. So, while software piracy is technically stealing, it is quite different in nature than the stealing of a material item, where the original owner is then denied the usage of the item taken. In the case of illegal software copying, several ethical issues come into play. In one of the few studies utilizing ethical theory to study the piracy problem, Thong and Yap (1998) found that entry-level IS personnel use both utilitarian and deontological evaluations to arrive at an ethical decision regarding whether or not to pirate. The authors concluded that efforts to encourage ethical behavior in IS personnel should include training in ethical analysis and enforcement of an organizational code of ethics. From a utilitarian or consequentialist perspective, where the focus is on the results of the action more so than the action itself, arguments can be made that an individual act of piracy is not unethical. Assume that an individual can significantly improve his or her productivity in the workplace by installing a pirated copy of Microsoft Excel. While the employee completes the same amount of work in a single day, he or she is now able to leave work earlier and spend more time with his or her family, thus increasing their happiness. If the organization was not going to purchase the software under any circumstances, it is difficult to claim that Microsoft is financially damaged, as no sale would have taken place. In any case, one further sale of Excel would do little to impact Microsoft’s overall profits and most likely would not outweigh the good created by the employee playing with his or her children for an extra hour or so each day. In the end, the individual and his or her family benefit, while the creator of the software is not significantly harmed. The organization, and even society, may also benefit, as the individual and his family will be happier and the employee will
Software Piracy
be under less stress to complete things on time. From a utilitarian viewpoint, the benefits of this single case of piracy may outweigh the costs, implying that the act is ethical in nature. Some researchers have claimed that software piracy may even benefit software companies, as individuals who would never have been exposed to a software product are given the opportunity to try the software at no cost, which may lead to future purchases of the product if it benefits the user (Givon, Mahajan, & Muller, 1995). This is similar to the concept of providing trial versions of products. If this is the case, the utilitarian arguments defending piracy behavior are strengthened, although further study of this claim is required. However, what if everyone pirated software instead of just one individual? The situation now changes dramatically. Software manufacturers would see a drastic reduction in income and would eventually have to either go out of business or greatly reduce their activities. The rapid pace of technological growth seen over the past two decades would slow down significantly. Open source products, such as Linux, have demonstrated that a non-profit software industry can still lead to technological advancement, but it is hard to imagine the advance continuing at the same pace with no profit motive in place. Even programmers have to eat. Therefore, a single act of piracy in a situation where the software would never have been purchased seems the easiest to defend from an ethical standpoint. However, if the piracy is replacing a potential legitimate purchase, the equation is changed. Any large scale commitment of piracy of this type would lead to serious damage to the software industry which, in turn, would negatively impact future software development. It could certainly be argued that the costs would outweigh the benefits. From a deontological perspective, things are somewhat clearer. Deontologists argue that the act itself is ethical or unethical, regardless of the outcomes. In the case of piracy, the facts are
clear—the software corporation has expended its research and development money to create the software, usually for the purposes of recouping the development costs and creating an income stream. These corporations legally create software licensing agreements into which purchasers enter voluntarily when they purchase the software. Those agreements, in most cases, prohibit the unauthorized copying of the software for purposes other than backing up the software. As the purchase is voluntary and certainly not a necessity of life, one has to argue that the purchaser is ethically bound to abide by the licensing agreement. The fact that so many individuals and organizations have voluntarily purchased software and abided by the licensing agreements, without major complaint, is further evidence that these licenses are generally accepted to be fair and ethical. Therefore, allowing for that, copying software in violation of the agreement is unethical—it is the same as breaking any other contract where both sides, in full knowledge of the situation, voluntarily enter into an agreement to abide by a set of rules. Breaking those rules, especially unbeknownst to the other party, is clearly an unethical act, as it violates the other entity’s trust. It may not be stealing in the material sense, but it is a violation of a voluntary contract, nonetheless. Looked at another way, using Immanuel Kant’s Categorical Imperative, we want people to act in a way that is universally applicable (i.e., the way in which we would want all people to act, in that situation). In the case of standard legal business agreements, we certainly cannot envision a situation where we would want all people to violate those agreements, especially in secret. Therefore, it must be unethical to break the software licensing agreement by copying the software illegally or using an illegally copied version of the software against the software creator’s wishes. One interesting caveat to this discussion is the role of cultural norms. In the Western world, it is commonly accepted that the creator of intellectual property is granted rights to exploit that property
Software Piracy
for financial gain, if he or she so wishes. The foundations of copyright and trademark law are based on the view of ownership. Just as material items can be owned, so can intellectual property, and the right of ownership can be protected by legal and ethical means. Given that the technology industry developed primarily in the United States and Western Europe, it is not surprising that the legal concepts of intellectual property rights were developed in parallel. However, in many other cultural traditions, most notably in Asia, the concept of individual ownership of intellectual property is not as common. For example, while in the Western world artists are rewarded and recognized for creating unique works and often criticized for “copying,” in many Eastern traditions, success can be gained through the replication of works and styles created by previous masters. In another major difference, the focus in many Eastern societies is on the collective, as opposed to the individual. In the US, in particular, individualism is encouraged and rewarded. Uniqueness is seen as a strength, in many cases, whereas in Asian culture, it is much more important to assume the proper role in the group. Individualism is often seen as a negative, and people strive to become part of the whole; individualism is sacrificed for the benefit of the group. In a culture such as this, it is easy to see how the concept of individual ownership of a virtual property, especially one that can be copied and distributed at no cost to the originator, can be difficult to establish. Hence, it is not surprising to see that countries such as Vietnam (95%), China (92%), and Indonesia (89%) lead the world in terms of software piracy rates (BSA, 2003). The cultures of these countries have a different concept of intellectual property than the cultures of Western Europe and North America. This leads to the idea of cultural relativism, which states that ethics are based on a society’s culture. Therefore, individuals in cultures with different attitudes and norms can undertake completely opposite acts, although both could be acting
ethically. While the concept of intellectual property in Western culture makes it easy to claim that piracy is unethical, it may be that cultural norms in societies like those found in Asia are such that the act of piracy is simply not seen as unethical. As the global marketplace becomes a reality, and Western business concepts are embraced across the international spectrum (witness China’s recent admission into the World Trade Organization), it seems inevitable that Western concepts of intellectual property will have to be accepted by other cultures and their corresponding legal systems. However, it may be a slow process and will require well-developed educational programs. Until the time that intellectual property rights are fully understood and accepted into non-Western cultures, the initial rush to judgment regarding the unethical nature of software copying in those societies must be tempered with an understanding of the cultural traditions in which those ethics were developed.
Previous Research In recent years, a small research stream has developed in the academic literature regarding the causes and potential cures of piracy. Not surprisingly, initial studies focused on the extent of the problem. Shim and Taylor (1989) found that more than 50% of managers admitted to copying software illegally, consistent with a later study of computer-using professionals by Peace (1997). Several other studies found piracy to be common among college students (e.g., Oz, 1990; Paradice, 1990). Males have been found to commit piracy more often than females, while age has been found to be negatively correlated with piracy (i.e., younger people copy software illegally more often than older people) (Sims, Cheng, & Teegen, 1996). When combined with the yearly reports by the BSA and SIIA, it is evident that a significant percentage of computer users are pirating software and that the software industry faces billions of dollars in lost sales each year.
Software Piracy
Figure 1. Model of software piracy behavior (Source: Peace et al., 2003) Punishment Severity
-
Attitude +
Software Cost
Punishment Certainty
+ -
-
Subjective Norms Perceived Behavioral Control
In recent years, studies have focused more on the causes of piracy. In one of the initial attempts to build a model of piracy behavior, Christensen and Eining (1991) utilized the Theory of Reasoned Action (TRA). TRA posits that a person’s behavioral intention is the leading predictor of whether or not the person will carry out that behavior. In other words, if someone intends to do something, then he or she probably will. Intention, in turn, is predicted by the individual’s subjective norms (i.e., the perception of pressures from the external environment, such as peer norms) and the individual’s attitude towards the behavior (positive or negative, based upon the perceived consequences of the behavior). The authors found that attitude and peer norms are directly related to piracy behavior (although they did not utilize a construct for intention, in their study). TRA has been expanded to include the concept of perceived behavioral control; the individual’s perception of his or her ability to actually undertake the behavior in question (Ajzen, 1991). The resulting theory is known as the Theory of Planned Behavior (TPB), and it has been empirically tested in many situations, with successful results. In the most recent major study of piracy behavior, Peace, Galletta, and Thong (2003) used TPB as a base for the development of a more complete model of piracy behavior (Figure 1). Economic Utility Theory (EUT) and Deterrence Theory were
+
Piracy Intention
+
utilized to identify the antecedents of the main TPB constructs, including the cost of the software, the severity of potential punishment (punishment severity), and the probability of being punished (punishment certainty). Each was found to be an important factor in the decision to pirate, and the model was found to account for 65% of the variance in piracy intention. Research into software piracy has come a long way from its humble beginnings in the late 1980s. The model developed by Peace et al. (Figure 1) is a major step forward from the first attempts to identify the factors that lead to the decision to pirate. We will return to the discussion of these factors and what they tell us about piracy prevention, later in the chapter. The next section details a study of software piracy in the little analyzed country of Jordan.
tHE JOrDAN stOrY Almost all academic piracy research to date has focused on the industrialized nations of Europe, Asia, and North America. To add interest to the discussion of software piracy’s causes and potential cures, the authors undertook a small study of piracy behavior in the country of Jordan. Jordan entered the World Trade Organization in 2000 and signed a free trade accord with the United States
Software Piracy
in the same year. An association agreement was signed with the European Union in 2001, leading to increases in trade and foreign investment. Eighty-three percent (83%) of the workforce is employed in the services industry, and approximately 212,000 of the country’s population of 5.5 million have regular Internet access (CIA World Factbook, 2004).
Background The BSA’s statistics indicate that software piracy is prevalent in the Middle East, although there have been signs of significant improvement over the past several years. From a high of 84% in 1994, piracy rates have decreased to 50% in 2002, representing a dollar loss to the software industry of US$141 million (BSA, 2003). This is a small number, when compared to the nearly US$5.5 billion in losses sustained in the Asian market, or the US$2.2 billion lost in North America, which perhaps accounts for the lack of detailed research into software piracy in Middle Eastern countries. Jordan has a small but growing information technology industry, currently employing approximately 10,000 people and generating US$167 million in annual revenue (Usaid.gov, 2004). The government has placed a clear emphasis on developing this sector, and also on reducing piracy. In 1999, Jordan’s parliament amended the country’s 1992 Copyright Law and passed various regulations to better protect intellectual property. Two years later, King Abdullah received a special award from the BSA for his efforts to enforce the country’s copyright and trademark laws. Largely due to these efforts, software piracy in Jordan has seen a steady decline since 1994, when rates reached 87%. By 2002, piracy rates had dropped to 64%, although the total losses to the software industry had risen, from US$2.2 million in 1994 to US$3.5 million in 2002 (BSA, 2003).
Method For the purposes of this study, questionnaires were distributed to a sample of adults taking graduate-level evening classes at the Arab Academy for Banking and Financial Services in Amman, Jordan. Engineers and programmers in the telecommunications industry in Amman were also surveyed. No incentives were given for completing the questionnaire, and all respondents were promised anonymity. Almost all of the respondents were employed. This sample was chosen as it provided an available group of business professionals with the ability, opportunity, and knowledge to use computer technology. All of the respondents indicated some training with computers during their education, and 53% stated that they worked with computer technology on a daily basis.
Results One hundred and two questionnaires were distributed and 98 were returned. However, 12 surveys were deemed unusable, as those respondents indicated that they did not use computers, either at work or at home, leaving 86 surveys for a usable response rate of 84.3%. 86% of the respondents were male, which is not surprising given the make-up of the workforce in both the software industry and the Middle East in general, each of which are male dominated. 24 (28%) respondents ranged in age from 20 to 25 years old, 37 (43%) respondents ranged in age from 25-30 years old, and the remaining 25 (29%) were older than 30 years of age, at the time of the survey. The sample was well educated, with 58% of the respondents holding a bachelor’s degree, and a further 37% holding a master’s degree or higher. The majority of the respondents (64%) were employees in industry, 20% were students only, and 15% were either university personnel or privately employed.
Software Piracy
46 (53%) of the respondents had a computer at home, 18 (21%) used a computer at work, and 22 (26%) had computers available both at home and in the workplace. Almost half of the respondents had used computers for more than six years, while all respondents had used computers for at least one year. When asked about their knowledge of the laws regarding software copying, 86% reported understanding the concept of software piracy, while 13% reported no knowledge of the issue. Of those who reported an understanding of the subject, 24% reported learning about piracy at school, 41% from media reports, and 34% reported knowledge from both sources. Rather surprisingly, the respondents were very open about their software copying habits. A troubling 80% of the respondents admitted to using illegally copied software. When asked for the main reason behind their usage of pirated software, price was the number one issue raised. Sixty-one percent (61%) of the respondents listed the cost of software as the main reason for committing piracy. A further 18% responded that they simply saw no reason for paying when the software was available for free. Seventy-eight percent (78%) stated that they were satisfied with their decision to pirate software, while the remaining 22% admitted to some dissatisfaction or guilt associated with their choice. The respondents were asked to list the source of their pirated software. Eighty-six percent (86%) received software from friends or colleagues both within and outside of their organization. Surprisingly, 17% stated that their pirated software came with the PC that they had purchased, and 3.5% claimed to have received pirated software from a software industry professional, indicating that the problem is inherent in the supply chain. This may relate to the discussion of cultural relativism, described above. When their attitudes were studied further, 76% stated that it is “fair” to be asked to pay for software, since software companies had expended
effort to produce the product. Also, 74% thought that it was necessary to require the purchasing of software, in order to sustain the software industry. However, only 3.5% of the respondents stated that they had personally purchased legal software from a technology company in the past.
Discussion The most obvious result of the survey confirms the findings of the BSA. Piracy is a serious problem in the Middle East, and the act of piracy is not seen in a negative light. The piracy rate found in this survey is much higher than the 64% found by the BSA, most likely due to the sample utilized—computer-using professionals who have the knowledge, skills, and opportunity to pirate. While the Middle East is not a major user of software, when compared to the industrialized nations of Europe and North America, the numbers are still significant and indicative of the work that must be done to combat illegal software copying. Perhaps most disturbing is the fact that 78% of the respondents seemed to show no remorse, despite the fact that 86% claimed to understand the concept of piracy, indicating that they knowingly committed an illegal act. It is also interesting to note that the majority of the software pirates believed that being asked to pay for software is fair, and even necessary to maintain the software industry, showing an obvious conflict between their views and actions. There is clearly a lot of work to be done if piracy is to be fully understood and prevented in the future.
POtENtIAL cUrEs The SIAA and BSA have undertaken a twopronged approach to reducing the problem of piracy: enactment and enforcement of applicable laws (i.e., punishment as a deterrent), and education of organizations and individuals as to the ethical and legal implications of pirating. There is
Software Piracy
evidence from the academic literature that each of these efforts is useful. In particular, punishment is an important factor. Peace et al. (2003) found that the level of punishment is directly related to the individual’s attitude towards piracy—the higher the perceived level of punishment, the more negative the individual’s attitude, and the more unlikely the individual will be to intend to pirate. In fact, punishment levels are quite high. In the US, for example, punishment can include jail time and fines of up to US$250,000. However, do people truly believe that they, personally, will incur these punishments? High punishment levels are not enough; the individual must perceive the levels to be high, and they must also perceive that pirates are likely to be caught. When looking at the case of Jordan, the fact that 80% of the individuals surveyed freely admitted to copying software illegally gives the impression that they do not perceive the risks of being punished to be high. In reality, while the efforts of the BSA and SIIA to bring pirates to justice have led to some highly publicized convictions, the fact is that most pirates are not caught and freely commit the crime with no negative consequences. The perceptions of punishment severity (the level of punishment) and punishment certainty (the chance of incurring punishment) relate to the education efforts of the industry trade groups. While the unethical nature of the act is important, making potential pirates aware of the possible punishments has been a main focus of the BSA. One look at its Web page (http://www. bsa.org) quickly makes the individual aware of the organization’s tactic to publicize the potential punishments that pirates face and the fact that some individuals and organizations are actually being caught. There is a clear goal of increasing the individual’s perception of the levels of punishment certainty and severity. In the case of Jordan, over 73% of the respondents indicated that they had become informed of the issue of piracy at least partially through the media, which indicates that that the campaign of industry groups
is working—the word is being spread. However, with over a quarter of respondents claiming no information from the media, and 13% claiming to have no knowledge of the issue at all (keeping in mind that many of these respondents work in the technology industry), there is still work to be done. On an organizational level, punishment severity and certainty can also be useful tools. Companies wishing to reduce piracy can use punishment effectively; auditing can be carried out to find pirated software, and those committing piracy can be punished. Similarly, research shows that peer norms are a factor in the decision to pirate (e.g., Peace et al., 2003). Establishing a corporate culture that promotes only the legal use of software, combined with punishment for those that do not comply, can greatly reduce piracy in an organization. As suggested in the literature, corporate codes of conduct can aid in this endeavor (e.g., Thong & Yap, 1998). Software cost is a more interesting aspect of the problem. As can be seen in the study results, cost is a significant factor in the decision to pirate. Sixty-one percent (61%) of those admitting to piracy listed software cost as the major reason. It would not be surprising to find that cost is more of an issue in a country such as Jordan, with an annual per capita GDP of US$4,300, as opposed to the US, where per capita GDP is a much greater US$33,600 (CIA World Factbook, 2004). Peace et al.’s (2003) model also found that price plays a significant role in the piracy decision. Some interesting suggestions have been made in this area. Clearly, incomes differ in various parts of the world. Therefore, the importance of the cost of the software may vary on a regional level, based on things such as per capita GDP and income. Researchers, such as Gopal and Sanders (2000) and Moores and Dhillon (2000), have suggested that price discrimination strategies could be used as a tool to combat piracy. In countries with lower per capita incomes or GDPs, such as Jordan, reduced prices could be used to limit the
Software Piracy
incentive to pirate. This is an area very deserving of future study. Another area much deserving of future research is the impact of local culture on piracy. As stated above, cultural relativism in the area of ethics is a potential issue, as some cultures do not have a history of protecting intellectual property rights, and the concept of intellectual property ownership is mainly a Western ideal. Also, most major research to date has focused on the industrialized countries of Europe and North America. Gopal and Sanders (1998) have called for further study of the cross-cultural aspects of piracy, and a fruitful research stream awaits for those willing to focus on this area of the problem.
cONcLUsION This chapter provides an overview of the topic of software piracy, including the results of a study of illegal software copying in the country of Jordan. Piracy costs the software industry billions of dollars each year, but through the two pronged approach of education and enforcement, industry groups such as the BSA and the SIIA have managed to greatly reduce piracy worldwide. However, the issue of software cost appears to be a major factor in the decision to pirate, indicating that price discrimination strategies may have to be used to truly impact illegal software copying in much of the world, and cultural relativism may make changing habits difficult, in some societies. Looking into the future, the case of software piracy provides insight into what is quickly becoming a larger intellectual property rights issue: the illegal downloading of both music and video files via the Internet. Not including Internet downloads, it is estimated that piracy of CDs and cassettes cost the entertainment industry US$4.6 billion in 2002 (IFPI, 2003). There are many similarities between software piracy and entertainment piracy, and the lessons learned in
0
the software arena can provide insight into how to deal with this new issue. With the spread of technologies such as Kazaa and bittorrenting, the ability to copy any digital product quickly, easily, and almost anonymously threatens the value of the intellectual property that has created great wealth for Bill Gates and David Bowie alike. It is imperative that the ethical, legal, and technological factors involved are studied further, so that prevention and protection strategies can be devised to protect the rights of those creating intellectual property.
rEFErENcEs Ajzen, I. (1999). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179-211. Business Software Alliance (BSA) (2003). Eighth Annual BSA Global Software Piracy Study. Washington, DC: Business Software Alliance. Central Intelligence Agency (CIA) (2004). CIA World Factbook (2004). U.S. Central Intelligence Agency. Washington, DC. Retrieved from http:// www.cia.gov/cia/publications/factbook/ Christensen, A., & Eining, M. (1991). Factors influencing software piracy: Implications for accountants. Journal of Information Systems, 5, 67-80. Givon, M., Mahajan, V., & Muller, E. (1995). Software piracy: Estimation of lost sales and impact on software diffusion. Journal of Marketing, 59, 29-37. Gopal, R., & Sanders, G. (1998). International software piracy: Analysis of key issues and impacts. Information Systems Research, 9(4), 380-397. Gopal, R., & Sanders, G. (2000). Global software piracy: You can’t get blood out of a turnip. Communications of the ACM, 43(9), 83-89.
Software Piracy
International Federation of the Phonographic Industry (IFPI) (2003). The recording industry commercial piracy report 2003. London: International Federation of the Phonographic Industry. Moores, T., & Dhillon, G. (2000). Software piracy: A view from Hong Kong. Communications of the ACM, 43(12), 88-93. Oz, E. (1990). The attitude of managers-to-be toward software piracy. OR/MS Today, 17, 24-26. Paradice, D.J. (1990). Ethical attitudes of entrylevel MIS personnel. Information & Management, 18, 143-151. Peace, A.G. (1997). Software piracy and computer-using professionals: A survey. Journal of Computer Information Systems, 38(1), 94-99. Peace, A.G., Galletta, D.F., & Thong, J.Y.L. (2003). Software piracy in the workplace: A model and
empirical test. Journal of Management Information Systems, 20(1), 153-178. Shim, J.P., & Taylor, G.S. (1989). Practicing managers’ perception/attitudes toward illegal software copying. OR/MS Today, 16, 30-33. Sims, R.R., Cheng, H.K., & Teegen, H. (1996). Toward a profile of student software piraters. Journal of Business Ethics, 15, 839-849. Thong, J.Y.L., & Yap, C.S. (1998). Testing an ethical decision-making theory: The case of softlifting. Journal of Management Information Systems, 15(1), 213-237. Usaid.gov (2004). USAID supports Jordan’s information, communication and technology sector. Retrieved from: http://www.usaid.gov/ locations/asia_near_east/countries/jordan/ictjordan.html
This work was previously published in Information Ethics: Privacy and Intellectual Property, edited by L. Freeman and A. G. Peace, pp. 84-99, copyright 2005 by Information Science Publishing (an imprint of IGI Global).
72
Chapter 1.5
Administering the Semantic Web:
Confidentiality, Privacy, and Trust Management Bhavani Thuraisingham The University of Texas at Dallas, USA Natasha Tsybulnik The University of Texas at Dallas, USA Ashraful Alam The University of Texas at Dallas, USA
AbstrAct The Semantic Web is essentially a collection of technologies to support machine-understandable Web pages as well as Information Interoperability. There has been much progress made on the Semantic Web, including standards for eXtensible Markup Language, Resource Description Framework, and Ontologies. However, administration policies and techniques for enforcing them have
received little attention. These policies include policies for security, privacy, data quality, integrity, trust, and timely information processing. This article discusses administration policies for the Semantic Web as well as techniques for enforcing them. In particular, we will discuss an approach for ensuring confidentiality, privacy, and trust for the Semantic Web. We will also discuss the inference and privacy problems within the context of administration policies.
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Administering the Semantic Web
INtrODUctION A Semantic Web can be thought of as a Web that is highly intelligent and sophisticated so that one needs little or no human intervention to carry out tasks such as scheduling appointments, coordinating activities, searching for complex documents, as well as integrating disparate databases and information systems (Lee, 2001). Recently there have been many developments on the Semantic Web (see, for example, Thuraisingham, 2002). The World Wide Web consortium (W3C) is specifying standards for the Semantic Web. These standards include specifications for XML (eXtensible Markup Language), RDF (Resource Description Framework), and Ontologies. While much progress has been made toward developing such an intelligent Web, there is still a lot to be done in terms of security, privacy, data quality, integrity, and trust management. It is critical that the Semantic Web be secure and trustworthy, that is, the components that constitute the Semantic Web have to be secure. The components include XML, RDF, and Ontologies. In addition, we need secure information integration. We also need to examine trust issues for the Semantic Web. Essentially what we need is a set of administration policies as well as techniques for enforcing these policies for the Semantic Web. This article focuses on administration issues for the Semantic Web with an emphasis on confidentiality, privacy, and trust management. In the case of security policies, which we will also call confidentiality policies, we will discuss XML security, RDF security, and secure information integration. We also discuss privacy for the Semantic Web. Trust management issues include the extent to which we can trust the users and the Web sites to enforce security and privacy policies. The organization of this article is as follows. Our definitions of confidentiality, privacy, and trust, as well as the current status on administering the Semantic Web, will be discussed first. This will be followed by a discussion of our proposed
framework for securing the Semantic Web, which we call CPT (Confidentiality, Privacy, and Trust). Next we will take each of the features, Confidentiality, Privacy, and Trust, and discuss various aspects as they relate to the Semantic Web. An integrated architecture for CPT, as well as inference and privacy control, will also be discussed. Finally, the article is summarized and future directions are given.
trUst, PrIvAcy, AND cONfIDeNtIAlIty Definitions Confidentiality, privacy, trust, integrity, and availability will be briefly defined with an examination of how these issues specifically relate to the trust management and inference problem. Confidentiality is preventing the release of unauthorized information. Privacy is a subset of confidentiality in that it is the prevention of unauthorized information from being released in regards to an individual. Integrity of data is the prevention of any modifications made by an unauthorized entity. Availability is the prevention of unauthorized omission of data. Trust is a measure of confidence in data correctness and legitimacy from a particular source. Integrity, availability, and trust are all very closely related in the sense that data quality is of particular importance, and all require individuals or entities processing and sending information to not alter the data in an unauthorized manner. If all of these issues, confidentiality, privacy, trust, integrity, and availability, are guaranteed, a system can be considered secure. Thus if the inference problem can be solved such that unauthorized information is not released, the rules of confidentiality, privacy, and trust will not be broken. A technique such as inference can either be used to aid or impair the cause of integrity, availability, and trust. If correctly used, inference can be used to infer trust management policies. Thus
73
Administering the Semantic Web
inference can be used for good or bad purposes. The intention is to prevent inferred unauthorized conclusions and to use inference to apply trust management.
Current Successes and Potential Failures W3C is proposing encryption techniques for securing XML documents. Furthermore, logic, proof, and trust belong to one of the layers of the Semantic Web. However, by trust in that context is meant whether the Semantic Web can trust the statements such as data and rules. In our definition, by trust we mean to what extent we can believe that the user and the Web site will enforce the confidentiality and privacy policies as specified. Privacy has been discussed by the Semantic Web community. The main contribution of this community is developing the Platform for Privacy Preferences (P3P). P3P requires the Web developer of the server to create a privacy policy, validate it, and then place it in a specific location on the server as well as write a privacy policy in English. When the user enters the Web site, the browser will discover the privacy policy; if the privacy policy matches the user’s browser security specifications, then the user can simply enter the site. If the policy does not match the user’s specifications, then the user will be informed of the site’s intentions, and the user can then choose to enter or leave. While this is a great start, it is lacking in certain areas. One concern is the fact that the privacy policy must be placed in a specific location. If a Web site, for example a student Web site on a school’s server, is to implement P3P and cannot place it in a folder directly from the school’s server, then the user’s browser will not find the privacy policy. Another problem with P3P is that it requires the data collector on the server side to follow exactly what is promised in the privacy policy. If the data collections services on the server side decide to
74
abuse the policy and instead do other things not stated in the agreement, then no real consequences occur. The server’s privacy policy can simply choose to state that it will correct the problem upon discovery, but if the user never knows it until the data is shared publicly, correcting it to show the data is private will not simply solve the problem. Accountability should be addressed, where it is not the server’s decision but rather the lawmaker’s decisions. When someone breaks a law or does not abide by contractual agreements, we do not turn to the accused and ask what punishment they deem necessary. Instead we look to the law and apply each law when applicable. Another point of contention is trust and inference. Before beginning any discussions of privacy, a user and a server must evaluate how much the other party can be trusted. If neither party trusts each other, how can either party expect the other to follow a privacy policy? Currently P3P only uses tags to define actions; it uses no Web rules for inference or specific negotiations regarding confidentiality and privacy. With inference, a user can decide if certain information should not be given because it would allow the distrusted server to infer information that the user would prefer to remain private or sensitive.
Motivation for a Framework While P3P is a great initiative to approaching the privacy problem for users of the Semantic Web, it becomes obvious from the above discussion that more work must be continued on this process. Furthermore, we need to integrate confidentiality and privacy within the context of trust management. A new approach, to be discussed later, must be used to address these issues, such that the user can establish trust, preserve privacy and anonymity, and ensure confidentiality. Once the server and client have negotiated trust, the user can begin to decide what data can be submitted that will not violate his/her privacy. These security policies,
Administering the Semantic Web
one each for trust, privacy, and confidentiality, are described with Web rules. Describing policies with Web rules can allow an inference engine to determine what is in either the client or server’s best interest and help advise each party accordingly. Also, with Web rules in place, a user and server can begin to negotiate confidentiality. Thus if a user does not agree with a server’s privacy policies but would still like to use some services, a user may begin negotiating confidentiality with the server to determine if the user can still use some services but not all (depending on the final conclusion of the agreement). The goal of this new approach is to simulate real-world negotiations, thus giving semantics to the current Web and providing much-needed security.
cPt frAMeWOrK In this section, we will discuss a framework for enforcing confidentiality, privacy, and trust (CPT) for the Semantic Web. We first discuss the basic framework where rules are enforced to ensure confidentiality, privacy, and trust. In the advanced framework, we include inference controllers that will reason about the application and determine whether confidentiality, privacy, and trust violations have occurred.
The Role of the Server In the previous section, focus was placed on the client’s needs; now we will discuss the server’s needs in this process. The first obvious need is that the server must be able to evaluate the client in order to grant specific resources. Therefore, the primary goal is to establish trust regarding the client’s identity and, based on this identity, grant various permissions to specific data. Not only must the server be able to evaluate the client, but also be able to evaluate its own ability to grant permission with standards and metrics. The server also needs to be able to grant or deny
a request appropriately without giving away classified information or, instead of giving away classified information, the server may desire to give a cover story. Either of the scenarios, a cover story or protecting classified resources, must be completed within the guidelines of a stated privacy policy in order to guarantee a client’s confidentiality. One other key aspect is that all of these events must occur in a timely manner such that security is not compromised.
CPT Process Now that the client’s and server’s needs have been discussed, focus will be placed on the actual process of our system CPT. First, a general overview of the process will be presented. After the reader has garnered a simple overview, this article will continue to discuss two systems, Advanced CPT and Basic CPT, based on the general process previously discussed. The general process of CPT is to first establish a relationship of trust and then negotiate privacy and confidentiality policies. Figure 1 shows the general process. Notice that both parties partake in establishing trust. The client must determine the degree to which it can trust the server in order to decide how much trust to place in the resources supplied by the server and also to negotiate privacy policies. The server must determine the degree to which it can trust the client in order to determine what privileges and resources it can allow the client to access as well as how to present the data. The server and client will base their decisions of trust on credentials of each other. Once trust is established, the client and server must come to an agreement of privacy policies to be applied to the data that the client provides the server. Privacy must follow trust because the degree to which the client trusts the server will affect the privacy degree. The privacy degree affects what data the client chooses to send. Once the client is comfortable with the privacy policies negotiated, the client will then begin requesting data.
75
Administering the Semantic Web
Figure 1. Basic framework for CPT Client
Server
Establish Trust Negotiate Privacy Request Data Based on Server-side Confidentiality Requirements, Send Data
Based on the initial trust agreement, the server will determine what and when the client views these resources. The client will make decisions regarding confidentiality and what data can be given to the user based on its own confidentiality requirements and confidentiality degree. It is also important to note that the server and client must make these decisions and then configure the system to act upon these decisions. The Basic CPT system will not advise the client or server in any way regarding outcomes of any decisions. Figure 2 illustrates the communication between the different components.
Advanced CPT The previous section discussed the Basic CPT system; the Advanced CPT system is an extension of the Basic system. The Advanced CPT system is outlined in Figure 3, which incorporates three new entities not found in the Basic system. These three new entities are the Trust Inference Engine (TIE), the Privacy Inference Engine (PIE), and the Confidentiality Inference Engine (CIE). The first step of sending credentials and establishing trust is the same as the Basic system except that both parties consult with their own
76
TIE. Once each party makes a decision, the client receives the privacy policies from the server and then uses these policies in configuration with PIE to agree, disagree, or negotiate. Once the client and server have come to an agreement about the client’s privacy, the client will send a request for various resources. Based on the degree of trust that the server has assigned to a particular client, the server will determine what resources it can give to the client. However, in this step the server will consult the CIE to determine what data is preferable to give to the client and what data, if given, could have disastrous consequences. Once the server has made a conclusion regarding what data the client can receive, it can then begin transmitting data over the network.
Trust, Privacy, and Confidentiality Inference Engines In regards to trust, the server must realize that if it chooses to assign a certain percentage of trust, then this implies the client will have access to the specific privileged resources and can possibly infer other data from granted permissions. Thus, the primary responsibility of the trust inference engine is to determine what information can be
Administering the Semantic Web
Figure 2. Communication between the components for basic CPT Client
Server Request Interaction Send Credential Expectations Send Credentials Use Credentials to Establish Trust/Set Trust Degree
Send Privacy Policies Client Agrees or Sends Conditions for Agreement Server Agrees or Continues Negotiations
Client Sets Privacy Degree
Establish Privacy Agreement Request Data
Send Appropriate Data
inferred and if this behavior is acceptable. Likewise, the client must realize that the percentage of trust which it assigns to the server will affect permissions of viewing the site as well as affecting how data given to the client will be processed. The inference engine in the client’s scenario will guide the client regarding what can or will occur based on the trust assignment given to the server. Once trust is established, the privacy inference engine will continue the inference process. It is important to note that the privacy inference engine only resides on the client side. The server will have its own privacy policies, but these policies may not be acceptable to the client. It is impossible for the server to evaluate each client and determine how to implement an individual privacy policy without first consulting the client. Thus the privacy inference engine is unnecessary on the server’s side.
Server Sets Confidentiality Degree
The privacy inference engine must guide the client in negotiating privacy policies. In order to guide the client through negotiations, the inference engine must be able to determine how the server will use data the client gives it as well as who else will have access to the submitted data. Once this is determined, the inference engine must evaluate the data given by the client to the server. If the inference engine determines that this data can be used to infer other data that the client would prefer to remain private, the inference engine must warn the client and then allow the client to choose the next appropriate measure of either sending or not sending the data. Once the client and server have agreed on the privacy policies to be implemented, the client will naturally begin requesting data and the server will have to determine what data to send based
77
Administering the Semantic Web
Figure 3. Communication between components for advanced CPT Client
Server Request Interaction Send Credential Expectations Send Credentials
TIE (Trust Inference Engine)
Use Credentials to Establish Trust/Set Trust Degree
TIE (Trust Inference Engine)
Send Privacy Policies PIE (Privacy Inference Engine)
Agrees or Negotiates with Server until in Agreement over Privacy Policies
Establish Privacy Agreement Clients Sets Privacy Degree
Request Data
CIE (Confidentiality Inference Engine)
Send Appropriate Data Server Sets Confidentiality Degree
on confidentiality requirements. It is important to note that the confidentiality inference engine is located only on the server side. The client has already negotiated its personal privacy issues and is ready to view the data, thus leaving the server to decide what the next appropriate action is. The confidentiality inference engine must first determine what data will be currently available to the client, based on the current trust assignment. Once the inference engine has determined this, the inference engine must explore what policies or data can be potentially inferred if the data is given to the client. The primary objective of the confidentiality inference engine is to ponder how the client might be able to use the information given to it and then guide the server through the process of deciding a client’s access to resources.
78
cONfIDeNtIAlIty fOr the seMANtIc Web Layered Architecture By confidentiality, we mean secrecy which is what we also usually refer to as security, although in reality security may include integrity as well as trust. In this section, by security issues we essentially mean confidentiality issues. In particular, we provide an overview of security issues for the Semantic Web, with special emphasis on XML security, RDF security, and secure information integration. Note that according to the vision of Tim Berners Lee et al. (2001), logic, proof, and trust are at the highest layers of the Semantic Web. That is, how can we trust the information that the
Administering the Semantic Web
Web site gives us? Trusting the information that the Web site gives us is essentially about trusting the quality of the information. We will not discuss that aspect of trust further in this article. Instead, we will discuss trust from the viewpoint of trusting the Web site or the user/ client. Security cannot be considered in isolation. There is no one layer that should focus on security. Security cuts across all layers, and this is a challenge. We need security for each of the layers, and we must also ensure secure interoperability as illustrated in Figure 4. For example, consider the lowest layer. One needs secure TCP/IP, secure sockets, and secure HTTP. There are now security protocols for these various lower-layer protocols. One needs end-to-end security. One cannot just have secure TCP/IP built on untrusted communication layers. We need network security. Next layer is XML and XML schemas. One needs secure XML. That is, access must be controlled to various portions of the document for reading, browsing, and modifications. There is research on securing XML and XML schemas. The next step is securing RDF. Now with RDF, not only do we need secure XML, but we also need security for the interpretations and semantics. For example, under certain contexts, portions of the document may be “unclassified”, while under certain other contexts the document may be “classified”. As an example, one could declassify an RDF document, once the war is over. Much work has been carried out on security constraint processing for relational Figure 4. Layers for the secure Semantic Web Logic, Proof and Trust with respect to security Security for Rules/Query Security for RDF, Ontologies Security for XML, XML Schemas Security for the Protocols
databases. One needs to determine whether these results could be applied for the Semantic Web (Thuraisingham et al., 1993). Once XML and RDF have been secured, the next step is to examine security for ontologies. That is, ontologies may have security levels attached to them. Certain parts of the ontologies could be “secret” while certain other parts may be “unclassified”. The challenge is, how does one use these ontologies for secure information integration? Researchers have done some work on the secure interoperability of databases. We need to revisit this research and then determine what else needs to be done so that the information on the Web can be managed, integrated, and exchanged securely. We also need to examine the inference problem for the Semantic Web. Inference is the process of posing queries and deducing new information. It becomes a problem when the deduced information is something the user is unauthorized to know. With the Semantic Web, and especially with data mining tools, one can make all kinds of inferences. Recently, there has been some research on controlling unauthorized inferences on the Semantic Web (Stoica & Farkas, 2004) Security should not be an afterthought. We have often heard that one needs to insert security into the system right from the beginning. Similarly, security cannot be an afterthought for the Semantic Web technologies. Note also that XML, RDF, and Ontologies may be used to specify the security policies also. Therefore, not only do we need to secure XML, RDF, and Ontology documents, but these languages can also be used to specify policies (Thuraisingham, 2006). In the remaining subsections, we will discuss security for the different layers of the Semantic Web.
XML Security Various research efforts have been reported on XML security (Bertino et al., 2002). We briefly discuss some of the key points. XML documents
79
Administering the Semantic Web
have graph structures. The main challenge is whether to give access to entire XML documents or parts of the documents. Bertino et al. (2002) have developed authorization models for XML. They have focused on access control policies as well as on dissemination policies. They also considered push and pull architectures. They specified the policies in XML. The policy specification contains information about which users can access which portions of the documents. In Bertino et al., 2002), algorithms for access control as well as computing views of the results are presented. In addition, architectures for securing XML documents are also discussed. In Bertino et al. (2004), the authors go further and describe how XML documents may be published on the Web. The idea is for owners to publish documents, for subjects to request access to the documents, and for untrusted publishers to give the subjects the views of the documents they are authorized to see. W3C (World Wide Web Consortium) is specifying standards for XML security. The XML security project (see XML Security) is focusing on providing the implementation of security standards for XML. The focus is on XML-Signature Syntax and Processing, XML-Encryption Syntax and Processing, and XML Key Management. W3C also has a number of working groups, including XML Signature working group (see XML Signature) and XML encryption working group (see XML Encryption). While the standards are focusing on what can be implemented in the nearterm, the work reported in Bertino et al. (2002) and Bertino et al. (2004) sets the direction for access control and secure publishing of XML documents. Note also that Bertino and others have also specified confidentiality and privacy policies in XML. In other words, not only is there research on securing XML documents, but there is also research in specifying policies in XML (see also, Thuraisingham, 2005a).
80
RDF Security RDF is the foundation of the Semantic Web. While XML is limited in providing machine-understandable documents, RDF handles this limitation. As a result, RDF provides better support for interoperability as well as searching and cataloging. It also describes the contents of documents as well as relationships between various entities in the document. While XML provides syntax and notations, RDF supplements this by providing semantic information in a standardized way. The basic RDF model has three types: resources, properties, and statements. Resource is anything described by RDF expressions. It could be a Web page or a collection of pages. Property is a specific attribute used to describe a resource. RDF statements are resources together with a named property, plus the value of the property. Statement components are subject, predicate, and object. So, for example, if we have a sentence of the form “John is the creator of xxx”, then xxx is the subject or resource, Property or predicate is “Creator” and object or literal is “John”. There are RDF diagrams very much like Entity-Relationship diagrams or object diagrams to represent statements. There are various aspects specific to RDF syntax, and for more details we refer to the various documents on RDF published by W3C. Also, it is very important that the intended interpretation be used for RDF sentences. This is accomplished by RDF schemas (Antoniou & Harmelen, 2004). More advanced concepts in RDF include the container model and statements about statements. The container model has three types of container objects, and they are Bag, Sequence, and Alternative. A bag is an unordered list of resources or literals. It is used to mean that a property has multiple values, but the order is not important. A sequence is a list of ordered resources. Here the order is important. Alternative is a list of resources that represent alternatives for the value
Administering the Semantic Web
of a property. Various tutorials in RDF describe the syntax of containers in more detail. RDF also provides support for making statements about other statements. For example, with this facility one can make statements of the form “The statement A is false” where A is the statement “John is the creator of X”. Again one can use object-like diagrams to represent containers and statements about statements. RDF also has a formal model associated with it. This formal model has a formal grammar. For further information on RDF, we refer to the work of W3C reports (see RDF Primer). As in the case of any language or model, RDF will continue to evolve. Now to make the Semantic Web secure, we need to ensure that RDF documents are secure. This would involve securing XML from a syntactic point of view. However, with RDF we also need to ensure that security is preserved at the semantic level. The issues include the security implications of the concepts resource, properties, and statements. That is, how is access control ensured? How can statements, properties, and statements be protected? How can one provide access control at a finer grain of granularity? What are the security properties of the container model? How can bags, lists, and alternatives be protected? Can we specify security policies in RDF? How can we resolve semantic inconsistencies for the policies? How can we express security constraints in RDF? What are the security implications of statements about statements? How can we protect RDF schemas? Some initial directions on RDF security are given in Carminati et al. (2004). More details are given in Thuraisingham (2006).
Secure Information Interoperability Information is everywhere on the Web. Information is essentially data that makes sense. The database community has been working on database integration for several decades. They encountered many challenges including interoperability of heterogeneous data sources. They used schemas
to integrate the various databases. Schemas are essentially data describing the data in the databases (Sheth & Larsen, 1990). Now with the Web, one needs to integrate the diverse and disparate data sources. The data may not be in databases. It could be in files both structured and unstructured. Data could be in the form of tables or in the form of text, images, audio, and video. One needs to come up with technologies to integrate the diverse information sources on the Web. Essentially one needs the Semantic Web services to integrate the information on the Web. The challenge is how does one integrate the information securely? For example, in Thuraisingham (1994) the schema integration work of Sheth and Larson was extended for security policies. That is, different sites have security policies, and these policies have to be integrated to provide a policy for the federated database system. One needs to examine these issues for the Semantic Web. Each node on the Web may have its own policy. Is it feasible to have a common policy for a community on the Web? Do we need a tight integration of the policies, or do we focus on dynamic policy integration? Ontologies are playing a major role in information integration on the Web. How can ontologies play a role in secure information integration? How do we provide access control for ontologies? Do we have ontologies for specifying the security policies? How can we use some of the ideas discussed in Bertino et al. (2004) to integrate information securely on the web? That is, what sort of encryption schemes do we need? How do we minimize the trust placed on information integrators on the Web? We are investigating issues related to the above questions.
Secure Query and Rules Processing for the Semantic Web The layer above the Secure RDF layer is the Secure Query and Rules processing layer. While
81
Administering the Semantic Web
RDF can be used to specify security policies (see for example, Carminati et al., 2004), the Web rules language being developed by W3C is more powerful to specify complex policies. Furthermore, an inference engine is also being proposed to process the rules. One could integrate ideas from the database inference controller that we have developed (Thuraisingham et al., 1993) with Web rules processing to develop an inference or privacy controller for the Semantic Web. The query-processing module is responsible for accessing the heterogeneous data and information sources on the Semantic Web. W3C is examining ways to integrate techniques from Web query processing with Semantic Web technologies to locate, query, and integrate the heterogeneous data and information sources.
Our Approach to Confidentiality Management We utilize two popular Semantic Web technologies in our prototype called Intellidimension RDF Gateway and Infered (see Intellidimension, the RDF Gateway) and Jena (see JENA). RDF Gateway is a database and integrated Web server, utilizing RDF, and built from the group up rather than on top of existing Web servers or databases (RDF Primer). It functions as a data repository for RDF data and also as an interface to various data sources, external or internal, that can be queried. Jena is a Java application programming package to create, modify, store, query, and perform other processing tasks on RDF/XML (eXtensible Markup Language) documents from Java programs. RDF documents can be created from scratch, or pre-formatted documents can be read into memory to explore various parts. The node-arc-node feature of RDF is closely resembled in how Jena accesses an RDF document. Through different class objects subjects, properties and objects can be iterated. It also has a built-in query engine designed on top of RDFQL (RDF Query
82
Language) that allows querying documents using standard RDFQL query statements. Using these technologies, we specify the confidentiality policies. The confidentiality engine will then ensure that the policies are enforced correctly. If we assume the basic framework then the confidentiality engine will enforce the policies and will not examine security violations via inference. In the advanced approach, the confidentiality engine will include what we call an inference controller. We utilize a similar approach for ensuring privacy for the Semantic Web which will be discussed in a later section.
Inference/Confidentiality Controller for the Semantic Web Inference is the process of forming conclusions from the response obtained to queries. If the conclusions are not authorized, then the problem is called the inference problem. Inference controller is the system that ensures that confidentiality violations via inference do not occur. We have designed and developed inference controllers for a database system (Thuraisingham et al., 1993). In such a system, users with different roles access and share a database consisting of data at different security levels. A powerful and dynamic approach to assigning privacy levels to data is one which utilizes security constraints. Security constraints provide an effective and versatile confidentiality policy. They can be used to assign security levels to the data depending on their content and the context in which the data is displayed. They can also be used to dynamically reclassify the data. An architecture for an inference controller for database systems is illustrated in Figure 5. While much of our previous work focused on security control in relational databases, our recent work is focusing on extending this approach to the Semantic Web. Figure 6 illustrates an inference controller for the Semantic Web. The Semantic Web is augmented by an inference controller that
Administering the Semantic Web
Figure 5. Confidentiality enhanced database management system D ata Privacy M ining Controller T ool
D BM S DB MS
D atabase Database
examines the policies specified as ontologies and rules, and utilizes the inference engine embedded in the Web rules language, reasons about the applications, and deduces the security violations via inference. In particular, we focus on the design and implementation of an inference controller where the data is represented as RDF documents.
PrIvAcy fOr the seMANtIc Web Privacy is about protecting information about individuals. Furthermore, an individual can specify, say to a Web service provider, the information that can be released about him or her. Privacy has been discussed a great deal in the past, especially when it relates to protecting medical information about patients. Social scientists as well as technologists have been working on privacy issues. However,
privacy has received enormous attention during the past year. This is mainly because of the advent of the Web, the Semantic Web, counter-terrorism, and national security. For example, in order to extract information about various individuals and perhaps prevent and/or detect potential terrorist attacks, data mining tools are being examined. We have heard much about national security versus privacy in the media. This is mainly due to the fact that people are now realizing that to handle terrorism, the government may need to collect data about individuals and mine the data to extract information. Data may be in relational databases or it may be text, video, and images. This is causing a major concern with various civil liberties unions (Thuraisingham, 2003). We have utilized the same Semantic Web technologies that we used for our work on the inference controller to develop the privacy controller. (i.e., Intellidimension RDF Gateway and Infered and Jena) From a technology policy of view, our privacy controller is identical to the confidentiality controller which we have designed and developed (Thuraisingham, 2005b). The privacy controller illustrated in Figure 7 is, however, implemented at the client side. Before the client gives out information to a Web site, it will check whether the Web site can divulge aggregated information to the third party and subsequently result in privacy violations. For example, the Web site may give out medical records without the identity so that the third party can study the patterns of flu or
Figure 6. Inference controller for the Semantic Web
83
Administering the Semantic Web
Figure 7. Privacy controller for the Semantic Web
other infectious diseases. Furthermore, at some other time the Web site may give out the names. However, if the Web site gives out the link between the names and diseases, then there could be privacy violations. The inference engine will make such deductions and determine whether the client should give out personal data to the Web site.
trUst fOr the seMANtIc Web Researchers are working on protocols for trust management. Languages for specifying trust management constructs are also being developed. Also there is research on the foundations of trust management. For example, if A trusts B and B trusts C, then can A trust C? How do you share the data and information on the Semantic Web and still maintain autonomy? How do you propagate trust? For example, if A trusts B, say 50% of the time, and B trusts C, 30% of the time, then what value do you assign for A trusting C? How do you incorporate trust into semantic interoperability? What are the quality of service primitives for trust and negotiation? That is, for certain situations one may need 100% trust while for certain other situations 50% trust may suffice (see also Yu & Winslett, 2003).
84
Another topic that is being investigated is trust propagation and propagating privileges. For example, if you grant privileges to A, what privileges can A transfer to B? How can you compose privileges? Is there an algebra and calculus for the composition of privileges? Much research still needs to be done here. One of the layers of the Semantic Web is Logic, Proof, and Trust. Essentially this layer deals with trust management and negotiation between different agents, and examining the foundations and developing logics for trust management. Some interesting work has been carried out by Finin and Joshi (2002) (see also Denker et al., 2003; Kagal, Finin, & Joshi, 2003). For example, if given data A and B, can someone deduce classified data X (i.e., A + B X). The inference engines will also use an inverse inference module to determine if classified information can be inferred if a user employs inverse resolution techniques. For example, if given data A and the user wants to guarantee that data X remains classified, the user can determine that B, which combined with A implies X, must remain classified as well (i.e., A + ? X—the question mark results with B). Once the expert system has received the results from the inference engines, it can conclude a recommendation and then pass this recommendation to the client or server, who will have the option to either accept or reject the suggestion.
Administering the Semantic Web
Figure 8. Trust probabilities Trust degree = 59% 90 Policy1 75 Policy2 70 Policy3 60 Policy4 50 Policy5 35 Policy6 10 Policy7 0 Policy8
In order to establish trust, privacy, and confidentiality, it is necessary to have an intelligent system that can evaluate the user’s preferences. The system will be designed as an expert system to store trust, privacy, and confidentiality policies. These policies can be written using a Web rules language with foundations of First Order Logic. Traditional theorem provers can then be applied to the rules to check for inconsistencies and alert the user (Antoniou & Harmelen, 2004). Once the user approves of all the policies, the system can take action and properly apply these policies during any transaction occurring on a site. Also, the user can place percentages next to the policies in order to apply probabilistic scenarios. Figure 9 gives an example of a probabilistic scenario occurring with a trust policy. In Figure 8, the user set the trust degree to 59%. Because the user trusts another person 59%, only policies 5-8 will be applied. Figure 9 shows some example policies. These example policies will be converted into a Web rules language, such as the Semantics Web Rules Language (see SWRL) and enforced by the Trust engine. Figure 10 illustrates an integrated architecture for ensuring confidentiality, privacy, and trust for the Semantic Web. The Web server, as well as the client, has trust management modules. The Web server has a confidentiality engine, whereas the client has a privacy engine. We are currently designing and developing such an Integrated
Figure 9. Example policies Policy1: Policy2: Policy3: Policy4: Policy5:
if A then B else C not A or B A or C A or C or D or not E not (A or C)
CPT System with XML, RDF, and Web Rules Technologies. Some details of the modules are illustrated in Figure 10. In Figure 11, ontologies, CPT policies, and credentials are given to the expert system such that the expert system can advise the client or server who should receive access to what particular resource and how these resources should further be regulated. The expert system will send the policies to the WCOP (Web rules, credentials, ontologies, and policies) parser to check for syntax errors and validate the inputs. The information contained within the dashed box is a part of the system that is only included in the Advanced TP&C system. The inference engines (e.g., TIE, PIE, and CIE) will use an inference module to determine if classified information can be inferred based on the given known information.
sUMMAry AND DIrectIONs This article has provided an overview of administration issues for the Semantic Web. We first discussed a framework for enforcing confidentiality, privacy, and trust for the Semantic Web. Then we discussed security issues for the Semantic Web. We argued that security must cut across all the layers. Next we provided more details on XML security, RDF security, secure information integration, and trust. If the Semantic Web is to be secure, we need all of its components to be secure. We
85
Administering the Semantic Web
Figure 10. Integrated architecture for confidentiality, privacy, and trust
Figure 11. Modules of CPT controller CPT Policies, Ontologies and Web Rules
Policies
Ontologies
Web Rules
Credentials
CPT Controller
Expert System
Advises the User
User: Client or Server
Inference Engines User Interface WCOP Parser
Inconsistency Handler
Inference Module
also described our approach to confidentiality and inference control. Next we discussed privacy for the Semantic Web. Finally, we discussed trust management as well as an integrated framework for CPT.
86
Inverse Inference Module
Data
There are many directions for further work. We need to continue with the research on confidentiality, privacy, as well as trust for the Semantic Web. Then we need to develop the integrated framework
Administering the Semantic Web
for CPT. Finally we need to formalize the notions of CPT and build a security model. Standards play an important role in the development of the Semantic Web. W3C has been very effective in specifying standards for XML, RDF, and the Semantic Web. We need to continue with the developments and try as much as possible to transfer the research to the standards effort as well as incorporate security into these efforts. We also need to transfer the research and standards to commercial products.
refereNces Antoniou, G., & Harmelen, F. V. (2004). A semantic Web primer. MIT Press. Berners Lee, T., et al. (2001). The semantic Web. Scientific American, (May, 2001). Bertino, E., et al. (2002). Access control for XML documents. Data and Knowledge Engineering, 43(3). Bertino, E., et al. (2004). Secure third party publication of XML documents. To appear in IEEE Transactions on Knowledge and Data Engineering. Carminati, B., et al. (2004). Security for RDF. In Proceedings of the DEXA Conference Workshop on Web Semantics, Zaragoza, Spain. Denker, G., et al. (2003). Security for DAML Web services: Annotation and matchmaking. International Semantic Web Conference. Finin, T., & Joshi, A. (2002). Agents, trust, and information access on the semantic Web, ACM SIGMOD, (December). Intellidimension, the RDF Gateway. Retrieved from http://www.intellidimension.com/ JENA. Retrieved from http://jena.sourceforge. net/
Kagal, L, Finin, T., & Joshi, A. (2003). A policybased approach to security for the semantic Web. International Semantic Web Conference. RDF Primer. Retrieved from http://www.w3.org/ TR/rdf-primer/ Sheth, A., & Larson, J. (1990). Federated database systems. ACM Computing Surveys, 22(3). Stoica, A., & Farkas, C. (2004). Ontology-guided XML security engine. Journal of Intelligent Information Systems, 23(3). SWRL: Semantic Web Rules Language (2004). Retrieved from http://www.w3.org/Submission/SWRL/ Thuraisingham, B., et al. (1993). Design and implementation of a database inference controller. Data and Knowledge Engineering Journal, 11(3). Thuraisingham, B. (1994). Security issues for federated database systems. Computers and Security, 13, (6). Thuraisingham, B. (2002). XML, databases, and the semantic Web. CRC Press, FL (2001). Thuraisingham, B. (2003). Data mining, national security, and privacy. ACM SIGKDD, (January, 2003). Thuraisingham, B. (2005a). Standards for the semantic Web. Computer Standards and Interface Journal. Thuraisingham, B. (2005b). Privacy constraint processing in a privacy-enhanced database system. Data and Knowledge Engineering Journal. Thuraisingham, B. (2006). Building trustworthy semantic Webs. CRC Press. World Wide Web Consortium (W3C). Retrieved from www.w3c.org XML Encryption. Retrieved from http://www. w3.org/Encryption/2001/
87
Administering the Semantic Web
XML Security. Retrieved from http://xml.apache. org/security/ XML Signature. Retrieved from http://www. w3.org/Signature/
Yu, T., & Winslett, M. (2003). A unified scheme for resource protection in automated trust negotiation. IEEE Symposium on Security and Privacy, Oakland, CA.
This work was previously published in International Journal of Information Security and Privacy, Vol. 1, Issue 1, edited by H. Nemati, pp. 18-34, copyright 2007 by Idea Group Publishing (an imprint of IGI Global).
88
Chapter 1.6
Human and Social Perspectives in Information Technology: An Examination of Fraud on the Internet C. Richard Baker University of Massachusetts, USA
AbstrAct This chapter adds to the discussion of human and social perspectives in information technology by examining the existence and extent of fraudulent activities conducted through the Internet. The principal question addressed by this chapter is whether fraudulent activities perpetuated using the Internet constitute a new type of fraud, or whether they are classic forms of fraud appearing in a new medium. Three areas of fraud are investigated, namely: securities fraud, fraud in electronic commerce, and fraud arising from the rapid growth of Internet companies. The U.S. Securities and Exchange Commission (SEC) has cited more than 100 companies for committing securities fraud using the Internet. Actions prohibited under U.S. securities laws are now being
conducted through the Internet, and the SEC has taken steps to suppress these frauds (SEC, 2001). The rapid growth of electronic commerce, and the natural desire on the part of consumers to feel secure while engaging in electronic commerce, has prompted the creation of mechanisms, such as web site seals and logos, to reduce concerns about fraudulent use of information. It is, however, questionable whether these mechanisms are effective in reducing fraud conducted through the Internet. A third potential area for fraud on the Internet involves the rapid growth of Internet companies, often with little economic substance and lacking in traditional managerial controls. This chapter seeks to examine areas with significant potential for fraud on the Internet and to assess implications of such activities for the management of information technology.
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Human and Social Perspectives in Information Technology
INtrODUctION We will say then that a considerable advance has been made in mechanical development when all men, in all places, without any loss of time, are cognizant through their senses, of all that they desire to be cognizant of in all other places, at a low rate of charge, so that the back country squatter may hear his wool sold in London and deal with the buyer himself, may sit in his own chair in a back country hut and hear the performance of Israel in Egypt at Exeter Hall, may taste an ice on the Rakaia, which he is paying for and receiving in the Italian opera house Covent garden. Multiply instances ad libertum–this is the grand annihilation of time and place which we are all striving for, and which in one small part we have been permitted to see actually realized. (Attributed to Samuel Butler with reference to the opening of the first telegraph between cities in New Zealand in 1863) Speculation about the effects of new information technology is not a new phenomenon. As the quotation cited above indicates, the invention of the telegraph in the early 19th century prompted the belief that the world would quickly become smaller and more closely connected, thereby eliminating wars and conflicts. Sadly, this was not to be the case. Similar speculation has arisen in recent years with regard to the Internet. Is the Internet a liberating tool offering the possibility of rapid increases in human freedom, or does the Internet threaten our right to privacy? By using the Internet, musicians can now bypass recording companies and publish their own music directly online for fans to download. Day traders can buy and sell shares of stock without the intervention of brokers. Readers of newspapers, books, and magazines can choose the news, entertainment, and even people that they wish to interact with. There is a common thread running through these and similar Internet developments. What appears to be going on here is a radical shift
0
in power, whereby individuals use technology to take control of information away from governments and corporations (Kaplan, 1999). Many observers feel that the advent of the Internet is an unmitigated positive trend, while others believe that there is a dark side to cyberspace. This latter perspective argues that when individuals use technology excessively and avoid contact with other human beings, there is the danger that they will remove themselves from the wider world. The result may be that cyberspace, which has been prized for its diversity and wealth of information, will lead to a certain type of ignorance through over-involvement in virtual communities at the expense of citizenship in real-world communities (Shapiro, 1999). While the Internet has the potential to shift control of information away from organisations and institutions in interesting ways, individual power and control can be misused. Examples of this misuse include hacking, virus spreading, sending massive e-mail spams, distributing pornography, and perpetuating fraudulent schemes. To prevent the abuse of individual power, it may be necessary to curb some of the freedom that has heretofore reigned in cyberspace. The question is whether a balance can be achieved between individual freedom and the needs of civil society. This chapter focuses on one aspect of this question, namely the existence and extent of fraud perpetuated through the Internet. The chapter will discuss whether fraud using the Internet constitutes a new category of fraud or whether it is a classic form of fraud committed through other means. Before addressing this question in more detail, the following section will briefly discuss the issue of what fraud is or may be.
A tHEOrY OF FrAUD Mitchell et al. (1998) indicate that fraud is a form of white-collar crime. They argue that white-collar crime is: “a contested concept which is invoked to
Human and Social Perspectives in Information Technology
cover abuse of position, power, drug trafficking, insider trading, fraud, poverty wages, violation of laws, theft, exploitation, and concealment, resulting in financial, physical, psychological damage to some individuals and a disruption to the economic, political, and social institutions and values” (Mitchell et al., 1998, p. 593). Mitchell et al. suggest that opportunities to commit whitecollar crime have expanded as free-market policies have become the reigning political economic philosophy, rendering it more likely that fraud and other white-collar crimes will go unpunished and unprevented. They also argue that some professionals, including lawyers, accountants, and information technology specialists, have been implicated in white-collar crime and fraudulent activities. Because the Internet has been a repository of strong beliefs about the inadvisability of government regulation, the potential for whitecollar crime and fraud to proliferate through the Internet may be greater than it is for other media (Kedrosky, 1998). From a general point of view, fraud is defined as any act where one party deceives or takes unfair advantage of another. From a legal perspective, fraud is defined more specifically as an act, omission, or concealment, involving a breach of legal or equitable duty or trust, resulting in disadvantage or injury to another. By law, it is necessary to prove that a false representation was made as a being true, and that the statement was made with intent to deceive and to induce the other party to act upon it. Ordinarily it must be proven that the person who was defrauded suffered an injury or damage from the act. In sum, fraud is a deliberate misrepresentation of fact for the purpose of depriving someone of a valuable possession (Encyclopaedia Britannica Online, 2001). While fraud can be viewed as a crime, often it is an element of a crime, such as in the act of taking money by false pretenses or by impersonation. European legal codes often define fraud to include not only intentional misrepresentations of fact designed to deceive another into parting with
valuable property, but also misunderstandings arising out of normal business transactions. Thus, any omission or concealment that is injurious to another, or that allows a person to take unfair advantage of another, may constitute criminal fraud in some countries. In Anglo-American legal systems, this latter type of fraud is often treated as deceit, subject to civil action rather than criminal penalties (Encyclopedia Britannica Online, 2001). Managers of information technology are often concerned about fraud in their organisations. This is understandable because the cost of fraud is high on an annual basis. It is estimated that businesses lose approximately six percent of their annual revenue to fraudulent schemes. On average, organisations lose $9 dollars per day per employee to fraud (Association of Certified Fraud Examiners, 2001). Research indicates that the persons most likely to commit fraud are college- or university-educated white males. Men are responsible for almost four times as many frauds as women. On average, if the perpetuator is male, the loss is $185,000 versus $48,000 for a female. Losses arising from persons with college degrees are five times greater than from high school graduates. Fifty-eight percent of fraud is committed by employees, with an average of $60,000 per case, while, 12 percent is caused by owners, with an average cost of $1 million per case. Fifty percent of fraud involves the cash account of the organisation. About 10 percent arises from conflicts of interest, and about five percent of fraud arises from fraudulent financial statements (Association of Certified Fraud Examiners, 2001). There has been a growing realization in recent years that the Internet offers a fertile venue for fraudulent schemes. The focus of this chapter is on three particular areas with significant potential for fraud on the Internet, namely: securities fraud, fraud in electronic commerce, and fraud arising from the rapid growth of Internet companies. The next section will address the issue of securities fraud using the Internet.
Human and Social Perspectives in Information Technology
sEcUrItIEs FrAUD ON tHE INtErNEt While the Internet can be helpful in obtaining investment information, it can also be used to commit securities fraud. The U.S. Securities and Exchange Commission has cited more than 100 companies and individuals for committing securities fraud using the Internet (SEC, 2001). Among other things, the perpetuators of securities fraud through the Internet have been cited for failing to tell investors that they were paid for recommending shares of companies, for hiding their lack of independence from the companies they were recommending, for issuing false or misleading information about the companies they recommended, and for using false information to drive up the price of shares so that they could be sold before accurate information became known. Because the Internet allows information to be communicated easily and inexpensively to a vast audience, it is easy for persons intent on committing securities fraud to send credible-looking messages to a large number of possible investors. Investors are often unable to tell the difference between legitimate and false claims. Some of the ways that securities fraud has been committed using the Internet include: online investment newsletters, bulletin boards, and e-mail spam. Many of the fraudulent activities cited by the SEC have been classic investment frauds, such as: The Pump and Dump, The Pyramid, The Risk-Free Fraud, and Off-Shore Frauds (SEC, 2001).
Online Investment Newsletters There have been a large number of investment newsletters appearing in recent years on the Internet. Online newsletters offer investment advice and recommend the purchase of a specific company’s shares. Legitimate newsletters help investors gather investment information, but some are fraudulent. Companies may pay newsletters to recommend their shares. This practice is not
illegal, but U.S. securities laws require newsletters to disclose who paid them, as well as the amount and the type of payment. If the newsletter does not disclose its relationship with the company being recommended, the newsletter has committed securities fraud. The newsletter may appear to be legitimate, but it earns a fee if it persuades investors to buy or sell a particular company’s shares. Some online newsletters commit securities fraud by claiming to perform research on the companies they recommend when in fact they do not. Other newsletters spread false information or promote worthless shares. The goal is to drive up the price of the shares in order to sell before investors can obtain truthful information about the companies (SEC, 2001).
Bulletin Boards Online bulletin boards exist in several different formats, including chat rooms, newsgroups, and web site-based bulletin boards. Bulletin boards have become a popular way for investors to share information concerning investment opportunities. While some messages are true, many are fraudulent. Persons engaged in fraudulent schemes pretend to reveal inside information about upcoming announcements, new products, or lucrative contracts. It is often difficult to ascertain the reliability of such information because bulletin boards allow users to hide their identity behind aliases. Persons claiming to be unbiased observers may be company insiders, large shareholders, or paid promoters. Acting alone, an individual may be able to create the illusion of widespread interest in a thinly traded stock by posting a large number of messages under various aliases (SEC, 2001).
E-Mail Spam E-mail spam is similar to junk mail. Because e-mail spam is inexpensive and easy to create, persons intent on committing securities fraud use it to locate potential investors for investment
Human and Social Perspectives in Information Technology
schemes or to spread false information about a company. E-mail spam allows solicitation of many more potential investors than mass mailing or cold calling. Through the use of bulk e-mail programs, personalized messages can be sent to thousands of Internet users simultaneously (SEC, 2001).
Classic Investment Frauds Using the Internet Investment frauds on the Internet are similar in many respects to frauds using the telephone or the mail. The following are some examples: •
•
•
The pump and dump: This type of fraud involves online messages that urge investors to buy shares quickly or recommend selling before the price goes down. The sender of the message claims to have inside information about a company or the ability to pick shares that will increase in price. The perpetuator of the fraud may be an insider or paid promoter who stands to gain by selling their shares after the stock price is pumped up. Once the perpetuator sells his shares and stops promoting the company, the price falls and investors lose their money. This scheme is often employed with small, thinly traded companies because it is easier to manipulate share prices when there is relatively little information available about the company. The pyramid: This type of fraud involves a message such as: “How To Make Big Money From Your Home Computer!!” The message might claim that investors can turn $5 into $60,000 in just three to six weeks. The promotion is an electronic version of a classic pyramid scheme where participants make money only if they can recruit new participants into the program. The risk-free fraud: This type of fraud involves a message like: “Exciting, LowRisk Investment Opportunities” inviting participation in: wireless cable projects,
•
prime bank securities, or eel farms. The investment products usually do not exist. Off-shore frauds: Off-shore frauds targeting U.S. investors are common. The Internet has removed barriers imposed by different time zones, different currencies, and the high costs of international telephone calls and postage. When an investment opportunity originates in another country, it is difficult for U.S. law enforcement agencies to investigate and prosecute the frauds.
Examples of Securities Fraud on the Internet Francis Tribble and Sloane Fitzgerald, Inc. sent more than six million unsolicited e-mails, and distributed an online investment newsletter to promote the shares of two small, thinly traded companies (SEC, 2001). Because Tribble and Sloane failed to tell investors that the companies they were recommending had agreed to pay them in cash and securities, the SEC sued to stop them and imposed a $15,000 penalty on Tribble. The massive amount of e-mail spam distributed by Tribble and Sloane resulted in hundreds of complaints being received by the SEC’s online Enforcement Complaint Center (SEC v. Tribble, 1998). The SEC also cited an Internet newsletter called Future Superstock (FSS), written by Jeffrey Bruss of West Chicago, Illinois. Bruss recommended the purchase of shares in 25 Microcap (i.e., small capitalization) companies and predicted that the share prices would double or triple in the months following dissemination of the recommendations. In making these recommendations, FSS: (1) failed to disclose more than $1.6 million of compensation, in cash and stock, from profiled issuers; (2) failed to disclose that it had sold shares in many of the issuers shortly after dissemination of recommendations; (3) said that it had performed independent research and analysis in evaluating the companies profiled by the newsletter when it
Human and Social Perspectives in Information Technology
had conducted little, if any, research; and (4) lied about the success of certain prior stock picks (SEC v. The Future Superstock et al., 1998). The SEC also cited Charles Huttoe and 12 other defendants for secretly distributing to friends and family nearly 42 million shares of Systems of Excellence, Inc., known by its ticker symbol SEXI (SEC, 2001). In a pump and dump scheme, Huttoe drove up the price of SEXI shares through false press releases claiming multi-million dollar sales which did not exist, an acquisition that had not occurred, and revenue projections that had no basis in reality. He also bribed co-defendant, SGA Goldstar, to tout SEXI to readers of SGA Goldstar’s online newsletter called Whisper Stocks. The SEC fined Huttoe $12.5 million. Huttoe and Theodore Melcher, the author of the online newsletter, were sentenced to federal prison. In addition, four of Huttoe’s colleagues pled guilty to criminal charges (SEC, 2001). Matthew Bowin recruited investors for his company, Interactive Products and Services, in a direct public offering completed entirely through the Internet. Bowin raised $190,000 from 150 investors. Instead of using the money to build the company, Bowin pocketed the proceeds. The SEC sued Bowin in a civil case, and the Santa Cruz, California, District Attorney’s Office prosecuted him criminally. He was convicted of 54 felony counts and sentenced to jail (SEC, 2001). IVT Systems solicited investments to finance the construction of an ethanol plant in the Dominican Republic. The Internet solicitations promised a return of 50% or more with no reasonable basis for the prediction. The solicitations included false information about contracts with well-known companies and omitted other important information about the company. After the SEC filed a complaint, IVT Systems agreed to stop breaking the law (SEC, 2001). In another case, Gene Block and Renate Haag were charged by the SEC with offering prime bank securities through the Internet, a type of security that does not exist. Block
and Haag collected over $3.5 million by promising to double investors’ money in four months. The SEC froze their assets and prevented them from continuing their fraud (SEC, 2001).
Combating Securities Fraud on the Internet It should be recognized that securities frauds using the Internet are similar to frauds that existed before the Internet. The perpetuators of securities fraud often engage professional advisors such as lawyers, accountants, and information technology specialists for advice concerning accounting, taxation, information systems design, and other matters. Mitchell et al. (1998) indicate that professionals are implicated in white-collar crimes such as money laundering. While, there is no specific evidence that lawyers, accountants, and information technology professionals have been involved in securities frauds using the Internet, it seems improbable that such frauds could be perpetuated without at least the tacit involvement of knowledgeable professionals. It is important for information technology managers and professionals to be aware of the activities of their associates. If these activities include securities fraud using the Internet, there should be an attempt to prevent such activities. If an appropriate response is not forthcoming through these efforts, the IT manager should cease further contact with such associates. Obviously, if the IT manager is facilitating such activities, they could be subject to SEC enforcement actions or even criminal prosecution. Securities fraud on the Internet is not just a U.S. phenomenon. The Fraud Advisory Panel of the Institute of Chartered Accountants in England and Wales (ICAEW) estimates that Internet fraud costs the United Kingdom as much as five billion pounds per year. This estimate includes both securities fraud and other types of fraud in electronic commerce, which is the subject of the next section.
Human and Social Perspectives in Information Technology
FrAUD IN ELEctrONIc cOMMErcE There is widespread recognition that the Internet offers an innovative and powerful way to conduct business activities (Tedeschi, 1999). Forrester Research, Inc. indicates that participants in electronic commerce purchase an average of $4 billion per month online (Forrester Research, 2001). Many transactions in electronic commerce are consummated with credit cards. The use of credit cards provides a certain degree of comfort to consumers because there are legal limits on losses arising from unauthorized use of credit card information. Nevertheless, perpetuators of fraudulent schemes using the Internet often look for opportunities to obtain credit card information as well as other private information such as e-mail addresses, home addresses, phone numbers, birth dates, social security numbers, and other similar types of information which can be sold to e-mail spam lists. This is a ripe area for fraud. Participants in electronic commerce are frequently concerned about the potential for fraud or other forms of misuse of information transmitted through the Internet. Gray and Debreceny (1998) have detailed some of the concerns that participants in electronic commerce have, including: • • • •
• • • • •
Is this a real company? Is this a trustworthy company? If I send credit card or bank information, is it safe? If I provide information to a company on its web site, where will the information end up? If I place an order, will I receive what I asked for? Will I receive delivery when promised? Will any problems I have be resolved quickly? Is a money-back guarantee honored? How soon will I get credit for returned items?
• •
How quickly will the company perform service on warranty items? Will the company be able to send me necessary replacement parts quickly?
It should be recognized that the above expressed concerns can exist in any type of transaction, whether conducted face-to-face, over the telephone, or through the Internet. Unscrupulous people will be unscrupulous regardless of the medium through which the transaction is conducted. Several mechanisms have been developed in recent years to reduce the concerns of participants in electronic commerce, including electronic logos, encryption techniques, and firewalls. The idea behind an electronic logo is that if an online merchant meets certain specified criteria, the merchant is allowed to place a logo on its web site. The logo is provided by an assurance provider, such as a public accounting firm, or another entity organised for that purpose. Examples include: AICPA/CICA’s WebTrust, Verisign, TRUSTe, ICSA, and BBBOnline. The logo is intended to provide assurance that the merchant has complied with standards established by the assurance provider. Usually, the logo is linked to the assurance provider’s web site. The online consumer can navigate to the assurance provider’s web site to read about the degree of assurance provided by the logo (Gray and Debreceny, 1998). An example is the VeriSign logo (www.verisign.com) which provides assurance that a web site is capable of transmitting and receiving secure information and that the site and company are real. The VeriSign logo focuses primarily on the security of the transaction and the validity of the web site and the electronic merchant. WebTrust is another logo assurance service that was developed jointly between the American Institute of CPAs (AICPA) and the Canadian Institute of Chartered Accountants (CICA). Other accounting associations in the United Kingdom, Australia, and New Zealand are also participat-
Human and Social Perspectives in Information Technology
ing in the WebTrust program. WebTrust operates under the assumption that consumers seek assurance in the following areas: •
•
•
•
They are dealing with a real company, rather than a fraudulent company seeking to obtain and sell credit card numbers, addresses, and other private information. They will receive the goods and services ordered, when promised, at the agreed-upon price. They have the option to request that the Internet seller not give or sell any private information provided in an online transaction. Private information cannot be intercepted while being transmitted (Primoff, 1998).
WebTrust is an attestation service provided by a licensed public accounting firm. During the assurance engagement, the WebTrust practitioner “audits” the online business to verify compliance with certain principles and criteria. The principles and criteria address matters such as privacy, security, availability, confidentiality, consumer redress for complaints, and business practices. The WebTrust Principles and Criteria were developed jointly by the AICPA and the CICA. In the United States, the WebTrust engagement is performed in accordance with standards specified by the AICPA. At the client’s request, the WebTrust practitioner may also provide consulting advice as part of the preparation for the WebTrust examination. If the online business meets the WebTrust Principles and Criteria, the site can display the WebTrust seal of approval. By “clicking” on the WebTrust seal, online customers can review the site’s business practice disclosures, report of the independent accountant, and management’s assertions, as well as viewing a list of other sites with seals and a digital certificate that authenticates the seal. At least every 90 days, the WebTrust practitioner must update their testing of the relevant activities to determine continued compliance with the
WebTrust Principles and Criteria. If the site fails to comply, the seal can be revoked.
Combating Fraud in Electronic Commerce The use of logo assurance services and other forms of encryption techniques are intended to reduce concerns about fraud in electronic commerce. IT managers may seek to convince potential online consumers to rely on logos as providing assurance against fraud and misuse of information. However, it is important for online consumers to be aware of the limits of the assurance provided by these logos. It must be recognized that providers of logos disclaim responsibility if the electronic merchant violates the principles and criteria of the logo provider or if fraud is present. Consequently, logo assurance programs do not provide protection against fraud, rather they are primarily marketing devices. In addition, the National Consumer’s League Internet Fraud Watch indicates that the greatest number of complaints concerning fraud in electronic commerce concern on-line auctions (National Consumers’ League, 1999). In an on-line auction, the auction web portal takes no responsibility for the quality, the suitability, or even the existence of the merchandise offered for sale. Fraud in on-line auctions has occurred frequently. For example, in December 1998, using a number of aliases, Jamison Piatt promised on eBay auctions that he had more than 1,500 copies of the popular Furby toy ready for delivery by Christmas. In January 1999, the state of Pennsylvania’s attorney general announced that Piatt had agreed to reimburse 29 persons who never received their Furbys because they had never existed (Wice, 1999). Many online auction sites are legitimate business enterprises, and they try to ensure that the persons offering items for sale do not mislead buyers, but some sites and some sellers are not legitimate businesses. A typical type of scheme
Human and Social Perspectives in Information Technology
is to induce online purchasers to submit bids for software, such as Microsoft Office, at below market prices. Bidders are told that they won the auction and are legally obliged to pay within 24 hours, but the product never arrives and the buyer is left holding the bag (BBC, 1999a). What appears to be happening, both in the area of Internet securities fraud and fraud in electronic commerce, is that the ability of the perpetuator of the fraud to contact a large number of people at relatively low cost allows the fraud to be conducted more easily. In addition, the lack of face-to-face contact appears to induce people to be more credulous of unlikely claims. The creation of virtual communities and the corresponding decrease in the level of participation in real-world communities reduces the propensity of individuals to question the reasonableness of claims, thereby facilitating the growth of fraud on the Internet.
FrAUD IN tHE rAPID GrOWtH OF INtErNEt cOMPANIEs A third area of potential fraud using the Internet lies in the rapid growth of companies whose existence depends solely on the Internet. This potential has been highlighted during the last several years by the rapid rise in prices of Internet company shares followed by an equally rapid decline, with many dot com companies going bankrupt during the years 2000 and 2001. Even though electronic commerce has been growing very rapidly, it can be described as still in the development stage. In the dot com industry, there are many companies struggling to succeed, and, as recent events have demonstrated, many of these companies will ultimately fail. During a period of rapid growth and contraction in an industry, it is likely that fraudulent practices will develop. Even if most Internet companies are legitimate, some have no economic basis. The business practices of some Internet companies border on fraud in the broader
sense defined in the first part of this chapter. In addition, as with other rapidly growing industries, there is often a lack of control over data and systems, particularly when a significant portion of a company’s transactions are conducted through the Internet. In this environment, Internet companies may not have control over the information systems that are essential to their business. This is an environment ripe for fraud. The Internet has sometimes been viewed as a rainbow with a pot of gold at the end, but we now realize that there is a grim reality to this picture. Most Internet companies do not make money (Hansell, 1998; Kedrosky, 1998). Even Amazon. com, one of the best-known Internet companies, has not made a profit since its inception. The economic basis of many Internet companies is not the sale of products or services, but rather the sale of advertising. Many Internet companies were created on the basis of projections about advertising revenues drawn from market research done by consulting firms. These projections may be suspect for several reasons. As with other forms of advertising, Internet advertising revenue is based on the number of persons who view the advertisement, but, it has been estimated that the top 10 Internet sites receive 50% of the advertising revenue, and the top 100 sites receive almost 95% of the revenue (Kedrosky, 1998). A second area in which projections concerning Internet advertising revenues may be suspect lies in the area of banner exchanges. Internet companies earn advertising credits by showing advertisements for other Internet companies. In other words, one Internet company provides advertising for another Internet company and vice versa. The payments are in the form of credits for advertising on the other company’s web site. Revenues are produced, but there is no cash flow (Kedrosky, 1998). A third area in which Internet advertising revenues may be suspect lies in the measurement of the number of visitors to a web site. A web site may report that it receives one million hits (i.e., visitors). However, the number
Human and Social Perspectives in Information Technology
of actual visitors may be as low as one percent of that number (i.e., 10,000). This is because the measurement of hits is based on factors such as the number of links and graphic images on the site. Consequently, the number of actual visitors is difficult to measure with any degree of accuracy (Kedrosky, 1998). Beyond the issue of questionable projections concerning Internet advertising revenues, there is the issue of technologies such as autonomous agents which may reduce the probability of earning a profit from Internet sales. The purpose of an autonomous agent is to locate every seller on the Internet that sells a particular item and then to sort them by price. Consequently, whatever an Internet company may try to do to create brand identity, or provide a service, autonomous agents will drive the market to the lowest price (Kedrosky, 1998). In addition, it is questionable whether Internet companies make money even in a period of rapidly growing electronic commerce. It is estimated that despite large increases in online commerce during recent years, less than five percent of online retailers earned a profit (High, 1999). Another area with potential for fraud lies in the initial public offering of Internet company shares. During 1998 and 1999, there was a stock market fascination with Internet companies which resembled a classic speculative bubble. Internet companies with no earnings, and in some cases no sales or even negative net worth, were able to complete initial public offerings at highly inflated prices. Because most Internet companies did not have earnings, financial analysts invented the price-to-revenues ratio as a comparative indicator. This precipitated a host of misleading accounting practices related to premature recognition of revenues. The lack of economic substance underlying many Internet IPOs resulted in a sharp decline in the price of Internet company shares in 2000 and 2001. A final area for potential fraud arising from the rapid growth of Internet companies lies in the lack managerial and internal controls in these
companies. Until recently, the cost of the hardware, software, and professional expertise necessary for electronic commerce served as a barrier to entry. The costs are now much lower. Internet service providers (ISPs) offer turnkey solutions that combine hardware, software, payment processing, and communications in one package. Since the ISP packages are outsourced, they operate solely on the ISP’s computers (Primoff, 1998). In a turnkey ISP approach, all of the information is located with the ISP, potentially compromising the Internet company’s access to information and the ability to exclude unauthorized persons from obtaining access. It is important for companies to understand how their information is controlled and by whom. It is also important to ascertain whether the Internet company and the ISP personnel have the skills necessary to deal with issues of security and internal control and what security techniques are employed (Primoff, 1998).
Combating Fraud in the Rapid Growth of Internet Companies Many would say that the recent rapid rise and fall of Internet companies is an example of free markets at work. However, what is overlooked in this assessment is that previous speculative bubbles, such as the world wide stock market crash of 1929, have usually resulted in calls for greater government regulation of private sector economic activity. The U.S. Securities and Exchange Commission has been observing and in some cases punishing securities fraud using the Internet, but they have not taken any visible steps to scrutinize the issuance of shares in Internet companies when there is little or no economic substance. Whether or not these companies will ultimately prove to be successful in a traditional economic sense remains to be seen. What is true is that the issuance of shares of Internet companies during the late 1990s had all of the hallmarks of a classic speculative bubble. As has been historically true of all previous speculative bubbles, this bubble burst,
Human and Social Perspectives in Information Technology
causing economic losses to many investors. Some would say that if losses have occurred, it is all in the normal course of business, but is this merely an example of the virtual community triumphing over any sense of real-world community?
cONcLUsION This chapter has examined the issue of fraud on the Internet and has examined three areas with significant potential for fraud, namely: securities fraud, fraud in electronic commerce, and fraud arising from the rapid growth of Internet companies. The SEC has cited many companies and individuals for committing securities fraud on the Internet. Activities prohibited under U.S. law are being conducted through the Internet, and the SEC has taken action to suppress these activities. A second potential area for fraud on the Internet lies in electronic commerce. The rapid growth of electronic commerce in recent years, and the corresponding desire by consumers to feel secure when engaging in electronic commerce, has prompted the creation of logo services such as WebTrust which are designed to reduce concerns about misuse of information. Nonetheless, it must be recognized that providers of logos and other seals do not actually offer any assurances regarding the lack of fraud. A third area for potential fraud on the Internet discussed in this chapter involves the rapid growth of Internet companies, often based on little economic substance and without traditional management or internal controls. These three potential areas for fraud on the Internet have developed rapidly, and it may well be that we are seeing opportunistic fraudulent schemes perpetuated by clever individuals. However, as Mitchell et al. (1998) point out, complex fraudulent schemes are difficult to perpetuate without the assistance of knowledgeable professionals. Have lawyers, accountants, and information technology professionals been involved
with fraud on the Internet? The evidence on this question is unclear, but the possibility is there.
rEFErENcEs Association of Certified Fraud Examiners. (2001). Report to the Nation on Occupational Fraud and Abuse. Avaliable at http://www.cfenet.com/media/report/reportsection1.asp BBC. (1999a). Internet scam file. BBC On-Line Network, April 7. Available on http://news.bbc. co.uk/hi/english/business/your_money/newsid_313000/313051.stm BBC. (1999b). Internet fraud to cost UK business billions. BBC On-Line Network, May 13. Available at http://news.bbc.co.uk/hi/english/business/ the_economy/newsid_342000/342644.stm Cahners Publishing Company. (1998). Auditing the website. Electronic News, 44(2219), 48. Encyclopedia Britannica Online. (2001). Fraud. Available at http://www.eb.com:180/ bol/search?type=topic&query=fraud&DBase= Articles&x=20&y= Forrester Research. (2001). Forester Online Retail Index. Cambridge, MA: Forrester Research, Inc. Available at http://www.forrester. com/NRF/1,2873,0,00.html Garcia, A. M. (1998). Global e-commerce explodes: Will you thrive, survive, or die? e-Business Advisor, October. Gray, G. L. and Debreceny, R. S. (1998). The electronic frontier. Journal of Accountancy, 185(1), 32-37. Hansell, S. (1998). A quiet year in the Internet industry. The New York Times, December 28, C1. High, K. (1999). What the holiday web boom hid. Fortune & Your Company, January 4.
Human and Social Perspectives in Information Technology
Available at http://cgi.pathfinder.com/yourco/ briefs/0,2246,142,00.html Kaplan, C. (1999). Writer seeks balance in Internet power shifts. New York Times Cyber Law Journal, June 18. Available at http://www.nytimes.com/library/tech/99/06/cyber/cyberlaw/18law.html Kedrosky, P. (1998). There’s little but fool’s gold in the Internet boomtown. The Wall Street Journal, November 23, A22. Lohr, S. and Markoff, J. (1998). AOL lays out plan for marriage to Netscape. New York Times On-Line, November 28. Available at http://www. nytimes.com Nagel, K. D. and Gray, G. L. (1998). Guide to Electronic Commerce Assurance Services. New York: Harcourt Brace Professional Publications. National Consumers League. (1999). Internet Fraud Watch. Available at http://www.nclnet. org/Internetscamfactsheet.html Primoff, W.M. (1998). Electronic commerce and webtrust. The CPA Journal, 68(November), 14-23. Schmidt, W. (1998). Webtrust services: AICPA launches webtrust for assurance. The CPA Journal, 68, 70.
SEC. (2001). Internet Fraud: How to Avoid Internet Investment Scams. Washington, DC: Securities and Exchange Commission, October. Available at http://www.sec.gov/investor/pubs/ cyberfraud.htm SEC v. John Wesley Savage et al. (1998). Washington, DC: Securities and Exchange Commission, October. Available at http://www.sec.gov/enforce/ litigrel/lr15954.txt SEC v. The Future Superstock et al. (1998). Washington, DC: Securities and Exchange Commission, October. Available at http://www.sec. gov/enforce/litigrel/lr15958.txt SEC v. Tribble. (1998). Washington, DC: Securities and Exchange Commission, October. Available at http://www.sec.gov/enforce/litigrel/lr15959.txt Shapiro, A. L. (1999). The Control Revolution: How the Internet is Putting Individuals in Charge and Changing the World We Know. New York: Public Affairs. Tedeschi, B. (1998). Real force in e-commerce is business-to-business sales. New York Times Online, January 5. Available at http://www.nytimes.com Wice, N. (1999). Furby fraud on eBay. Time Digital, January 25. Available at http://cgi.pathfinder. com/time/digital/daily/0,2822,18831,00.html
This work was previously published in Knowledge and Information Technology Management, edited by A. Gunasekaran, O. Khalil, and S. M. Rahman, pp. 268-282, copyright 2003 by Idea Group Publishing (an imprint of IGI Global).
00
0
Chapter 1.7
Codes of Ethics in Virtual Communities Călin Gurău Groupe Sup. de Co. Montpellier, France
INtrODUctION The development of the World Wide Web has created new opportunities for interpersonal interaction. The Internet allows one-to-one (e-mail), one-to-many (Web sites, e-mail lists) or manyto-many (online discussion forums) interaction, which represent a unique feature in comparison with traditional communication channels (Armstrong & Hagel, 1996). On the other hand, the Internet has specific characteristics, such as: •
•
•
Interactivity: The Internet offers multiple possibilities of interactive communication, acting not only as an interface, but also as a communication agent (allowing a direct interaction between individuals and software applications) Transparency: The information published online can be accessed and viewed by any Internet user, unless this information is specifically protected Memory: The Web is a channel not only for transmitting information, but also for storing
informationin other words, the information published on the Web remains in the memory of the network until it is erased. These characteristics permit the development of online or virtual communitiesgroups of people with similar interests who communicate on the Web in a regular manner (Armstrong & Hagel, 1996; Goldsborough, 1999a, 1999b; Gordon, 2000). Many studies deal with the ethics of research in Cyberspace and Virtual Communities (Bakardjieva, Feenberg, & Goldie, 2004), but very few publications relate with the Codes of Ethics used in Public Discussion Forums (Belilos, 1998; Johnson, 1997). Other specialists have analyzed specific categories or uses of online discussion forums, such as online learning (Blignaut & Trollip, 2003; DeSanctis, Fayard, Roach, & Jiang, 2003) or the creation of professional communities of practice (Bickart & Schindler, 2001; Kling, McKim & King, 2003; Millen, Fontaine, & Muller, 2002), and in this context, have also discussed briefly the importance of netiquette and forum monitoring (Fauske & Wade, 2003, 2004). The
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Codes of Ethics in Virtual Communities
difference between these online communities and public discussion forums is the degree of control exercised on the functioning and purpose of the forum by a specific individual or organization. This article attempts to investigate, analyze and present the main patterns of the codes/rules of ethics used in the public discussion forums, otherwise known as Newsgroups, and their influence on the profile and functioning of the community.
tHE OrGANIZAtION OF DIscUssION FOrUMs The discussion forum is a Web-based application that functions as a worldwide bulletin board (Fox & Roberts, 1999). Each discussion forum has a specific topic, or a series of related topics, and there are literally thousands of newsgroups available on the Internet, covering virtually any issue (Preece, 2001; Rheingold, 2000). Typically, online discussion forums use a three-tiered structure (Bielen, 1997): 1. 2.
3.
Forums: Focus on individual topic areas, such as classifieds or current news Threads: Created by end users to narrow a discussion to a particular topic, such as a car someone is looking to buy or a comment on a previously posted message. A thread opens a new topic of conversation. Once the topic is created, anyone can continue the ongoing conversation. Messages: Individual topic postings. A message is often a reply to someone else’s message, or users can post a message to initiate a conversation (thread).
An interested person can access the messages transmitted by other members of the discussion forum, post messages on the same discussion forum or start a new thread of discussion. Usually, in order to post a message or start a new thread, participants are asked to register first; however,
0
many discussion forums are totally transparent, since anyone (members or visitors) can access the archived messages and read them. Most discussion forums are monitored by people (monitors and administrators) and/or software applications (e.g., programs that automatically censor specific words from the posted messages). The monitors are usually volunteers that have a good knowledge of and interest in the topics discussed (Preece, 2000).
cODEs OF EtHIcs IN DIscUssION FOrUMs Ethical rules of discussion forums are usually displayed in two formats: 1.
2.
Explicit codes of ethics: These are presented under titles such as Terms of Use, Guidelines, Forum Rules, Terms and Conditions, Web Site User Agreement or Forum Policy. Very often, the ethical rules are just a topic among other legal disclaimers and definitions; in other cases they are a stand-alone text that does not include any other legal information. The link to these guidelines is easily identifiable, as the members of the discussion forum are expected to read them before engaging in online interaction. Implicit ethical rules: In this case, no clear indication is provided regarding the ethical rules that have to be respected by forum members; however, indirect references are made in the frequently asked questions (FAQ) section regarding the possibility of censoring members’ messages by replacing specific words with a string of “*.” In other sites, the ethical rules are limited to a general principle or “Golden Rule,” such as “We should do unto others as we would have them do unto us,” from which the members can derive the desirable rules of ethical behavior.
Codes of Ethics in Virtual Communities
When a site hosts multiple discussion forums (e.g., Google Groups), the ethical guidelines officially published by the site usually have a standardized style that might not properly apply to every group active on the site. Also, the site may indicate that it does not monitor the content of specific postings. Sometimes, members of a particular group attempt to create and enforce a specific ethical code for their group, in order to fill in this organizational gap. The attempt is seldom successful, since the member initiating this action does not have any official, recognized authority. The reactions of fellow members to such an initiative are very diverse, ranging from constructive dialogue to ironic criticism.
2.
tHE cONtENt OF EtHIcAL cODEs FOr DIscUssION FOrUMs The ethical rules used in discussion forums usually fall into one of five general categories: 1.
Rules that concern the good technical functioning of the forum. Since the discussion forum is supposed to be an open (and in most cases, free) board for expressing and exchanging ideas and opinions, some members might have the tendency to abuse it. To counter this tendency, participants are forbidden from: a. Posting multiple messages b. Inserting in their message long quotes from previous posts of other members, or entire articles downloaded from the Web—as a rule, the member is expected to edit the text, providing only the relevant quotes or an abstract; and if absolutely necessary, providing a hyperlink to the article of interest c. Inserting pictures or sound files in their messages d. Posting files that are corrupted or contain viruses
3.
4.
All these actions occupy substantial computer memory, and slow down or damage the general functioning of the discussion forum. Rules concerning the content of the posted message. Members should not include in their post: a. Content that is not relevant for the subject of the discussion forum (crossposting) b. Defamatory, obscene or unlawful material c. Information that infringes patent, trademark, copyright or trade secret rights d. Advertisements or commercial offers, other than those accepted by the group (since some discussion forums have a commercial purpose) e. Questionnaires, surveys, contests, pyramid schemes or chain letters, other than those accepted by the group (since some discussion forums have a data collection purpose) Rules concerning the purpose of use. The members must not use the site to: a. Defame, abuse, harass, stalk or threaten other members of the same group b. Encourage hatred or discrimination of racial nature c. Practice flaming or trolling d. Engage in flame wars with fellow members e. Excessively criticize the opinion of other participants (although some sites do not include this advice for restraint) f. Advertise products g. Conduct surveys Rules pertaining to personal identification issues. The members should not: a. Refrain from providing all the information required for their identification within the discussion forum
0
Codes of Ethics in Virtual Communities
b. c.
5.
Impersonate another person or entity Falsify or delete any author attribution d. Falsely state or otherwise misrepresent the affiliation with a person or entity e. Delete or revise any material posted by any other person or identity. Rules concerning the writing style of the message. Forum members are expected to post messages that: a. Are in the official language of the Web site b. Have a title that accurately describe the topic of the message c. Are not excessively using capital letters (no shouting) d. Are free from spelling and grammatical mistakes e. Are not highly emotional, so that they might disrupt the exchange of opinions f. Do not attack people, but opinions
tHE ENFOrcEMENt OF EtHIcAL rULEs Usually, the ethical codes governing the participation in discussion forums are enforced by the forum’s monitors or administrators. This can be done proactively or reactively. The proactive enforcement of ethical rules usually takes place in smaller forum communities, where the monitors have the possibility to read every posted message and approve it before it becomes publicly available. The reactive mode is implemented when a participant identifies or has a problem related with the unethical behavior of another member. In this situation the participant can alert the monitor, who can take direct measures against the participant infringing the ethical rules. In some cases, when a site hosts a huge number of forums, it is clearly specified that site moni-
0
tors/administrators do not monitor the content of specific postings, although a number of ethical rules might be provided. However, this case can be considered as a particular application of the reactive monitoring system, since the site monitors/administrators will probably react to complaints concerning a blatant breach of ethical rules. Monitors can take a number of measures progressively to enforce the ethical rules of the forum, such as: 1. 2. 3.
A warning for the first-time breach of ethical rules Suspension of posting privileges for repeated infringement of rules Withdrawal of posting rights and deactivation of a member’s account for repeated and flagrant violations of ethical rules
In parallel with these actions, the monitors using the proactive mode of surveillance can require a forum participant to edit a message or withdraw it. Alternatively, in some cases, the monitors themselves have the specific right to edit or completely erase posts that might breach the ethical rules of the Forum, and to close the threads that are out of focus or that have degenerated in a flame war. These measures are also taken when monitors manage the forum using a reactive mode, but in this case, the action will be initiated by a complaint sent by a forum participant regarding a breach of ethical rules. Any complaint or opposition to these measures should be discussed outside the forum with the moderator, and, if the problem cannot be solved satisfactorily, the member can eventually send an e-mail message to the forum administrator. The advantage of the proactive mode of surveillance is that the messages published online are cleaner and better edited in the first instance. However, this type of monitoring is difficult to implement when the forum is very popular and dynamic, publishing hundreds of messages daily.
Codes of Ethics in Virtual Communities
A POssIbLE FrAMEWOrK FOr ANALYZING AND DEsIGNING EtHIcAL rULEs FOr PUbLIc DIscUssION FOrUMs The implementation and enforcement of ethical codes in discussion forums represent a complex and multi-dimensional process (see Figure 1). The main purpose of these rules is to regulate the exchange of opinions within the forum by establishing reasonable limits to participants’ behavior. The effectiveness of these rules will be determined by two related variables: the clarity of the rules and their enforcement. The online exploratory survey of 200 discussion forums has provided multiple values for these two variables, indicating the use of a continuous scale as the best tool for their evaluation. The de-
gree of clarity of rules will vary between implicit and explicit ethical codes, and the enforcement of rules between a proactive and reactive monitoring style. It is therefore possible to design a twodimensional graph on which every Discussion Forum can be represented in a position defined by the specific values given to the two variables (see Figure 2). To evaluate the effectiveness of each of the four Ethical Systems, 10 messages have been randomly selected and accessed in each of the surveyed discussion forums, and their level of ethical compliance has been evaluated on a scale from 1 (unethical message) to 10 (fully ethical message). The mean of these 10 measurements of ethical compliance was calculated, and then was used to calculate the general mean for the sites included in each of the four possible categories.
Figure 1. The implementation and enforcement of ethical rules in the interactive environment of discussion forums
0
Codes of Ethics in Virtual Communities
Figure 2. A bi-dimensional framework for the representation of ethical codes implementation in discussion forums
Figure 3. The results of the empirical study regarding ethical codes implementation in discussion forums
Figure 3 represents the level of ethical compliance for each of the four categories of sites (the size of the star is directly related with the general mean of ethical compliance measurementswritten inside each star). As it can be seen, a proactive style of monitoring has a strong influence on the ethical dimension of the posted messages, even when the rules are only implicit. On the other hand, the combination of implicit rules with reactive
0
monitoring creates a highly anarchic discussion forum environment, which in some cases might be the desired option. The proposed framework can be applied by creators/administrators of discussion forums in order to identify the best format for the design and implementation of ethical rules that are consistent with the specific profile and purpose of the site.
Codes of Ethics in Virtual Communities
Another important consideration is the content of the ethical code in terms of coverage and generality of the ethical issues that are presented, as well as the clarity of the penalties and the specific power/authority of monitors. All these elements can be combined and dynamically tuned to create the desired environment in the discussion forum. The interactive and dynamic nature of a discussion forum introduces evolutionary features in relation to the style of the posted messages, which consequently determines the specific audience profile of the forum. The “personality” and audience of a discussion forum can therefore be changed over time by manipulating the variables presented above.
cONcLUsION The implementation of ethical codes represents an essential element for the good functioning of virtual communities (Herring, Job-Sluder, Scheckler, & Barab, 2002; Preece & MaloneyKrichmar, 2002). The discussion forums are very diverse from the point of view of their topic and organization, and these characteristics introduce variability at the level of ethical rules. The content, structure, mode of presentation and style of enforcement of ethical rules in a discussion forum can represent an important tool for defining the type of the community and the style of interpersonal interaction. The results of the study presented outline the relation among the mode of presentation and the style of enforcement of the ethical code, and the style/values/profile of the online community. The design and enforcement of ethical codes do not and can not represent an exact science. However, the dynamics of interaction in a discussion forum permit an evolutionary adaptation of the ethical code to the desired profile of the discussion forum. In this context, more research is necessary to identify, define and measure the influence of ethical codes on the specific organization of a
discussion forum. Future studies may concentrate on particular case studies (forums) in order to analyze the parallel evolution of ethical rules and online community, emphasizing the relation between a particular style of ethical rules and the behavior of participants in the community.
rEFErENcEs Armstrong, A., & Hagel, J. III. (1996). The real value of online communities. Harvard Business Review, 74(3), 134-141. Bakardjieva, M., Feenberg, A., & Goldie, J. (2004). User-centered Internet research: The ethical challenge. In E. Buchanan (Ed.), Readings in virtual research ethics: Issues and controversies (pp. 338-350). Hershey, PA: Idea Group Publishing. Belilos, C. (1998). Networking on the net. Professionalism, ethics and courtesy on the net. Retrieved November, 2004, from www.easytraining. com/networking.htm Bickart, B., & Schindler, R.M. (2001). Internet forums as influential sources of consumer information. Journal of Interactive Marketing, 15(3), 31-41. Bielen, M. (1997). Online discussion forums are the latest communication tool. Chemical Market Reporter, 252(7), 9. Blignaut, A.S., & Trollip, S.R. (2003). Measuring faculty participation in asynchronous discussion forums. Journal of Education for Business, 78(6), 347-354. DeSanctis, G., Fayard, A.-L., Roach, M., & Jiang, L. (2003). Learning in online forums. European Management Journal, 21(5), 565-578. Fauske, J., & Wade, S.E. (2003/2004). Research to practice online: Conditions that foster democracy, community, and critical thinking in computer-
0
Codes of Ethics in Virtual Communities
mediated discussions. Journal of Research on Technology in Education, 36(2), 137-154. Fox, N., & Roberts, C. (1999). GPs in cyberspace: The sociology of a ‘virtual community.’ Sociological Review, 47(4), 643-672. Goldsborough, R. (1999a). Web-based discussion groups. Link-Up, 16(1), 23. Goldsborough, R. (1999b). Take the ‘flame’ out of online chats. Computer Dealer News, 15(8), 17. Gordon, R.S. (2000). Online discussion forums. Link-Up, 17(1), 12. Herring, S., Job-Sluder, K., Scheckler, R., & Barab, S. (2002). Searching for safety online: Managing “trolling” in a feminist forum. Information Society, 18(5), 371-385. Johnson, D.G. (1997). Ethics online. Communications of the ACM, 40(1), 60-65. Kling, R., McKim, G., & King, A. (2003). A bit more to it: Scholarly communication forums as socio-technical interaction networks. Journal of the American Society for Information Science & Technology, 54(1), 47-67. Millen, D.R., Fontaine, M.A., & Muller, M.J. (2002). Understanding the benefit and costs of communities of practice. Communications of the ACM, 45(4), 69-74. Preece, J. (2000). Online communities: Designing usability, supporting sociability. Chichester, UK: John Wiley & Sons. Preece, J. (2001). Sociability and usability in online communities: Determining and measuring success. Behavior and Information Technology, 20(5), 347-356. Preece, J., & Maloney-Krichmar, D. (2002). Online communities: Focusing on sociability
and usability. In J. Jacko & A. Sears (Eds.), The human-computer interaction handbook (pp. 596620). Mahwah: Lawrence Earlbaum Associates. Rheingold, H. (2000). The virtual community: Homesteading on the electronic frontier. Cambridge: MIT Press.
KEY tErMs Crossposting: Posting the same message on multiple threads of discussion, without taking into account the relevance of the message for every discussion thread. Flame War: The repetitive exchange of offensive messages between members of a Discussion Forum, which can eventually escalate and degenerate in exchange of injuries. Flaming: Posting a personally offensive message, as a response to an opinion expressed on the Discussion Forum. Monitor: A person who is monitoring the good functioning of a Public Discussion Forum. It is usually a volunteer who is specialised and interested in the specific topic of the forum. Public Discussion Forum: Internet-based application which permits an open exchange of opinions and ideas among various Internet users, on a specific topic of interest, and that can be easily accessed by interested individuals. Shouting: Posting a message written entirely or partially in capital letters. Trolling: Posting a controversial message on a Discussion Forum, with the purpose to attract or instigate a flaming response, mainly targeting the inexperienced members of the Forum.
This work was previously published in Encyclopedia of Virtual Communities and Technologies, edited by S. Dasgupta, pp. 22-28, copyright 2006 by Idea Group Reference (an imprint of IGI Global).
0
109
Chapter 1.8
Digital Audio Watermarking Changsheng Xu Institute for Infocomm Research, Singapore Qi Tian Institute for Infocomm Research, Singapore
AbstrAct This chapter provides a comprehensive survey and summary of the technical achievements in the research area of digital audio watermarking. In order to give a big picture of the current status of this area, this chapter covers the research aspects of performance evaluation for audio watermarking, human auditory system, digital watermarking for PCM audio, digital watermarking for wavtable synthesis audio, and digital watermarking for compressed audio. Based on the current technology used in digital audio watermarking and the demand from real-world applications, future promising directions are identified.
INtrODUctION The recent growth of networked multimedia systems has increased the need for the protection of digital media. This is particularly important for the protection and enhancement of intellectual property rights. Digital media includes text, digital audio, video and images. The ubiquity of digital media in Internet and digital library applications has called for new methods in digital copyright
protection and new measures in data security. Digital watermarking techniques have been developed to meet the needs for these growing concerns and have become an active research area. Digital watermark is an invisible structure to be embedded into the host media. To be effective, a watermark must be imperceptible within its host, discrete to prevent unauthorized removal, easily extracted by the owner, and robust to incidental and intentional distortions. Many watermarking techniques in images and video are proposed, mainly focusing on the invisibility of the watermark and its robustness against various signal manipulations and hostile attacks. Most of recent work can be grouped into two categories: spatial domain methods (Pitas, 1996; Wolfgang & Delp, 1996) and frequency domain methods (Cox et al., 1995; Delaigle et al., 1996; Swanson et al., 1996). There is a current trend towards approaches that make use of information about the human visual system (HVS) to produce a more robust watermark. Such techniques use explicit information about the HVS to exploit the limited dynamic range of the human eye. Compared with digital video and image watermarking, digital audio watermarking provides a special challenge because the human auditory
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Digital Audio Watermarking
system (HAS) is extremely more sensitive than the HVS. The HAS is sensitive to a dynamic range of amplitude of one billion to one and of frequency of one thousand to one. Sensitivity to additive random noise is also acute. The perturbations in a sound file can be detected as low as one part in ten million (80dB below ambient level). Although the limit of perceptible noise increases as the noise contents of the host audio signal increases, the typical allowable noise level is very low. While the HAS has a large dynamic range, it often has a fairly small differential range. As a result, loud sounds tend to mask out quiet sounds. Additionally, while the HAS has very low sensitivity to the amplitude and relative phase of the sound, it is unable to perceive absolute phase. Finally, there are some environmental distortions so common as to be ignored by the listener in most cases. There is always a conflict between inaudibility and robustness in digital audio watermarking. How to achieve an optimal balance between inaudibility and robustness of watermarked audio is a big challenge. The aim of this chapter is to provide a comprehensive survey and summary of the technical achievements in the research area of digital audio watermarking. In order to give a big picture of the current status of this area, this chapter covers the research aspects of performance evaluation for audio watermarking, human auditory system, digital watermarking for PCM audio, digital watermarking for wav-table synthesis audio, and digital watermarking for compressed audio. Based on the current technology used in digital audio watermarking and the demand from real-world applications, future promising directions are identified.
PErFOrMANcE EVALUAtION FOr AUDIO WAtErMArKING Digital audio watermarking can be applied into many applications, including copyright protection, authentication, trace of illegal distribution, captioning and digital right management (DRM). 110
Since different applications have different requirements, the criteria used to evaluate the performance of digital audio watermarking techniques may be more important in some applications than in others. Most of the requirements are conflicting and there is no unique set of requirements that all digital audio watermarking techniques must satisfy. Some important performance evaluation criteria are described in following subsections. These criteria also can be used in image and video watermarking.
Perceptual Quality One of the basic requirements of digital audio watermarking is that the embedded watermark cannot affect the perceptual quality of the host audio signal; that is, the embedded watermark should not be detectable by a listener. This is important in some applications, such as copyright protection and usage tracking. In addition, digital watermarking should not produce artefacts that are perceptually dissimilar from those that may be detected in an original host signal. Usually, signal-to-noise ratio (SNR) of the original host signal vs. the embedded watermark can be used as a quantitative quality measure (Gordy & Bruton, 2000). N −1 x 2 ( n) ∑ SNR = 10 log10 N −1 n =0 2 [~ x ( n ) − x ( n )] n∑ =0
(1)
where x(n) is the host signal of length N samples and is the watermarked signal. Another subjective quality measure is listening test. In listening test, subjects (called golden ears) are selected to listen to the test sample pairs with and without watermarks and give the grades corresponding to different impairment scales. There are a number of listening test methods, such as “Perceptual Audio Quality Measure (PAQM)” (Beerends & Stemerdink, 1992).
Digital Audio Watermarking
bit rate
•
Bit rate is a measure to reflect the amount of watermark data that may be reliably embedded within a host signal per unit of time, such as bits per second. Some watermarking applications, such as insertion of a serial number or author identification, require relevant small amounts of data embedded repeatedly in the host signal. However, high bit rate is desirable in some envisioned applications such as covert communication in order to embed a significant fraction of the amount of data in the host signal. Usually, the reliability is measured as the bit error rate (BER) of extracted watermark data (Gordy & Bruton, 2000). For embedded and extracted watermark sequences of length B bits, the BER (in percent) is given by the expression:
• •
~ 100 B −1 1, w(n) ≠ w(n) BER = ~ ∑ B n =0 0, w(n) = w(n)
(2)
where w(n) ∈ {-1,1} is a bipolar binary sequence of bits to be embedded within the host signal, for 0 ≤ m ≤ B-1, and denotes the set of watermark bits extracted from the watermarked signal.
robustness Robustness is another important requirement for digital audio watermarking. Watermarked audio signals may frequently suffer common signal processing operations and malicious attacks. Although these operations and attacks may not affect the perceived quality of the host signal, they may corrupt the embedded data within the host signal. A good and reliable audio watermarking algorithm should survive the following manipulations (MUSE Project, 1998):
• • •
Additive and multiplicative noise Linear and nonlinear filtering, for example, lowpass filtering Data compression, for example, MPEG audio layer 3, Dobly AC-3
• • • • • • • •
Local exchange of samples, for example, permutations Quantization of sample values Temporal scaling, for example, stretch by 10% Equalization, for example, +6 dB at 1 kHz and -6 dB at 4 kHz Removal of insertion of samples Averaging multiple watermarked copies of a signal D/A and A/D conversions Frequency response distortion Group-delay distortions Downmixing, for example, stereo to mono Overdubbing, for example, placing another track into the audio
Robustness can be measured by the bit error rate (BER) of the extracted watermark data as a function of the amount of distortion introduced by a given manipulation.
security In order to prevent an unauthorized user from detecting the presence of embedded data and remove the embedded data, the watermark embedding procedure must be secure in many applications. Different applications have different security requirements. The most stringent requirements arise in covert communication scenarios. Security of data embedding procedures is interpreted in the same way as security of encryption techniques. A secure data embedding procedure cannot be broken unless the authorized user has access to a secret key that controls the insertion of the data in the host signal. Hence, a data embedding scheme is truly secure if knowing the exact algorithm for embedding the data does not help an unauthorized party detect the presence of embedded data. An unauthorized user should not be able to extract the data in a reasonable amount of time even if he or she knows that the host signal contains data and is familiar with the exact algorithm for embedding the 111
Digital Audio Watermarking
data. Usually, the watermark embedding method should open to the public, but the secret key is not released. In some applications, for example, covert communications, the data may also be encrypted prior to insertion in a host signal.
computational complexity Computational complexity refers to the processing required to embed watermark data into a host signal, and/or to extract the data from the signal. It is essential and critical for the applications that require online watermark embedding and extraction. Algorithm complexity is also important to influence the choice of implementation structure or DSP architecture. Although there are many ways to measure complexity, such as complexity analysis (or “Big-O” analysis) and actual CPU timings (in seconds), for practical applications more quantitative values are required (Cox et al., 1997).
HUMAN AUDItOrY sYstEM The human auditory system (HAS) model has been successfully applied in perceptual audio coding such as MPEG Audio Codec (Brandenburg & Stoll, 1992). Similarly, HAS model can also be used in digital watermarking to embed the data into the host audio signal more transparently and robustly. Audio masking is a phenomenon where a weaker but audible signal (the maskee) can be made inaudible (masked) by a simultaneously occurring stronger signal (the masker) (Noll, 1993). The masking effect depends on the frequency and temporal characteristics of both the maskee and the masker. Frequency masking refers to masking between frequency components in the audio signal. If masker and maskee are close enough to each other in frequency, the masker may make the maskee inaudible. A masking threshold can be measured below which any signal will not be audible. The masking threshold depends on the sound pressure level (SPL) and the frequency of the masker, 112
and on the characteristics of masker and maskee. For example, with the masking threshold for the SPL=60 dB masker at around 1 kHz, the SPL of the maskee can be surprisingly high — it will be masked as long as its SPL is below the masking threshold. The slope of the masking threshold is steeper towards lower frequencies; that is, higher frequencies are more easily masked. It should be noted that it is easier for a broadband noise to mask a tonal than for a tonal signal to mask out a broadband noise. Noise and low-level signal contributions are masked inside and outside the particular critical band if their SPL is below the masking threshold. If the source signal consists of many simultaneous maskers, a global masking threshold can be computed that describes the threshold of just noticeable distortions as a function of frequency. The calculation of the global masking threshold is based on the high-resolution short-term amplitude spectrum of the audio signal and sufficient for critical-band-based analyses. In a first step all individual masking thresholds are determined, depending on signal level, type of masker (noise or tone), and frequency range. Next, the global masking threshold is determined by adding all individual masking thresholds and threshold in quiet. Adding threshold in quiet ensures that computed global masking threshold is not below the threshold in quiet. The effects of masking reaching over critical band bounds must be included in the calculation. Finally, the global signal-to-mask ratio (SMR) is determined as the ratio of the maximum of the signal power and the global masking threshold. Frequency masking models can be readily obtained from the current generation of high quality audio codes, for example, the masking model defined in ISO-MPEG Audio Psychoacoustic Model 1, for Layer 1 (ISO/IEC IS 11172, 1993). In addition to frequency masking, two time domain phenomena also play an important role in human auditory perception, pre-masking and post-masking. The temporal masking effects occur before and after a masking signal has been switched on and off respectively. Pre-masking
Digital Audio Watermarking
effects make weaker signals inaudible before the stronger masker is switched on, and post-masking effects make weaker signals inaudible after the stronger masker is switched off. Pre-masking occurs from five to 20 ms before the masker is switched on, while post-masking occurs from 50 to 200 ms after the masker is turned off.
DIGItAL WAtErMArKING FOr PcM AUDIO Digital audio can be classified into three categories: PCM audio, WAV-table synthesis audio and compressed audio. Most current audio watermarking techniques mainly focus on PCM audio. The popular methods include low-bit coding, phase coding, spread spectrum coding, echo hiding, perceptual masking and content-adaptive watermarking.
Low bit coding The basic idea in low bit coding technique is to embed the watermark in an audio signal by replacing the least significant bit of each sampling point by a coded binary string corresponding to the watermark. For example, in a 16-bits per sample representation, the least four bits can be used for hiding. The retrieval of the hidden data in low-bit coding is done by reading out the value from the low bits. The stego key is the position of altered bits. Low-bit coding is the simplest way to embed data into digital audio and can be applied in all ranges of transmission rates with digital communication modes. Ideally, the channel capacity will be 8kbps in an 8kHz sampled sequence and 44kbps in a 44kHz sampled sequence for a noiseless channel application. In return for this large channel capacity, audio noise is introduced. The impact of this noise is a direct function of the content of the original signal; for example, a live sports event contains crowd noise that makes the noise resultant from low-bit encoding. The major disadvantage of the low bit coding method is its poor immunity to manipulations.
Encoded information can be destroyed by channel noise, re-sampling, and so forth, unless it is coded using redundancy techniques, which reduces the data rate one to two orders of magnitude. In practice, it is useful only in closed, digital-to-digital environments. Turner (1989) proposed a method for inserting an identification string into a digital audio signal by substituting the “insignificant” bits of randomly selected audio samples with the bits of an identification code. Bits are deemed “insignificant” if their alteration is inaudible. Unfortunately, Turner’s method may easily be circumvented. For example, if it is known that the algorithm only affects the least significant two bits of a word, then it is possible to randomly flip all such bits, thereby destroying any existing identification code. Bassia and Pitas (1998) proposed a watermarking scheme to embed a watermark in the time domain of a digital audio signal by slightly modifying the amplitude of each audio sample. The characteristics of this modification are determined both by the original signal and the copyright owner. The detection procedure does not use the original audio signal. But this method can only detect whether an audio signal contains a watermark or not. It cannot indicate the watermark information embedded in the audio signal. Aris Technologies, Inc. (Wolosewicz & Jemeli, 1998) proposed a technique to embed data by modifying signal peaks with their MusiCode product. Temporal peaks within a segment of host audio signal are modified to fall within quantized amplitude levels. The quantization pattern of the peaks is used to distinguish the embedded data. In Cooperman and Moskowitz (1997), Fourier transform coefficients are computed on non-overlapping audio blocks. The least significant bits of the transform coefficients are replaced by the embedded data. The DICE company offers a product based on this algorithm.
Phase coding Phase coding is one of the most effective coding schemes in term of the signal-to-noise ratio because 113
Digital Audio Watermarking
experiments indicate that listeners might not hear any difference caused by a smooth phase shift, even though the signal pattern may change dramatically. When the phase relation between each frequency components is dramatically changed, phase dispersion and “rain barrel” distortions occur. However, as long as the modification of the phase is within certain limits an inaudible coding can be achieved. In phase coding, a hidden datum is represented by a particular phase or phase change in the phase spectral. If the audio signal is divided into segments, data are usually hidden only in the first segment under two conditions. First, the phase difference between each segment needs to be preserved. The second condition states that the final phase spectral with embedded data needs to be smoothed; otherwise, an abrupt phase change causes hearing awareness. Once the embedding procedure is finished, the last step is to update the phase spectral of each of the remaining segments by adding back the relative phase. Consequently, the embedded signal can be constructed from this set of new phase spectral. For the extraction process, the hidden data can be obtained by detecting the phase values from the phase spectral of the first segment. The stego key in this implementation includes the phase shift and the size of one segment. Phase coding can be used in both analog and digital modes but it is sensitive to most audio compressing algorithms. The procedure for phase coding (Bender et al., 1996) is as follows:
1.
2.
3.
Break the sound sequence s[i], (0 ≤ i ≤ I-1) into a series of N short segments, sn[i] where (0 ≤ n ≤ N-1). Apply a K-points discrete Fourier transform (DFT) to n-th segment, sn[i], where (K = I/N), and create a matrix of the phase, φn(ωk), and magnitude, An(ωk) for (0 ≤ k ≤ K-1). Store the phase difference between each adjacent segment for (0 ≤ n ≤ N-1): ∆
114
n +1 ( k )
=
n +1 ( k ) − n ( k )
(3)
4.
A binary set of data is represented as a φdata = π/2 or -π/2 representing 0 or 1: ' 0
5.
=
' data
Re-create phase matrices for n>0 by using the phase difference: ( 1' ( k ) = 0' ( k ) + ∆ 1 ( k )) ... ' ' ( n ( k ) = n −1 ( k ) + ∆ n ( k )) ... ' ' ( N ( k ) = N −1 ( k ) + ∆ N ( k ))
6.
(4)
(5)
Use the modified phase matrix φn’(ωk) and the original magnitude matrix An(ωk) to reconstruct the sound signal by applying the inverse DFT.
For the decoding process, the synchronization of the sequence is done before the decoding. The length of the segment, the DFT points, and the data interval must be known at the receiver. The value of the underlying phase of the first segment is detected as a 0 or 1, which represents the coded binary string. Since φ0’(ωk) is modified, the absolute phases of the following segments are modified respectively. However, the relative phase difference of each adjacent frame is preserved. It is this relative difference in phase that the ear is most sensitive to. Phase coding is also applied to data hiding in speech signals (Yardimci et al., 1997).
spread spectrum coding The basic spread spectrum technique is designed to encrypt a stream of information by spreading the encrypted data across as much of the frequency spectrum as possible. It turns out that many spread spectrum techniques adapt well to data hiding in audio signals. Because the hidden data are usually not expected to be destroyed by operations such as compressing and cropping, broadband spread
Digital Audio Watermarking
spectrum-based techniques, which make small modifications to a large number of bits for each hidden datum, are expected to be robust against the operations. In a normal communication channel, it is often desirable to concentrate the information in as narrow a region of the frequency spectrum as possible. Among many different variations on the idea of spread spectrum communication, Direct Sequence (DS) is currently considered. In general, spreading is accomplished by modulating the original signal with a sequence of random binary pulses (referred to as chip) with values 1 and -1. The chip rate is an integer multiple of the data rate. The bandwidth expansion is typically of the order of 100 and higher. For the embedding process, the data to be embedded are coded as a binary string using error-correction coding so that errors caused by channel noise and original signal modification can be suppressed. Then, the code is multiplied by the carrier wave and the pseudo-random noise sequence, which has a wide frequency spectrum. As a consequence, the frequency spectrum of the data is spread over the available frequency band. The spread data sequence is then attenuated and added to the original signal as additive random noise. For extraction, the same binary pseudorandom noise sequence applied for the embedding will be synchronously (in phase) multiplied with the embedded signal. Unlike phase coding, DS introduces additive random noise to the audio signal. To keep the noise level low and inaudible, the spread code is attenuated (without adaptation) to roughly 0.5% of the dynamic range of the original audio signal. The combination of simple repetition technique and error correction coding ensure the integrity of the code. A short segment of the binary code string is concatenated and added to the original signal so that transient noise can be reduced by averaging over the segment in the extraction process. Most audio watermarking techniques are based on the spread spectrum scheme and are inherently projection techniques on a given key-defined direction. In Tilki and Beex (1996), Fourier transform coefficients over the middle frequency bands are
replaced with spectral components from a signature sequence. The middle frequency band is selected so that the data remain outside of the more sensitive low frequency range. The signature is of short time duration and has a low amplitude relative to the local audio signal. The technique is described as robust to noise and the wow and flutter of analogue tapes. In Wolosewicz (1998), the high frequency portion of an audio segment is replaced with embedded data. Ideally, the algorithm looks for segments in the audio with high energy. The significant low frequency energy helps to perceptually hide the embedded high frequency data. In addition, the segment should have low energy to ensure that significant components in the audio are not replaced with the embedded data. In a typical implementation, a block of approximately 675 bits of data is encoded using a spread spectrum algorithm with a 10kHz carrier waveform. The duration of the resulting data block is 0.0675 seconds. The data block is repeated in several locations according to the constraints imposed on the audio spectrum. In another spread spectrum implementation, Pruess et al. (1994) proposed to embed data into the host audio signal as coloured noise. The data are coloured by shaping a pseudo-noise sequence according to the shape of the original signal. The data are embedded within a preselected band of the audio spectrum after proportionally shaping them by the corresponding audio signal frequency components. Since the shaping helps to perceptually hide the embedded data, the inventors claim the composite audio signal is not readily distinguishable from the original audio signal. The data may be recovered by essentially reversing the embedding operation using a whitening filter. Solana Technology Development Corp. (Lee et al., 1998) later introduced a similar approach with their Electronic DNA product. Time domain modelling, for example, linear predictive coding, or fast Fourier transform is used to determine the spectral shape. Moses (1995) proposed a technique to embed data by encoding them as one or more whitened direct sequence spread spectrum signals and/or a narrowband FSK data signal and transmit115
Digital Audio Watermarking
ted at the time, frequency and level determined by a neural network such that the signal is masked by the audio signal. The neural network monitors the audio channel to determine opportunities to insert the data such that the inserted data are masked.
Echo Hiding Echo hiding (Gruhl et al., 1996) is a method for embedding information into an audio signal. It seeks to do so in a robust fashion, while not perceivably degrading the original signal. Echo hiding has applications in providing proof of the ownership, annotation, and assurance of content integrity. Therefore, the embedded data should not be sensitive to removal by common transform to the embedded audio, such as filtering, re-sampling, block editing, or lossy data compression. Echo hiding embeds data into a host audio signal by introducing an echo. The data are hidden by varying three parameters of the echo: initial amplitude, decay rate, and delay. As the delay between the original and the echo decreases, the two signals blend. At a certain point, the human ear cannot distinguish between the two signals. The echo is perceived as added resonance. The coder uses two delay times, one to represent a binary one and another to represent binary zero. Both delay times are below the threshold at which the human ear can resolve the echo. In addition to decreasing the delay time, the echo can also be ensured unperceivable by setting the initial amplitude and the delay rate below the audible threshold of the human ear. For the embedding process, the original audio signal (v(t)) is divided into segments and one echo is embedded in each segment. In a simple case, the embedded signal (c(t)) can, for example, be expressed as follows: c(t)=v(t)+av(t-d)
(6)
where a is an amplitude factor. The stego key is the two echo delay times, of d and d’.
116
The extraction is based on the autocorrelation of the cepstrum (i.e., logF(c(t))) of the embedded signal. The result in the time domain is F-1(log(F(c(t))2). The decision of a d or a d’ delay can be made by examining the position of a spike that appears in the autocorrelation diagram. Echo hiding can effectively place unperceivable information into an audio stream. It is robust to noise and does not require a high data transmission channel. The drawback of echo hiding is its unsafe stego key, so it is easy to be detected by attackers.
Perceptual Masking Swanson et al. (1998) proposed a robust audio watermarking approach using perceptual masking. The major contributions of this method include:
•
•
•
A perception-based watermarking procedure: The embedded watermark adapts to each individual host signal. In particular, the temporal and frequency distribution of the watermark are dictated by the temporal and frequency masking characteristics of the host audio signal. As a result, the amplitude (strength) of the watermark increases and decreases with the host signal, for example, lower amplitude in “quiet” regions of the audio. This guarantees that the embedded watermark is inaudible while having the maximum possible energy. Maximizing the energy of the watermark adds robustness to attacks. An author representation that solves the deadlock problem: An author is represented with a pseudo-random sequence created by a pseudo-random generator and two keys. One key is author-dependent, while the second key is signal-dependent. The representation is able to resolve rightful ownership in the face of multiple ownership claims. A dual watermark: The watermarking scheme uses the original audio signal to detect the presence of a watermark. The procedure can handle virtually all types of distortions, including cropping, temporal rescaling, and
Digital Audio Watermarking
so forth using a generalized likelihood ratio test. As a result, the watermarking procedure is a powerful digital copyright protection tool. This procedure is integrated with a second watermark, which does not require the original signal. The dual watermarks also address the deadlock problem. Each audio signal is watermarked with a unique noise-like sequence shaped by the masking phenomena. The watermark consists of (1) an author representation, and (2) spectral and temporal shaping using the masking effects of the human auditory system. The watermarking scheme is based on a repeated application of a basic watermarking operation on smaller segments of the audio signal. The length N audio signal is first segmented into blocks of length 512 samples, i = 0, 1, ..., N/512 -1, and k = 0, 1, ..., 511. The block size of 512 samples is dictated by the frequency masking model. For each audio segment si(k), the algorithm works as follows.
1. 2. 3.
4. 5. 6.
7.
Compute the power spectrum Si(k) of the audio segment si(k) Compute the frequency mask Mi(k) of the power spectrum Si(k) Use the mask Mi(k) to weight the noise-like author representation for that audio block, creating the shaped author signature Pi(k) = Yi(k)Mi(k) Compute the inverse FFT of the shaped noise pi(k) = IFFT(Pi(k)) Compute the temporal mask ti(k) of si(k) Use the temporal mask ti(k) to further shape the frequency shaped noise, creating the watermark wi(k) = ti(k)pi(k) of that audio segment Create the watermarked block si’(k) = si(k) + wi(k).
The overall watermark for a signal is simply the concatenation of the watermark segments wi for all of the length 512 audio blocks. The author signature yi for block i is computed in terms of
the personal author key x1 and signal-dependent key x2 computed from block si. The dual localization effects of the frequency and temporal masking control the watermark in both domains. Frequency-domain shaping alone is not enough to guarantee that the watermark will be inaudible. Frequency-domain masking computations are based on a Fourier transform analysis. A fixed length Fourier transform does not provide good time localization for some applications. In particular, a watermark computed using frequency-domain masking will spread in time over the entire analysis block. If the signal energy is concentrated in a time interval that is shorter than the analysis block length, the watermark is not masked outside of that subinterval. This leads to audible distortion, for example, pre-echoes. The temporal mask guarantees that the “quiet” regions are not disturbed by the watermark.
content-Adaptive Watermarking A novel content-adaptive watermarking scheme is described in Xu and Feng (2002). The embedding design is based on audio content and the human auditory system. With the content-adaptive embedding scheme, the embedding parameter for setting up the embedding process will vary with the content of the audio signal. For example, because the content of a frame of digital violin music is very different from that of a recording of a large symphony orchestra in terms of spectral details, these two respective music frames are treated differently. By doing so, the embedded watermark signal will better match the host audio signal so that the embedded signal is perceptually negligible. The content-adaptive method couples audio content with the embedded watermark signal. Consequently, it is difficult to remove the embedded signal without destroying the host audio signal. Since the embedding parameters depend on the host audio signal, the tamper-resistance of this watermark embedding technique is also increased. In broad terms, this technique involves seg117
Digital Audio Watermarking
Figure 1. Watermark embedding scheme for PCM audio Bit Embedding
Watermarked Audio
Watermark Information Bit Hopping Original Audio
Audio Segmentation
Feature Extraction
Classification & Embedding Selection
Classification Parameters
menting an audio signal into frames in time domain, classifying the frames as belonging to one of several known classes, and then encoding each frame with an appropriate embedding scheme. The particular scheme chosen is tailored to the relevant class of audio signal according to its properties in frequency domain. To implement the content-adaptive embedding, two techniques are disclosed. They are audio frame classification and embedding scheme design. Figure 1 illustrates the watermark embedding scheme. The input original signal is divided into frames by audio segmentation. Feature measures are extracted from each frame to represent the characteristics of the audio signal of that frame. Based on the feature measures, the audio frame is classified into one of the pre-defined classes and an embedding scheme is selected accordingly, which is tailored to the class. Using the selected embedding scheme, a watermark is embedded into the audio frame using multiple-bit hopping and hiding method. In this scheme, the feature extraction method is exactly the same as the one used in the training processing. The parameters of the classifier and the embedding schemes are generated in the training process. Figure 2 depicts the training process for an adaptive embedding model. Adaptive embedding, or content-sensitive embedding, embeds watermark differently for different types of audio signals. In order to do so, a training process is run for each category of audio signal to define embedding schemes that are well suited to the particular category of audio signal. The training process 118
Embedding Schemes
analyses an audio signal to find an optimal way to classify audio frames into classes and then design embedding schemes for each of those classes. To achieve this objective, the training data should be sufficient to be statistically significant. Audio signal frames are clustered into data clusters and each of them forms a partition in the feature vector space and has a centroid as its representation. Since the audio frames in a cluster are similar, embedding schemes can be designed according to the centroid of the cluster and the human audio system model. The design of embedding schemes may need a lot of testing to ensure the inaudibility and robustness. Consequently, an embedding scheme is designed for each class/cluster of signal that is best suited to the host signal. In the process, inaudibility or the sensitivity of the human auditory system and resistance to attackers must be taken into considerations. The training process needs to be performed only once for a category of audio signals. The derived classification parameters and the embedding schemes are used to embed watermarks in all audio signals in that category. As shown in Figure 1 in the audio classification and embedding scheme selection, similar pre-processing will be conducted to convert the incoming audio signal into feature frame sequences. Each frame is classified into one of the predefined classes. An embedding scheme for a frame is chosen, which is referred to as content-adaptive embedding scheme. In this way, the watermark code is embedded frame by frame into the host audio signal.
Digital Audio Watermarking
Figure 2. Training and embedding scheme design Classification Parameters Training Data
Feature Extraction
Feature Clustering
Audio Segmentation HAS
Embedding Schemes Embedding Design
Figure 3 illustrates the scheme of watermark extraction. The input signal is converted into a sequence of frames by feature extraction. For the watermarked audio signal, it will be segmented into frames using the same segmentation method as in embedding process. Then the bit detection is conducted to extract bit delays on a frame-byframe basis. Because a single bit of the watermark is hopped into multiple bits through bit hopping in the embedding process, multiple delays are detected in each frame. This method is more robust against attackers compared with the single bit hiding technique. Firstly, one frame is encoded with multiple bits, and any attackers do not know the coding parameters. Secondly, the embedded signal is weaker and well hidden as a consequence of using multiple bits. The key step of the bit detection involves the detection of the spacing between the bits. To do this, the magnitude (at relevant locations in each audio frame) of an autocorrelation of an embedded signal’s cepstrum (Gruhl et al., 1996) is examined. Cepstral analysis utilises a form of a homomorphic
Figure 3. Watermark extracting scheme for PCM audio Watermark Recovery
Watermark Key
Embedding Schemes
Watermarked Audio
Audio Segmentation
Decryption
Code Mapping
Bit Detection
Watermark
system that coverts the convolution operation into an addition operation. It is useful in detecting the existence of embedded bits. From the autocorrelation of the cepstrum, the embedded bits in each audio frame can be found according to a “power spike” at each delay of the bits.
DIGItAL WAtErMArKING FOr WAV-tAbLE sYNtHEsIs AUDIO Architectures of WAV-table Audio Typically, watermarking is applied directly to data samples themselves, whether this is still image data, video frames or audio segments. However, such systems fail to address the issue of audio coding systems, where digital audio data are not available, but a form of representing the audio data for later reproduction according to a protocol is. It is well known that tracks of digital audio data can require large amounts of storage and high data transfer rates, whereas synthesis architecture coding protocols such as the Musical Instrument Digital Interface (MIDI) have corresponding requirements that are several orders of magnitude lower for the same audio data. MIDI audio files are not files made entirely of sampled audio data (i.e., actual audio sounds), but instead contain synthesizer instructions, or MIDI message, to reproduce the audio data. The synthesizer instructions contain much smaller amounts of sampled audio data. That is, a synthesizer generates actual sounds from the instructions in a MIDI audio file. Expanding upon MIDI, Downloadable Sounds (DLS) is a synthesizer architecture specification that requires a hardware or software synthesizer to support all of its components (Downloadable Sounds Level 1, 1997). DLS is a typical WAV-table synthesis audio and permits additional instruments to be defined and downloaded to a synthesizer besides the standard 128 instruments provided by the MIDI system. The DLS file format stores both samples of digital sound data and articulation parameters to create at least one sound instrument. 119
Digital Audio Watermarking
An instrument contains “regions” that point to WAVE “files” also embedded in the DLS file. Each region specifies an MIDI note and velocity range that will trigger the corresponding sound and also contains articulation information such as envelopes and loop points. Articulation information can be specified for each individual region or for the entire instrument. Figure 4 illustrates the DLS file structure. DLS is expected to become a new standard in musical industry, because of its specific advantages. On the one hand, when compared with MIDI, DLS provides a common playback experience and an unlimited sound palette for both instruments and sound effects. On the other hand, when compared with PCM audio, it has true audio interactivity and, as noted hereinbefore, smaller storage requirement. One of the objectives of DLS design is that the specification must be open and non-proprietary. Therefore, how to effectively protect its copyright is important. A novel digital watermarking method for WT synthesis audio, including DLS, is proposed in Xu et al. (2001). Watermark embedding and extraction schemes for WT audio are described in the following subsections.
Watermark Embedding scheme Figure 5 illustrates the watermark embedding scheme for WT audio. Generally, a WT audio file contains two parts: articulation parameters and sample data such as DLS, or only contains articulation parameters such as MIDI. Unlike traditional PCM audio, the sample data in WT audio are not the prevalent components. On the contrary, it is the articulation parameters in WT audio that control how to play the sounds. Therefore, in the embedding scheme watermarks are embedded into both sample data (if they are included in the WT audio) and articulation parameters. Firstly, original WT audio is divided into sample data and articulation parameters. Then, two different embedding schemes are used to process them respectively and form the relevant watermarked outputs. Finally, the watermarked WT audio is 120
generated by integrating the watermarked sample data and articulation parameters.
Adaptive coding based on Finite Automaton Figure 6 shows the scheme of adaptive coding. In this scheme, techniques (finite automaton and redundancy) are proposed to improve the robustness. In addition, the bits of sample data are adaptively coded according to HAS so as to guarantee the minimum distortion of original sample data. The watermark message is firstly converted into a string of binary sequence. Each bit of the sequence will replace a corresponding bit of the sample points. The particular location in sample points is determined by finite automaton and HAS. The number of sample points is calculated according to the redundancy technique. Adaptive bit coding has, however, low immunity to manipulations. Embedded information can be destroyed by channel noise, re-sampling, and other operations. Adaptive bit coding technique is used based on several considerations. Firstly, unlike sampled digital audio, WT audio is a parameterised digital audio, so it is difficult to attack it using the typical signal processing techniques such as adding noise and re-sampling. Secondly, the size of wave sample in WT audio is very small, and therefore it is unsuitable to embed a watermark into the samples in the frequency domain. Finally, in order to ensure robustness, the watermarked bit sequence of sample data is embedded into the articulation parameters of WT audio. If the sample data are distorted, the embedded information can be used to restore the watermarked bit of the sample data. The functionality of a finite automaton M can be described as a quintuple: M =< X , Y , S , , >
(7)
where X is a non-empty finite set (the input alphabet of M), Y is a non-empty finite set (the output alphabet of M), S is a non-empty finite set (the state alphabet of M), δ : S × X → S is a single-valued
Digital Audio Watermarking
Figure 4. DLS file structure Instrument 1 Bank, Instrument # Articulation info
Region 1a MIDI Note/Velocity Range Articulation info
Instrument 2 Bank, Instrument # Articulation info
Region 1b MIDI Note/Velocity Range Articulation info
Sample Data 1
Region 2a MIDI Note/Velocity Range Articulation info
Sample Data 2
Figure 5. Watermark embedding scheme for WAV-table synthesis audio Watermarked Articulation Parameters
Articulation Parameters
Original WT
Content Extraction
Parameters Hiding
Watermark
Watermarked WT Integration
Coding-Bit Extraction
Adaptive Coding Sample Data
Watermarked Sample Data
mapping (the next state function of M) and λ : S × X → Y is a single-valued mapping (the output function of M). The elements are expressed as follows: X = {0,1} Y = { y1 , y 2 , y 3 , y 4 } S = {S 0 , S1 , S 2 , S 3 , S 4 } S i +1 = ( S i , x ) yi = ( Si , x)
(8) (9) (10) (11) (12)
where yi (i=1,2,3,4) is the number of sample points that are jumped off when embedding bit corresponding to relevant states, and Si (i = 0 - 4) is five kinds of states corresponding to 0, 00, 01, 10 and 11 respectively, and S0 is to be supposed the initial state. The state transfer diagram of finite automaton is shown in Figure 7. An example procedure of redundancy low-bit coding method based on FA and HAS is described as follows:
1. 2.
Convert the watermark message into binary sequence; Determine the values of the elements in FA;
3.
4. 5.
(a)
(b)
(c)
that is, the number of sample points that will be jumped off corresponding relevant states: y1: state 00 y2: state 01 y3: state 10 y4: state 11 Determine the redundant number for 0 and 1 bit to be embedded: r0: the embedded number for 0 bit; r1: the embedded number for 1 bit; Determine the HAS threshold T; For each bit of the binary sequence corresponding to watermark message and the sample point in the WT sample data, Compare the amplitude value of sample point with HAS threshold T; if A 2 > > n where the i are known as the singular values of D. Then: n
C = DD T = USV TVS TU T = US 2U T = ∑ i =1
2 T i i i
uu
(7) where S 2 [ N × N ] = diag ( 12 , 22 ,, n2 ,0, 0). Thus, only the first n singular values are non-zero. Comparing (7) with (5), we see that the squares of the singular values give us the eigenvalues of (i.e., i = i2) and the columns of U are the eigenvectors. Now consider a similar derivation for C'.
1169
Robust Face Recognition for Data Mining
Figure 1. Typical set of eigenfaces as used for face recognition. Leftmost image is average face.
n
C ' = D T D = VS TU TUSV T = VS 2V T = ∑ i =1
2 T i i i
vv
(8) where S 2 [ n×n ] = diag( 12 , 22 ,, n2 ). Comparing (7) and (8) we see that the singular values are identical, so the squares of the singular values yield the eigenvalues of C. The eigenvectors of C can be obtained from the eigenvectors of C', which are the columns of V, by rearranging (6) as follows:
U = DVS −1
(9)
which can be expressed alternatively by
ui =
1
Dvi ,
(10)
eigenfaces are orthogonal and efficiently describe (span) the space of variation in faces. Generally, we select a small subset of m < n eigenfaces to define a reduced dimensionality facespace that yields highest recognition performance on unseen examples of faces: for good recognition performance the required number of eigenfaces, m, is typically chosen to be of the order of 6 to 10. Thus in PCA recognition each face can be represented by just a few components by subtracting out the average face and then calculating principal components by projecting the remaining difference image onto the eigenfaces. Simple methods such as nearest neighbors are normally used to determine which face best matches a given face.
i
where i = [1 ... n]. Thus by performing an eigenvector decomposition on the small matrix C' [nxn], we efficiently obtain both the eigenvalues and eigenvectors of the very large matrix C [NxN] . In the case of a database of 100x100 pixel face images of size 30, by using this shortcut, we need only decompose a 30x30 matrix instead of a 10,000x10,000 matrix! The eigenvectors of C are often called the eigenfaces and are shown as images in Figure 1. Being the columns of a unitary matrix, the
1170
robust PcA recognition The authors have developed Adaptive Principal Component Analysis (APCA) to improve the robustness of PCA to nuisance factors such as lighting and expression (Chen & Lovell, 2003, 2004). In the APCA method, we first apply PCA. Then we rotate and warp the facespace by whitening and filtering the eigenfaces according to overall covariance, between-class, and within-class covariance to find an improved set of eigenfeatures. Figure 2 shows the large improvement in robustness to
Robust Face Recognition for Data Mining
Figure 2. Contours of 95% recognition performance for the original PCA and the proposed APCA method against lighting elevation and azimuth
Figure 3. Recognition rates for APCA and PCA vs. number of eigenfaces with variations in lighting and expression from Chen and Lovell (2003)
1171
Robust Face Recognition for Data Mining
Table 3. A summary of critical issues of face recognition technologies Privacy Concerns It is clear that personal privacy may be reduced with the widespread adoption of face recognition technology. However, since September 11, 2001, concerns about privacy have taken a back seat to concerns about personal security. Governments are under intense pressure to introduce stronger security measures. Unfortunately, government’s current need for biometric technology does nothing to improve performance in the short term and may actually damage uptake in the medium term due to unrealistic expectations. Computational Efficiency Face recognition can be computationally very intensive for large databases. This is a serious impediment for multimedia data mining. Accuracy on Large Databases Studies indicate that recognition error rates of the order of 10% are the best that can be obtained on large databases. This error rate sounds rather high, but trained humans do no better and are much slower at searching. Sensitivity to Illumination and Other Changes Changes in lighting, camera angle, and facial expression can greatly affect recognition performance. Inability to Cope with Multiple Head Poses Very few systems can cope with non-frontal views of the face. Some researchers propose 3-D recognition systems using stereo cameras for real-time applications, but these are not suitable for data mining. Ability to Scale While a laboratory system may work quite well on 20 or 30 faces, it is not clear that these systems will scale to huge face databases as required for many security applications such as detecting faces of known criminals in a crowd or the person locator service on the planetary sensor Web.
lighting angle. The proposed APCA method allows us to recognize faces with high confidence even if they are half in shadow. Figure 3 shows significant recognition performance gains over standard PCA when both changes in lighting and expression are present.
and the great strides made in recent decades, there is still much work to do before these applications become routine.
critical Issues of Face recognition technology
Face recognition and other biometric technologies are coming of age due to the need to address heightened security concerns in the 21s t century. Privacy concerns that have hindered public acceptance of these technologies in the past are now yielding to society’s need for increased security
Despite the huge number of potential applications for reliable face recognition, the need for such search capabilities in multimedia data mining,
1172
FUtUrE trENDs
Robust Face Recognition for Data Mining
while maintaining a free society. Apart from the demands from the security sector, there are many applications for the technology in other areas of data mining. The performance and robustness of systems will increase significantly as more researcher effort is brought to bear. In recent real-time systems there is much interest in 3-D reconstruction of the head from multiple camera angles, but in data mining the focus must remain on reliable recognition from single photos.
cONcLUsION It has been argued that by the end of the 20t h century computers were very capable of handling text and numbers and that in the 21s t century computers will have to be able to cope with raw data such as images and speech with much the same facility. The explosion of multimedia data on the Internet and the conversion of all information to digital formats (music, speech, television) is driving the demand for advanced multimedia search capabilities, but the pattern recognition technology is mostly unreliable and slow. Yet, the emergence of handheld computers with built-in speech and handwriting recognition ability, however primitive, is a sign of the changing times. The challenge for researchers is to produce pattern recognition algorithms, such as face recognition, reliable and fast enough for deployment on data spaces of a planetary scale.
rEFErENcEs Adinj, Y., Moses, Y., & Ullman, S. (1997). Face recognition: The problem of compensation for changes in illumination direction. IEEE PAMI, 19(4), 721-732. Agamanolis, S., & Bove, Jr., V.M. (1997). Multilevel scripting for responsive multimedia. IEEE Multimedia, 4(4), 40-50.
Belhumeur, P., & Kriegman, D. (1998). What is the set of images of an object under all possible illumination conditions? International Journal of Computer Vision, 28(3), 245-260. Beymer, D., & Poggio, T. (1995). Face recognition from one example view. In Proceedings of the International Conference of Computer Vision (pp. 500-507). Black, M.J., Fleet, D.J., & Yacoob, Y. (2000). Robustly estimating changes in image appearance. Computer Vision and Image Understanding, 78(1), 8-31. Chen, S., & Lovell, B.C. (2003). Face recognition with one sample image per class. In Proceedings of ANZIIS2003 (pp. 83-88), December 10-12, Sydney, Australia. Chen, S., & Lovell, B.C. (2004). Illumination and expression invariant face recognition with one sample image. In Proceedings of the International Conference on Pattern Recognition, August 2326, Cambridge, UK. Chen, S., Lovell, B.C., & Sun, S. (2002). Face recognition with APCA in variant illuminations. In Proceedings of WOSPA2002 (pp. 9-12), December 17-18, Brisbane, Australia. Edelman, S., Reisfeld, D., & Yeshurun, Y. (1994). A system for face recognition that learns from examples. In Proceedings of the European Conference on Computer Vision (pp. 787-791). Berlin: Springer-Verlag. Feraud, R., Bernier, O., Viallet, J.E., & Collobert, M. (2000). A fast and accurate face detector for indexation of face images. In Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (pp. 77-82), March 28-30. Gao, Y., & Leung, M.K.H.(2002). Face recognition using line edge map. IEEE PAMI, 24(6), 764-779.
1173
Robust Face Recognition for Data Mining
Georghiades, A.S., Belhumeur, P.N., & Kriegman, D.J. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643-660.
Yang, J., Zhang, D., Frangi, A.F., & Jing-Yu, Y. (2004). Two-dimensional PCA: A new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(1), 131-137.
Gibbons, P.B., Karp, B., Ke, Y., Nath, S., & Sehan, S. (2003). IrisNet: An architecture for a worldwide sensor Web. Pervasive Computing, 2(4), 22-33.
Yilmaz, A., & Gokmen, M. (2000). Eigenhill vs. eigenface and eigenedge. In Proceedings of International Conference Pattern Recognition (pp. 827-830). Barcelona, Spain.
Li, Y., Goshtasby, A., & Garcia, O.(2000). Detecting and tracking human faces in videos. Proc. 15th Int’l Conference on Pattern Recognition (pp. 807-810), Sept 3-8, 1. Liu, C., & Wechsler, H. (1998). Evolution of Optimal Projection Axes (OPA) for Face Recognition. Third IEEE International Conference on Automatic face and Gesture Recognition, FG’98 (pp. 282-287), Nara, Japan, April 14-16. Liu, X.M., Chen, T., & Kumar, B.V.K.V. (2003). Face authentication for multiple subjects using eigenflow. Pattern Recognition, Special issue on Biometric, 36(2), 313-328. Ming-Hsuan, Y., Kriegman, D.J., & Ahuja, N. (2002). Detecting faces in images: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(1), 34-58. Rein-Lien, H., Abdel-Mottaleb, M., & Jain, A.K. (2002). Face detection in color images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 696-706. Swets, D.L., & Weng, J. (1996). Using discriminant eigenfeatures for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8), 831-836. The Hypersoap Project. (n.d.) Retrieved February 6, 2004, from http://www.media.mit.edu/hypersoap/ Turk, M.A., & Pentland, A.P. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71-86.
1174
Zhao, L., & Yang, Y.H. (1999). Theoretical analysis of illumination in PCA-based vision systems. Pattern Recognition, 32, 547-564.
KEY tErMs Biometric: A measurable, physical characteristic or personal behavioral trait used to recognize the identity, or verify the claimed identity, of an enrollee. A biometric identification system identifies a human from a measurement of a physical feature or repeatable action of the individual (for example, hand geometry, retinal scan, iris scan, fingerprint patterns, facial characteristics, DNA sequence characteristics, voice prints, and hand written signature). Computer Vision: Using computers to analyze images and video streams and extract meaningful information from them in a similar way to the human vision system. It is related to artificial intelligence and image processing and is concerned with computer processing of images from the real world to recognize features present in the image. Eigenfaces: Another name for face recognition via principal components analysis. Face Space: The vector space spanned by the eigenfaces. Head Pose: Position of the head in 3-D space including head tilt and rotation.
Robust Face Recognition for Data Mining
Metadata: Labeling, information describing other information. Pattern Recognition: Pattern recognition is the ability to take in raw data, such as images, and take action based on the category of the data. Principal Components Analysis: Principal components analysis (PCA) is a method that can be used to simplify a dataset. It is a transform that chooses a new coordinate system for the data set, such that the greatest variance by any projection of the data set comes to lie on the first axis (then
called the first principal component), the second greatest variance on the second axis and so on. PCA can be used for reducing dimensionality. PCA is also called the Karhunen-Loève transform or the Hotelling transform. Robust: The opposite of brittle; this can be said of a system that has the ability to recover gracefully from the whole range of exceptional inputs and situations in a given environment. Also has the connotation of elegance in addition to careful attention to detail.
This work was previously published in Encyclopedia of Data Warehousing and Mining, edited by J. Wang, pp. 965-972, copyright 2005 by Information Science Reference, formerly known as Idea Group Reference (an imprint of IGI Global).
1175
1176
Chapter 3.6
Securing an Electronic Legislature Using Threshold Signatures Brian King Indiana University – Purdue University Indianapolis (IU–PUI), USA Yvo Desmedt University College of London, UK
INtrODUctION Today a significant amount of research has focused on trying to apply the advances in information technology to governmental services. One endeavor has been the attempt to apply it to “electronic voting.” Unfortunately, while questionable secure e-voting technology has been widely deployed, the same cannot be said for cryptographic based ones. There is one type of “voting” which has received only limited attention concerning applying these technology advances, the type of voting that takes place within a legislative body. At first glance, it may not appear difficult to institute electronic voting in a legislature, for it may seem that one only needs to apply the traditional security mechanisms that are used to safeguard networked systems, but as we soon outline there will be significant security risks associated with an electronic legislature. One of our concerns is that entities may attempt to implement an electronic version of a legislature without realizing all the
risks and implementing all the needed security mechanisms. In fact, there have been occasional instances of some entities attempting to create some electronic/digital form of legislature, for example (Weidenbener, 2004). In any legislative vote, the legislature’s ability to pass or to not pass legislation should be interpreted as the legislature deciding whether to “sign the proposal” into “law.” Thus, “law” is a signature; anyone can verify that a “proposal” is a “law” by applying the signature verification procedure. As we move towards electronic applications of governmental services, it is only natural when this is applied towards legislatures we will replace the “written law” by a “digital signature” (here the use of the term law can be replaced by any internal regulation and a legislature by any regulatory body). The underlying aspect of the article is the security considerations that need to be applied when this is implemented. The question why consider an electronic legislature is important. The fundamental reasons
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Securing an Electronic Legislature Using Threshold Signatures
for applying today’s information technology to government and its services have always focused on that it would bring improved services and allow greater accessibility of government to its constituents. An electronic legislature would most certainly improve the legislative service. It will allow for the legislators to be mobile, they will no longer need to be tied to the legislative house to provide representation. Many industrial employers allow their workers to telecommute to work, it is a realization by the employers that these workers are valuable, as well as a recognition that the workforce and the time constraints on the workforce has changed. In many cases, without this option, these workers may leave the workplace. This same reasoning of a valued worker should be applied to our legislators. Further, it does not make sense that today we would allow a subset of the legislature to make and pass laws due to absenteeism, especially in light that many of the required mechanisms to bring about a mobile “electronic legislature” are available. One can argue that by allowing legislators to occasionally telecommute will provide an improved workforce (this argument is motivated by the same reason that private industry utilizes “telecommuting”). We also observe that an electronic legislature should provide the constituents greater access to their legislators. A final argument for an electronic legislature is that it will provide continuation of government in the case of some drastic action like a terrorist attack. In the fall of 2001, the legislative branch of the U.S. federal government came under two attacks. The first attack was performed by Al Qaeda operatives (who it is speculated intended to fly one of the planes into the U.S. capital), and a second attack by an unknown entity who contaminated parts of the U.S. senate (and it offices) with anthrax spores. This second attack was successful in that it denied the Senate the ability to convene for several days. Although such terrorist’s attacks on the legislative branch may appear novel, at least in the U.S., such attacks have been precipitated in other
countries for some years (PBS, 2001). The U.S. government has recognized the need to develop a means for the continuity of government in the wake of such disasters (Continuity of Government Commission, 2002), one such solution is to utilize an e-legislature. The concept, model, and a protocol for an e-legislature was first described in Desmedt and King (1999). In Ghodosi and Pieprzyk (2001), the authors described an alternative, which required the use of a trusted administrator. Later in Desmedt and King (2002), we pointed out the weaknesses and disadvantages of the system in Ghodosi and Pieprzyk (2001) and clarified some aspects of the protocol in Desmedt and King (1999).
sEcUrItY cONcErNs One reason to be concerned about the security of an electronic legislature (e-legislature) is that one can “view” the e-legislature as a “network.” Represent the legislators as computers/hosts and their communications as the network communications. All problems that affect a network can affect an e-legislature; however there are several more reasons to be concerned. First observe that as a “law making body,” an e-legislature and the results derived from its communications need to possess a high integrity. In addition, the participation of members from the legislative body will dynamically vary from time-to-time. Further, since the decisions made by the body (i.e., law) are determined by some fixed percentage of those members present/active, there will need to be some “transfer of power” which allows this percentage of the legislators present to pass legislation. For example, suppose that the legislature makes decisions based on majority rules and that the original legislature contains 50 members. Thus 26 legislators are required to approve a proposal into law. Later we have seven legislators absent. At this time, 22 legislators are needed to pass legislation. Thus, there will need to be some mechanism
1177
Securing an Electronic Legislature Using Threshold Signatures
that allows the original body to transfer signing power from the 50 to the 43 (so in the latter case 22 can pass legislation). This in turn becomes a great risk to the integrity of the legislature. The reason is that a legislature is a political body and their members will certainly act this way. The moment at which a transfers needs to occur will be the moment when the risk to the integrity of the legislature is the highest (unless mechanisms are enacted to ensure the integrity).
tHrEsHOLD sIGNAtUrEs As we have described earlier the mechanism that is used to pass a “law” is equivalent to creating a signature, whereas the “legislature” will construct the signature. This is done as a collective body. The first realization question is “how do we model this construction” in an electronic legislature. We could of course provide each legislator with a public-key/private-key pair (Menezes, van Oorschot, & Vanstone, 1996), and when a legislator wishes to vote on a proposal they sign it. If enough legislators sign the proposal then the proposal becomes “law.” The problem is that this is unsuitable. First the essence is that this system of law making is generated by a “group-decision,” hence the signature should be a signature created by a group and not individually signed. There are several other reasons why it is not reasonable to have each legislator individually sign, one is the procedure of verification. To verify that the proposal has been passed one will need to verify each of the individual signatures using each of the individual public-keys, and then they will need to verify that a suitable number of legislators have signed1 . Since the verification of a law can take place at various times by various parties, there would be a need to “securely store this information concerning who was present and how many.” This information would need to be authenticated; hence some signature may need to be applied. But no one party can sign this information oth-
1178
erwise they would possess a power, concerning the signature of proposals (making law), that others don’t possess. Thus we need a signature created by a group to authenticate this information, but we were trying to avoid such a signature. Consequently, a signature created by a group is required and so we should make the signature of a proposal a “group generated signature” which is called threshold signatures. The next question would be “how do we generate this signature generated by a group?” The solution is to use a cryptographic tool called “threshold secret sharing.”2 , 3 The tool is such that a distributor4 generates a single “legislative signing key” and divides it into shares—one for each of the legislators, so that any k of the legislators can reconstruct the signing key5 . Here k is the quorum number. When a proposal is considered each of the n legislators decide to vote on it. If they decide to vote “yes” they create a partial signature by applying the signature generation function with their share. This process of using a threshold secret sharing scheme within a signature scheme is called threshold signature sharing or threshold signatures, for short. Consider a legislative body P1 ,…,Pn . They each possess shares of the signing key, so they collectively possess the signing power, for which when a proposal is made this body has the power to sign it into law (as long as a quorum of legislators are present). The number of legislators present will vary from time to time. As long as a quorum k exists (a pre agreed minimum number of legislators needed to be present), a proposal can be passed, according to some fixed percentage (threshold), for simplicity we will assume a simple majority vote. i.e. a kt out of nt vote where nt represents the number of legislators present at time t, and kt = nt /2+1; and so we must transfer from a k out of n vote to a kt out of nt vote. We can support the dynamism of legislator attendance by doing the following: select any k of the legislators present at time t. Have each of them independently play the role of a distributor
Securing an Electronic Legislature Using Threshold Signatures
and share their partial signature in a kt out of nt manner using threshold secret sharing. Then each of the nt legislators has received k shares. Each of them (for convenience) compresses these k shares to one share.
NEcEssArY rEQUIrEMENts tO sEcUrE AN E-LEGIsLAtUrE The first requirement is that the “law” is created by having the legislators “partially sign a proposal” and that the signature (law) is generated by using threshold signature sharing. Such a signature is created by using the single legislative key. No one entity possesses this key, but rather this key is shared to the participants by using threshold secret sharing. Second as the legislature changes in size the transfer of signature power from the original body must be made to the body that is present. This sharing of signature power needs to be temporary. For example suppose the original legislature contains 50 parties. Thus 26 legislators are required to approve a proposal into law. Later 7 legislatures are absent. At this time 22 are needed to pass legislation. At sometime later 49 are now present. Consequently 25 are required to pass legislation. If we permanently shared the secret signing key, then 22 would still be able to sign. We could “ask/require” that these members destroy their old shares. But this would require that 49-22=27 destroy their old shares, the irony is that we require more people to be honest than what the current threshold is (which is 25). If legislators send “shares of their share” to the others6 , then these legislators can continue to use this information to sign later messages into law (those that occur at later times). In fact they can impersonate this legislator in future votes. Temporary sharing is achieved by having k participants Pi _ 1 , …,Pi _ k transfer their partial signatures instead of their power to sign. Consequently the “transfer of power” (also called “sharing of shares”) needs
to be message-oriented, and so it is achieved by sharing partial signatures. Third, observe that a few of the k (out of the nt participants Pi _ 1 ,…,Pi _ k could defeat the process by not properly transferring their power (shares). This would be especially true if the message (law) was such that they had a vested interest that the law should not be passed. Thus, as the transfer of power (“sharing of shares”) is message oriented, there is a need for the set Pi _ 1 , …,Pi _ k to transfer power blindly (i.e., encrypt the message before sharing). Fourth, the participants At = Pi _ 1’, …,Pi _ k _ t ’, when given an opportunity to act on legislation must know that the outcome (“sign” or “not sign”) is a result of their decision and not a result of bad faith on the part of the participants Pi _ 1 , …,Pi _ k who had transferred them the power to sign (these are the legislators who “share their shares”). Signature generation should be such that if a signature is not generated then we should be confident that the only possible reason for this was that there were not enough “yes votes.” We shouldn’t have to wait until voting time to find out that this “sharing out shares” was not fair. Hence, the participants P1’, …,Pn _ t ’ (the legislators who are present at time t) need to be able to verify that they were actually given the power to sign that message. Fifth, no set of participants should gain any information about a motion made during an illegal session, a session where either cheaters have been discovered or the number of legislators present is less than the quorum k. Otherwise, they could use this knowledge, to act in later sessions. The point is that cheaters should not benefit. Further cheaters maybe motivated by their political affiliation, and attempt to cheat so that their colleagues benefit, hence no one should benefit. This provides another reason to blind the motion. Sixth, in a receipt-required version of an e-laws protocol, for each legislator belonging to At there must exist a record as to how that legislator voted. Note that if each legislator sends a validated partial signature (which we interpret as a valid vote) then
1179
Securing an Electronic Legislature Using Threshold Signatures
this provides a receipt that the legislator voted in favor of the message. We could use the lack of a validated partial signature as a “no” vote. Lastly, we assume that the network is sufficiently reliable connected, even to overcome any disruption by malicious parties.
AN OUtLINE OF A sEcUrE E-LEGIsLAtUrE PrOtOcOL The following is an outline of a verifiable democracy portocol. We omit all technical details, for technical details of the protocol we refer the reader to Desmedt and King (1999).
verifiable Democracy Protocol: A Democratic threshold scheme During the set-up, the legislature is empowered with a secret key so that any k out of n legislators can compute the secret signing key. If nt >= k we proceed with the protocol (we have a quorum), if nt < k then there are not enough legislators to pass the legislation. At any time t, a message/proposal mt may be proposed. At represents the set of participants present at time t, nt = | At |, and kt represents the threshold (the minimal number of participants required to sign).
set-Up Phase Legislative Key Generation A secret key K is distributed to the n participants so that a “blinded message/proposal” can be signed in a k out of n threshold manner. In addition to distributing shares of K this distributor generates auxiliary information7 which is used later to verify “partial signatures.” (For example if the protocol utilizes RSA signatures (Rivest, Shamir, & Adelman, 1978) a “test message” is
1180
generated and the distributor broadcasts all n partial signatures of the test message. The test message and partial signatures of test message play an important role in the verification of future partial signatures (Gennaro, Jarecki, Krawczyk, & Rabin, 1996). This can be performed by a trusted third party or by the participants using a protocol such as Desmedt and King (1999) and Ghodosi and Pieprzyk (2001).
Use for Each Law-Proposal Blinding Message The participant P* , who proposes message mt , blinds mt before they present it to the legislative body At .
Transfer of Power: Partial Signature Generation TPSG As long as nt exceeds (or equals) k, the message will be considered for signing. If so, k participants in At are chosen and they generate partial signatures for the blinded mt .
Transfer of Power: Partial Signature Distribution TPSD Each of the k participants share out their partial signatures in a kt out of nt manner to At (we will refer to these k participants as partial signature distributors). Each participant in At has received k shares, whereupon they compress the k shares of the partial signature to one share. In addition to distributing partial signatures, the partial signature distributors will also distribute auxiliary information which allows the legislative body At to verify the correctness of the partial signatures of the blinded mt . If enough valid shares of the signatures have been obtained at this stage, then this will allow one to obtain the signed law. Details of this are now described.
Securing an Electronic Legislature Using Threshold Signatures
Transfer of Power: Partial Signature Verification TPSV The auxiliary information provided in TPSD is first verified by each legislator in At . Upon verification the auxiliary information is used by each legislator to verify the correctness of their share of the partial signature of the blinded mt . The verification procedure is devised so that with overwhelming probability it can be determined that a recipient has received a valid share this is achieved via a “verification and complaint” protocol. If a verification fails then a complaint will be raised, at that time a cheater has been detected, what remains is a protocol to determine whether the cheater is the “partial share distributor” or the “complainer.” The consequence is that the completion of this stage with no complaints implies that the signature power for the message has been transferred to At so that any kt can sign the message. If the cheater is the partial share distributor, it is removed and one proceeds without that party, which is possible due to the sharing technology.
Unblind the Message The message is revealed to the legislature. Who reveals the message? P* could. Or if one utilizes a trusted chairperson as in Ghodosi and Pieprzyk (2001), then the trusted chairperson could reveal mt (this protocol has several problems, for more details we refer the reader to Desmedt and King (2002)—for example the use of a trusted chairperson is rarely if ever utilized in a legislature). In Desmedt and King (1999), the protocol utilized RSA signatures and so the legislators themselves could unblind the message without the legislators revealing their partial signature of mt .
Decision: Vote on mt The legislators decide whether to vote for or against mt .
Partial Signatures Sent PSS If any legislator wishes to vote for the by now known mt they send their share of the partial signature of the blinded mt .
Verification of the Signature: Determining the Passage of mt . PSV If kt or more participants have sent their partial signatures then the message may be passed. If so, the combiner selects any kt of the sent partial signatures and verifies the correctness of these partial signatures using the ancillary information provided within this protocol. For each one of these invalid partial signatures the combiner selects one of the remaining partial signatures sent and verifies it. If the number of valid partial signatures is less than kt then the message mt . is automatically not passed. We have adopted a receipt-required version of the verifiable democracy protocol. The partial signature sends (PSS) together with the partial signature verification (PSV) implies kt “valid votes.” Who can play the role of the combiner? Any person, collection of people, or even the legislators.
Message Passed The message is passed if a signature of mt can be constructed and the kt “yes votes” can be verified using the auxiliary information. A vote for mt is a valid partial signature. A comment about the verification steps TPSV and PSV in the protocol. The verification procedure TPSV and PSV may utilize different verifiable secret sharing schemes due to the amount of information the senders TPSD and PSS, respectively, know. In TPSD the senders know the shares of the secret key, whereas in PSS the senders know partial signatures. Whether TPSV and PSV require different verifiable sharing
1181
Securing an Electronic Legislature Using Threshold Signatures
schemes may depend on the threshold signature scheme that is used.
cONcLUsION The importance of developing an electronic legislature with high integrity requires a careful consideration of the e-legislature protocol. We have provided several requirements that such a protocol will need to possess to ensure this integrity. We have also provided a high-level outline of an e-legislature protocol. One of our main concerns is that one may attempt to implement an e-legislature without giving careful consideration to the security risks, potentially handing over democracies to hackers.
rEFErENcEs Boyd, C. (1989). Digital multisignatures. In H. J. Beker & F. C. Piper (Eds.), Cryptography and coding (pp. 241-246).Oxford, UK: Oxford University Press. Continuity of Government Commission. (2002). Preserving our institutions—The first report of the Continuity Government Commission. Retrieved from http://www.continuityofgovernment.org Desmedt, Y. (1988). Society and group oriented cryptography: A new concept. In Advances of cryptology-crypto 87, LNCS 293 (pp. 120-127). Springer Verlag. Desmedt, Y., & King, B. (1999). Verifiable democracy. Proceedings of the IFIP TC6/TC11 Joint Working Conference on Communications and Multimedia Security (CMS’99) (pp. 53-70). Leuven, Belgium: Kluwer Academic Publishers. Desmedt, Y., & King, B. (2002, September 2-6). Verifiable democracy a protocol to secure an electronic legislature. EGOV 2002, eGovern-
1182
ment: State of the Art and Perspectives, Aixen-Provence, France (LNCS). Berlin: Springer Verlag. Ghodosi, H., & Pieprzyk, J. (2001). Democratic systems. ACISP 2001 (pp. 392-402). Menezes, A., van Oorschot, P., & Vanstone, S. (1996). Applied cryptography. Boca Raton: CRC Press. Gennaro, R., Jarecki, S., Krawczyk, H., & Rabin, T. (1996). Robust and Efficient Sharing of {RSA} Functions. Advances in Cryptology Crypto ’96, Proceedings, LNCS 1109 (pp. 157-172). Rivest, R., Shamir, A., & Adelman, L. (1978). A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM, 21, 120-126. Shamir, A. (1979). How to share a secret. Communications of the ACM, 22, 612-613. Weidenbener, L. (2004). House fails to approve kindergarten funding plan. The Courier Journal. Retrieved from www.courier-journal.com/ localnews/2004/02/06in/wir-front-kind02068732.html PBS. (2001). India blames Pakistan militant group for parliament attack. Retrieved December 14, 2001, from www.pbs.org/newshour/updates/december01/india_12-14.html
KEY tErMs Digital Signature Scheme: A digital signature scheme is a is a public-key cryptographic tool which allows a party to provide origin authentication to a message. It consists of two schemes, a signature generation scheme, where a party can “sign” a message with their private key and a verification scheme where any party can verify the authentication of the signature by using the public-key.
Securing an Electronic Legislature Using Threshold Signatures
Threshold Secret Sharing: A cryptographic tools which allows one to distribute pieces of a secret key to n participants so that: any k of the n participants can collectively reconstruct the secret key and any set of participants with less that k member cannot generate any information about the secret key. Verifiable Secret Sharing: A secret sharing scheme for which there exists a mechanism which allows the shareholders to verify the correctness of their shares, without requiring them to reconstruct the secret key. That is, by utilizing this mechanism they can be assured that their shares can construct the key without requiring them to reconstruct the key. RSA Cryptographic Primitive: A public key cryptographic primitive that can be used for both public-key encryption and digitally signing. The primitive is such that the public value N is generated by selecting two large, secret distinct primes p and q and setting N equal to their product. Two parameters e and d are determined by: (1) selecting e so that gcd of e and (p-1)(q-1) is 1, and (2) computing d so that e*d =1 mod (p-1)(q-1). Encryption of message m is equal to me mod N and decryption of C is performed by computing Cd mod N. Partial Signature: When utilizing a threshold signature scheme, a partial signature is the data generated by a participant by signing a message using their share of the private key (secret key).
ENDNOtEs 1
2
3
4
5
6
7
Recall that this “suitable number” is dependent on some fixed percentage of those legislators that are present/active. One method to construct a threshold sharing scheme is to utilize a polynomial construct method within a field (Shamir, 1979). The importance of using threshold sharing to construct a group signature was independently developed in Boyd (1989) and Desmedt (1988). Technology exists to avoid the need to rely on a single distributor, that is, using several distributors. Of course the legislators will never reconstruct the signing key, what they will do is, use this information to construct a signature. Rather than sharing out their partial signatures, the participants could share out their shares of the secret signing key. This ancillary information will be broadcasted to all, that is, public record. The nature of the ancillary information is dependent on the verifiable sharing scheme that is used. In Desmedt and King (1999) we utilized the RSA signature scheme (Rivest, Shamir, & Adelman, 1978), so the ancillary information was based on this assumption of RSA and using a verifiable secret sharing scheme for RSA.
This work was previously published in Encyclopedia of Digital Government, edited by A. Anttiroiko and M. Malkia, pp. 1445-1450, copyright 2007 by Information Science Reference, formerly known as Idea Group Reference (an imprint of IGI Global).
1183
1184
Chapter 3.7
Use of RFID In Supply Chain Data Processing Jan Owens University of Wisconsin – Parkside, USA Suresh Chalasani University of Wisconsin – Parkside, USA
INtrODUctION The use of Radio Frequency Identification (RFID) is becoming prevalent in supply chains, with large corporations such as Wal-Mart, Tesco, and the Department of Defense phasing in RFID requirements on their suppliers. The implementation of RFID can necessitate changes in the existing data models and will add to the demand for processing and storage capacities. This article discusses the implications of the RFID technology on data processing in supply chains.
bAcKGrOUND RFID is defined as the use of radio frequencies to read information on a small device known as a tag (Rush, 2003). A tag is a radio frequency device that can be read by an RFID reader from a distance, when there is no obstruction or mis-
orientation. A tag affixed to a product flowing through a supply chain will contain pertinent information about that product. There are two types of tags: passive and active. An active tag is powered by its own battery, and it can transmit its ID and related information continuously. If desired, an active tag can be programmed to be turned off after a predetermined period of inactivity. Passive tags receive energy from the RFID reader and use it to transmit their ID to the reader. The reader then may send the data
Figure 1. Reading ID information from an RFID tag RFID Reader (1) Tag receives energy from the reader
(3) Reader sends ID information to the Host system
Host System
(2) Tag transmits its ID information to the reader
RFID Tag
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Use of RFID in Supply Chain Data Processing
Figure 2. Interaction between a retailer and a supplier in a supply chain Retailer Location 1 Retailer Location L
Intranet
Retailer’s Central Server With Inventory Information
Figure 3. Processing information from operational data stores (ODS) to an enterprise data warehouse (EDWH) and to data marts (DM)
Supplier’s Information System ODS
VPN
ODS
to a host system for processing. Figure 1 depicts the activity of reading the ID from a passive tag by an RFID reader (Microlise, 2003). The ID in the above discussion is a unique ID that identifies the product, together with its manufacturer. MIT’s Auto-ID Center proposed the Electronic Product Code (EPC) that serves as the ID on these tags (Auto-ID Technology Guide, 2002). EPC can be 64 bits or 96 bits long. However, EPC formats allow the length of the EPC to be extended in future. Auto-ID center envisions RFID tags constituting an Internet of things. RFID tag information is generated based on events such as a product leaving a shelf or being checked out by a customer at a (perhaps automatic) checkout counter. Such events or activities generate data for the host system shown in Figure 1. The host system, when it processes these data, in turn may generate more data for other partners in the supply chain. Our focus in this article is to study the use of RFID in supply chains.
MAIN tHrUst This article explores the data generated by RFID tags in a supply chain and where this data may be placed in the data warehouse. In addition, this article explores the acceptance issues of RFID tags to businesses along the supply chain and to consumers.
Retailer’s Enterprise Operational Data Store (EODS)
EDWH ETL (Extract, Transform, Load)
Load
DM
Load
DM
types of Data Generated by rFID tags The widespread use of the Internet has prompted companies to manage their supply chains using the Internet as the enabling technology (Gunasekaran, 2001). Internet-based supply chains can reduce the overall cost of managing the supply chains, thus allowing the partners to spend more money and effort on innovative research and product development (Grosvenor & Austin, 2001; Hewitt, 2001). Internet-based supply chains also allow smaller companies to thrive without massive physical infrastructures. The impact of RFID on a retailer supplier interaction in the supply chain is discussed below. The information system model for communication between a retailer and a supplier is shown in Figure 2. The retailer is assumed to have several locations, each equipped with RFID readers and RFID-tagged items. Each location has its own computer system comprising a local database of its inventory and application programs that process data from the RFID readings. The complete inventory information for the retailer is maintained at a central location comprising high-end database and application servers (Chalasani & Sounderpandian, 2005). Computer systems at retail locations are interconnected to the central inventory server of the retailer by the company’s intranet. Reordering
1185
Use of RFID in Supply Chain Data Processing
of inventory from the supplier, once inventory levels fall below the reorder points, takes place by communication of specific messages between the retailer’s information systems and the supplier’s information system (Ranganathan, 2003). Any such communication is facilitated by a virtual private network (VPN), to which the retailer and the supplier subscribe (Weisberg, 2002). For large retailers, such as Wal-Mart, each location communicating with one central server is impractical. In such cases, a hierarchical model of interconnected servers, where each server serves a regional group of locations, is more practical (Prasad & Sounderpandian, 2003). In this article, for the sake of simplicity, we assume a flat hierarchy. The ideas and developed in this article can be extended and applied to hierarchical models as well. RFID readings and the transactions that may be triggered upon processing the readings are classified by Chalasani and Sounderpandian (2004).
Placing rFID Data in an Enterprise Data Warehouse system Data warehouse systems at the retailer and the supplier should be able to handle the information generated by the transactions described previously. Figure 3 presents a typical data warehouse system at the retailer’s central location. The operational data store (ODS) at each retailer’s location stores the data relevant to that location. The data from different retailer ODSs is combined together to obtain an enterprise operational data store (EODS). The process commonly referred to as ETL (extract → transform → load) is applied to the EODS data, and the resulting data is loaded into the enterprise data warehouse (EDWH). The data from EDWH are then sliced along several dimensions to produce data for the data marts. The transactions described in the previous section are handled by several tables in the ODS and the EODS databases. These tables are de-
1186
picted in Figure 4. The reader table contains the Reader_ID for each RFID reader. This reader ID is the primary key in this table. In addition, it contains the location of the reader. Reader_Location often is a composite attribute containing the aisle and shelf and other data that precisely identify the location of the reader. The product table has several attributes pertaining to the product, such as the product description. The primary key in the product table is the Product_EPC, which is the electronic product code (EPC) that uniquely identifies each product and is embedded in the RFID tag. Transaction type table is a lookup table that assigns transaction codes to each type of transaction (e.g., point of sale or shelf replenishment). Each of the tables—Reader, Product, Transaction Type—have a one-to-many relationship with the transactions table, with the many sides of the relationship ending on the transactions table. The amount of transaction data generated by RFID transactions can be estimated using a simple model. Let N be the total number of items and f be the average number of tag-reads per hour. The total number of RFID transactions are N * f per hour. If B is the number of bytes required for each entry in the transactions table, the total storage requirements in the transactions table per hour is given by N * f * B. For example, if there are 100,000 items on the shelves at a retail store location, and the items are read every 15 minutes, there are 400,000 transactions every hour. In addition, if each transaction requires 256 bytes of storage, the total storage requirement per hour is 100 MB. If the store operates on average for 15 hours a day, the total storage per day is 1.5 GB. If the retailer has 1,000 locations, the total storage required at the EODS is 1.5 terabytes a day. To reduce the storage requirements, the following principles are adhered to: (1) archive the transactions data at the EODS on a daily basis; that is, move the data from the EODS transactions table to an archived table; (2) purge the transactions data from the ODS tables on a weekly basis; purging
Use of RFID in Supply Chain Data Processing
Figure 4. Tables in ODS and EODS that hold RFID transactions Transaction Type
Product 1
Tran_Type_Code (PK) Tran_Type_Description
1 M Transactions Tran_ID (PK) Tran_Type_Code (FK) Product_EPC (FK) Reader ID (FK)
Product_EPC (PK) Product_Description …
M M
Reader 1
Reader_ID (PK) Reader_Location …
data does not cause loss of this data, since this data are already archived by the EODS; (3) calculate the summary data and write only the summary data to the EDWH.
customer Acceptance of rFID tags Customer acceptance of RFID technology and diffusion will depend on its perceived value in the supply chain. The rate of acceptance can be explained and predicted through five general product characteristics that impact this perception of value: complexity, compatibility, relative advantage, observability, and trialability (Rogers, 1983). Complexity describes the degree of difficulty involved in understanding and using a new product. The more complex the product, the more difficult it is to understand and use, and the slower will be its diffusion in the marketplace. Compatibility describes the degree to which the new product is consistent with the adopter’s existing values and product knowledge, past experiences, and current needs. Incompatible products diffuse more slowly than compatible products, as adopters must invest more time in understanding the new product and rationalizing or replacing existing operations, knowledge, and behaviors. Relative advantage describes the degree to which a product is perceived as superior to existing substitutes. A product that is notably
easier to use, more efficient, or accomplishes an objective more effectively than other alternatives provides a relative advantage. Related to this is the speed with which the anticipated benefits of the new product accrue to the adopter. Observability describes the degree to which the benefits or other results of using the product can be observed and communicated to target customers. This is the noticeable difference factor, or the ability to observe its features or performance. Trialability describes the degree to which a product can be tried on a limited basis with little or acceptable risk. Demonstrations, test-drives, limited implementation, and sampling are some of the ways that promote customer adoption and diffusion of a new product. By demonstrating the usefulness and value of the new product, potential adopters reduce the risk of making a poor adoption decision.
Producers, Manufacturers, Distributors, and Retailers The adoption of RFID technology presents some initial advantages and disadvantages for all business-to-business customers. One advantage is the potential supply chain savings, but the savings come with considerable upfront cost, even when issues of compatibility and reliability have been resolved (Mukhopadhyay & Kekre, 2002). Because buyers and sellers have somewhat different operational concerns specific to each business, the value estimation of these advantages and disadvantages may differ among supply chain partners and, thus, the relative advantage of adopting RFID. At a basic level, both retailers and suppliers will incur substantial upfront investment in readers and systems integration. However, this cost to retailers will be relatively fixed. In contrast, manufacturers have the additional unit cost of the tags affixed to pallets, cases, or items. This cost is currently prohibitive for many small-ticket, small-margin goods, particularly in high-volume manufacturing such as consumer packaged goods.
1187
Use of RFID in Supply Chain Data Processing
Essentially, the compatibility issue will be dictated by the more powerful players in the supply chain. Even if the firm sees many obstacles in RFID adoption, powerful buyers will dictate a producer’s RFID adoption process, if the latter hopes to retain their key accounts. From a producer’s perspective, only compliance may keep its business with an important retailer or customer (i.e., there is a strong motivation to master the new technology and become compatible with new technological demands (Pyke, 2001.) Similarly, the regulatory environment is an important motivator in adoption of new technologies. Many ranchers and food producers have adopted the RFID tag to meet future tracking requirements in the European Union. Even so, the volume of new data, as well as system compatibility issues, are not small considerations in effective implementation of RFID (Levinson, 2004.) The producers who see the relative advantage of RFID at current RFID unit prices sell higher-margin goods and containers, such as pharmaceuticals, car parts, boxcars, livestock, and apparel. Here, the high value of the item can accommodate the current cost of the RFID tags. Manufacturers that can use additional information about the product at different points in the supply chain would also understand a clear advantage of RFID over bar codes. For example, pharmaceuticals that must maintain constant temperature, time, and moisture ranges for efficacy can monitor such data through to the customer’s purchase. High-end apparel can code tags to indicate the availability of other colors or complementary products; RFID tags married to VIN numbers on automobiles can maintain a car’s manufacturing, use, and repair history. Indeed, some products such as higher-end consumer electronics see RFID as a potentially less expensive anti-theft device than the security systems they currently use (Stone, 2004). Clear identification of counterfeit goods is also a valued advantage of RFID adoption in defending proprietary designs and formulations
1188
across many product categories as well as providing a guarantee of product safety and efficacy. However, even high value-added products, such as pharmaceuticals, currently use only track-andtrace applications until EPC tag specifications can prevent one chip’s programming being copied to another, a key requirement in guaranteeing product authenticity (Whiting, 2004). In contrast, the tags are prohibitively expensive on low-ticket, low-margin items typical of consumer packaged goods. Compared to current, fairly reliable distribution systems, RFID may present less added value and efficiency to manufacturers compared to retailers. Producers are also concerned that their tags are not talking to their competitors’ readers, either in competitor stores or warehouses. Producers feel that they have more to lose in divulging inventory and marketing specifications than do retailers (Berkman, 2002). Loss of sensitive, proprietary information can be seen to increase the system’s vulnerability to competitors, a potentially prohibitive cost to its implementation. Software safeguards must be in place so that tags only can be read and changed by approved readers (Hesseldahl, 2004a.) Pilot projects will facilitate RFID trialability and observability, both within and between firms, and subsequently demonstrate comparative advantages over the prior inventory systems. Besides improved inventory monitoring and control, new RFID tags that are incorporated into the packaging also promise to reduce system costs. As major supply chain groups fine-tune the technology and share successes and improvements, and as unit costs of RFID decline, the more likely RFID will be considered and adopted. In contrast, as glitches arise, only firms that envision the highest payoff from RFID are likely to engage in extensive pilot testing (Shi & Bennett, 2001). However, thirdparty mid-level firms are increasingly available to smooth the complexities of RFID implementation (Hesseldahl, 2004b.)
Use of RFID in Supply Chain Data Processing
rFID and consumer concerns Consumers will not be concerned with the technical complexity and compatibility of RFID in the same way as producers and retailers, similar to their lack of concern about the technical issues of bar codes. Instead, they are more concerned about RFID compatibility with lifestyle issues and the comparative advantage of the tags in their everyday lives. Advantages can be seen in everything from assured efficacy of pharmaceuticals to quality parts assurance. New technologies may see RFID tags on shirts that program an appropriate wash cycle; that provide heating instructions from a frozen dinner to a microwave oven; that suggest complementary merchandise to an item of apparel or accessories based on past purchases; that warn of past-due shelf dates; and that can add a product to a shopping list when it is removed from the refrigerator. However, until consumers have homes with appropriately programmed appliances and other infrastructure, or retailers demonstrate value-added services based on RFID technology, observability, trialability, and compatibility will be issues in demonstrating the value of RFID to consumers. A great concern for consumers is privacy. Consumers worry about RFID’s ability in postpurchase surveillance, as it opens the door to corporate, government, and other sources of abuse. Groups such as Consumers Against Supermarket Privacy Invasion and Numbering (CASPIAN) already have organized protests against chains at the forefront of data gathering, such as information from customer loyalty cards. The clothing retailer Benetton considered putting RFID tags into some apparel, until Caspian threatened a boycott. Some consumer groups fear that a thief with an RFID reader easily could identify which homes had the more expensive TVs and other
tempting targets, unless the tags are deactivated (Pruitt, 2004.)
FUtUrE trENDs Retailers have proposed various solutions to privacy concerns. A customer may deactivate the tags when exiting a store, but this gets cumbersome for large shopping trips. Furthermore, deactivation could block useful data in third-party situations, such as dietary information and allergy alerts. Other suggestions have included special carry bags that block the RFID signal, but this is often impractical or ineffective in home storage for many items. However, it is more feasible for small, personally sensitive items such as prescription medications. Expiring product signals may defeat the purpose of accurately identifying product returns or recalls, unless the tag can be reactivated.
cONcLUsION To date, retailers have been the driving forces behind the adoption of RFID technology. Retailers who have a very wide variety of goods, such as large general merchandisers like Wal-Mart and grocery chains such as Metro, can observe much improvement in inventory systems using RFID compared to bar-code technology. Yet, in its current form, RFID systems are complex to install and incompatible with current systems, plant, and equipment. Furthermore, staff will have to be retrained in order to extract the full value from RFID technology, including the store clerk who can search for additional inventory in an off-site warehouse.
1189
Use of RFID in Supply Chain Data Processing
rEFErENcEs Auto-Id Technology Guide. (2002). MIT Auto ID Center, Cambridge, MA. Berkman, E. (2002). How to practice safe B2B: Before swapping information with multiple ecommerce partners, it pays to protect yourself by pushing partners to adopt better security practices. CIO, 15(17), 1-5. Chalasani, S., & Sounderpandian, J. (2004). RFID for retail store information systems. Proceedings of the Americas Conference on Information Systems (AMCIS 2004), New York. Chalasani, S., & Sounderpandian, J. (2005). Performance benchmarks and cost sharing models for B2B supply chain information systems [to appear in Benchmarking: An International Journal]. Grosvenor, F., & Austin, T.A. (2001). Cisco’s eHub initiative. Supply Chain Management Review, 5(4), 28-35. Gunasekaran, A. (2001). Editorial: Benchmarking tools and practices for twenty-first century competitiveness. Benchmarking: An International Journal, 8(2), 86-87. Hesseldahl, A. (2004a). A hacker’s guide to RFID. Forbes.com. Retrieved from http://www.forbes. com/commerce/2004/07/29/cx_ ah_ 0729rfid. html Hesseldahl, A. (2004b). Master of the RFID universe. Forbes.com. Retrieved from http://www. forbes.com/manufacturing/2004/06/29/cx_ah_ 0629rfid.html Hewitt, F. (2001). After supply chains, think demand pipelines. Supply Chain Management Review, 5(3), 28-41. Levinson, M. (2004). The RFID imperative. CIO.com. Retrieved from http://www.cio.com. au/pp.php?id= 557782928&fp=2&fpid=2
1190
Microlise. (2003). White Paper on RFID Tagging Technology. Mukhopadhyay, T., & Kekre, S. (2002). Strategic and operational benefits of electronic integration. Management Science, 48(10), 1301-1313. Prasad, S., & Sounderpandian, J. (2003). Factors influencing global supply chain efficiency: Implications for information systems. Supply Chain Management: An International Journal, 8(3), 241-250. Pruitt, S. (2004). RFID: Is big brother watching? Infoworld. Retrieved from http://www.inforworld. com/article/04/03/19/HNbigbrother_1.html Pyke, D. et al. (2001). e-Fulfillment: It’s harder than it looks. Supply Chain Management Review, 5(1), 26-33. Ranganathan, C. (2003). Evaluating the options for business-to-business e-commerce. In C.V. Brown (Ed.), Information systems management handbook. New York, NY: Auerbach Publications. Rogers, E.M. (1983). Diffusion of innovations. New York: The Free Press. Rush, T. (2003). RFID in a nutshell—A primer on tracking technology. UsingRFID.com. Retrieved from http://usingrfid.com/features/read. asp?id=2 Shi, N., & Bennett, D. (2001). Benchmarking for information systems management using issues framework studies: Content and methodology. Benchmarking: An International Journal, 8(5), 358-375. Stone, A. (2004). Stopping sticky fingers with tech. BusinessWeekOnline. Retrieved from http://www. businessweek.com/technology/content/aug2004/ tc20040831_9087_tc172.htm Thomson, I. (2004). Privacy fears haunt RFID rollouts. CRM Daily. Retrieved from http://wireless. newsfactor.com/story.xhtml?story_id=23471
Use of RFID in Supply Chain Data Processing
Weisberg, D. (2002). Virtual private exchanges change e-procurement. Information Executive, 6(1), 4-5. Whiting, R. (2004). RFID to flourish in pharmaceutical industry. InformationWeek. Retrieved from http://infomrationweek.com/story/showArticle.jhtml? article!ID=29116923
KEY tErMs Active Tag: An active tag is powered by its own battery, and it can transmit its ID and related information continuously. Auto Id Technology: A precursor to the RFID technology that led to the definitions of RFID technology, including EPC. Compatibility: Describes the degree to which the new product is consistent with the adopter’s
existing values and product knowledge, past experiences, and current needs. Electronic Product Code (EPC): Uniquely identifies each product and is normally a 128bit code. It is embedded in the RFID tag of the product. Observability: The degree to which the benefits or other results of using the product can be observed and communicated to target customers. Passive Tag: Receive energy from the RFID reader and then transmit its ID to the reader. RFID: Radio Frequency Identification, defined as the use of radio frequencies to read information on a small device known as a tag. Trialability: The degree to which a product can be tried on a limited basis.
This work was previously published in Encyclopedia of Data Warehousing and Mining, edited by J.Wang, pp. 1160-1165, copyright 2005 by Information Science Reference, formerly known as Idea Group Reference (an imprint of IGI Global).
1191
1192
Chapter 3.8
Digital Signature-Based Image Authentication Der-Chyuan Lou National Defense University, Taiwan Jiang-Lung Liu National Defense University, Taiwan Chang-Tsun Li University of Warwick, UK
AbstrAct
INtrODUctION
This chapter is intended to disseminate the concept of digital signature-based image authentication. Capabilities of digital signature-based image authentication and its superiority over watermarking-based approaches are described first. Subsequently, general models of this technique—strict authentication and non-strict authentication are introduced. Specific schemes of the two general models are also reviewed and compared. Finally, based on the review, design issues faced by the researchers and developers are outlined.
In the past decades, the technological advances of international communication networks have facilitated efficient digital image exchanges. However, the availability of versatile digital signal/image processing tools has also made image duplication trivial and manipulations discernable for the human visual system (HVS). Therefore, image authentication and integrity verification have become a popular research area in recent years. Generally, image authentication is projected as a procedure of guaranteeing that the image content has not been altered, or at least that the visual (or semantic) characteristics of the image are maintained after incidental manipulations such
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Digital Signature-Based Image Authentication
as JPEG compression. In other words, one of the objectives of image authentication is to verify the integrity of the image. For many applications such as medical archiving, news reporting and political events, the capability of detecting manipulations of digital images is often required. Another need for image authentication arises from the requirement of checking the identity of the image sender. In the scenario that a buyer wants to purchase and receive an image over the networks, the buyer may obtain the image via e-mails or from the Internet-attached servers that may give a malicious third party the opportunities to intercept and manipulate the original image. So the buyer needs to assure that the received image is indeed the original image sent by the seller. This requirement is referred to as the legitimacy requirement in this chapter. To address both the integrity and legitimacy issues, a wide variety of techniques have been proposed for image authentication recently. Depending on the ways chosen to convey the authentication data, these techniques can be roughly divided into two categories: labeling-based techniques (e.g., the method proposed by Friedman, 1993) and watermarking-based techniques (e.g., the method proposed by Walton, 1995). The main difference between these two categories of techniques is that labeling-based techniques create the authentication data in a separate file while watermarking-based authentication can be accomplished without the overhead of a separate file. However, compared to watermarking-based techniques, labeling-based techniques potentially have the following advantages.
• •
They can detect the change of every single bit of the image data if strict integrity has to be assured. The image authentication can be performed in a secure and robust way in public domain (e.g., the Internet).
•
The data hiding capacity of labeling-based techniques is higher than that of watermarking.
Given its advantages on watermarking-based techniques, we will focus on labeling-based authentication techniques. In labeling-based techniques, the authentication information is conveyed in a separate file called label. A label is additional information associated with the image content and can be used to identify the image. In order to associate the label content with the image content, two different ways can be employed and are stated as follows.
•
•
The first methodology uses the functions commonly adopted in message authentication schemes to generate the authentication data. The authentication data are then encrypted with secret keys or private keys depending on what cryptographic authentication protocol is employed. When applying to two different bit-streams (i.e., different authentication data), these functions can produce two different bit sequences, in such a way that the change of every single bit of authentication data can be detected. In this chapter, image authentication schemes of this class are referred to as strict authentication. The second methodology uses some special-purpose functions to extract essential image characteristics (or features) and encrypt them with senders’ private keys (Li, Lou & Chen, 2000; Li, Lou & Liu, 2003). This procedure is the same as the digital signature protocol except that the features must be designed to compromise with some specific image processing techniques such as JPEG compression (Wallace, 1991). In this chapter, image authentication techniques of this class are referred to as non-strict authentication.
1193
Digital Signature-Based Image Authentication
The strict authentication approaches should be used when strict image integrity is required and no modification is allowed. The functions used to produce such authentication data (or authenticators) can be grouped into three classes: message encryption, message authentication code (MAC), and hash function (Stallings, 2002). For message encryption, the original message is encrypted. The encrypted result (or cipher-text) of the entire message serves as its authenticator. To authenticate the content of an image, both the sender and receiver share the same secret key. Message authentication code is a fixed-length value (authenticator) that is generated by a public function with a secret key. The sender and receiver also share the same secret key that is used to generate the authenticator. A hash function is a public function that maps a message of any length to a fixed-length hash value that serves as the authenticator. Because there is no secret key adopted in creating an authenticator, the hash functions have to be included in the procedure of digital signature for the electronic exchange of message. The details of how to perform those labeling-based authentication schemes and how to obtain the authentication data are described in the second section. The non-strict authentication approaches must be chosen when some forms of image modifications (e.g., JPEG lossy compression) are permitted, while malicious manipulation (e.g., objects’ deletion and modification) must be detected. This task can be accomplished by extracting features that are invariant to predefined image modifications. Most of the proposed techniques in the literature adopted the same authentication procedure as that performed in digital signature to resolve the legitimacy problem, and exploited invariant features of images to resolve the nonstrict authentication. These techniques are often regarded as digital signature-based techniques and will be further discussed in the rest of this chapter. To make the chapter self-contained, some labeling-based techniques that do not follow the
1194
standard digital-signature procedures are also introduced in this chapter. This chapter is organized as follows. Following the introduction in the first section, the second section presents some generic models including strict and non-strict ones for digital signaturebased image authentication. This is followed by a section discussing various techniques for image authentication. Next, the chapter addresses the challenges for designing secure digital signaturebased image authentication methods. The final section concludes this chapter.
GENErIc MODELs The digital signature-based image authentication is based on the concept of digital signature, which is derived from a cryptographic technique called public-key cryptosystem (Diffie & Hellman, 1976; Rivest, Shamir & Adleman, 1978). Figure 1 shows the basic model of digital signature. The sender first uses a hash function, such as MD5 (Rivest, 1992), to hash the content of the original data (or plaintext) to a small file (called digest). Then the digest is encrypted with the sender’s private key. The encrypted digest can form a unique “signature” because only the sender has the knowledge of the private key. The signature is then sent to the receiver along with the original information. The receiver can use the sender’s public key to decrypt the signature, and obtain the original digest. Of course, the received information can be hashed by using the same hash function in the sender side. If the decrypted digest matches the newly created digest, the legitimacy and the integrity of the message are therefore authenticated. There are two points worth noting in the process of digital signature. First, the plaintext is not limited to text file. In fact, any types of digital data, such as digitized audio data, can be the original data. Therefore, the original data in Figure 1 can be replaced with a digital image, and the process of digital signature can then be
Digital Signature-Based Image Authentication
Figure 1. Process of digital signature
used to verify the legitimacy and integrity of the image. The concept of trustworthy digital camera (Friedman, 1993) for image authentication is based on this idea. In this chapter, this type of image authentication is referred to as digital signaturebased image authentication. Second, the hash function is a mathematical digest function. If a single bit of the original image is changed, it may result in a different hash output. Therefore, the strict integrity of the image can be verified, and this is called strict authentication in this chapter. The framework of strict authentication is described in the following subsection.
strict Authentication Figure 2 shows the main elements and their interactions in a generic digital signature-based model for image authentication. Assume that the sender wants to send an image I to the receiver, and the legitimate receiver needs to assure the
legitimacy and integrity of I. The image I is first hashed to a small file h. Accordingly: (1)
h = H(I),
where H(⋅) denotes hash operator. The hashed result h is then encrypted (signed) with the sender’s private key KR to generate the signature: (2)
S = E KR (h)
where E(⋅) denotes the public-key encryption operator. The digital signature S is then attached to the original image to form a composite message: (3)
M = I || S,
where “||” denotes concatenation operator. If the legitimacy and integrity of the received image I’ needs to be verified, the receiver first
1195
Digital Signature-Based Image Authentication
Figure 2. Process of digital signature-based strict authentication
separates the suspicious image I’ from the composite message, and hashes it to obtain the new hashed result, that is: (4)
h’ = H(I’).
The attached signature is decrypted with the sender’s public-key Kp to obtain the possible original hash code: (5)
hˆ = D Kp ( Sˆ )
where D(⋅) denotes the public-key decryption operator. Note that we use and respectively to represent the received signature and its hash result because the received signature may be a forged one. The legitimacy and integrity can be confirmed by comparing the newly created hash h’ and the possible original hash . If they match with each other, we can claim that the received image I’ is authentic. The above framework can be employed to make certain the strict integrity of an image because of the characteristics of the hash functions. In the process of digital signature, one can easily create the hash of an image, but it is difficult to reengineer a hash to obtain the original image. This can be also referred to “one-way” property. Therefore, the hash functions used in digital signature are also called one-way hash functions. MD5 and SHA (NIST FIPS PUB, 1993) are two good examples of one-way hash functions. Besides one-way hash functions, there are other authentication functions that can be utilized to perform the strict authen-
1196
Figure 3. Process of encryption function-based strict authentication
tication. Those authentication functions can be classified into two broad categories: conventional encryption functions and message authentication code (MAC) functions. Figure 3 illustrates the basic authentication framework for using conventional encryption functions. An image, I, transmitted from the sender to the receiver, is encrypted using a secret key K that was shared by both sides. If the decrypted image I’ is meaningful, then the image is authentic. This is because only the legitimate sender has the shared secret key. Although this is a very straightforward method for strict image authentication, it also provides opponents opportunities to forge a meaningful image. For example, if an opponent has the pair of (I, C), he/she can forge an intelligible image I’ by the cutting and pasting method (Li, Lou & Liu, 2003). One solution to this problem is to use the message authentication code (MAC). Figure 4 demonstrates the basic model of MAC-based strict authentication. The MAC is a cryptographic checksum that is first generated with a shared secret key before the transmission of the original image I. The MAC is then transmitted to the receiver along with the original image. In order to assure the integrity, the receiver conducts the same calculation on the received image I’ using the same secret key to generate a new MAC. If the received MAC matches the calculated MAC, then the integrity of the received image is verified. This is because if an attacker alters the original image without changing the MAC, then the newly calculated MAC will still differ from the received MAC.
Digital Signature-Based Image Authentication
Figure 4. Process of MAC-based strict authentication
The MAC function is similar to the encryption one. One difference is that the MAC algorithm does not need to be reversible. Nevertheless, the decryption formula must be reversible. It results from the mathematical properties of the authentication function. It is less vulnerable to be broken than the encryption function. Although MAC-based strict authentication can detect the fake image created by an attacker, it cannot avoid “legitimate” forgery. This is because both the sender and the receiver share the same secret key. Therefore, the receiver can create a fake image with the shared secret key, and claim that this created image is received from the legitimate sender. With the existing problems of encryption and MAC functions, the digital signature-based method seems a better way to perform strict authentication. Following the increasing applications that can tolerate one or more content-preserving manipulations, non-strict authentication becomes more and more important nowadays.
Non-strict Authentication Figure 5 shows the process of non-strict authentication. As we can see, the procedure of non-strict authentication is similar to that of strict authentication except that the function here used to digest the image is a special-design feature extraction function fC . Assume that the sender wants to deliver an image I to the receiver. A feature extraction function fC is used to extract the image feature and to encode it to a small feature code:
Figure 5. Process of non-strict authentication
(6)
C = fC (I)
where fC (⋅) denotes feature extraction and coding operator. The extracted feature code has three significant properties. First, the size of extracted feature code is relatively small compared to the size of the original image. Second, it preserves the characteristics of the original image. Third, it can tolerate incidental modifications of the original image. The feature code C is then encrypted (signed) with the sender’s private key KR to generate the signature: (7)
S = E K R (C )
The digital signature S is then attached to the original image to form a composite message: (8)
M = I || S.
Then the composite message M is forwarded to the receiver. The original image may be lossy compressed, decompressed, or tampered during transmission. Therefore, the received composite message may include a corrupted image I’. The original I may be compressed prior to the concatenation operation. If a lossy compression strategy is adopted, the original image I in the composite message can be considered as a corrupted one. In order to verify the legitimacy and integrity of the received image I’, the receiver first separates the corrupted image I’ from the composite message, and generates a feature code C’ by using the same feature extraction function in the sender side, that is:
1197
Digital Signature-Based Image Authentication
Figure 6. Idea of the trustworthy digital camera
(9)
C’ = fC (I’)
The attached signature is decrypted with the sender’s public-key KU to obtain the original feature code: (10) Cˆ = D K ( Sˆ ) U Note that we use and to represent the received signature and feature code here because the signature may be forged. The legitimacy and integrity can be verified by comparing the newly generated feature C’ and the received feature code . To differentiate the errors caused by authorized modifications from the errors of malevolent manipulations, let d(C, C’) be the measurement of similarity between the extracted features and the original. Let T denote a tolerable threshold value for examining the values of d(C, C’) (e.g., it can be obtained by performing a maximum compression to an image). The received image may be considered authentic if the condition < T is met. Defining a suitable function to generate a feature code that satisfies the requirements for non-strict authentication is another issue. Ideally, employing a feature code should be able to detect content-changing modifications and tolerate content-preserving modifications. The content-
1198
changing modifications may include cropping, object addition, deletion, and modification, and so forth, while the content-preserving modifications may include lossy compression, format conversion and contrast enhancing, etc. It is difficult to devise a feature code that is sensitive to all the content-changing modifications, while it remains insensitive to all the contentpreserving modifications. A practical approach to design a feature extraction function would be based on the manipulation methods (e.g., JPEG lossy compression). As we will see in the next section, most of the proposed non-strict authentication techniques are based on this idea.
stAtE OF tHE Art In this section, several existing digital signaturebased image authentication schemes are detailed. Specifically, works related strict authentication is described in the first subsection and non-strict ones in the second subsection. Note that the intention of this section is to describe the methodology of the techniques. Some related problems about these techniques will be further discussed in the fourth section, in which some issues of designing practical schemes of digital signature-based image authentication are also discussed.
Digital Signature-Based Image Authentication
Figure 7. Verification process of Friedman’s idea
strict Authentication Friedman (1993) associated the idea of digital signature with digital camera, and proposed a “trustworthy digital camera,” which is illustrated as Figure 6. The proposed digital camera uses a digital sensor instead of film, and delivers the image directly in a computer-compatible format. A secure microprocessor is assumed to be built in the digital camera and be programmed with the private key at the factory for the encryption of the digital signature. The public key necessary for later authentication appears on the camera body as well as the image’s border. Once the digital camera captures the objective image, it produces two output files. One is an all-digital industrystandard file format representing the captured image; the other is an encrypted digital signature generated by applying the camera’s unique private key (embedded in the camera’s secure microprocessor) to a hash of the captured image file, a procedure described in the second section. The digital image file and the digital signature can later be distributed freely and safely. The verification process of Friedman’s idea is illustrated in Figure 7. The image authentication can be accomplished with the assistance of the public domain verification software. To authenticate a digital image file, the digital image, its accompanying digital signature file, and the
public key are needed by the verification software running on a standard computer platform. The program then calculates the hash of the input image, and uses the public key to decode the digital signature to reveal the original hash. If these two hash values match, the image is considered to be authentic. If these two hash values are different, the integrity of this image is questionable. It should be noted that the hash values produced by using the cryptographic algorithm such as MD5 will not match if a single bit of the image file is changed. This is the characteristic of the strict authentication, but it may not be suitable for authenticating images that undergo lossy compression. In this case, the strict authentication code (hash values) should be generated in a non-strict way. Non-strict authentication schemes have been proposed for developing such algorithms.
Non-strict Authentication Instead of using a strict authentication code, Schneider and Chang (1996) used content-based data as the authentication code. Specifically, the content-based data can be considered to be the image feature. As the image feature is invariant for some content-preserving transformation, the original image can also be authenticated although it may be manipulated by some allowable image transformations. The edge information, DCT
1199
Digital Signature-Based Image Authentication
coefficients, color, and intensity histograms are regarded as potentially invariant features. In Schneider and Chang’s method, the intensity histogram is employed as the invariant feature in the implementation of the content-based image authentication scheme. To be effective, the image is divided into blocks of variable sizes and the intensity histogram of each block is computed separately and is used as the authentication code. To tolerate incidental modifications, the Euclidean distance between intensity histograms was used as a measure of the content of the image. It is reported that the lossy compression ratio that could be applied to the image without producing a false positive is limited to 4:1 at most. Schneider and Chang also pointed out that using a reduced distance function can increase the maximum permissible compression ratio. It is found that the alarm was not triggered even at a high compression ratio up to 14:1 if the block average intensity is used for detecting image content manipulation. Several works have been proposed in the literature based on this idea. They will be introduced in the rest of this subsection.
Feature-Based Methods The major purpose of using the image digest (hash values) as the signature is to speed up the signing procedure. It will violate the principle of the digital signature if large-size image features were adopted in the authentication scheme. Bhattacharjee and Kutter (1998) proposed another algorithm to extract a smaller size feature of an image. Their feature extraction algorithm is based on the so-called scale interaction model. Instead of using Gabor wavelets, they adopted MexicanHat wavelets as the filter for detecting the feature points. The algorithm for detecting feature-points is depicted as follows.
1200
•
Define the feature-detection function, Pi j (⋅) as:
(11) Pij ( x ) =| M i ( x ) − ⋅ M j ( x ) | where and represent the responses of Mexican-Hat wavelets at the image-location for scales i and j, respectively. For the image A, the wavelet response is given by:
(12) M i ( x ) = 〈 (2− i (2− i ⋅ x )); A〉 where denotes the convolution of its operands. The normalizing constant γ is given by γ = 2- (i - j ) , the operator |⋅| returns the absolute value of its parameter, and the represents the response of the Mexican-Hat mother wavelet, and is defined as: (13)
• •
x2 2 ( x ) = (2− | x | ) exp(− ) 2
Determine points of local maximum of Pi j (⋅). These points correspond to the set of potential feature points. Accept a point of local maximum in Pi j (⋅) as a feature-point if the variance of the image-pixels in the neighborhood of the point is higher than a threshold. This criterion eliminates suspicious local maximum in featureless regions of the image.
The column-positions and row-positions of the resulting feature points are concatenated to form a string of digits, and then encrypted to generate the image signature. It is not hard to imagine that the file constructed in this way can have a smaller size compared to that constructed by recording the block histogram. In order to determine whether an image A is authentic with another known image B, the feature set SA of A is computed. The feature set SA is then compared with the feature set SB of B that is decrypted from the signature of B. The following rules are adopted to authenticate the image A.
Digital Signature-Based Image Authentication
Figure 8. Process of edge extraction proposed by Queluz (2001)
• • •
Verify that each feature location is present both in SB and in SA . Verify that no feature location is present in SA but absent in SB . Two feature points with coordinates and are said to match if:
(14) | x − y |< 2
Edge-Based Methods The edges in an image are the boundaries or contours where the significant changes occur in some physical aspects of an image, such as the surface reflectance, illumination, or the distances of the visible surfaces from the viewer. Edges are kinds of strong content features for an image. However, for common picture formats, coding edges value and position produces a huge overhead. One way to resolve this problem is to use a binary map to represent the edge. For example, Li, Lou and Liu (2003) used a binary map to encode the edges of an image in their watermarkingbased image authentication scheme. It should be concerned that edges (both their position and value, and also the resulting binary image) might be modified if high compression ratios are used. Consequently, the success of using edges as the authentication code is greatly dependent
on the capacity of the authentication system to discriminate the differences the edges produced by content-preserving manipulations from those content-changing manipulations. Queluz (2001) proposed an algorithm for edges extraction and edges integrity evaluation. The block diagram of the edge extraction process of Queluz’s method is shown as Figure 8. The gradient is first computed at each pixel position with an edge extraction operator. The result is then compared with an image-dependent threshold obtained from the image gradient histogram to obtain a binary image marking edge and no-edge pixels. Depending on the specifications for label size, the bit-map could be sub-sampled with the purpose of reducing its spatial resolution. Finally, the edges of the bit-map are encoded (compressed). Edges integrity evaluation process is shown as Figure 9. In the edges difference computation block, the suspicious error pixels that have differences between the original and computed edge bit-maps and a certitude value associated with each error pixel are produced. These suspicious error pixels are evaluated in an error relaxation block. This is done by iteratively changing low certitude errors to high certitude errors if necessary, until no further change occurs. At the end, all high certitude errors are considered to be true
1201
Digital Signature-Based Image Authentication
Figure 9. Process of edges integrity evaluation proposed by Queluz (2001)
errors and low certitude errors are eliminated. After error relaxation, the maximum connected region is computed according to a predefined threshold. A similar idea was also proposed by Dittmann, Steinmetz and Steinmetz (1999). The feature-extraction process starts with extracting the edge characteristics CI of the image I with the Canny edge detector E (Canny, 1986). The CI is then transformed to a binary edge pattern EPC I . The variable length coding is then used to compress EPC I into a feature code. This process is formulated as follows:
• • •
Feature extraction: CI = E(I); Binary edge pattern: EPC I = f(CI ); Feature code: VLC(EPC I ).
The verification process begins with calculating the actual image edge characteristic CT and the binary edge pattern EPC T . The original binary edge pattern EPC I is obtained by decompressing the received VLC(EPC I ). The EPC I and CPC T are then compared to obtain the error map. These steps can also be formulated as follows:
• • •
Extract feature: CT = E(T), EPC T = f(CT ); Extract the original binary pattern: EPC I = Decompress(VLC(EPC I )); Check EPC I = EPC T .
1202
Mean-Based Methods Using local mean as the image feature may be the simplest and most practical way to represent the content character of an image. For example, Lou and Liu (2000) proposed an algorithm to generate a mean-based feature code. Figure 10 shows the process of feature code generation. The original image is first divided into non-overlapping blocks. The mean of each block is then calculated and quantized according to a predefined parameter. All the calculated results are then encoded (compressed) to form the authentication code. Figure 11 shows an example of this process. Figure 11(a) is a 256×256 gray image, and is used as the original image. It is first divided into 8×8 non-overlapping blocks. The mean of each block is then computed and is shown as Figure 11(b). Figure 11(c) also shows the 16-step quantized block-means of Figure 11(b). The quantized block-means are further encoded to form the authentication code. It should be noted that Figure 11(c) is visually close to Figure 11(b). It means that the feature of the image is still preserved even though only the quantized block-means are encoded. The verification process starts with calculating the quantized block-means of the received image. The quantized code is then compared with the original quantized code by using a sophisticated comparison algorithm. A binary error map is then produced as an output, with “1” denoting match
Digital Signature-Based Image Authentication
Figure 10. Process of generation of image feature proposed by Lou and Liu (2000)
and “0” denoting mismatch. The verifier can thus tell the possibly tampered blocks by inspecting the error map. It is worth mentioning that the quantized block-means can be used to repair the tampered blocks. This feasibility is attractive in the applications of the real-time image such as the video. A similar idea was adopted in the process of generating the AIMAC (Approximate Image Message Authentication Codes) (Xie, Arce & Graveman, 2001). In order to construct a robust IMAC, an image is divided into non-overlapping 8× 8 blocks, and the block mean of each block is computed. Then the most significant bit (MSB) of each block mean is extracted to form a binary map. The AIMAC is then generated according to this binary map. It should be noted that the histogram of the pixels in each block should be adjusted to preserve a gap of 127 gray levels for each block mean. In such a way, the MSB is robust enough to distinguish content-preserving manipulations from content-changing manipulations. This part has a similar effectiveness to the sophisticated comparison part of the algorithm proposed by Lou and Liu (2000).
Relation-Based Methods Unlike the methods introduced above, relationbased methods divide the original image into non-overlapping blocks, and use the relation between blocks as the feature code. The method proposed by Lin and Chang (1998, 2001) is called SARI. The feature code in SARI is generated to survive the JPEG compression. To serve this purpose, the process of the feature code generation starts with dividing the original image into 8× 8 non-overlapping blocks. Each block is then DCT transformed. The transformed DCT blocks are further grouped into two non-overlapping sets. There are equal numbers of DCT blocks in each set (i.e., there are N/2 DCT blocks in each set if the original image is divided into N blocks). A secret key-dependent mapping function then one-to-one maps each DCT block in one set into another DCT block in the other set, and generates N/2 DCT block pairs. For each block pair, a number of DCT coefficients are then selected and compared. The feature code is then generated by comparing the corresponding coefficients of the paired blocks. For example, if the coefficient in
Figure 11. (a) Original image, (b) map of block-means, (c) map of 16-step quantized block-means
(a)
(b)
(c)
1203
Digital Signature-Based Image Authentication
the first DCT block is greater than the coefficient in the second DCT block, then code is generated as “1”. Otherwise, a “0” is generated. The process of generating the feature code is illustrated as Figure 12. To extract the feature code of the received image, the same secret key should be applied in the verification process. The extracted feature code is then compared with the original feature code. If either block in each block pair has not been maliciously manipulated, the relation between the selected coefficients is maintained. Otherwise, the relation between the selected coefficients may be changed. It can be proven that the relationship between the selected DCT coefficients of two given image blocks is maintained even after the JPEG compression by using the same quantization matrix for the whole image. Consequently, SARI authentication system can distinguish JPEG compression from other malicious manipulations. Moreover, SARI can locate the tampered blocks because it is a block-wise method.
the image content. More specifically, the content structure of an image is composed of parent-child pairs in the wavelet domain. Let ws , o (x, y) be a wavelet coefficient at the scale s. Orientation o denotes horizontal, vertical, or diagonal direction. The inter-scale relationship of wavelet coefficients is defined for the parent node ws+1,o(x, y) and its four children nodes ws,o(2x+i, 2y+j) as either |ws+1,o(x, y)| ≥ |ws,o(2x+i, 2y+j)| or |ws+1,o(x, y)| ≤ |ws,o(2x+i, 2y+j)|, where 0 ≤ i, j ≤ 1. The authentication code is generated by recording the parent-child pair that satisfies ||ws +1 , o (x, y)| - |ws , o (2x+i, 2y+j)|| > ρ , where ρ > 0. Clearly, the threshold ρ is used to determine the size of the authentication code, and plays a trade-off role between robustness and fragility. It is proven that the inter-scale relationship is difficult to be destroyed by content-preserving manipulations and is hard to be preserved by content-changing manipulations.
Structure-Based Methods
Digital signature-based image authentication is an important element in the applications of image communication. Usually, the content verifiers are not the creator or the sender of the original image. That means the original image is not available
Lu and Liao (2000, 2003) proposed another kind of method to generate the feature code. The feature code is generated according to the structure of
DEsIGN IssUEs
Figure 12. Feature code generated with SARI authentication scheme
1204
Digital Signature-Based Image Authentication
during the authentication process. Therefore, one of the fundamental requirements for digital signature-based image authentication schemes is blind authentication, or obliviousness, as it is sometimes called. Other requirements depend on the applications that may be based on strict authentication or non-strict authentication. In this section, we will discuss some issues about designing effective digital signature-based image authentication schemes.
Error Detection In some applications, it is proper if modification of an image can be detected by authentication schemes. However, it is beneficial if the authentication schemes are able to detect or estimate the errors so that the distortion can be compensated or even corrected. Techniques for error detection can be categorized into two classes according to the applications of image authentication; namely, error type and error location.
Error Type Generally, strict authentication schemes can only determine whether the content of the original image is modified. This also means that they are not able to differentiate the types of distortion (e.g., compression or filtering). By contrast, non-strict authentication schemes tend to tolerate some form of errors. The key to developing a nonstrict authentication scheme is to examine what the digital signature should protect. Ideally, the authentication code should protect the message conveyed by the content of the image, but not the particular representation of that content of the image. Therefore, the authentication code can be used to verify the authenticity of an image that has been incidentally modified, leaving the value and meaning of its contents unaffected. Ideally, one can define an authenticity versus modification curve such as the method proposed by Schneider and Chang (1996) to achieve the desired authentic-
ity. Based on the authenticity versus modification curve, authentication is no longer a yes-or-no question. Instead, it is a continuous interpretation. An image that is bit by bit identical to the original image has an authenticity measure of 1.0 and is considered to be completely authentic. An image that has nothing in common with the original image has an authenticity measure of 0.0 and is considered unauthentic. Each of the other images would have authenticity measure between the range (0.0, 1.0) and be partially authentic.
Error Location Another desirable requirement for error detection in most applications is errors localization. This can be achieved by block-oriented approaches. Before transmission, an image is usually partitioned into blocks. The authentication code of each block is calculated (either for strict or nonstrict authentication). The authentication codes of the original image are concatenated, signed, and transmitted as a separate file. To locate the distorted regions during the authenticating process, the received image is partitioned into blocks first. The authentication code of each block is calculated and compared with the authentication code recovered from the received digital signature. Therefore, the smaller the block size is, the better the localization accuracy is. However, the higher accuracy is gained at the expense of the larger authentication code file and the longer process of signing and decoding. The trade-off needs to be taken into account at the designing stage of an authentication scheme.
Error correction The purpose of error correction is to recover the original images from their manipulated version. This requirement is essential in the applications of military intelligence and motion pictures (Dittmann, Steinmetz & Steinmetz, 1999; Queluz, 2001). Error correction can be achieved by means
1205
Digital Signature-Based Image Authentication
of error correction code (ECC) (Lin & Costello, 1983). However, encrypting ECC along with feature code may result in a lengthy signature. Therefore, it is more advantageous to enable the authentication code itself to be the power of error correction. Unfortunately, the authentication code generated by strict authentication schemes is meaningless and cannot be used to correct the errors. Compared to strict authentication, the authentication code generated by non-strict authentication schemes is potentially capable of error correction. This is because the authentication code generated by the non-strict authentication is usually derived from the image feature and is highly content dependent. An example of using authentication code for image error correction can be found in Xie, Arce and Graveman (2001). This work uses quantized image gray values as authentication code. The authenticated code is potentially capable of error correcting since image features are usually closely related to image gray values. It should be noted that the smaller the quantization step is, the better the performance of error correction is. However, a smaller quantization step also means a longer signature. Therefore, trade-off between the performance of error correction and the length of signature has to be made as well. This is, without doubt, an acute challenge, and worth further researching.
security With the protection of public-key encryption, the security of the digital signature-based image authentication is reduced to the security of the image digest function that is used to produce the authentication code. For strict authentication, the attacks on hash functions can be grouped into two categories: brute-force attacks and cryptanalysis attacks.
1206
Brute-force Attacks It is believed that, for a general-purpose secure hash code, the strength of a hash function against brute-force attacks depends solely on the length of the hash code produced by the algorithm. For a code of length n, the level of effort required is proportional to 2n / 2 . This is also known as birthday attack. For example, the length of the hash code of MD5 (Rivest, 1992) is 128 bits. If an attacker has 26 4 different samples, he or she has more than 50% of chances to find the same hash code. In other words, to create a fake image that has the same hash result as the original image, an attacker only needs to prepare 26 4 visually equivalent fake images. This can be accomplished by first creating a fake image and then varying the least significant bit of each of 64 arbitrarily chosen pixels of the fake image. It has been proved that we could find a collision in 24 days by using a $10 million collision search machine for MD5 (Stallings, 2002). A simple solution to this problem is to use a hash function to produce a longer hash code. For example, SHA-1 (NIST FIPS PUB 180, 1993) and RIPEMD-160 (Stallings, 2002) can provide 160-bit hash code. It is believed that over 4,000 years would be required if we used the same search machine to find a collision (Oorschot & Wiener, 1994). Another way to resolve this problem is to link the authentication code with the image feature such as the strategy adopted by non-strict authentication. Non-strict authentication employs image feature as the image digest. This makes it harder to create enough visually equivalent fake images to forge a legal one. It should be noted that, mathematically, the relationship between the original image and the authentication code is many-to-one mapping. To serve the purpose of error tolerance, non-strict authentication schemes may have one authentication code corresponding to more images. This phenomenon makes non-
Digital Signature-Based Image Authentication
strict authentication approaches vulnerable and remains as a serious design issue.
Cryptanalysis Attacks Cryptanalysis attacks on digest function seek to exploit some property of the algorithm to perform some attack rather than an exhaustive search. Cryptanalysis on the strict authentication scheme is to exploit the internal structure of the hash function. Therefore, we have to select a secure hash function that can resist cryptanalysis performed by attackers. Fortunately, so far, SHA-1 and RIPEMD-160 are still secure for various cryptanalyses and can be included in strict authentication schemes. Cryptanalysis on non-strict authentication has not been defined so far. It may refer to the analysis of key-dependent digital signature-based schemes. In this case, an attacker tries to derive the secret key from multiple feature codes, which is performed in a SARI image authentication system (Radhakrisnan & Memon, 2001). As defined in the second section, there is no secret key involved in a digital signature-based authentication scheme. This means that the secrecy of the digital signature-based authentication schemes depends on the robustness of the algorithm itself and needs to be noted for designing a secure authentication scheme.
cONcLUsIONs With the advantages of the digital signature (Agnew, Mullin & Vanstone, 1990; ElGamal, 1985; Harn, 1994; ISO/IEC 9796, 1991; NIST FIPS PUB, 1993; Nyberg & Rueppel, 1994; Yen & Laih, 1995), digital signature-based schemes are more applicable than any other schemes in image authentication. Depending on applications, digital signature-based authentication schemes are divided into strict and non-strict categories and are described in great detail in this chapter. For strict
authentication, the authentication code derived from the calculation of traditional hash function is sufficiently short. This property enables fast creation of the digital signature. In another aspect, the arithmetic-calculated hash is very sensitive to the modification of image content. Some tiny changes to a single bit in an image may result in a different hash. This results in that strict authentication can provide binary authentication (i.e., yes or no). The trustworthy camera is a typical example of this type of authentication scheme. For some image authentication applications, the authentication code should be sensitive for content-changing modification and can tolerate some content-preserving modification. In this case, the authentication code is asked to satisfy some basic requirements. Those requirements include locating modification regions and tolerating some forms of image processing operations (e.g., JPEG lossy compression). Many non-strict authentication techniques are also described in this chapter. Most of them are designed to employ a special-purpose authentication code to satisfy those basic requirements shown above. However, few of them are capable of recovering some certain errors. This special-purpose authentication code may be the modern and useful aspect for non-strict authentication. Under the quick evolution of image processing techniques, existing digital signature-based image authentication schemes will be further improved to meet new requirements. New requirements pose new challenges for designing effective digital signature-based authentication schemes. These challenges may include using large-size authentication code and tolerating more imageprocessing operations without compromising security. This means that new approaches have to balance the trade-off among these requirements. Moreover, more modern techniques combining the watermark and digital signature techniques may be proposed for new image authentication generations. Those new image authentication
1207
Digital Signature-Based Image Authentication
techniques may result in some changes of the watermark and digital signature framework, as demonstrated in Sun and Chang (2002), Sun, Chang, Maeno and Suto (2002a, 2002b) and Lou and Sung (to appear).
rEFErENcEs Agnew, G.B., Mullin, R.C., & Vanstone, S.A. (1990). Improved digital signature scheme based on discrete exponentiation. IEEE Electronics Letters, 26, 1024-1025. Bhattacharjee, S., & Kutter, M. (1998). Compression tolerant image authentication. Proceedings of the International Conference on Image Processing, 1, 435-439. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(6), 679-698. Diffie, W., & Hellman, M.E. (1976). New directions in cryptography. IEEE Transactions on Information Theory, IT-22(6), 644-654. Dittmann, J., Steinmetz, A., & Steinmetz, R. (1999). Content-based digital signature for motion pictures authentication and content-fragile watermarking. Proceedings of the IEEE International Conference On Multimedia Computing and Systems, 2, 209-213. ElGamal, T. (1985). A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, IT-31(4), 469-472. Friedman, G.L. (1993). The trustworthy digital camera: Restoring credibility to the photographic image. IEEE Transactions on Consumer Electronics, 39(4), 905-910. Harn, L. (1994). New digital signature scheme based on discrete logarithm. IEE Electronics
1208
Letters, 30(5), 396-398. ISO/IEC 9796. (1991). Information technology security techniques digital signature scheme giving message recovery. International Organization for Standardization. Li, C.-T., Lou, D.-C., & Chen, T.-H. (2000). Image authentication via content-based watermarks and a public key cryptosystem. Proceedings of the IEEE International Conference on Image Processing, 3, 694-697. Li, C.-T., Lou, D.-C., & Liu, J.-L. (2003). Image integrity and authenticity verification via contentbased watermarks and a public key cryptosystem. Journal of the Chinese Institute of Electrical Engineering, 10(1), 99-106. Lin, C.-Y., & Chang, S.-F. (1998). A robust image authentication method surviving JPEG lossy compression. SPIE storage and retrieval of image/video databases. San Jose. Lin, C.-Y., & Chang, S.-F. (2001). A robust image authentication method distinguishing JPEG Compression from malicious manipulation. IEEE Transactions on Circuits and Systems of Video Technology, 11(2), 153-168. Lin, S., & Costello, D.J. (1983). Error control coding: Fundamentals and applications. NJ: Prentice-Hall. Lou, D.-C., & Liu, J.-L. (2000). Fault resilient and compression tolerant digital signature for image authentication. IEEE Transactions on Consumer Electronics, 46(1), 31-39. Lou, D.-C., & Sung, C.-H. (to appear). A steganographic scheme for secure communications based on the chaos and Euler theorem. IEEE Transactions on Multimedia. Lu, C.-S., & Liao, M.H.-Y. (2000). Structural digital signature for image authentication: An incidental distortion resistant scheme. Proceedings of Multimedia and Security Workshop at the
Digital Signature-Based Image Authentication
ACM International Conference On Multimedia, pp. 115-118. Lu, C.-S., & Liao, M.H.-Y. (2003). Structural digital signature for image authentication: An incidental distortion resistant scheme. IEEE Transactions on Multimedia, 5(2), 161-173. NIST FIPS PUB. (1993). Digital signature standard. National Institute of Standards and Technology, U.S. Department of Commerce, DRAFT. NIST FIPS PUB 180. (1993). Secure hash standard. National Institute of Standards and Technology, U.S. Department of Commerce, DRAFT. Nyberg, K., & Rueppel, R. (1994). Message recovery for signature schemes based on the discrete logarithm problem. Proceedings of Eurocrypt’94, 175-190. Oorschot, P.V., & Wiener, M.J. (1994). Parallel collision search with application to hash functions and discrete logarithms. Proceedings of the Second ACM Conference on Computer and Communication Security, 210-218. Queluz, M.P. (2001). Authentication of digital images and video: Generic models and a new contribution. Signal Processing: Image Communication, 16, 461-475. Radhakrisnan, R., & Memon, N. (2001). On the security of the SARI image authentication system. Proceedings of the IEEE International Conference on Image Processing, 3, 971-974. Rivest, R.L. (1992). The MD5 message digest algorithm. Internet Request For Comments 1321. Rivest, R.L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120-126.
Schneider, M., & Chang, S.-F. (1996). Robust content based digital signature for image authentication. Proceedings of the IEEE International Conference on Image Processing, 3, 227-230. Stallings, W. (2002). Cryptography and network security: Principles and practice (3r d ed.). New Jersey: Prentice-Hall. Sun, Q., & Chang, S.-F. (2002). Semi-fragile image authentication using generic wavelet domain features and ECC. Proceedings of the 2002 International Conference on Image Processing, 2, 901-904. Sun, Q., Chang, S.-F., Maeno, K., & Suto, M. (2002a). A new semi-fragile image authentication framework combining ECC and PKI infrastructures. Proceedings of the 2002 IEEE International Symposium on Circuits and Systems, 2, 440-443. Sun, Q., Chang, S.-F., Maeno, K., & Suto, M. (2002b). A quantitive semi-fragile JPEG2000 image authentication system. Proceedings of the 2002 International Conference on Image Processing, 2, 921-924. Wallace, G.K. (1991, April). The JPEG still picture compression standard. Communications of the ACM, 33, 30-44. Walton, S. (1995). Image authentication for a slippery new age. Dr. Dobb’s Journal, 20(4), 18-26. Xie, L., Arce, G.R., & Graveman, R.F. (2001). Approximate image message authentication codes. IEEE Transactions on Multimedia, 3(2), 242-252. Yen, S.-M., & Laih, C.-S. (1995). Improved digital signature algorithm. IEEE Transactions on Computers, 44(5), 729-730.
This work was previously published in Multimedia Security: Steganography and Digital Watermarking Techniques for Protection of Intellectual Property, edited by C.-S. Lu, pp. 207-230, copyright 2005 by IGI Publishing, formerly known as Idea Group Publishing (an imprint of IGI Global).
1209
1210
Chapter 3.9
Digital Certificates and Public-Key Infrastructures Diana Berbecaru Politecnico di Torino, Italy Corrado Derenale Politecnico di Torino, Italy Antonio Lioy Politecnico di Torino, Italy
AbstrAct
INtrODUctION
The technical solutions and organizational procedures used to manage certificates are collectively named Public Key Infrastructure (PKI). The overall goal of a PKI is to provide support for usage of public-key certificates within—and also outside—its constituency. To this aim, several functions are needed, such as user registration, key generation, certificate revocation and many others. It is the aim of this paper to describe issues related to digital certificates and PKIs, both from the technical and management viewpoint.
In 1976, Diffie & Hellman introduced the concept of public-key (or asymmetric) cryptography in their paper “New Directions in Cryptography”. This kind of cryptography uses a pair of mathematically related keys to perform the encryption and decryption operations. One key is named the “private key” and is known only to its owner, while the other key is named “public key” and must be publicly known. Public-key cryptography is a quantum leap in the field of security because it offers a better solution to several old problems: data and party authentication, privacy without a shared secret and key distribution. The full range of benefits of public-key cryptography can be obtained only when there is assurance about the entity associated to the public
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Digital Certificates and Public-Key Infrastructures
key being used; that is the entity that controls the corresponding private key. To this purpose, members of small groups of communicating parties can meet face-to-face and directly exchange their public keys, for example on labelled floppy disks, and then ensure that these keys are securely stored on each user’s local system. This is usually known as manual key distribution (Ford & Baum, 1997), but it is seldom used outside small closed groups because it is highly impractical. Another approach is to aggregate the keys into a so-called “public file” (i.e., the list of keys and associated entities), managed by a trusted entity that makes it publicly available. This solution has its own problems too: the file is insecure and can be manipulated, the trusted entity is a single point of failure (the whole system fails if it gets compromised or access to it is denied) and the whole approach doesn’t scale well. A better solution would be to bind the public key to the controlling entity on an individual basis and protect this binding with some cryptographic measure. To this aim, Loren Kohnfelder, in his MIT Bachelor thesis (1978), proposed to use a signed data structure named public-key certificate (PKC). Webster’s Dictionary defines a certificate as a “document containing a certified statement, especially as to the truth of something” (Webster’s New Collegiate Dictionary, 1980). The Kohnfelder’s approach leaves open the issue about the signer of the certificate. This could be a user Alice that would digitally sign Bob’s public key along with Bob’s name and other accessory information. The result would be Bob’s certificate, which could convince anyone who trusts Alice that Bob’s public key really belongs to him. This is the approach taken by some certification systems, such as PGP (Garfinkel, 1995) in which a certificate is signed by all the users that vouch for the data contained in the certificate. However this approach is unpractical, relies on personal judgement and doesn’t scale well. Thus, usually the role of certificate signer is taken by a specialized entity named Certification Authority
Figure 1. Certificates identifier attribute certificate
authorization
identity certificate
authorization certificate
public-key
(CA) that handles the certificates on behalf of its constituency and takes some sort of liability for having performed the necessary trust and security checks. When no privacy issue exists, the certificates are published in appropriate repositories (such as a web server or a LDAP directory) to make them widely available. Since a certificate is digitally signed information, it is intrinsically secure and no other specific security measure is needed when it is stored or downloaded from the repository. When a third party accepts a certificate as part of a security measure to protect a data exchange with a PKI user, he plays the role of a relying party (RP) because he relies on the issuer to have provided accurate data inside the certificate. In the remainder of the chapter we deal with certificate formats, standards, and certificate management principles.
cErtIFIcAtE FOrMAts Digital certificates can be divided into three classes based on the data they bind together (Figure 1): identity, attribute and authorization certificates. An identity certificate binds together a publickey and some information that uniquely identifies the certificate’s subject, that is the person, device, or entity that controls the corresponding private key. A certificate of this type is issued by a CA. For its similarity to the identity documents used in the human world, this is the most used type of certificate for a variety of applications and
1211
Digital Certificates and Public-Key Infrastructures
it is usually simply referred to as public-key certificate. An attribute certificate binds an identity to an authorization, title or role by a digital signature. That signature is produced by a trusted third party, named Attribute Authority (AA), which has the power and takes the liability of assessing the attribute. Attribute certificates are finding increasing use in access control and electronic signatures. An authorization certificate binds an authorization, title or role directly to a public key rather than to an identity. The concept here is that the public key can speak for its key-holder; that is the entity that controls the corresponding private key. Thus, a public key is by itself an identifier. This kind of certificate has been proposed to shorten the authorization process: when the AA coincides with the consumer of the attribute certificate (i.e., the resource controller), then the controller can directly issue an authorization certificate. Authorization certificates are rarely used and are applied mostly in access control within closed groups.
the X.509 standard The X.509 ISO/IEC/ITU recommendation (ITU-T Recommendation, 2000) is the most widely accepted standard for the format of digital certificates. This is a general and very flexible standard. As such, to achieve application interoperability often a profiling operation is needed. For example, the IETF PKIX working group has defined a X.509 certificate profile for use with Internet applications (Housley et al., 1999; 2002) while Visa, MasterCard and other players adopted X.509 as the basis of the certificate format used by the SET standard for electronic commerce (SET, 2003). Originally, X.509 was conceived as the authentication framework for the X.500 directory service, but later has been applied mostly out of its original scope. Four versions of this standard exist: version 1 (1988) is the foundation but quickly showed some severe limitations, not solved by
1212
version 2 (1993), that is a minor one. Widespread use of X.509 took place with version 3 (1996) that addresses the limitations of the previous versions: •
•
•
•
since each subject can hold several certificates (for the same or different public keys), a mechanism to unambiguously distinguish between them is needed; as X.500 is rarely used in practice, other ways to identify the subject and issuer are needed; for commercial applications, it is important to know the certification policy used to issue the certificate because it is related to the legal liability; mechanisms should allow the definition of mutual trust relationships between different certification authorities.
X.509v3 uses optional fields (named extensions) to carry these and other additional data. Finally, ITU-T Recommendation (2000) further extends the aim of this standard by introducing attribute certificates, in addition to the identity certificates supported since version 1. The basic fields of an X.509v3 certificate are shown in Table 1 and their meaning is explained. The serial number field contains the unique identifier of the certificate within all the certificates created by the issuer. Since it is a unique identifier, the “issuer name and serial number” pair always uniquely identifies a certificate and hence a public key. The signature algorithm field identifies the algorithm and optional parameters used by the issuer when signing the certificate. This is very important to avoid a class of cryptographic attacks based on detaching a signature created by one algorithm and claiming that it was done with a different algorithm. The issuer name field specifies the X.500 distinguished name (DN) of the CA that issued
Digital Certificates and Public-Key Infrastructures
the certificate. This field must always be nonempty. The validity period field specifies the start and the expiration date of the validity of the certificate. The issuer backs the certificate only within this period. The subject name field specifies the X.500 distinguished name (DN) of the entity holding the private key corresponding to the public key identified in the certificate. If this field is left empty
(because the subject doesn’t have a directory entry) then the identity information must be carried in the Subject Alternative Name extension field that must be marked as critical. This permits one to identify the subject by the name or names in the extension because it binds a key to an application-level identifier (such as an e-mail address or an URI, when support for secure email or Web is desired).
Table 1. Sample X.509 public-key certificate Field name
Example
Version
Version 3
Serial number
12345
CA signature algorithm identifier
Algorithm shal With Rsa Encryption (OID 1.2.840.113549.1.1.5)
Issuer
CN=Politecnico di Torino Certification Authority, O=Politecnico di Torino, C=IT
Validity Period
Not Before
Wed Dec 19 18:00:00 2001 (011219170000Z)
Not After
Wed Dec 31 10:00:00 2003 (031231090000Z)
Subject
CN=Alice Twokeys, OU=Dipartimento di Automatica e Informatica, O=Politecnico di Torino, C=IT
Subject Public Key Information
Algorithm RSA (OID 1.2.840.113549.1.1.1)
Certificate Extensions
CA Signature
Authority Key Identifier
F40C 6D6D 9E5F 4C62 5639 81E0 DEF9 8F2F 37A0 84C5
Subject Key Identifier
9E85 84F9 4CAF A6D3 8A33 CB43 ED16 D2AE 8516 6BB8
Key Usage
digitalSignature nonRepudiation keyEncipherment dataEncipherment
Subject alternative names
RFC822:
[email protected] …
… BDD66A3EC 4AB7871E EFBE8370 E9CC4BC1 48F2EE52 340A3B66 $E12E0A0 57DF72CA
D7C1C411 023F18BB 58A2248D 144CF894 7550A7E1 5C835BE7 8983B194 A0E96C01
146FBD7E 4DA23262 A6839F23 E391D5B0 41CDFEDA 00B4B97A D0101B95 66BB1CD4
DF25FF 90822E FA1AA1 B79D86 DCB394 F1D3D3 5A94B3 F9C9F7
1213
Digital Certificates and Public-Key Infrastructures
The subject public key information field contains the value of the public key owned by the subject, and the identifier of the algorithm with which the public key is to be used. The CA Signature field contains the actual value of the digital signature of the CA and the identifier of the signature algorithm by which it was generated. Quite obviously, this identifier must be the same as the one in the CA signature algorithm field. Since certificate extensions are more complex, their treatment is deferred to the homonymous section. The X.509 standard distinguishes among end-entity (EE) certificates and CA certificates. An end-entity certificate is issued by a CA to a key-holder that can use it for several applications (e.g., to digitally sign documents) but not to sign certificates. On the contrary, a CA certificate is issued by a CA to another CA; in this case, the private key corresponding to the certified public key is mainly used to issue other certificates. CA certificates can be self-signed or cross certificates. A cross certificate is issued by a CA to another CA to empower it to sign certificates for its constituency. On the contrary, a self-signed certificate is issued by a CA to itself (i.e., the subject and the issuer of the certificate are the same entity). This is usually done to distribute the certificate of a so-called root CA or trusted CA; that is a CA that is directly trusted by the end user. For example, self-signed certificates are installed inside the most common web browsers to let the user automatically trust some commercial root CAs. If this is positive or negative from a security viewpoint is still a topic for debate.
Certificate Status Information (via crL) X.509 certificates have a validity period with an expiration date. However a certificate can become invalid before its natural expiration for various
1214
reasons. For instance, the secret key may have been lost or compromised, or the owner of the certificate may have changed his affiliation. In general, any time any data present in a certificate changes, then the certificate must be revoked. This is a task of the certificate issuer that must also let this event be publicly known by making available fresh certificate status information about the revoked certificate. The X.509 standard does not mandate any specific way in which an authority should maintain the revocation information, but suggests the Certificate Revocation List (CRL) method. The CRL is a signed object, which contains the serial numbers of the revoked certificates. A sample CRL is shown in Table 2. If a relying party application encounters a revoked certificate, then it should be configured to perform different actions depending on the revocation reason. For example, a certificate that was revoked because the private key was compromised has more serious implications compared with the case when the certificate was revoked due to a change in the affiliation of the subject. Thus, from version 2 of the CRL format, it can be stated also the revocation reason. Besides, the suspension state is introduced where it is specified that the certificate is temporarily invalid. After a period of time, the reference to the certificate can be removed from the CRL, thus making the certificate valid again, or it can be definitely revoked. The decision to revoke a certificate is the responsibility of the CA, usually as a response to a request coming from an authorized entity. According to its internal rules, the CA authenticates the source of the revocation request and, after taking the decision to revoke the certificate, the CA has the obligation to inform the PKI community about the revocation event. Several methods have been proposed to allow RP to retrieve the certificate status. CRL-based mechanisms are the primary methods for revocation notification in PKI: the RP retrieves the CRL from well-known servers. Alternatively, mechanisms that provide
Digital Certificates and Public-Key Infrastructures
Table 2. Sample Certificate Revocation List (CRL) Field name
Example
Version
Version 1
CA signature algorithm identifier
Md5 With RSA Encryption
Issuer DN
CN=Politecnico di Torino Certification Authority, O=Politecnico di Torino, C=IT
Validity Period
This Updated
Sep 4 12:15:11 2003 GMT
Next Updated
Oct 5 12:15:11 2003 GMT
Revoked certificates
Serial Number
0127
Revocation Date
Jun 9 12:54:18 2003 GMT
Serial Number
00E7
Revocation Date
Oct 3 07:53:38 2002 GMT
CA Signature
0B7FABDC 4AB7871E E1238370 E9CC4BC1 48F2EE52 340A3B66 4E12E0A0 57FD72CA
immediate notification of revocation have been proposed, such as OCSP, the on-line certificate status protocol (Myers et al., 1999). All the available revocation mechanisms share the design goals of correctness, scalability, and availability: •
•
•
all verifiers must be able to correctly determine the state of a certificate within wellknown time bounds; the costs for the determination of current revocation status of certificates should not grow exponentially with the size of the PKI user community; replicated repository servers should be provided in order to ensure service availability.
CRLs may be distributed by the same means as certificates, namely via untrusted channels and servers. The disadvantage is the increasing size
D7C1C411 023F18BB 58A2248D 01476894 7550A7E1 5C835BE7 8983B194 A0E96C01
146FBD7E 4DA23262 A6839F23 E391D5B0 41CDFEDA 00D3457A D0101B95 66BB1CD4
DF25FF6A 90822E57 FCD56A8 B79D86C0 DCB394C6 F1D3D349 5A94B3E2 F9C9F7B5
of the CRL that leads to high repository-to-user communication costs. Thus, CRLs can introduce significant bandwidth and latency costs in largescale PKIs. Another disadvantage is that the time granularity of revocation is limited to the CRL validity period; hence, the timeliness of revocation information is not guaranteed. It is worth adding a security warning for the clients, signalling that in retrieving the CRLs without verifying the server’s identity the risk exists that an obsolete CRL is sent. Clearly, an attacker cannot create a false CRL (without compromising the CA), but an attacker does have a window of opportunity to use a compromised key by denying access to the latest CRL and providing access to a still-valid-but-obsolete CRL. The term still-validbut-obsolete implies that CRLs have a validity period. Certificates have a validity period clearly specifying the start date and the expiry date of the certificate. CRLs instead have values for the date when a CRL is issued (the thisUpdate field) and
1215
Digital Certificates and Public-Key Infrastructures
Figure 2. OCSP CRL
CRL
valid? OCSP responder
Relying Party yes, no, unknown
CRL
the date when the next CRL will surely be issued (the nextUpdate field). However, nothing prevents a CA from generating and publishing a new CRL (let us call it off-cycle CRL) immediately when a new revocation takes place (Ford & Baum, 1997). If an intruder deletes or blocks access to an offcycle CRL from an untrusted server and leaves the previous periodic CRL in its place, this cannot be detected with certainty by the RP. Another important observation is that a CRL does not become invalid when the next CRL is issued, although some applications behave as though it would be. CRL Validity interpretation eliminates the ability of the client to decide, based on the value of information, how fresh its revocation information needs to be. For example, if CRLs are issued every hour, a user might demand a CRL less than two hours old to authenticate a high value purchase transaction. If a CA issues CRLs every month, a user would rather prefer to be warned that, according to his application preferences, that CA’s certificates shouldn’t be used for high value purchases. This means that the user does not want the application to blindly treat them as fresh as certificates from CAs that issue CRLs daily or hourly. Unfortunately, there is no way to express the update frequency in a formal way inside the CRL; however, this information might be available in the certification practice of the CA.
1216
Since the size of a CRL may increase, it is possible to split the CRL in non-overlapping segments. This can be done via the CRL distribution point (CDP) extension of X.509 certificates. A CDP is the location from which the RP can download the latest CRL for a specific certificate. By using different CDP for different sets of certificates (e.g., one CDP every 10,000 certificates), it is possible to set an upper limit to the size of a CRL. Usually, the CDP extension contains multiple access methods (such as LDAP, HTTP, FTP), to support as many applications as possible.
Certificate Status Information (via OCSP) The main alternative to CRL is the Online Certificate Status Protocol (OCSP) (Myers et al., 1999). This protocol allows applications to request the revocation status of a specific certificate by querying an online OCSP responder that provides fresh information about certificate status (Figure 2). The main advantage of OCSP is its speed, since it does not require downloading huge CRLs. The response is very simple and may convey three different values: valid, invalid, unknown. Additionally, OCSP can provide more timely revocation information than CRLs (when it is fed directly with revocation data, before a CRL is
Digital Certificates and Public-Key Infrastructures
generated) and seems also to scale well for large user communities. However, since certificate status validation implies a specific client/server request to OCSP responders, the mechanism can overload the network and generate an intense traffic toward the responder. Thus, even if the cost of user-to-repository communication is lower compared to the traffic involved in transmitting a CRL, there still is an intense communication toward the OCSP server. Since the revocation information is produced at the server, the communication channel between the relying party and the server must be secured, most likely by using signed responses. Signing operations could also limit the server scalability, since digital signature generation is computationally intensive. On the other hand, the decision to trust an OCSP responder is an important decision to be made. Consequently, all the issues related to the distribution and maintenance of the trusted CA’s public keys will apply to the OCSP responder’s public key too. For revocation notification in enterprise environments, there should exist an additional mechanism to manage and enforce trusted responders in a centralized manner. Revocation of OCSP server’s public key requires usage of an alternative revocation method for checking server’s public key status. The OCSP responses are signed objects and so the OCSP client must verify the validity of the signature on them. In order to verify the validity of the signature on the OCSP response messages the OCSP client has to verify the status of the OCSP responder certificate. But she cannot ask to the OCSP responder if its certificate is still valid, because in the hypothesis that the OCSP responder private-key has been compromised its responses can be manipulated and then give false status information. In other words, how will the OCSP clients verify the status of the responder’s certificate? In the standard three alternatives are mentioned to solve this issue. In the first alternative, the client trusts the responder’s certificate for the entire validity period
specified within. For this purpose the CA issues the responder’s certificate in question adding a special non-critical extension called id-pkix-ocspnocheck. However, the effects of such a choice are self-evident. In case the responder’s private key is compromised in any way, the entire PKI is compromised. An attacker having the control of the responder’s private key can provide any status information to the PKI community. Therefore, when this alternative is chosen to be deployed in a PKI, the CA should issue the responder’s certificate with a very short validity period and renew it frequently. This would cause also the OCSP client to download too often the fresh OCSP responder certificate from the certificate repository. In the second alternative, the OCSP clients are not suggested to trust the responder certificate and an alternative method of checking the responder certificate status must be provided. Typically, a CRLDP extension is inserted into the responder’s certificate when CRLs are employed, or an Authority Information Access (AIA) extension is provided in the responder’s certificate if other means of revocation notification should be employed. The third case is when the CA does not specify any means for checking the revocation status of the responder certificate. In such case, the OCSP client should fall back to its local security policy in order to decide whether the certificate at hand should be checked or not. Another interesting feature of OCSP is preproduction of response. For this, OCSP responders could produce, at a specified moment of time, a complete set of responses for all the certificates issued by the CA. This procedure has its advantages and disadvantages. The advantage is that the responder saves up its computational resources by having available the pre-produced responses. The disadvantage comes from the fact that, by means of pre-produced responses, the server exposes itself to replay attacks. This type of attack would typically consist of an attacker sending back to an OCSP client an old response with a good status
1217
Digital Certificates and Public-Key Infrastructures
inside just before the response expires and after the certificate was revoked. However, this type of attack is feasible also for responses generated on the fly. To prevent replay attacks, a special unique value (named “nonce”) could be inserted in the request and copied back by the responder into the signed response. In this way, a replay attack would be immediately detected. The OCSP responders are vulnerable to yet another type of attack, namely the denial of service (DoS) attack. This vulnerability is evident when imaging a flood of queries toward the server. The flaw is sharpened by the requirement to sign the responses. This is a fact that slows down the capacity of the responder to answer queries and eventually can take it to a halt. Nevertheless, producing unsigned responses would not be an alternative, since the attacker could be able to send false responses to the clients on behalf of the responder. For this reason, the protocol implementers must carefully consider the alternative of restricting access to the responder by accepting only signed requests from known partners or by using other access control techniques. In general, OCSP is better for fast and specific certificate status lookup at the present time, as needed for online transactions, while CRLs are superior in providing evidence for long periods of time, as needed for archival of electronic documents.
X.509 Extensions Since version 3, X.509 has defined a mechanism to extend the certificate format to include additional information in a standardized and yet general fashion. The term standard extension refers to those extensions that are defined in the standard itself while the term private extension refers to any extension defined by a single company or closed group. For example, before the definition of OCSP, Netscape defined the NetscapeRevocationURL extension supported by its products to provide certificate status lookup.
1218
Each extension consists of three fields, namely the extension type, the extension value and the criticality bit. While the meaning of the first two fields is straightforward, special attention must be paid to the criticality field, which is a singlebit flag. When an extension is marked as critical, this indicates that the associated value contains information that the application cannot ignore and must process. If an application cannot process a critical extension, the application should reject the whole certificate. On the contrary, if an application encounters an unrecognised but non-critical extension, it can silently ignore it. Note that the certificates containing critical extensions are defined by the CA to be used for a specific purpose. If an application encounters a critical extension and does not process the extension in accordance with its definition, then the CA is not liable for the misuse/processing of the certificate. Thus, certificate extensions are not only related to technical issues but also to legal aspects. Standard certificate extensions are grouped into four main categories: certificate subject and certificate issuer attributes, key and policy information, certificate path constraints, and CRL distribution points. The certificate subject and certificate issuer attributes extensions support alternative names of various forms, to identify the subject or the issuer in a way consistent with the application that requires the certificate. These extensions can also convey additional information about the certificate subject, to assist a relying party’s application in being confident that the certificate subject is a specific person or entity. The subject and issuer alternative name extensions allow one or more unique names to be bound to the subject of the certificate. These extensions support several forms of names, such as email identifiers, domain names, IP addresses, X.400 originator/recipient addresses, EDI party names and much more. These additional names are very valuable to perform additional security
Digital Certificates and Public-Key Infrastructures
checks at the application level. For example, if the issuer inserts the RFC-822 email address of the user in the subject alternative name of her certificate (e.g.,
[email protected]), then the secure email applications can associate the sender of an email with her cryptographic identity. This is actually what happens in S/MIMEv3 (Ramsdell, 1999) where the subject alternative name is specified as the preferred mean to verify the correspondence between the “From” header of the RFC-822 message and the identity present in the certificate used to digitally sign the email. If the values do not match then the S/MIME mail user agent usually displays a warning message. The key information extensions convey additional information about the keys involved, to identify a specific key or to restrict key use to specific operations. The authority key identifier extension allows one to identify a particular public key used to sign a certificate. This is the case when a CA uses two key pairs (one for low and one for high assurance operations). The identification of the key can be performed in two ways: either with a key identifier, which typically is the digest of the public key, or with the pair issuer name and serial number. This extension is always non-critical but, nonetheless, in some software it is very important because it is used for constructing the certification paths. The key usage extension identifies the range of applications for which a certain public key can be used. The extension can be critical or non-critical. If the extension is critical, then the certificate can be used only for the cryptographic operations for which the corresponding value is defined. For example, if a certificate contains the extension key usage with the values set up to Digital Signature and Non-Repudiation then the certificate can be used to generate and to verify a digital signature, but not for other purposes. As another example, if a certificate contains the key usage extension with the value set to Key Encipherment then the
corresponding public key in the certificate can be used in a key distribution protocol to encrypt a symmetric key. The policy information extensions convey additional information about the policy used in certificate creation and management. A certificate policy is a named set of rules that indicates the applicability of a certificate to a particular community and/or class of application with common security requirements. For example, a particular certificate policy might indicate applicability of a type of certificate to the authentication of electronic transactions for trading goods up to a given price. This extension is very important as it is being used to establish the validity of a certificate for a specific application. Relying party applications with specific certificate policy requirements are expected to have a list of acceptable policies and to compare the policies in the certificate to those in this list. The certification path constraints extensions allow constraints to be included in a CA certificate to limit its certification power. The following types of constraints are defined: •
•
•
basic constraints tell whether an entity is a CA or not, i.e., whether it is authorised to issue certificates or if it is a leaf in the certification tree name constraints restrict the domain of trustworthy names that can placed by the CA inside the certificates (e.g., only a certain subset of email identifiers or IP addresses) policy constraints restrict the set of acceptable policies that can be adopted by the CA in its operations.
These extensions are very important and they must be processed correctly when a relying party application must determine the validity of a certificate. For example, one paper (Hayes, 2001) describes a security attack, named certificate masquerading attack, which successfully occurred
1219
Digital Certificates and Public-Key Infrastructures
because the certificate-enabled application did not properly apply the external name constraints and policies. The CRL distribution point extension identifies the point of distribution for the CRL to be used in the process of determining the validity of a certain certificate. The value of this extension can be either a directory entry, an email address or a URL.
PKIX Certificate Profile Extensions are a good way to make the certificates more flexible and accommodate different needs. However, this can make interoperability a nightmare. Therefore, several bodies have started to define certificate profiles that suggest which extensions to use for specific applications or environments. Among all bodies that defined profiles, the work of the IETF-PKIX group is particularly important because it applies X.509 certificates to the security of common Internet applications, such as web protection via SSL/TLS, email security via S/MIME, and the protection of IP networks via IPsec. The PKIX profile was originally defined by RFC-2459 (Housley et al., 1999) and later updated by its successor RFC-3280 (Housley et al., 2002). The profile suggests the use of X.509v3 certificates, coupled with X.509v2 CRL, and addresses the following issues: •
• •
the format and semantics of certificates and certificate revocation lists for the Internet PKI; procedures for processing the certification paths; encoding rules for popular cryptographic algorithms.
In addition to the standard extensions, PKIX defined several private extensions. However, since the group that defined these extensions is the Internet itself, extensions can hardly be
1220
regarded as private. These PKIX extensions are subject information access, authority information access and CA information access. The subject information access extension specifies a method (e.g., http, whois) to retrieve information about the owner of a certificate and a name to indicate where to locate it (address). This extension is fundamental when X.500 is not used for certificate distribution. The authority information access specifies how to get access to information and services of the CA that issued a certificate, while CA information access specifies how to get access to information and services of the CA owning the certificate. In addition to these new extensions, the PKIX profile introduces new application-specific values for the extended key usage extension, in addition or in place of the basic purposes indicated in the key usage field. For example, if a certificate has the value of this extension set to server authentication then it can be used to authenticate the server in the TLS protocol. Other values are defined for TLS client authentication, OCSP signing, timestamping generation, code authentication and email protection.
Certificate validation Once a certificate has been issued, the task of the CA is completed. However, when a relying party accepts a digital signature (and hence the associated pub certificate), it is its own responsibility to check for the certificate’s validity. This is quite a complex task that requires many actions to be taken and many checks to be performed, but it is at the heart of trust in PKIs. Suppose that Alice receives a digitally signed message by Bob and she needs to validate the certificate that comes along with the message. First Alice needs to check the authenticity of the CA signature, for which it is necessary to obtain the public key of the CA. Next, if Alice directly trusts this CA then she can validate Bob’s certificate immediately. Otherwise Alice needs to get a set
Digital Certificates and Public-Key Infrastructures
of certificates that start from Bob’s one, continue with all the intermediate CA’s certificates up to a CA that Alice trusts (called also trust anchor, TA). This ordered set of certificates is called a certification path or certificate chain. More than one certification path can exist for Bob’s certificate. The process of finding all of them is called certificate path discovery (CPD). CPD tries to find a certificate sequence that leads to a trusted CA. This may require constructing several certificate chains before finding an acceptable one. The process of constructing a certificate path rooted in a trusted CA is called path construction. Once the certification path is built, Alice needs to execute a path validation algorithm that takes as input the certification path and returns as output whether the certificate is valid or not. Both the ITU (ITU-T Recommendation, 2000) and the IETF (Housley et al., 2002) provide sample algorithms for path validation. These algorithms are a reference to establish the correct results of path processing, but are not necessarily the best or most optimised way to validate certification paths. They were chosen because of the ability to describe them fully and accurately. RPs are free to implement whatever path validation algorithm they choose, as long as the results are guaranteed to be the same as these standard algorithms. The path validation algorithm contains the following steps: a)
b)
Syntax check. Parse and check the syntax of the digital certificate and its contents, including some semantic check like use of certificate compared to allowed use (key usage extension), presence of mandatory fields and critical extensions. Signature validation. Validate the CA’s signature on the certificate. This requires a trusted copy of the CA’s own public key. If the CA is not directly trusted then a certification path (or certificate chain) must be constructed up to a trusted CA. The definition of the certificate chain is given.
c)
d)
e)
f)
g)
Temporal validation. Check that the digital certificate is within its validity period, as expressed by the “not before” and “not after” fields in the certificate. For real-time checking, this must be compared against the current time, while for old signed messages, the signature time must be considered. Revocation status. Check that the certificate is not revoked; that is, declared invalid by the CA before the end of the validity period. This may require a CRL lookup or an OCSP transaction. Semantic check. Process the certificate content by extracting the information that shall be presented to the relying party either through a user interface or as parameters for further processing by the RP application. This should include an indication of the quality of the certificate and of the issuer, based on the identification of the certificate policy that the CA applied for certificate issuance. Chain validation. In case a certificate chain had to be constructed, the above steps must be repeated for each certificate in the chain. Constraints validation. Execute controls on each element of the path to check that a number of constraints have been respected (e.g., naming and policy constraints).
A valid certificate is a certificate signed by a CA that a relying party is willing to accept without further checks — that is, the certificate has a CA trust point, the certificate’s signature can be verified, the certificate is not revoked and the certification path up to a trusted CA can be constructed and is valid (Housley, 2001). In simple hierarchical closed environments, where all the entities trust a single root, certificate validation seems a trivial task. However, security attacks can be performed at the application level, as explained in Hayes (2001). When multiple CAs exist, if the entities that trust different CAs want
1221
Digital Certificates and Public-Key Infrastructures
Figure 3. Example ACL ID
2. look-up in the ACL
ACL
1. ask for access Alice Smith
web server
access
bob Green
grant
Alice smith
grant
Jack ripper
deny
carl blue
grant
3. positive response
4. access granted Certificate ID: Alice smith Public key: xxxx
to communicate among themselves, the CAs must establish efficient structuring conventions to create trust between each other. These conventions are often called PKI trust models and are discussed later. Generally speaking, a PKI trust model is employed to provide a chain of trust from Alice’s trusted anchor to Bob’s certificate. In order to perform certificate validation, it is important to know where certificate status information is available and to locate the certificate of the CA that issued the certificate being validated. Unfortunately, there is not yet general agreement about where this information should be put inside the certificate. The following X.509 extensions could be used: •
•
•
1222
the Issuer Alternative Name extension can contain an URI related to the issuer, but the interpretation of this URI is entirely application specific the CRL Distribution Point can contain the location of CRLs, but too often this field is omitted the Authority Info Access extension can contain an URI indicating the location of the CA certificate and/or the location of an OCSP responder
If a certificate does not contain any of the above extensions – as in the Verisign-Microsoft case (Microsoft, 2001) – then the certificate’s revocation status cannot be automatically checked, unless it is configured in some other way into the relying party application.
Attribute Certificates Attribute certificates are an extension of the identity certificates and were introduced to offer a robust, distributed and scalable system for the management of authorizations. In fact, identity certificates can contain attribute information to be used for authorization purposes. For example, a widely used access control method employs Access Control Lists (ACLs), together with the identity information placed inside a public-key certificate. This technique is based on a list of records that state the authorization granted by the system to an identity. Figure 3 shows a sample system that uses an ACL for controlling user access to a web server. The performed steps are as follows: • •
the user sends her public-key certificate to the server the server checks the certificate validity and then engages the user into a challenge-
Digital Certificates and Public-Key Infrastructures
Table 3. Main fields of an attribute certificate Field name Version
Version 2
Serial number
3514535
Signature algorithm identifier for AA
RSA with MD5
Issuer
CN=Politecnico di Torino Attribute Authority, O=Politecnico di Torino, C=IT
attrCertValidityPeriod
Start=01/01/2001, expiry=01/02/2002
Holder
Attributes
AA Signature
•
Example
Issuer
CN=Politecnico di Torino Certification Authority, O=Politecnico di Torino, C=IT
Serial Number
12345
Type
2.5.4.72 (role)
value
Student EF1dGhvcm10eTCCASIwL2NybC5kZXIwTgYDB R0gBEcgEEAakHAQEBMDUwMwYIKwYBBQUHAgE Wj2h0dHeg6a5r61a4jUqp4upKxuzgu6unsw/ +RkU2KzlNm053JOcsZs/0IFiMW1GJB2P7225 WWDF01OtQcmLYspoiffUPy2g+KvCG1b9zHmf JoaDn5y+kQQpHs/ZIZeUyNe9ULifu3GgG
response protocol to check possession of the corresponding private key if the check is positive then the identity extracted from the certificate is used as a search key in the ACL and the selected action is executed
Other systems base the ACL not on identities, but rather on roles or authorizations. By using X.509 extensions (such as the Directory Attributes one), roles and authorizations can be directly inserted into an identity certificate. However, this solution exposes the certificate to a higher risk of revocation if any of the roles or authorizations change. Moreover, the CA is rarely the correct entity with the right to state roles and permissions that are usually defined directly by the application servers.
To avoid these problems, the Attribute Certificate was defined with the idea to store privilege information in a structure similar to that of a public key certificate but with no cryptographic key. This type of certificate is used for the express purpose of storing privilege information and has been standardized in X.509v4 (ITU-T Recommendation, 2000). The main fields of an X.509 AC are illustrated in Table 3 and discussed in the following: •
•
Version: This field indicates the version of the format in use. For attribute certificates conforming to the standard (ITU-T Recommendation, 2000) the version must be v2. Holder: This field identifies the principal with which the attributes are being associated. Identification can be either directly
1223
Digital Certificates and Public-Key Infrastructures
• • •
•
•
by name or by reference to an X.509 public key certificate (by a pair issuer name and certificate serial number). Issuer: This field identifies the AA that issued the AC. Signature: This field indicates the digital signature algorithm used to sign the AC. Serial Number: This field contains a unique serial number for the AC. The number is assigned by the issuing AA and used in a CRL to identify the attribute certificate. attrCertValidityPeriod: This field may contain a set of possibly overlapping time periods during which the AC is assumed to be valid. Attributes: This field contains a list of attributes that were associated to the AC owner (the owner is the principal that is referred to in the subject field). This field is a sequence of attributes, each one defined as a pair type and value. The standard allows one to specify in a single AC a set of attributes for the same owner. The information may be supplied by the subject, the AA, or a third party, depending on the particular attribute type in use.
With an attribute certificate of this type, the issuer declares that the subject has the set of attributes listed in the attributes field. Attributes certified by an AC can be anything, from group membership (e.g., the subject belongs to group of administrators), to financial limitations (e.g., the subject has an upper limit of 500 Euro for online transactions). The attrCertValidityPeriod field indicates the period of time for which the attributes hold for the subject.
SDSI/SPKI Certificates Ronald L. Rivest & Butler Lampson, in their paper (1996), proposed a new distributed security
1224
infrastructure. The motivation of this proposal lies in the authors’ perception that many PKIs (such as those based on X.509 certificates) were incomplete and difficult to deploy due to their dependence on a global name space, like the one proposed by X.500. The main goal of the SDSI infrastructure is to eliminate the need of associating an identity to a key. The key itself is the identifier of the key-holder. According to its authors, this is a “key-centric” system. A public key represents and speaks for the entity that controls the associated private key. Rivest & Lampson base their arguments on the fact that names are never global, but have value only in a local space. Names are bound to a person by experience and have, therefore, always a limited scope. However this leaves room for two entities having the same name: this problem is called by Ellison the “John Wilson problem” (Ellison, 2002). To differentiate between such two entities with the same name, a large quantity of personal details is needed (e.g., birth date and place, town of residence, name of parents and many other issues that would form a dossier of personal information that would violate privacy). In SDSI, names are always related to a very limited and specific context and should therefore disambiguate very easily. Each public key, along with the algorithm used to generate it, is contained in an object called a principal. Signatures are appended to the object or can also be detached from it. Objects can be co-signed by many signers. A signed object is a particular data structure that must contain: • • •
the hash of the object being signed, along with the algorithm used to generate it the signature date the output produced by the signature algorithm
Digital Certificates and Public-Key Infrastructures
Below is an example of a SDSI signed object: (Signed: (Object-Hash: (SHA-1=7Yhd0mNcGFE071QtzXsap=q/uhb= ) ) (Date: 1996-02-14T11:46:05.046-0500 ) (Signature: #3421197655f0021cdd8acb21866b) (Re-confirm: PT8H ( Principal: ... ) ) )
The SDSI signed object can have an absolute expiration time or may need a reconfirmation. An absolute expiration time means that the signature is valid until the date present inside the object. If due to some accident, e.g., a private-key compromise, the signature is not valid anymore, then there is the need of a procedure to revoke the signed object. Reconfirmation is the opposite way of thinking: a signature is valid until the signer reconfirms it. In this paradigm the signature is valid if the time passed from the signature date is less than the re-confirmation period. The signed object can contain the Expiration-date: or the Re-confirm: attribute; in the latter case, a time interval is specified in ISO (8601:1998) format: for example, PT8H says that the reconfirmation is needed every eight hours. The Re-confirm: field has an interesting optional value used to specify a different reconfirmation principal from the original signing one. This is the case, for example, of a server specialized in the reconfirmation duty. The SDSI architecture does not rely on Certification Revocation List model in order to deny validity to a signature but on a reconfirmation method. The paper (Rivest & Lampson, 1996) proposes a client-server protocol in which someone who needs to verify a signature could query the original signer itself or a delegated entity for reconfirmation of the signature. SDSI identity certificates are used to bind a principal to a person, and, because humans should always examine the identity certificates, they should always contain some readable text to describe the person being certified. An identity
certificate is the union of some data to a signed object. It contains: • • • •
a local name in the field Local-name a principal in the fields Value:(Principal:) a description of the certified entity in the field Description: a signature (signed object) in the field signed
The local name is chosen arbitrarily; however, a system exists to link all the names. For example, to refer to Alice Smith, Bob’s best friend and from him certified, it could simply be said Bob’s Alice or, in formal notation, (ref: bob alice) while a reference to Alice’s mother would be: (ref: bob alice mother) Here is an example of an SDSI identity certificate: (Cert: (Local-Name: Alice) (Value: (Principal: ...)) (Description: [text/richtext] “Alice Smith is a researcher at the Politecnico di Torino. (Phone: 39-011-594-7087 ) (signed: ...)) In SDSI, each principal (that is, each public key) can act as a certification authority; that is, can bind a key to an identity by a signature. In doing so the principal should use his personal judgment about the identity of the key owner being certified. Based on the SDSI theory, the SPKI model (Simple PKI) was proposed to the IETF in February 1997 by a working group, which terminated
1225
Digital Certificates and Public-Key Infrastructures
its work in 2001. The main target of the SPKI certificates (Ellison, 1999) is authorization rather than authentication. SPKI identifies the problem of having a unique identifier for a person as being the problem that led the X.500 project to failure. Then SPKI uses local names to represent entities. But relationships on the net are established between people who have never met in person or who do not have common friends or references. So, SPKI elects as a global identifier the public-key itself or a hash free function of the public-key. This is based on the assumption that any public-key is different from another.
PKI cOMPONENts A Public Key Infrastructure (PKI), as defined by the IETF working group PKIX (Arsenault et al., 2002), is “the set of hardware, software, people, policies and procedures needed to create, manage, store, distribute, and revoke public-key certificates based on public-key cryptography.” The main components of a PKI are: • • • • • •
certification authority (CA) registration authority (RA) local registration authority (LRA) repository relying party (RP) end entity (EE).
The Certification Authority (CA) is the authority trusted by users. The main duties and responsibilities of a CA are: • • • • •
1226
issue certificates manage certificates (suspend, renew, publish, revoke) generate and publish certificates status information keep safe its own private key keep safe the CA hardware and software.
The CA’s private key is the heart of the whole system. The safety of this key is one of the main responsibilities of a CA. To this aim a CA should adopt any security measure it can afford. At least the key should be stored in an encrypted form and possibly it should be used only by some hardware secure crypto-device, such as a smartcard or a cryptographic accelerator. To achieve a higher degree of security, the key to decrypt the private key or to activate the crypto device could be divided into two parts, each one held by a different person that need to join to operate the CA. Another good practice to protect the CA is to keep the system that issues certificates and CRLs off-line in a secure vault. Despite any security measure implemented by a CA, there is always the chance of a human error or attack based on human misbehaviour (the so-called “social engineering” attacks). Therefore the CA should also pay special attention to its internal procedures and to the phase of user registration that is usually delegated to a different component (the RA), whose behaviour should be periodically audited. The Registration Authority (RA) is the entity responsible for the registration of users requiring a certificate. It verifies the identity of the certificate applicants and the validity of their certificate requests. It is an optional part in the PKI architecture because in its absence its duties can be carried out directly by the CA, but it is extremely useful, not only because it relieves the CA of non-technical duties (like control of the applicants’ identity), but also because it can be physically located closer to the users to be certified. The Local Registration Authority (LRA) is a registration authority that is local to a specific and limited community of users. The users who belong to this community and want a certificate from their community CA should go to the LRA to have their request and identity verified. Since the LRA is closer than the CA to the user pool that it should serve, it is likely that it can identify them more easily. Quite often the terms RA and
Digital Certificates and Public-Key Infrastructures
Figure 4. Sample CA
Figure 5. Certificate usage 5
EE
repository
5 2
3
3,4 PKI users
1
PKI users
repository
1
RP
CA
management entities
2
management entities
CA 4
LRA are used interchangeably, while other times the term RA is used for the general concept (that is, to represent the set of all LRAs). The repository is the logical storage for certificates and other information (such as CRLs) made publicly available by the CA. The repository can be implemented in different technologies and accessed by different access protocols (LDAP, HTTP, FTP...). Since it usually contains only signed information (PKCs and CRLs), no special security measure is needed when accessing it via network. However, to avoid denial-of-service attacks and to offer a better service to its customers, the repository is usually replicated at different sites and sometimes it implements access control measures (such as SSL/TLS client-authentication or password-based authentication over a secure channel). A relying party (RP) is an entity that would make some decision based on the certificate content. Basically, a relying party is someone or something whose action depends on the certificate content and on the result of the certificate validation process. For example, an e-banking server acts as a relying party when it accepts and checks a user certificate for authentication in accessing a bank account. The end-entity (EE) is the holder of the private key corresponding to the public one certified and
consequently the subject of the certificate. It can be a certificate user or a Certification Authority. Using these basic components, several PKI architectures can be built. The simplest PKI architecture is composed merely of a CA, a repository and an end-entity. In this architecture the steps performed to issue a certificate are shown in Figure 4. 1.
2.
3.
4.
5.
The EE requires the CA certificate and verifies it with an out-of-band method; if the verification is successful, then the EE sets the public key found in the CA certificate as her trust point. The EE generates an asymmetric key pair, stores locally the private key, inserts the public key in a certificate requests that is sent to the CA. The EE physically goes by the CA to be authenticated according the CA’s procedures; this step could also occur before the second one. If authentication (step 3) is successful (i.e., the EE’s identity is verified) the CA issues the certificate and publishes it into the repository The EE downloads her certificate from the repository.
As shown in Figure 5, when an RP needs to use the EE’s certificate, it has to:
1227
Digital Certificates and Public-Key Infrastructures
Figure 7: Hierarchy
Figure 6. Simple CA with RA
tLcA
EE
repository 7 2
3 1
PKI users management entities RA 6 5
4 CA
1. 2.
3. 4. 5.
6.
Fetch the CA certificate from the repository. Verify the CA certificate with an out-of-band method (steps 1 and 2 can be avoided if the RP already performed them in the past or if the RP has pre-configured trust CAs). Fetch the EE’s certificate from the repository. Check the certificate status (e.g., by fetching the CRL from the repository). Verify the EE’s certificate validity (e.g., check that the current date is within the validity period). Use the key found in the EE’s certificate for the application needs (e.g., to encrypt data to be sent to the EE or to verify an EE’s signature).
In both procedures, the first steps (those that establish the CA as a trust point for the EE or the RP) are very critical and difficult because they require out-of-band verification and human decision. If these steps are successful, the EE becomes part of the PKI built around the given CA and the RP trusts the CA for its application needs. The importance of both things should not be underestimated and should require a careful analysis of the CA technical configuration and administrative procedures to see if they match
1228
certification Authority
End Entit
certificate
the needs of the EE or RP. Usually, to help parties in performing these checks, the CA makes publicly available its Certificate Policy (CP) and Certificate Practice Statements (CPS); pointers to these documents can be inserted in the certificatePolicies extensions of X.509 certificates. A simple architecture that includes a RA is shown in Figure 6. In this architecture, the EE sends the certificate request to the RA rather than to the CA, and goes to the RA to be properly identified.
PKI trUst MODELs This section describes several different trust models for CA interconnection and evaluates the relative advantages and disadvantages.
Hierarchical PKI A strict hierarchy of certification authorities is shown graphically as an inverted tree, with the root CA at the top and the branches extended downward (Figure 7). The root CA issues certificates to its immediate descendants, which in turn certify their descendants, and so on. The CA at the
Digital Certificates and Public-Key Infrastructures
top of the hierarchy, also known as TLCA (Top Level CA), is the trust anchor for the entire PKI. Each CA between the root and the subscribers is referred to as intermediate CA. In a hierarchical PKI, trust starts at the TLCA and flows down the hierarchy through a chain of subordinate CAs to the end-entities. All intermediate CAs have a certificate issued by their parent CA, but the TLCA which has a self-signed certificate as it does not have a parent CA. Therefore all EE certificates can be verified electronically but that of the TLCA. As a consequence it is very important that the certificate of the TLCA is distributed to EEs in a secure way. If someone succeeds in substituting the certificate of the TLCA maintained by an EE with another one, he can get the EE to trust a completely different infrastructure. The hierarchical PKI model is the first that was introduced and therefore it is well understood and supported by nearly all PKI products and commercial CA service providers. Also, two famous electronic financial transaction systems (SET and Identrus) use this model to organize their PKI. A new CA can enter a hierarchical PKI through the process of subordination, in which an existing CA issues a certificate for the new CA. Thus, certificates are issued in only one direction – from parent to child – and a CA never certifies another CA superior to itself. The subordination process is transparent and does not impact the users of the hierarchy that can correctly process the certificates issued by the new CA without any configuration changes. Integrating an existing, foreign CA into a hierarchical PKI, however, is more problematic as it requires the users of the foreign CA to directly trust the root CA of the hierarchical PKI, which may be difficult to achieve in a peer-to-peer business relationship. The hierarchical trust model offers a scalable, easy-to-administer PKI because each CA serves a specific user community in the hierarchy. Each member CA processes enrolment requests, issues
certificates, and maintains revocation information for its own community. The extension of this community can be restricted by the parent CA via appropriate usage of the X.509 NameConstraints and PolicyConstraints certificate extensions. They permit or restrict the set of names of EEs that can be certified (e.g., only the mail addresses of the form “*@polito.it”) and the applicable policies: these are clear advantages over other trust models, as pointed out by Linn (2000). Another advantage of the hierarchical model is the simplicity in constructing the certification paths between any two entities. This simply requires the RP that wants to validate an EE certificate to retrieve issuer certificates until it founds a certificate issued by the trust anchor. The direction of path construction is bottom-up. With bottom-up chain building, an application starts with an end-entity’s target certificate and uses the information in the certificate to locate the issuer’s certificate, iterating the process until the TLCA is reached. This process is further simplified by the convention adopted by many secure applications (such as SSL/TLS web servers and clients, and S/MIME mailers) to send the whole chain (up to the TLCA) along with the EE certificate. An important disadvantage of the hierarchical PKI model is the existence of a single point of failure: if the private key of the TLCA is compromised then so is the whole PKI. Instead, if the private key of an intermediate CA is compromised, then only its subordinate branch is compromised and the damage can be limited by quickly revoking the certificate of the compromised CA. This has the effect to invalidate all the certificates issued by any CA belonging to subtree rooted in the compromised CA. This subtree must then be completely reconstructed. In conclusion, this model is very successful, but it is mainly applicable within isolated, hierarchical organizations, be they a multinational enterprise or a national government. It is more difficult to apply across organizational boundaries (Linn,
1229
Digital Certificates and Public-Key Infrastructures
Figure 9: Mesh
Figure 8: Trusted CA list List of trusted cAs
D2
D1 X12 cA1
cA2
X21
cA3 D3
certification Authority
End Entity
cross certificate
certificate
certification Authority End Entity
2000) where there is clearly a political – rather than technical – issue: it is hard to identify a single entity trusted by all communicating parties and to establish common policies acceptable to all participants. Participants are reluctant to rely on other organizations to preserve the integrity of their subordinated namespaces.
trust List The trusted list model (Figure 8) requires RP applications to maintain a list of trusted TLCAs. In this way, interoperability between different hierarchical PKIs is achieved at the application level of the RP. This model is the most successful on a commercial ground: most current secure applications (SSL browsers, S/MIME mailers) come pre-configured with a list of commercial trusted root CAs. However this model presents some security weaknesses. The main problem is that it moves trust management away from the CAs towards end users. The organization must rely on the users to take the necessary actions to handle (add, remove, check) the CA from the trust list, to configure policies and to maintain certificate status information up to date. However, the typical user has little or no idea of what a PKI is and which are the policies or operating practices of the vari-
1230
certificate
relying parties are colored in the same way as their trust anchors
ous roots, and could be easily fooled to believe into a phoney CA. The alternative is to perform extensive management actions to configure all end users’ applications and/or workstations, to achieve a certain uniform trust policy across an organization. Also, the revocation of a TLCA is nearly impossible, since it requires deleting the certificate of that TLCA from the trust list of every end user. Last, but not least, there is no way to limit the span of a specific hierarchy, because the TLCAs in the list have all the same value, while user-defined name constraints would be needed to restrict a specific hierarchy to a subset of the global namespace. Trust lists are useful when they involve only a relatively small numbers of globally well-known CAs, for direct use within enterprises, and/or to support interactions across a predefined set of enterprise boundaries.
Cross-Certification In order to overcome the problem of hierarchical PKI when interconnecting different realms, various models based on the concept of crosscertification have been proposed.
Digital Certificates and Public-Key Infrastructures
In the mesh (or network) trust model, all CAs are self-signed, and trust flows through the network via cross-certificates. An EE directly trusts only the CA that issued its certificate, and trusts another CA only if its direct CA has cross-certified the foreign CA. Because cross-certification creates a parent-child relationship between two CAs, a network PKI can also be viewed as a hierarchical PKI, with the difference that many self-signed roots and many hierarchies exist at the same time. Figure 9 shows a sample networked PKI with three PKI domains that reciprocally trust each other through six cross-certificates. The certificate X12 represents a cross-certificate between CA1 and CA2; CA1 is the cross-certifying CA, whereas CA2 is the cross-certified CA. The certificate X21 plays the reverse role of the X12 certificate and allows CA2 to cross-certify CA1. The combination of X12 and X21 cross-certificates creates a bilateral cross-certification between domains D1 and D2. Figure 9 depicts a mesh PKI that is fully cross-certified; however, it is possible to deploy an architecture with a mixture of uni-directional and bi-directional cross-certifications. A new CA enters a networked PKI through the process of cross-certification, in which an existing CA issues a cross-certificate for the new CA. An existing CA leaves the network by revoking its cross-certificates. The cross-certification process is transparent and does not impact the users of the network, provided that they can retrieve the cross-certificates from a global directory. The cross-certification process can also integrate an existing, foreign CA into a networked PKI without changing the relative point of trust for either PKI. Compared to the hierarchical trust model, the network trust model might better represent peer-to-peer business relationships, where peers develop trust in each other through cross-certification instead of subordination. However, certification chain construction in a mesh PKI is
more complex than in a hierarchical PKI because multiple paths can exist between a certificate and the relying party’s trust anchor. One of the major drawbacks of this model is that it requires a globally accessible directory to distribute cross-certificates to PKI clients. Without a global directory, a RP cannot generally find the cross-certificates necessary to chain the certificate of a communicating peer to the CA that it directly trusts, thus causing the certificate validation process to fail. Note that a networked peer typically sends only its own certificate and that of its CA when it communicates with another peer, which may have a different direct CA. For example, in the network of Figure 9, a user of CA1 submits its certificate and the self-signed CA1 certificate when communicating with a user of CA3. However, users of CA3 do not generally have local access to all the cross-certificates and must contact a global directory to build a proper certificate chain to their issuing CA. Care must be also taken because the certificates for bi-directional cross-certification are typically stored as certificate pairs in a directory attribute different from that used for individual certificates. The requirement for an online, globally accessible directory of cross-certificates introduces interoperability issues if a client cannot access the directory or if the directory does not contain an up-to-date list of cross-certificates. Furthermore, users of a mesh PKI may require installation of application plug-ins because current commercial secure applications do not support cross certification, as they do not know how to access a global directory and process cross-certificates. The distribution process of such plug-ins to all clients in a large network can become a deployment issue. In contrast to a full mesh, a partial mesh can also be built in which not all CAs are cross-certified. In this case, the ability to perform any-to-any certificate validation is not guaranteed. Meshes do not enable general path construction to be accomplished unless the necessary cross-certificates have been pre-established between one or more
1231
Digital Certificates and Public-Key Infrastructures
Figure 10: Hybrid
cross certificate certification Authority
relying parties are colored in the same way as their trust anchors
End Entity
certificate
pairs of CAs positioned along the path. Meshes have the important advantage of being deployable in “bottom-up” direction without dependence on the prior availability of a top-level root CA: each EE is configured only with the self-signed certificate of its own CA (i.e., the one that acted as issuer for the EE). It is also possible that a hierarchical PKI becomes part of a mesh. This is sometime known as hybrid trust model (Figure 10). In this case, only the TLCA of the hierarchy needs to cross-certify with the other CAs of the mesh. In general, all trust models that make use of cross-certification present the following drawbacks: • • •
1232
increased complexity of the algorithms for certification path construction possible creation of unwanted trust paths through transitive trust decrease of the security level when a highassurance PKI with restrictive operating policies is cross-certified with a PKI with less restrictive policies
•
scarce or null support of cross certificates from commercial products.
The ICE-TEL web of hierarchies trust model (Chadwick et al., 1997) represents a form of hybrid model. Pairwise inter-hierarchy cross-certification, as in the hybrid model, serves well to link a small number of hierarchies. This model however does not scale well to larger numbers because it needs a number of cross-certificates equal to N2 –N, where N is the number of hierarchies to be interconnected. Due to this problem, the Bridge CA model, described in the next section, was developed for the U.S. Federal PKI (Burr, 1998) and it reduces the number of needed cross certificates to N.
bridge cA Another approach to the interconnection of PKIs through cross-certification is a hub-and-spoke configuration (Figure 11), also known as “bridge certification authority” (BCA). The main role of the BCA is to act as a central point to establish
Digital Certificates and Public-Key Infrastructures
Figure 11: Bridge
bridge cA
cross certificate certification Authority End Entity certificate
trust paths among different PKIs. In a bridge configuration, a “principal” CA (PCA) in each participating PKI cross-certifies with the central BCA, whose role is to provide cross-certificates rather than acting as the root of certification paths. Each CA can operate independently and under different certification policies. The BCA is not issuing certificates to the users, but only cross certificates to the principal CAs. Compared to the hybrid PKI model, where the number of cross certificates grows quadratically, in this environment the number of relationships grows only linearly with the number of PKIs. Although this schema mitigates the political problem of the central trust point of hierarchical PKIs and reduces the number of cross certificates needed in generic mesh models, it still suffers from all the other problems of PKI models based on cross-certification. Furthermore, other political problems are raised by the operation of the BCA: the participating organizations need to agree with the BCA operators the certificate formats and the security policies in order to reduce the
relying parties are colored in the same way as their trust anchors
risk of unintended trust. This will most often be specified through certificate profiles as a part of the certificate policy and certification practices of the BCA and PCAs. This is discussed in the paper (Hesse & Lemire, 2002) that is based on the authors’ experience with a PKI interoperability tested built around a bridge certification authority that interconnects multiple PKIs based on CA products from several vendors (National Security Agency, 2001).
cONcLUsION This chapter illustrated various digital certificate generation and management schemas, by describing the different approaches proposed so far, together with their risks and vulnerabilities and the protections to be implemented. The reference to standards in this field is very relevant to let implementers achieve interoperability of certificate-based applications and systems (such as applications that manage electronic signatures
1233
Digital Certificates and Public-Key Infrastructures
or systems that use certificates in communication protocols). Attention has been paid to certificate extensions, as the ones defined by common profiles, because they are not just optional fields, but they are critical elements for more complex processing tasks such as certificate validation. Usage and application of the concepts and solutions presented in this chapter will provide a reference test bed to evaluate progresses in PKI and e-documents in the next few years.
rEFErENcEs Arsenault, A., & Turner, S. (2002, July). Internet X.509 Public Key Infrastructure: Roadmap. IETF, Internet draft. Burr, W.E. (1998, September). Public Key Infrastructure (PKI) technical specification: Part A – Technical concepts of operations. Retrieved from the World Wide Web: http://csrc.nist.gov/ pki/twg/baseline/pkicon20b.pdf. Chadwick, D.W., Young, A.J., & Kapidzic Cicovic, N. (1997, May/June). Merging and extending the PGP and PEM trust models: The ICE-TEL trust model. IEEE Network, 11 (3), 16-24. Diffie, W., & Hellman, M. E. (1976). New directions in cryptography. IEEE Transactions on Information Theory, 6, 644-654. Ellison, C. (2002, April). Improvements on conventional PKI wisdom. In Proceedings of 1st Annual PKI Research Workshop, NIST (pp. 165-175). Retrieved from the World Wide Web: http://www. cs.dartmouth.edu/~pki02/Ellison/paper.pdf. Ellison, C., Frantz, B., Lampson, B., Rivest, R., Thomas, B., & Ylonen, T. (1999). SPKI Certificate Theory. IETF, RFC-2693.
1234
Ford, W., & Baum, M. S. (1997). Secure electronic commerce. Prentice Hall Freed, N., & Borenstein, N. (1996). Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. IETF, RFC-2045. Garfinkel, S. (1995). PGP: Pretty good privacy. O’Reilly & Associates. Hayes, J. M. (2001). Restricting access with certificate attributes in multiple root environments: A recipe for certificate masquerading. In Proceedings of 17th Annual Computer Security Applications Conference. Retrieved from the World Wide Web: http://www.acsac.org/2001/ papers/14.pdf. Hesse, P. M., & Lemire, D.P. (2002). Managing interoperability in non-hierarchical Public Key Infrastructure. In Proceedings of Network and Distributed System Security Symposium. Retrieved from the World Wide Web: www.isoc. org/isoc/conferences/ndss/02/proceedings/papers/hesse.pdf. Housley R., & Polk, T. (2001). Planning for PKI. New York: John Wiley & Sons, Inc. Housley, R., Ford, W., Polk, W., & Solo, D. (1999). Internet X.509 Public Key Infrastructure Certificate and CRL Profile. IETF, RFC-2459. Housley, R., Ford, W., Polk, W., & Solo, D. (2002). Internet X.509 Public Key Infrastructure Certificate and CRL Profile. IETF, RFC-3280. ITU-T Recommendation X.509 & ISO/IEC International Standard 9594-8. (2000). Information technology - Open systems - The Directory: Public-key and attribute certificate frameworks. Kohnfelder, L. M. (1978). Towards a practical public-key cryptosystem (Bachelor Thesis). Massachusetts Institute of Technology.
Digital Certificates and Public-Key Infrastructures
Linn, J. (2000). Trust models and management in Public-Key Infrastructures (Technical report). RSA Data Security, Inc. Microsoft. (2001). Erroneous VeriSign-Issued Digital Certificates Pose Spoofing Hazard. Microsoft Security Bulletin MS01-017. Retrieved from the World Wide Web: http://www.microsoft. com/TechNet/security/bulletin/MS01-017.asp. Myers, M., Ankney, R., Malpani, A., Galperin, S., & Adams, C. (1999). X.509 Internet Public Key Infrastructure Online Certificate Status Protocol. OCSP, IETF, RFC-2560. National Security Agency. (2001). Phase II Bridge Certification Authority Interoperability Demonstration Final Report. Retrieved from
the World Wide Web: http://www.anassoc.com/ Techno.htm. Pfleeger, C.P. (1997). Security in computing. Prentice Hall. Ramsdell, B. (1999). S/MIME Version 3 Message Specification. IETF, RFC-2633. Rivest, R. L., & Lampson, B. (1996). SDSI: A simple distributed security infrastructure. Retrieved from the World Wide Web: http://theory. lcs.mit.edu/~rivest/sdsi.html. SET - Secure Electronic Transaction Specification. (n.d.). Retrieved from the World Wide Web: http://www.setco.org/set_specifications.html. Webster’s New Collegiate Dictionary. (1980). Springfield, MA: G&C Merriam Company.
This work was previously published in Information Security Policies and Actions in Modern Integrated Systems, edited by M. Fugini and C. Bellettini, pp. 64-97, copyright 2004 by IGI Publishing, formerly known as Idea Group Publishing (an imprint of IGI Global).
1235
1236
Chapter 3.10
A Flexible Authorization Framework1 Duminda Wijesekera George Mason University, USA Sushil Jajodia George Mason University, USA
AbstrAct
INtrODUctION
Advances in application areas such as Internetbased transactions, cooperating coalitions, and workflow systems have brought new challenges to access control. In order to meet the diverse needs of emerging applications, it has become necessary to support multiple access control policies in one security domain. This chapter describes an authorization framework, referred to as the Flexible Authorization Framework (FAF), which is capable of doing so. FAF is a logic-based framework in which authorizations are specified in terms of a locally stratified rule base. FAF allows permissions and prohibitions to be included in its specification. FAF specifications can be changed by deleting and inserting its rules. We also describe FAF’s latest additions, such as revoking granted permissions, provisional authorizations, and obligations.
Traditionally, access control plays an integral part in overall system security. Over the years, many different access control models have been developed, and discretionary and mandatory access control models have received considerable attention. Discretionary access control is based on having subjects, objects, and operations as primitives and policies that grant access permissions of the form (s,o,a), where subject s is allowed to execute operation a on object o. Mandatory access control is based on having clearance levels for subjects and classification levels for objects as primitives and policies that grant accesses to subjects whose clearance levels dominate those of
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
A Flexibile Authorization Framework
the objects they access. These models have been used in the commercial and military domains, and implemented in operating systems, database management systems, and object-oriented systems. Advances in application areas bring new dimensions to access control models. The needs to support multiple access control policies in one security domain, Internet-based transactions, cooperating coalitions, and workflow systems have brought new challenges to access control. In response, new access control models are being proposed to address these emerging needs. Large numbers of access control models proposed over the years (Dobson, 1989) have been developed with a number of pre-defined policies in mind and thereby have introduced a sense of inflexibility. Two alternatives accommodate more than one access control model simultaneously. The first is to have more than one access control mechanism running at the same time, one for each policy. The second is to make access control an application responsibility. The first alternative calls for every application to be closely bound to its access control module, which decreases their portability. The second alternative requires all applications to enforce a consistent access control. Additionally, the responsibility of enforcing access control is vested in applications; it will not impose the same rigorous standards of verification and testing imposed on system code. Consequently, both alternatives are undesirable. This can be seen by considering a number of access control policies that have been used over the years (Castano, 1995). A popular policy is the closed world policy, where accesses that cannot be derived from those explicitly authorized are prohibited. A rarely used alternative is the open world policy, where accesses that are not explicitly denied are permitted. Some policies include explicit prohibitions in terms of negative authorizations. This, coupled with generalizations and specializations of these policies to structures such as subject and object
hierarchies (Bruggemann, 1992; Rabitti, 1991), yields numerous combinations. Hence, custom creation of policy enforcement mechanisms or passing of these complications to applications is practically infeasible. One of the solutions for this problem has been to develop flexible authorization models (Jajodia, 2001b), where the flexibility comes from having an access control model that does not depend on any policies or meta policies, but is capable of imposing any of them specifiable in the syntax of the model. One of the main advantages of this approach is that access control can now reside within the system, yet it is able to impose application-specific policies. Given that there is a need for flexible access control models, the following requirements would be desirable: •
•
•
•
Expressibility: It must be possible to model not only existing policies, such as closed world, open world, and denials take precedence policies, but also policies of emerging applications, such as provisions and obligations (to be discussed shortly). Decoupling Policies from Mechanisms: The primary need for flexibility is to obtain a policy-independent framework. Hence, policies expressible in such a framework must be enforceable using generic enforcement mechanisms. Conflict Resolution: Having a flexible framework may invite conflicting policies and, consequently, the framework must be able to facilitate their resolution. Efficiency: Due to the high frequency of requests coming to access control systems, their processing must be fast. Thus, efficient and simple mechanisms to allow or deny access requests are crucial.
In this chapter, we describe a logic-based framework to specify authorizations in the form of rules referred to as the Flexible Authorization
1237
A Flexibile Authorization Framework
Framework (FAF). In FAF, authorizations are specified using Prolog style rules. These rules may include both positive and negative authorizations. By placing syntactic restrictions on authorization specification rules, FAF ensures that every specification has a unique stable model. In addition, every FAF specification is complete. That is, for every authorization request, FAF either grants it or denies it. The flexibility in FAF comes by not having any pre-defined meta-policy such as the closed world or the open world policy. The former prohibits underivable permissions and the latter permits underivable prohibitions. In fact, such metapolices can be specified as FAF rules, making FAF applicable to a large number of application scenarios. Furthermore, by materializing FAF rules, FAF derivations can be made efficient. Due to the changing nature of applications, it may be necessary to change the rules that specify an authorization policy applicable to an application. We later describe how FAF specifications can be changed by changing the FAF rule base, and how these can affect the materialization. The dynamic nature of applications may also require the flexibility to revoke already granted permissions. The effect of such permission revocation, including its effect on the materialization, is described. Not all authorizations are absolute in the sense that an authorization may depend upon the subject satisfying some condition to obtain the access. As an example, a website offering electronic loans may require a potential borrower to register and prove her credit-worthiness to obtain a loan. Once the loan is granted, the borrower is obligated to pay back the loan. The latter is an example where an authorization is granted based on an obligation — a requirement that needs to be met by a subject after the access has been granted. An extension of FAF that incorporates provisions and obligations is described.
1238
FAF: tHE FLEXIbLE AUtHOrIZAtION FrAMEWOrK The Flexible Authorization Framework (FAF) of Jajodia et al. (Jajodia, 2001; Jajodia, 2001b) is a logic-based framework to specify authorizations in the form of rules. It uses a Prolog style rule base to specify access control policies that are used to derive permissions. It is based on four stages that are applied in a sequence, as shown in Figure 1. In the first stage of the sequence, some basic facts, such as authorization subject and object hierarchies (for example, directory structures) and a set of authorizations, along with rules to derive additional authorizations, are given. The intent of this stage is to use structural properties to derive permissions. Hence, they are called propagation policies. Although propagation policies are flexible and expressive, they may result in over specification (i.e., rules could be used to derive both negative and positive authorizations that may be contradictory). To avoid conflicting authorizations, the framework uses conflict resolution policies to resolve conflicts, which comprises the second stage. At the third stage, decision policies are applied to ensure the completeness of authorizations, where a decision will be made to either grant or deny every access request. This is necessary, as the framework makes no assumptions with respect to underivable authorizations, such as the closed policy. The last stage consists of checking for integrity constraints, where all authorizations that violate integrity constraints will be denied. In addition, FAF ensures that every access request is either honored or rejected, thereby providing a built-in completeness property. FAF syntax consists of terms that are built from constants and variables (no function symbols), and they belong to four sorts: subjects, objects, actions, and roles. We use the notation X s , Xo , Xa , and Xr to denote respective variables
A Flexibile Authorization Framework
belonging to them, and lower case letters such as s, a, o, and r for constants. For predicates, FAF has the following: •
•
•
•
•
•
A ternary predicate cando(Xs ,Xo ,Xa ), representing grantable or deniable requests (depending on the sign associated with the action), where s, o, and a are subject, object, and a signed action term, respectively. A ternary predicate dercando(Xs ,Xo ,Xa ), with the same arguments as cando. The predicate dercando represents authorizations derived by the system using logical rules of inference [modus ponens plus rule of stratified negation (Apt, K. R., 1988)]. A ternary predicate do, with the same arguments as cando, representing the access control decisions made by FAF. A 4-ary predicate done(Xs ,Xo ,Xa ,Xt ), meaning subject Xs which has executed action Xa on object Xo at time Xt . Two 4-ary predicate symbols over, that takes as arguments two object terms, a subject term, and a signed action term. over takes as arguments a subject term, an object term, another subject term, and a signed action term. They are needed in the definitions of some of the overriding policies. A propositional symbol error indicating violation of an integrity constraint. That is, a rule with an error head must not have a satisfiable body.
Other terms and predicates are necessary to model specific applications. In our examples, we use constants AOH, ASH to denote the authorization object and subject hierarchies, respectively. We use a ternary predicate in, where in(x,y,H) denotes that x < y in hierarchy H. For example, in(usr\local,usr,AOH) says that usr\local is below usr in the authorization object hierarchy AOH. Obtaining a locally stratified logic program requires a stratification of rules. FAF is stratified by assigning levels to predicates (literals) as given
in Table 1, and the level of a rule is the level of its head predicate.2 As a logic program, any FAF specification gets a local stratification with the level assignment to predicates, as the level of a head predicate is not less than levels of predicates in its body. For any FAF specification AS, ASi denotes the rules of belonging to the it h level. Because any FAF specification is a locally stratified logic program, it has a unique stable model (Gelfond, 1988), and a well-founded model (as in Gelfond & Lifshitz, 1988). In addition, the well-founded model coincides with the unique stable model (Baral, 1992; Jajodia, 2001b). Furthermore, the unique stable model can be computed in quadratic time data complexity (van Gelder, 1989). See Jajodia (2001b) for details. Following Jajodia (2001b), we use the notation M(AS) to refer to this unique stable model of specification AS. Access requests are frequent in systems. Therefore, processing them using a rule base requires long execution times. In addition, changes to access control specifications are relatively few. Therefore, to optimize processing of these requests, a materialization architecture has been proposed in Jajodia (2001b), where instances of derived predicates are maintained. To be able to incrementally update computed materializations upon changes to specifications, Jajodia (2001b) maintains a materialization structure that associates each instance of valid predicates with the rules that directly support its truth. Because predicates belong to strata as stated in Table 1, the materialization structure can be constructed in levels corresponding to them.
MAtErIALIZAtION In FAF, an access request must be presented to the rule base in terms of a query of the form ?do(s,o,+a). In computing the results of this query, the rule-base produces some instances of literals such as do, dercando, cando and possibly
1239
A Flexibile Authorization Framework
others because the rule execution engine needs to backtrack through rule chains. Conversely, if all valid instances of these literals are known, then the execution of the query ?do(s,o,+a) is much faster because backward chaining is eliminated. To facilitate this efficiency, Jajodia (2001b) constructs a materialization structure that stores all valid instances of literals. We now present the materialization structure given in Jajodia (2001). In the following description, we use the notation head(r) and body(r) for the head and body of rule r, respectively.
Definition 1: Materialization Structure The materialization structure for an authorization specification AS is a set of pairs (A,S), where A is a ground atom in the authorization specification language and S is a set of (indices of) rules whose head unifies with A. Definition 2 gives the relationship among a specification, its stable model semantics, and the materialization structure.
Definition 2: Correctness of Materialization Structures Let AS be an authorization specification and let MS be a materialization structure. We say that MS correctly models AS if for any pair (A,S) ε MS, the following conditions hold: •
A ε M(AS) (i.e., A belongs to the model of the authorization specification).
• •
For each A ε M(AS), there is at least one pair (A,S) ε MS. For all rules r such that q is the most general unifier of head(r) and A, rεS iff body(r)θ’s existential closure is true in M(AS).
According to Definitions 1 and 2, a materialization structure that correctly models an authorization specification AS contains a pair (A,S) for each atom A that is true in the (unique stable) model of AS, where S contains indices of the rules that directly support the truth of A. When instances of atoms are added to or deleted from a specification AS by adding or removing rules, corresponding changes need to be reflected in its materialization structure so that the updated materialization structure is correct with respect to the updated model. Either adding or removing indices to S for the set of supporting rules reflects that update. In this situation, an atom will be deleted from the materialization only when its support S becomes empty. The materialization structure is changed using two algebraic operators ⊕ and ⊗. Operators ⊕ and ⊗, respectively, add and remove a pair (A, S) to/from a materialization structure, and are defined as follows:
Definition 3: ⊕ and ⊗ Let MS(AS) be a materialization structure, A a ground instance of a literal, and S a set of rules. Then (see Box 1). Given a materialization structure MS of an authorization specification AS, the model M of AS
Box 1. Definition of ⊕ and ⊗ • •
1240
MS(AS) ⊕ (A,S) = MS(AS) ∪ {(A,S)} if ¬∃ (A,S’)εMS(AS) = MS(AS) - {(A,S’)}∪ {(A, S’∪ S)} otherwise. MS(AS) ⊗ (A,S) = MS(AS) if ¬∃ (A,S’)εMS(AS) such that S∩S’¹≠ ∅ = MS(AS) - {(A,S’)} if ∃ (A,S’)εMS(AS) such that S⊆S’ = MS(AS) - {(A,S’)}∪ {(A,S’- S)} if ∃ (A,S’)εMS(AS) such that S∩S’ ≠ ∅ and S’≠ S
A Flexibile Authorization Framework
is then the projection over the first element of the pairs, written M = ∏1 (MS). MSi and Mi denote the materialization structure and the model at stratum ASi , respectively.
computing the Materialization structure Computing the unique stable model of an authorization specification AS is an iterative process that at each step i computes the least model of AS∪M(ASi-1), where M(ASi -1) is the least model of the stratum ASi - 1 . The materialization algorithm presented next follows this process.
2.
Φ(ASn+1)(MS) ={(p(c),{r}) | rrew ε tr(ASν),head(rrew)θ = p(c), Vr∪ Φ1(MSn)body(rrew)θ}
Algorithm 1: The Materialization Algorithm The Base Step of Constructing M0: hie-, rel-, and done predicates are the only ones present in AS0. Hence, M0 is constructed as the union of these base relations, where IA is the set of (indices of) rules that support A. MS0 = {(A,IA) : A is a hi, rel- or a done fact}. Inductive Case where An +1 has no Recursive Rules: Suppose that we have constructed MSn , and the stratum ASn +1 does not have recursive rules. Then MSn +1 is defined as follows, where c refers to a ground instances of the predicate p(x): MSn+1 = ⊕{(p(c}),{r}): r is a rule in ASn +1θ is grounding, head(r)θ= p(c) and MSn body(r)θ} The Inductive Case where An+1 has Recursive Rules: We use a differential fix-point evaluation procedure, as follows: 1.
Split the body of each rule r ε ASn, where the first set denoted Dr contains all the recursive literals and the second set, denoted Nr, contains all the non-recursive literals of r. Evaluate the conjunction of the non-recursive literals against ∏1 (MS0∪ ….∪MSn), the
materialized model of all strata up to and including n. Store the result as a materialized view Vr. Rewrite r as the rule rrew given by head(r)Vr /\{A: AεDr}. Let tr(ASn) be the set of all rules { rrew|rεASn}. tr(ASn) and ASn are logically equivalent [see (Jajodia, 2001b) for the proof]. Hence, we compute the materialization with respect to tr(ASn) instead of ASn. Let MS be any materialization structure. Define the program transformation Φ(ASn+1) as follows, where θ is a grounding substitution:
3.
The set of all materialization structures is a complete lattice with respect to subset inclusion, and the operator Φ(ASn) is monotone and continuous on that lattice. Therefore, by Knaster-Tarski theorem (Tarski, 1955), it follows that Φ(ASn) has a least fixed point. Define MSn+1 to be ⊕LFP(Φ(ASn)(∅)), where LFP(Φ(ASn+1)(MS)) denotes the least fixed point of Φ(ASn+1).
We can use the above algorithm to materialize any FAF specification, all the way from AS0 to AS4 . In using the materialization algorithm as stated, we need to apply the base step at strata 0, recursive steps at strata 2 and 4, and the nonrecursive steps at strata 1, 3, and 5. Then, the computation of decision and integrity conflict views would become faster. The following theorem proved in Jajodia (2001b) states that the above procedure is sound and complete. Theorem 1: (Correctness of the Materialization Structure) Let AS= ∪{ASi: 0 element indicates that if the resource is a medical record and the action is to read it, then this policy set may be applied to determine the authorization decision response. The element defines the applicable resources or resource types, and, similarly, the element defines the applicable actions or action types. A element may also contain a element. Which defines to which subjects the policy set or policy is applicable. Because this element does not have a element, it does not filter any subjects. This policy set is organized so that filtering of subjects (are they hospital staff or not hospital staff) is done at the level of the two contained policies. --> http://example.org/schemas/Hospital/Medical_Record.xsd read
1279
Protecting Privacy Using XML, XACML, and SAML
Listing 4. XACML context request for the medical record scenario
1280
Protecting Privacy Using XML, XACML, and SAML
Listing 5. XACML context response for the medical record scenario
cal record selected from a search of female patients being treated for Tendonitis). In order to form the XACML context request, the XACML context handler must obtain any necessary information that is not directly in the service request. Recall that Patient Judy has consented to allow Researchers to access the medical conditions part of her patient record but only for the purpose of research. Because, in our scenario, that information is not in the service request, when Researcher M. Curie’s service request attempts to operate on the Patient Judy’s medical record, the XACML context handler will need to obtain that information about the purpose of the action. One way of obtaining the purpose would be through an SAML attribute request. If Researcher M. Curie has not already stipulated she is running the queries for the purpose of research, the attribute request could result in M. Curie’s application notifying her that she must verify that she is running the query for research. Such verification could then be securely logged for future reference in privacy audits. With the necessary attribute information, the XACML context handler would form an XACML context request like the one in Listing 4. The XACML context request contains the following elements: • •
specifies the subject as a Researcher from the University. specifies the element of Patient Judy’s Medical Record as the resource. specifies read as the action.
As the PDP evaluates the XACML context request (Listing 4) against the policy in Listing 3, the following steps occur: 1.
2.
3.
4.
The XACML context request is evaluated against the policy set target, which indicates that the policy set applies to read operations on patient medical records. Because the resource and action specified in the XACML context request match the target of the policy set (which does not filter subjects), the PDP determines that the policy set is applicable to the request. The XACML context request is evaluated against the “Hospital Staff” policy target (which indicates the policy is only for hospital employees). As the subject of the XACML context request is not hospital staff, this policy is ignored by the PDP. The XACML context request is evaluated against the “Non-Hospital Staff” policy target, which indicates the policy is only for subjects outside the hospital. Because the subject of the XACML context request indicates she is one, the “Non-Hospital Staff” policy is deemed applicable, and so its rules are evaluated against the XACML context request. The target of the sole rule of the “Non-Hospital Staff” matches the subject and resource
1281
Protecting Privacy Using XML, XACML, and SAML
5.
6.
attributes specified in the XACML context request. As the rule contains a element, the conditions therein must also be evaluated to determine the effect of the rule. The conditions enforce the privacy rule that the action on the resource (reading the data about Patient Judy’s medical conditions) must be for the purpose of research. Once the PDP (through the XACML context handler), in conjunction with the Hospital I&AM infrastructure, has determined this to be true, the rule effect of “Permit” takes force. Because the “Non-Hospital Staff” policy’s rule combining algorithm and the policy set’s policy combining algorithm are both set as permit-overrides, the PDP creates an XACML Context Response, shown in Listing 5, stating a “Permit” decision. The XACML context handler transforms the XACML Context Response into a format pertinent to the policy enforcement point and forwards it. The PEP forwards the Researcher’s service request to Hospital Patient Medical Records database and retrieves Patient Judy’s physical data and medical conditions information.
sAML AND PrIVAcY SAML, the Security Assertion Markup Language developed by OASIS, enables enterprise applications to share authentication, attribute, and authorization decision assertions. An authentication assertion testifies that the subject of the assertion has been securely identified. In the context of the current discussion, a user could identity himself on one system (an identity provider) and have an authentication assertion sent by that identity provider to another system (a service provider), allowing the user to procure services from the service provider’s system just as if he had signed on
1282
to it directly. Attribute assertions provide attribute information about a subject such as characteristics and preferences. And an authorization decision assertion indicates whether an action by a subject on a resource is authorized. A PEP could use the SAML authorization decision assertion protocol to obtain an authorization decision from another organization’s PDP. Or, as described previously, a PDP could use SAML to obtain attributes about a user from another organization’s I&AM infrastructure. SAML’s enabling of authentication, attribute, and authorization decision sharing across trust domains makes it invaluable, both in its own right and as a companion to XACML, for facilitating the implementation of privacy policies.
Pseudonymous Identifiers The latest version of SAML, “Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0” (Cantor, 2005; Madsen, 2005), incorporates techniques from earlier work by the Liberty Alliance Project (Simon, 2003) that allow a user to coordinate e-services of different organizations across trust domains for a particular task yet ensure that only the minimum amount of information about that user necessary to accomplish that e-service is shared among them. This aspect of SAML 2.0 is particularly valuable in applications such as consumers who may want to use a vendor’s Web site to make an auxiliary purchase but who do not want the vendor to be tracking their purchasing habits or collecting other, potentially personal data. As long as domain B trusts the assertions of domain A that entity X is qualified to do action Y or has certain attributes, then domain B need not know anything more about entity X. In the medical records scenario, the Researcher (whose identity is managed by the University) would be able to collect data from the Hospital’s patient medical records for research without divulging her identity to the Hospital. Through
Protecting Privacy Using XML, XACML, and SAML
Figure 6. SAML—Using pseudonymous identifiers for privacy
SAML, the Hospital is able to verify that the requestor of the patient data has been authenticated through the University and is a Researcher. Figure 6 illustrates how pseudonymous identifiers would be used in a slightly expanded version of the medical records scenario where the Researcher wants to use an e-service of Pharmaceutical Inc. to analyze the data collected from the hospital. In Figure 6, the Researcher signs on to the Hospital (a service provider), which, behind the scenes, obtains an authentication assertion from the University (acting as an identity provider). The assertion from the University assures the Hospital that the user signing on is an authentic University Researcher but does not need to divulge which one; it just specifies that the Hospital is to use the identifier “Rad_Chic” for the Researcher. In conducting her research through the Hospital’s e-services, the Researcher decides to use Pharmaceutical’s e-services, so she also signs on to the Pharmaceutical enterprise applications using a second authentication assertion from the University accomplished in the same way as with the
Hospital except that the Pharmaceutical identifier assigned for the Researcher will be “RX_1867”. The Hospital and the Pharmaceutical can then perform services on behalf of the Researcher but cannot cross-link her identity because each has a different identifier for her. To use the pseudonymous identifiers within the service request, the sender-vouches technique described in the Web Services Security: SAML Token Profile specification (Hallam-Baker, 2004) could be employed. With this technique, the Web Services SOAP message is signed by an attesting entity (the sender), who also vouches for the subject’s identity through an associated (signed) reference to a SOAP security token. If that SOAP security token uses SAML 2.0’s opaque identifier functionality, the receiver would not be able to deduce the identity of the subject. The Patient Medical Record database of Figure 1 would then be accepting requests signed by a notary acting on behalf of the Researcher rather than the Researcher herself.
1283
Protecting Privacy Using XML, XACML, and SAML
Listing 6. SAML attribute query by the Pharmaceutical
Listing 7. SAML attribute assertion response by the Researcher
1284
Protecting Privacy Using XML, XACML, and SAML
communicating consent Besides pseudonymous identifiers, SAML also protects the privacy of its users through a Consent attribute that can be specified on SAML requests and responses. This mechanism allows, for example, for federating parties to indicate that consent was obtained from a user during single sign-on. A service provider’s privacy policy could then require that information may only be collected about a user if that user’s consent has been so indicated. In Figure 6, the Researcher has, through her University acting as an identity provider, requested drug information from the Pharmaceutical organization. The Pharmaceutical organization may wish to gather information about the Researcher such as the name of her University department. SAML’s consent mechanism enables the Researcher to grant, or deny, that consent.
Example Listing 6 and Listing 7 illustrate a SAML transaction whereby the Pharmaceutical queries the Researcher for the name of her department, and the Researcher responds. In Listing 6, the attribute query issued by the Pharmaceutical identifies the subject as “RX_1867”, the pseudonym specified for the Researcher. Though the Pharmaceutical does not know the identity of the Researcher, it is interested in knowing her departmental affiliation and so requests the value of that attribute. The response to the attribute query contains an assertion, issued by the University, about the departmental affiliation of the Researcher — she is from the Department of Medicine. Like the attribute query of Listing 7, the Researcher is identified through her Pharmaceutical-provided pseudonym “RX_1867”. It is important to note that the Consent attribute in element indicates the response is sent with the consent of the subject (who is specifically the
Researcher herself, not the University). Also note the element by which the issuer (the University) identifies the intended recipients of the assertion information. The SAML specification details how to use XML Signature to sign SAML messages and assertions in order to assure their integrity and authenticity. For readability, the signatures are not shown in Listing 6 and Listing 7.
FUtUrE trENDs At the beginning of this chapter, it was stated that new Web technologies such as Web Services and Service-Oriented Architecture (SOA) will make possible levels and types of e-services far beyond the sophistication of what is seen today. Both SAML and XACML are important technologies for realizing the next generation of e-services not only because of their security capabilities for cross-domain authentication and authorization, but also because they can support the protection of individuals’ privacy. The potential for linking various domains’ eservices is powerful, yet it raises many questions regarding privacy. For example, not only might a different domain have a different privacy policy, but also it may be located in a different jurisdiction and thus be governed by different privacy laws (though it should be noted that many governments are working to ensure cross-border compatibility of their privacy legislation). In addition, technologies such as XACML and P3P (Platform for Privacy Preferences) can allow individual users to specify their privacy preferences, which could then influence the definition and/or the result of privacy policies. Determining the proper application of privacy laws and policy policies across domains is not new in itself. What is new is how this may be done in an automatable manner now that technologies like SAML and XACML make it possible to exchange and process high-level, machine-readable,
1285
Protecting Privacy Using XML, XACML, and SAML
privacy-related information. As an example, a Web vendor’s e-service may in turn require a type of e-service for credit checking. In order to find a particular instance of that service, it may turn to a registry in order to determine what instantiations of that type of credit-checking e-service are available. Besides the usual criteria such as price and features, an additional criterion based on the content of the e-services’ privacy policy (e.g., how long client information will be retained) and how well it fits with that of the vendor’s privacy policy and the privacy preferences of its client, can come into play. The challenge, then, is to explore how crossenterprise (and perhaps cross-jurisdictional) e-services can be engineered so that the respective privacy requirements of all the privacy stakeholders can be satisfied in a fully automated manner.
cONcLUsIONs XACML and SAML are important technologies for expressing and enforcing privacy polices in a world of e-services. XACML policies can specify what data is to be protected, who can access it, what actions can be performed on it, and require that actions be performed for a limited set of purposes. SAML assertions can be used to authenticate users and provide attributes about them without revealing the full details of their identity. It is important to understand that while information security and privacy are connected, they are not the same — security tends to be the art of making an intentionally malicious act difficult to achieve, whereas privacy focuses on stating rules of behavior, assuming (with the encouragement of audit trails and the law) that humans and applications will behave accordingly. For example, even though SAML pseudonymous identifiers prevent normal collusion among service providers, this feature would likely not thwart hackers who might
1286
use traffic analysis to cross-link network identities. XACML and SAML provide the foundations for privacy, but fully protecting privacy requires techniques beyond the scope of this chapter. The science of applying XACML and SAML security technologies to privacy issues is in its earliest stages. As products supporting XACML and SAML become ensconced in enterprise infrastructures, there will no doubt be further interest in exploring how they can be harnessed to support privacy protection. For example, with regard to XACML in particular, it will be important to ensure that enterprise-wide, and even inter-enterprise privacy policies can be efficiently managed and tested.
AcKNOWLEDGMENts Thanks to Tim Moses of Entrust and Paul Madsen of NTT for their reviews and comments.
rEFErENcEs Cantor, S., Kemp, J., Philpott, R., & Maler, E. (2005). Assertions and protocols for the OASIS Security Assertion Markup Language (SAML) V2.0. Retrieved May 16, 2005, from http://docs. oasis-open.org/security/saml/v2.0/saml-core2.0-os.pdf Hallam-Baker, P., Kaler, C., Monzillo, R., & Nadalin, A. (2004). Web Services security: SAML token profile. Retrieved May 16, 2005, from http://docs.oasis-open.org/wss/oasis-wss-samltoken-profile-1.0.pdf Madsen, P. (2005). SAML 2: The building blocks of federated identity. Retrieved May 16, 2005, from http://www.xml.com/pub/a/2005/01/12/ saml2.html Moses, T. (2004). Privacy policy profile of XACML. Retrieved May 16, 2005, from http://docs.
Protecting Privacy Using XML, XACML, and SAML
oasis-open.org/xacml/access_control-xacml2_0-privacy_profile-spec-cd-01.pdf Moses, T. (2005). eXtensible Access Control Markup Language (XACML) version 2.0. Retrieved May 16, 2005, from http://docs.oasis-open. org/xacml/access_control-xacml-2_0-core-speccd-04.pdf Simon, E. (2003). The Liberty Alliance Project. In M. O’Neill (Ed.), Web Services security (pp. 203-226). New York: Osborne.
This work was previously published in Privacy Protection for E-Services, edited by G. Yee, pp. 203-233, copyright 2006 by IGI Publishing, formerly known as Idea Group Publishing (an imprint of IGI Global).
1287
1288
Chapter 3.13
Multimedia Security and Digital Rights Management Technology Eduardo Fernandez-Medina Universidad de Castilla – La Mancha, Spain Sabrina De Capitani di Vimercati Università di Milano, Italy Ernesto Damiani Università di Milano, Italy Mario Piattini Universidad de Castilla – La Mancha, Spain Perangela Samarati Università di Milano, Italy
AbstrAct Multimedia content delivery applications are becoming widespread thanks to increasingly cheaper access to high bandwidth networks. Also, the pervasiveness of XML as a data interchange format has given origin to a number of standard formats for multimedia, such as SMIL for multimedia presentations, SVG for vector graphics, VoiceXML for dialog, and MPEG-21 and MPEG-7 for video. Innovative programming paradigms
(such as the one of web services) rely on the availability of XML-based markup and metadata in the multimedia flow in order to customize and add value to multimedia content distributed via the Net. In such a context, a number of security issues around multimedia data management need to be addressed. First of all, it is important to identify the parties allowed to use the multimedia resources, the rights available to the parties, and the terms and conditions under which those rights may be executed: this is fulfilled by the Digital
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Multimedia Security and Digital Rights Management Technology
Rights Management (DRM) technology. Secondly, a new generation of security and privacy models and languages is needed, capable of expressing complex filtering conditions on a wide range of properties of multimedia data. In this chapter, we analyze the general problem of multimedia security. We summarize the most important XMLbased formats for representing multimedia data, and we present languages for expressing access control policies. Finally, we introduce the most important concepts of the DRM technology.
INtrODUctION Multimedia information (such as free text, audio, video, images, and animations) is of paramount importance for human-computer interaction in a number of application fields, such as entertainment, distance learning, pay-per-view/listen business, and collaboration among others. Multimedia security problems are very similar to traditional security problems, but there are some important aspects that make more complex the solutions: traditional multimedia documents usually are monolithic, without a clearly defined internal structure, they are conceived to be discretionally distributed, they are easy to clone, easy to modify, to remove, to manipulate, and so on. Recently, the importance of complementing binary multimedia data with metadata in the form of XML tagging has been fully realized. XML-based standards formats
for multimedia, such us SVG (SVG, 2001), SMIL (SMIL, 2001), and VoiceXML (VoiceXML, 2002) make the internal structure of multimedia flows available to consumer applications and devices. These languages describe the internal properties of the multimedia documents, making possible to refer to structural or semantic components, such as keywords, and graphical/audio/audiovisual properties. The wealth of semantics- and structure-related information carried by these new XML-based multimedia formats suggests that the time has come to develop novel approaches for controlling access and fruition of multimedia content. In particular, a number of security issues around multimedia data management need to be addressed. Figure 1 illustrates a Multimedia Security Taxonomy according to which multimedia security encompass techniques to determine who can use multimedia content and at what conditions (controlled fruition); to prevent multimedia content from being illegally copied (prevention); and to protect multimedia content while it is being transmitted or stored (privacy). In the following, we describe these problems in more detail.
Multimedia Fruition control
Fruition control can be seen as a modern view of access control, where unauthorized accesses are prevented, complex conditions for authorized users can be specified (e.g., timing delay, quality of the rendering), fine-grained encryption can be
Figure 1. Multimedia Security Taxonomy Multimedia Security
Controlled Fruition
Access Control
DRM
Content Scrambling
Prevention
Privacy
Steganography
Digital Watermarking
Visible
Invisible Robust
Authentication
Anotation
Copyright
Fragile
Integrity Check
1289
Multimedia Security and Digital Rights Management Technology
Figure 2. Example of integrating hidden data in a pixel representation Original Pixel R
G
B
ho
ho Stegano graphed P ixel R
G
B
Secret Data
enforced, and policies and infrastructures are managed. In a nutshell, fruition control techniques try to define the usage conditions of digital products from the creation to the consumption. They deal mainly with two concepts: digital right management (DRM) and access control policies. Basically, DRM includes a set of techniques to specify rights and conditions associated with the use and protection of digital contents and services. Access control (AC) policies specify restrictions of individuals or application programs to obtain data from, or to place data into, a storage device. The main difference between DRM and AC is that DRM represents fruition policies specified on the resources, regardless of the system used to deliver them, while AC represents fruition policies that take into consideration the delivery service. Access control and DRM are discussed in later sections.
Multimedia Prevention With the proliferation of multimedia content and of the concerns of privacy on the Internet, research on information hiding has become even more pressing. Information is collected by different
1290
organizations and the nature of multimedia content allows for the exact duplication of material with no notification that the material has been copied. Systems to analyze techniques for uncovering hidden information and recovering destroyed information are thus of great importance to many parties (e.g., law enforcement authorities in computer forensics and digital traffic analysis). In such a context, we briefly explore two important labeling techniques: steganography and watermarking. Steganography is the art of hiding information inside other information (in our case, multimedia documents). The hidden information can be, for example, a trademark or the serial number of a product, and can be used to detect copyright violations and to prosecute them. Steganography dates back to ancient Greece, and the initial purpose was to hide messages inside other messages, for war, espionage, and many other reasons. Some curious historical steganographic methods are invisible ink, pictographs, null cipher, positional codes, and deliberate misprints. According to the type of multimedia document, different properties can be exploited to hide information (Johnson et al., 2000). For instance, for text, it is possible to code messages by changing
Multimedia Security and Digital Rights Management Technology
Figure 3. Scheme of watermarking flows
Message sent
Key
Message received
Watermark
Generation of watermark
Integration of watermark
channel
Original document
Restauration of original message
Fault clearance Watermark and remaining faults
the space between text lines (line-shift coding) or between words (word-shift coding), or by altering certain text features ( feature coding) such as vertical endlines of letters. For images, the most common techniques include the least significant bit insertion (LSB), redundant pattern encoding, and spread spectrum method. The idea behind the LSB algorithm is to insert the bits of the hidden message into the least significant bits of the pixels. Figure 2 illustrates an example with a 24-bit pixel. Here, the secret data are inserted in the last two bits of each byte. The redundant pattern encoding consists in painting a small message over an image many times. The spread spectrum method scatters an encrypted message throughout an image (not just the least significant bit). For audio, the most common techniques modify bits of the audio files or some audio properties. For instance, the low-bit encoding replaces the least significant bit of each sampling point with a coded binary string. The phase coding method works by substituting the phase of an initial audio segment with a reference phase that represents the data. All these techniques introduce changes in documents that humans are not able to identify, but a computer can identify, thus obtaining the hidden message. Unfortunately, many steganographic techniques are not robust against simple compression. Some compression mechanisms [such as JPEG (JPEG, 2003)] lose the least significant bits to be more efficient, thus also
losing the hidden information embedded in them. Some proposals try to overcome this drawback [see, for example, Currie & Irvine (1996) and Koch & Zhao (1995)]. A complete overview of steganography techniques can be found in Johnson et al. (2000) and Sellars (1999). Watermarking describes techniques used to include hidden information by embedding the information into some innocent-looking cover data (Loo, 2002). Watermarking and steganography are closely related in that both employ mechanisms for hiding information. However, in steganography, the object of communication is the hidden message and the main goal is to keep the message (or the communication of) from being detected. Watermarks, on the other hand, can be considered as attributes of the object in which they are inserted. The existence of an embedded watermark may be known or unknown. Also, steganography can incorporate encryption, while watermarking is noncryptographic. Figure 3 shows the scheme that represents how digital watermarking is managed. Let m be a message to be sent. A process generates the watermark w by using a secret key k. Next, this watermark is integrated into the message, thus obtaining a new message m’. When message m’ is received, the watermark w is extracted by using the key k and the original message is recovered. Once the watermark has been extracted, it is possible to
1291
Multimedia Security and Digital Rights Management Technology
Figure 4. Example of authentication problem
compare it with the initial watermark to detect whether the message has been modified. Digital watermarks can fall into two different classes, namely, visible and invisible. Visible watermarks are visual patterns that are overlaid on digital content. It is not possible to remove visible watermarks without destroying the digital content. Digital watermarks can also be classified as robust or fragile. Robust watermarks show strong resistance against accidental and malicious attacks such as content alteration, compression, and filtering. Fragile watermarks have just the opposite characteristics in that they drastically change in case of any alteration of the digital content. Watermarking is useful for many applications (Loo, 2002): •
1292
Copyright protection. With embedded data, copyright holders can verify the ownership of copyrighted contents with watermark in case of intentional or unauthorized distribution. New emerging metadata-based access control techniques and DRM languages provide additional information that is separated from the multimedia document. This information could be integrated as a watermark in the multimedia document. Access control and DRM policies could then be directly enforced at the client site by extracting and interpreting the watermark. Note however that this kind of watermarking information does not prevent people from copying the digital contents.
• •
•
Copy protection. A copy protection mechanism prevents users from making unauthorized copies of digital data. Fingerprint for pirate tracing. Watermarks are used in fingerprinting applications to identify the legal recipient of the digital contents and typically are used together with copyright protection watermarks. Authentication. Authentication watermarking is a technology that prevents forgery of digital contents. If the original digital content is modified, the watermark is destroyed. For instance, Figure 4 shows two photos: on the left there is the original photo and on the right the modified one. Authentication watermarking allows us to detect that the photo on the right has been modified.
All watermarking techniques have three major requirements. Watermark should be robust against accidental or malicious attacks, imperceptible, and carry the required number of bits. These three requirements conflict with each other. For instance, increasing the number of embedded bits increases the capacity but decreases the robustness.
Multimedia Privacy While the privacy of “traditional data” is a wellknown problem, the variety of types and usages of multimedia information makes the multimedia privacy problem fuzzier. In fact, the relationship
Multimedia Security and Digital Rights Management Technology
between multimedia data invasion and privacy has not yet been clearly described. Probably we are not able to perceive privacy problems in music, or in movies, but multimedia is much more than that. For instance, textual information can denote the way things are presented; audio documents can indicate tone of voice, accent or dialect; and video can show dress, look of user, and so on (Adams, 2000). There is not much work in multimedia privacy. However, some interesting work about the relationship between multimedia privacy and users’ perceptions can be found in Adams & Sasse (1999a) and Adams & Sasse (1999b).
XML MEtADAtA FOr MULtIMEDIA The importance of complementing binary multimedia data with descriptive metadata in the form of XML tagging has been fully realized only recently. The pervasiveness of XML as a data interchange format has given rise to a number of standard formats for multimedia representation such as SMIL (SMIL, 2001) for multimedia presentations, SVG (SVG, 2001) for vector graphics, VoiceXML (VoiceXML, 2002) for dialog, and MPEG-21 (MPEG-21, 2002) and MPEG-7 (MPEG-7, 2002) for video. Innovative programming paradigms (such as the one of web services) rely on the availability of XML-based markup and metadata in the multimedia flow in order to customize and add value to multimedia content distributed via the Net. SVG and VoiceXML make the internal structure of multimedia flows available to a variety of consumer applications and devices. More ambitious efforts, such as MPEG-21 and MPEG-7, are aimed at achieving the same results for video. Intelligent applications like search indexes, topic maps or browsable directories can use XML tagging and auxiliary metadata to identify the structural or semantics-related components of multimedia flows, such as keywords, key frames, audiovisual summaries, semantic concepts, color histograms, and shapes, as well
as recognize speech. XML formats are paving the way to a new generation of applications even in more traditional fields such as image processing. For instance, raster graphical formats (e.g., GIF or JPEG) have severe limitations, in as much they do not carry any information that can be queried, re-organized, or searched through. By contrast the Scalable Vector Graphics (SVG), an XMLbased language for describing two-dimensional vector and mixed vector/raster graphics, works well across platforms, across output resolutions and color spaces. Also, SVG clearly specifies the image structure, allowing applications to process data at a finer level of granularity. The wealth of semantics- and structure-related information carried by new XML-based multimedia formats suggests that the time has come to enrich traditional, coarse-grained access control models with a number of new concepts that are specific to the nature and meaning of multimedia data. An interesting consequence of using XML for representing multimedia metadata is that finegrained access policies can be specified. So, XML elements used to refer to multimedia components inside policies can be mapped to XML tags and metadata in the data flow. Policy-to-data mapping can be customized to the particular XML-based multimedia format under discussion, achieving fast and effective enforcement via either data filtering or encryption. In the following subsections we shortly describe these XML-based multimedia standard formats.
scalable Vector Graphics SVG (SVG, 2001) is a language for describing two-dimensional vector and mixed vector/raster graphics in XML. An SVG document has a flexible structure, composed of several optional elements placed in the document in an arbitrary order. Figure 5 shows the general structure of a SVG document. Nodes XML Version and DOCTYPE are common for any XML-based document and
1293
Multimedia Security and Digital Rights Management Technology
Figure 5. General structure of an SVG document SVG Document
XML Version
DOCTYPE
descriptive text
specify the XML version used in the document and information about the type of the document (the public identifier and the system identifier for SVG 1.0), respectively. Node SVG tree contains all the elements specific to SVG documents and is composed of four parts: descriptive text, script, definitions, and body. The descriptive text includes textual information not rendered as part of the graphic and is represented by two elements: title, usually appearing only once, and desc, appearing several times to describe the content of each SVG fragment. The script portion contains function definitions. Each function is associated with an action that can be executed on SVG objects in the document. Functions have a global scope across the entire document. The definition Figure 6. Example of an SVG document
1294
SVG Tree
script
definitions
body
portion contains global patterns and templates of graphical elements or graphical properties that can be reused in the body of the SVG document. Each definition is characterized by a name, which is used in the body of the document to reference the definition, and by a set of properties. The graphical elements to be rendered are listed after the node, according to the order of rendering. Each element can belong to any of the basic SVG graphics elements, such as path, text, rect, circle, ellipse, line, polyline, polygon, and image, whose names are self-explanatory. The body of an SVG document contains any number of container and graphics elements. A container element can have graphical elements and other container elements as child elements. Container g is used for
Multimedia Security and Digital Rights Management Technology
Figure 7. (a) Tree-based graphical representation of an SVG document (b) a portion of an SVG document oncologyfloor
outline information title window data panel information #circle electricity #circle fire #computer #phone public area public aisle reception #computer #phone
2
restroom
10
room
2
#rectRestRoom
#rectRoom #phone bed #rectBed
0..1
occupied bed information #rectOccupiedBed onmouseover onmouseout patientInformation name illness state treatment
private area private aisle kitchen #phone x-ray room #computer #phone
oxygen room #rectOxygenRoom #oxygen Lyne pharmacy room #computer #phone chemoterapy room #computer #phone bloodstore emergency fire emergency #circleFire #emergencyexit #electricity control #circleElectricity
(a)
Small Fragment of a Oncology Floor .......... .......... .......... ......... ONCOLOGY FLOOR Patient Information Name: Illness: State: Treatment: .......... ..........