Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2828
3
Berlin Heidelberg New York Hong Kong London Milan Paris Tokyo
Antonio Lioy Daniele Mazzocchi (Eds.)
Communications and Multimedia Security Advanced Techniques for Network and Data Protection 7th IFIP-TC6 TC11 International Conference, CMS 2003 Torino, Italy, October 2-3, 2003 Proceedings
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Antonio Lioy Politecnico di Torino Dip. di Automatica e Informatica corso Duca degli Abruzzi, 24, 10129 Torino, Italy E-mail:
[email protected] Daniele Mazzocchi Istituto Superiore Mario Boella corso Trento, 21, 10129 Torino, Italy E-mail:
[email protected] Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress. Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at . CR Subject Classification (1998): C.2, E.3, D.4.6, H.5.1, K.4.1, K.6.5, H.4 ISSN 0302-9743 ISBN 3-540-20185-8 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de ©IFIP International Federation for Information Processing, Hofstraße 3, A-2361 Laxenburg, Austria 2003 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin GmbH Printed on acid-free paper SPIN: 10959107 06/3142 543210
Preface
The Communications and Multimedia Security conference (CMS 2003) was organized in Torino, Italy, on October 2-3, 2003. CMS 2003 was the seventh IFIP working conference on communications and multimedia security since 1995. Research issues and practical experiences were the topics of interest, with a special focus on the security of advanced technologies, such as wireless and multimedia communications. The book “Advanced Communications and Multimedia Security” contains the 21 articles that were selected by the conference program committee for presentation at CMS 2003. The articles address new ideas and experimental evaluation in several fields related to communications and multimedia security, such as cryptography, network security, multimedia data protection, application security, trust management and user privacy. We think that they will be of interest not only to the conference attendees but also to the general public of researchers in the security field. We wish to thank all the participants, organizers, and contributors of the CMS 2003 conference for having made it a success.
October 2003
Antonio Lioy General Chair of CMS 2003 Daniele Mazzocchi Program Chair of CMS 2003
VI
Organization
CMS 2003 was organized by the TORSEC Computer and Network Security Group of the Dipartimento di Automatica ed Informatica at the Politecnico di Torino, in cooperation with the Istituto Superiore Mario Boella.
Conference Committee General Chair: Antonio Lioy (Politecnico di Torino, Italy) Program Chair: Daniele Mazzocchi (Istituto Superiore Mario Boella, Italy) Organizing Chair: Andrea S. Atzeni (Politecnico di Torino, Italy)
Program Committee F. Bergadano, Universit` a di Torino E. Bertino, Universit` a di Milano L. Breveglieri, Politecnico di Milano A. Casaca, INESC, chairman IFIP TC6 M. Cremonini, Universit` a di Milano Y. Deswarte, LAAS-CNRS M. G. Fugini, Politecnico di Milano S. Furnell, University of Plymouth R. Grimm, Technische Universit¨ at Ilmenau B. Jerman-Blaˇziˇc, Institut Joˇzef Stefan S. Kent, BBN T. Klobuˇcar, Institut Joˇzef Stefan A. Lioy, Politecnico di Torino P. Lipp, IAIK J. Lopez, Universidad de M´ alaga F. Maino, CISCO D. Mazzocchi, ISMB S. Muftic, KTH F. Piessens, Katholieke Universiteit Leuven P. A. Samarati, Universit` a di Milano A. F. G. Skarmeta, Universidad de Murcia L. Strous, De Nederlandsche Bank, chairman IFIP TC11 G. Tsudik, University of California at Irvine
Organization
CMS 2003 was organized by the TORSEC Computer and Network Security Group of the Dipartimento di Automatica ed Informatica at the Politecnico di Torino, in cooperation with the Istituto Superiore Mario Boella.
Conference Committee General Chair: Antonio Lioy (Politecnico di Torino, Italy) Program Chair: Daniele Mazzocchi (Istituto Superiore Mario Boella, Italy) Organizing Chair: Andrea S. Atzeni (Politecnico di Torino, Italy)
Program Committee F. Bergadano, Universit` a di Torino E. Bertino, Universit` a di Milano L. Breveglieri, Politecnico di Milano A. Casaca, INESC, chairman IFIP TC6 M. Cremonini, Universit` a di Milano Y. Deswarte, LAAS-CNRS M. G. Fugini, Politecnico di Milano S. Furnell, University of Plymouth R. Grimm, Technische Universit¨at Ilmenau B. Jerman-Blaˇziˇc, Institut Joˇzef Stefan S. Kent, BBN T. Klobuˇcar, Institut Joˇzef Stefan A. Lioy, Politecnico di Torino P. Lipp, IAIK J. Lopez, Universidad de M´ alaga F. Maino, CISCO D. Mazzocchi, ISMB S. Muftic, KTH F. Piessens, Katholieke Universiteit Leuven P. A. Samarati, Universit` a di Milano A. F. G. Skarmeta, Universidad de Murcia L. Strous, De Nederlandsche Bank, chairman IFIP TC11 G. Tsudik, University of California at Irvine
Table of Contents
Cryptography Computation of Cryptographic Keys from Face Biometrics . . . . . . . . . . . . . Alwyn Goh, David C.L. Ngo
1
AUTHMAC DH: A New Protocol for Authentication and Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heba K. Aslan
14
Multipoint-to-Multipoint Secure-Messaging with Threshold-Regulated Authorisation and Sabotage Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alwyn Goh, David C.L. Ngo
27
Network Security Securing the Border Gateway Protocol: A Status Update . . . . . . . . . . . . . . . Stephen T. Kent
40
Towards an IPv6-Based Security Framework for Distributed Storage Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro Bassi, Julien Laganier
54
Operational Characteristics of an Automated Intrusion Response System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Papadaki, Steven Furnell, Benn Lines, Paul Reynolds
65
Mobile and Wireless Network Security A Secure Multimedia System in Emerging Wireless Home Networks . . . . . Nut Taesombut, Richard Huang, Venkat P. Rangan
76
Java Obfuscation with a Theoretical Basis for Building Secure Mobile Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yusuke Sakabe, Masakazu Soshi, Atsuko Miyaji
89
A Security Scheme for Mobile Agent Platforms in Large-Scale Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Michelle S. Wangham, Joni da Silva Fraga, Rafael R. Obelheiro
Trust and Privacy Privacy and Trust in Distributed Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Thomas R¨ ossler, Arno Hollosi
VIII
Table of Contents
Extending the SDSI / SPKI Model through Federation Webs . . . . . . . . . . . 132 Altair Olivo Santin, Joni da Silva Fraga, Carlos Maziero Trust-X : An XML Framework for Trust Negotiations . . . . . . . . . . . . . . . . . . 146 Elisa Bertino, Elena Ferrari, Anna C. Squicciarini
Application Security How to Specify Security Services: A Practical Approach . . . . . . . . . . . . . . . 158 Javier Lopez, Juan J. Ortega, Jose Vivas, Jose M. Troya Application Level Smart Card Support through Networked Mobile Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Pierpaolo Baglietto, Francesco Moggia, Nicola Zingirian, Massimo Maresca Flexibly-Configurable and Computation-Efficient Digital Cash with Polynomial-Thresholded Coinage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Alwyn Goh, Kuan W. Yip, David C.L. Ngo
Multimedia Security Selective Encryption of the JPEG2000 Bitstream . . . . . . . . . . . . . . . . . . . . . 194 Roland Norcen, Andreas Uhl Robust Spatial Data Hiding for Color Images . . . . . . . . . . . . . . . . . . . . . . . . . 205 Xiaoqiang Li, Xiangyang Xue, Wei Li Watermark Security via Secret Wavelet Packet Subband Structures . . . . . 214 Werner Dietl, Andreas Uhl A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 David Meg´ıas, Jordi Herrera-Joancomart´ı, Juli` a Minguill´ on Loss-Tolerant Stream Authentication via Configurable Integration of One-Time Signatures and Hash-Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Alwyn Goh, G.S. Poh, David C.L. Ngo Confidential Transmission of Lossless Visual Data: Experimental Modelling and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 Bubi G. Flepp-Stars, Herbert St¨ ogner, Andreas Uhl
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Computation of Cryptographic Keys from Face Biometrics Alwyn Goh1 and David C.L. Ngo2 1
Corentix Laboratories, B–19–02 Cameron Towers, Jln 5/58B, 46000 Petaling Jaya, Malaysia.
[email protected] 2 Faculty of Information Science & Technology, Multimedia University, 75450 Melaka, Malaysia
Abstract. We outline cryptographic key–computation from biometric data based on error-tolerant transformation of continuous-valued face eigenprojections to zero-error bitstrings suitable for cryptographic applicability. Biohashing is based on iterated inner-products between pseudorandom and userspecific eigenprojections, each of which extracts a single-bit from the face data. This discretisation is highly tolerant of data capture offsets, with same-user face data resulting in highly correlated bitstrings. The resultant user identification in terms of a small bitstring-set is then securely reduced to a single cryptographic key via Shamir secret-sharing. Generation of the pseudorandom eigenprojection sequence can be securely parameterised via incorporation of physical tokens. Tokenised bio-hashing is rigorously protective of the face data, with security comparable to cryptographic hashing of token and knowledge key-factors. Our methodology has several major advantages over conventional biometric analysis ie elimination of false accepts (FA) without unacceptable compromise in terms of more probable false rejects (FR), straightforward key-management, and cryptographically rigorous commitment of biometric data in conjunction with verification thereof.
1 Introduction Biometric ergonomics and cryptographic security are highly complementary attributes, hence the motivation for the presented research. Computation of cryptographic keys from biometric data was first proposed in the Bodo patent [1], and is technically challenging from both signal processing and information security viewpoints. The representation problem is that biometric data (ie linear time-series or planar bitmaps) is continuous and high-uncertainty, while cryptographic parameters are discrete and zero-uncertainty. Biometric consistency—ie the difference between reference and test data, which are (at best) similar but never equal—is hence inadequate for cryptographic purposes which require exact reproduction. This motivates the formulation of offset-tolerant discretisation methodologies, the end result of which is also required to be protect against adversarial recovery of user-specific biometrics.
A. Lioy and D. Mazzocchi (Eds.): CMS 2003, LNCS 2828, pp. 1–13, 2003. © IFIP International Federation for Information Processing 2003
2
A. Goh and D.C.L. Ngo
2 Review of Previous Work The earliest publications in this domain are by Soutar et al [2, 3], whose research outlines cryptographic key-recovery from the integral correlation of freshly captured fingerprint data and previously registered bioscrypts. Bioscrypts result from the mixing of random and user-specific data—thereby preventing recovery of the original fingerprint data—with data capture uncertainties addressed via multiply-redundant majority-result table lookups. This ensures representation tolerance against offsets in same-user test fingerprints, but does not satisfactorily handle the issue of discrimination against different-user data.. The Davida et al [4, 5] formulation outlines cryptographic signature verification of iris data without stored references. This is accomplished via open token-based storage of user-specific Hamming codes necessary to rectify offsets in the test data, thereby allowing verification of the corrected biometrics. Such self-correcting biometric representations are applicable towards key-computation, with recovery of iris data prevented by complexity theory. Resolution of biometric uncertainty via Hamming error correction is rigorous from the security viewpoint, and improves on the somewhat heuristic Soutar et al lookups. Monrose et al key-computation from user-specific keystroke [6] and voice [7] data is based on the deterministic concatenation of single-bit outputs based on logical characterisations of the biometric data, in particular whether user-specific features are below (0) or above (1) some population-generic threshold. These feature-derived bitstrings are used in conjunction with randomised lookup tables formulated via Shamir [8] secret-sharing. Error correction in this case is also rigorous, with Shamir polynomial thresholding and Hamming error correction considered to be equivalent mechanisms [5]. The inherent scalability of the bitstrings is another major advantage over the Soutar et al methodology. Direct mixing of random and biometric data (as in Soutar er al) allows incorporation of serialised physical tokens, thereby resulting in token+biometric cryptographic keys. There are also advantages from the operations security viewpoint, arising from the permanent association of biometrics with their owners. Tokenised randomisation protects against biometric fabrication—as demonstrated by Matsumoto et al [9] for fingerprints, which is considered one of the more secure form factors—without adversarial knowledge of the randomisation, or equivalently possession of the corresponding token.
3 Bio–Hash Methodology This paper outlines cryptographic key-computation from face bitmaps, or specifically from Sirovich-Kirby [10, 11] eigenprojections thereof. The proposed bio-hashing is based on: (1) biometric eigenanalysis: resulting in user-specific eigenprojections with a moderate degree of offset tolerance, (2) biometric discretisation: via iterated innerproduct mixing of tokenised and biometric data, with enhanced offset tolerance, and (3) cryptographic interpolation: of Shamir secret-shares corresponding to token and biometric data, culminating in a zero-error key. Bio-hashing has the following ad-
Computation of Cryptographic Keys from Face Biometrics
3
vantages: (1) tokenised random mixing: in common with Soutar et al, (2) discretisation scalability: in common with Monrose et al, and (3) rigorous error correction: in common with Davida et al and Monrose et al. The proposed formulation is furthermore highly generic arising from the proposed discretisation in terms of innern
products ie s = a⋅b for a,b∈IR We believe our work to be the first demonstration of key-computation from face data, which seems difficult to handle (in common with other planar representations) using the Monrose et at procedure. Bio-hashing is essentially a transformation from representations which are high-dimension and high-uncertainty (the face bitmaps) to those which are low-dimension and zero-uncertainty (the derived keys). The successive N
representations are: (1) raw bitmap: x ∈ S in domain IR , with N the pixelisation n
dimension, (2) eigenprojection: a ∈ S′ in domain IR , with n
230
2.2
D. Meg´ıas, J. Herrera-Joancomart´ı, and J. Minguill´on
Mark Reconstruction
The objective of the mark reconstruction algorithm is to detect whether an audio test ˆ It is assumed that T is signal T is a (possibly attacked) version of the marked signal S. in RIFF-WAVE format. If it were not the case, a format conversion step (for example MPEG 1 Layer 3 decompression) should be performed prior to the application of the reconstruction process. First of all, the spectrum TF is obtained applying the FFT algorithm and, then, the magnitude at the potentially marked frequencies |TF (fmark )|, for all fmark ∈ Fmark , is computed. Note that this method is strictly positional and, because of this, it is required that the number of samples in Sˆ and T is the same. If there is only a little difference in the number of samples, it is possible to complete the sequences with zeroes. Thus, this methodology cannot be directly applied when resampling attacks occur. In such a case, sampling rate conversion must be performed before the mark reconstruction algorithm can be applied. When |TF (fmark )| are available, a scaling step is undertaken in order to minimise ˆ the distance of the sequences |TF (fmark )| and SF (fmark ). This scaling is performed to suppress the effect of attacks which modify only a range of frequencies or which scale ˆ The following least squares problem is solved: the PCM signal S.
2 min SˆF (f ) − λ |TF (f )| . λ
f ∈Fmark
This problem can be solved analytically as follows. Given the vectors T s = |SF (f1 )| |SF (f2 )| . . . |SF (fn )| , T
sˆ = SˆF (f1 ) SˆF (f2 ) . . . SˆF (fn ) , T t = |T (f1 )| |T (f2 )| . . . |T (fn )| , where T stands for the transposition operator, it is possible to write the least squares problem in vector form as T
min (ˆ s − λt) (ˆ s − λt) , λ
which yields the minimum for: λ=
sˆT t . tT t
Now, each component of λt is divided by the corresponding component of s and the value obtained is compared with 10d/20 to decide wether a ‘0’, a ‘1’ or a ‘*’ (not identified) might be embedded in this component of λt. Let r i = λstii : d d 100 − q 100 + q ˆi := ‘1’, , 10 20 ⇒b r i ∈ 10 20 100 100 d d 1 100 − q 100 + q ˆi := ‘0’. , 10 20 ⇒b ∈ 10 20 ri 100 100
A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression
231
ˆi := ‘*’. Here q ∈ [0, 100] is a If none of these two conditions are satisfied, then b ˆ ˆ which contains percentage (e.g. q = 10) and bi is the i-th component of the vector b a sequence of “detected bits". Finally, the PRBS signal is subtracted from the bits ˆ b to recover the true embedded bits b. This operation must preserve unaltered the ‘*’ marks. Once b has been obtained, it must be taken into account that its length n is (much) greater than the length of the extended mark. Hence, each bit of the mark appears at different positions in b. For example, if the length of the extended mark is 434, the first bit should appear at b1 , b435 , b869 , . . . , b1+434j , . . . Some of these bits will be identified as ‘1’, others as ‘0’ and the rest as ‘*’. Now a voting scheme is applied to decide wether the i-th bit of the mark is ‘1’, ‘0’ or unidentified (‘*’). Let n0 , n1 and n∗ , be the number of ‘0’s, ‘1’s and ‘*’s identified for the same mark bit. The simplest approach is to assign to each bit the sign which appears most. For example, if a given mark bit had been identified 100 times with n0 = 2, n1 = 47 and n∗ = 51, this simple approach would assign a ‘*’ mark to this bit. But, taking into account that any value outside the interval defined above is identified as ‘*’, it is clear that near ‘1’s are identified as ‘*’ although they are much closer to ‘1’ than to ‘0’. In the reported example, the big difference between the number of ‘1’s and ‘0’s (47 2) can reasonably lead to the conclusion that the corresponding bit can be assigned a ‘1’ with very little probability error, since most of the ‘*’ will probably be near ‘1’s. As a result of this consideration, the voting scheme used in this method ignores the ‘*’ if n∗ is not more than twice the difference |n1 − n0 |: ‘*’ if n∗ > 2 |n1 − n0 | , bit := ‘1’ if n∗ ≤ 2 |n1 − n0 | and n1 > n0 , ‘0’ if n∗ ≤ 2 |n1 − n0 | and n0 > n1 , A more sophisticated method using statistics might be applied instead of this voting scheme. For instance, an analysis of the distribution of r i for each bit might be performed. However, the voting procedure described here is simple to implement and fast to execute, which makes it very convenient for real applications. SF , Fmark
SˆF , Fmark Test signal T
W 70-bit stream
FFT
ECC
TF
Least squares
WECC
Voting
434-bit stream
scheme
λTF
b
Bit identification ˆ b PRBS
k
Fig. 2. Mark reconstruction process As a result of this voting scheme an identified extended mark WECC will be avail able. Finally, WECC and the error correcting algorithm are used to recover an identified
232
D. Meg´ıas, J. Herrera-Joancomart´ı, and J. Minguill´on
70-bit stream mark, W , which will be compared with the true mark W . The whole reconstruction process is depicted in Fig. 2. The mark reconstruction algorithm can be described in terms of the following expression:
ˆ Fmark , parameters = {q, d, k} → {W , b} Reconstruct T, S, S, where b is a byproduct of the algorithm which might be used to perform statistical tests. The proposed scheme is not blind, in the sense that the original signal is needed by the mark reconstruction process. On the other hand, the bit sequence which forms the embedded mark is not needed for reconstruction, which makes this method suitable also for fingerprinting once the mark is properly coded [14].
3
Performance Evaluation
As pointed out in Section 1, three main measures are commonly used to assess the performance of watermarking schemes: Imperceptibility: the extent to which the embedding process leaves undamaged the perceptual quality of the marked object. Capacity: the amount of information that may be embedded and recovered. Robustness: the resistance to accidental removal of the embedded bits. In this section, we test the properties of the proposed scheme presented in Section 2. The scheme described in Section 2 was implemented using a dual binary Hamming code DH(31, 5) as ECC and the pseudo-random generator is a DES cryptosystem implemented in a OFB mode. A 70-bit mark W (resulting in an encoded WECC with |WECC | = 434) was included. In order to test the watermarking scheme we have chosen the following parameters for embedding and reconstruction: – R = 128 Kbps, which is the most widely used bit rate in MPEG 1 Layer 3 files. – p = 2, meaning that we only consider relevant those frequencies for which the magnitude of SF is, at least, a 2% of the maximum magnitude. – ε = 0.05, which implies that a frequency is considered unchanged after compression/decompression if the magnitude varies less than a 5% (relative error). – d = 1 dB, if lower imperceptibility is required, a lower value can be chosen. – q = 10, i.e. a ±10% band is defined about d in order to reconstruct ‘1’s and ‘0’s. This choice is quite conservative, since the ‘0’ and the ‘1’ bands are quite far from each other. It is worth pointing out that these parameters have been chosen without performing a deep analysis on tuning. Basically, R, p and ε affect the place where the mark bits are embedded; d is related to the imperceptibility of the hidden mark, since it describes how much the spectrum of each marked frequency is disturbed; and, finally, q affects the robustness of the method, since it avoids misidentification of the embedded bits. To test the performance of the suggested audio watermarking scheme, some of the audio files provided in the Sound Quality Assessment Material (SQAM) page [15] have
A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression
233
been used. The following files have been tested: violoncello (m.p.3 ), trumpet (m.p.), horn (m.p.), glockenspiel (m.p.), harpsichord (arp.4 ), soprano (voice), bass (voice), quartet (voice), English female speech (voice) and English male speech (voice). We have taken only the first ten seconds of each of these files, i.e., 441000 samples, and the mark has been embedded in the left channel only. The glockenspiel file is an especial case, since it has about 5 blank seconds out of 10. 3.1
Imperceptibility
The imperceptibility property determines how much the marked signal Sˆ differs from the original one S. That is, imperceptibility is concerned with the distortion added with the inclusion of the mark or, in other words, with the audio quality of the marked signal Sˆ with respect to S. There are several ways to measure audio quality. Here, the signal-tonoise ratio (SNR) and an average SNR (ASNR) are used. The SNR measure determines the power of the noise added by the watermark relative to the original signal, and is defined by N
SNR =
Si2
i=1
N
Si − Sˆi
2
i=1
ˆ Usually, where N is the number of samples and Si (Sˆi ) denotes the i-th sample of S (S). this value is given in dB by performing the operation 10 log10 (SNR). Another measure usual in audio quality assessment is an average of the SNR computed taking sample blocks of some length. A typical choice is to consider pieces of 4 ms, which, with a sampling rate of 44100 Hz, means 176 samples. The SNR of all these pieces is computed, and the average for all the sample blocks is obtained. The ASNR measure is often given in dB. The measure used in this paper does not take into account the Human Auditory System (HAS) and, thus, all frequencies are equally weighted. In Table 1, the SNR and ASNR measures obtained for the ten benchmark files are shown. The SNR measures are about 19 dB whereas the ASNR measures are about 20 dB. This means that power of the noise introduced by watermarking is roughly 0.01 times the power of the original signal, which is quite satisfying and might even be improved (reduced) by choosing proper tuning parameters. Obviously, the parameter d only affects the imperceptibility of the watermark, since it determines to which extent the spectrum of the marked signal Sˆ is modified with respect to the original signal S. Hence, by reducing d, to say 0.5 dB, the imperceptibility of the mark would increase, though it will be more easily removed. The parameters R, p and ε determine how many frequencies are chosen for watermarking and, thus, they also affect the imperceptibility of the mark. The larger the number of marked frequencies is, the more perceptible the mark becomes. This establishes a link between the imperceptibility 3 4
“m.p." stands for “melodious phase". “arp." stands for “arpegio".
234
D. Meg´ıas, J. Herrera-Joancomart´ı, and J. Minguill´on Table 1. Capacity and imperceptibility
SQAM file Marked bits Capacity (bits) SNR (dB) ASNR (dB) violoncello 4477 722 18.92 20.91 trumpet 3829 617 18.83 19.84 horn 1573 253 18.96 21.10 glockenspiel 1258 202 25.78 29.75 harpsichord 3874 624 21.25 22.84 soprano 5042 813 19.47 21.59 bass 15763 2542 19.02 20.08 quartet 13548 2185 19.22 20.36 female speech 10677 1722 19.57 21.84 male speech 9359 1509 19.44 21.49
and the capacity of the watermarking system. Hence a tradeoff between imperceptibility and capacity must be achieved. Note, also, that capacity is related to robustness, since an increase in the number of times the mark is embedded into the signal results in decreasing the probability of losing the mark. 3.2
Capacity
The capacity of the watermarking scheme is determined by the parameters R, p and ε used in the embedding process. Since the marked frequencies are chosen according to the difference of S and S (the compressed/decompressed signal), it is obvious that the rate R is a key parameter in this process. The percentage p determines which frequencies are significant enough to be taken into account and, thus, this is a relevant parameter as capacity is concerned. Finally, the relative error ε are used to measure wether two spectral values of S and S are equal, which also affects the number of marked frequencies. In Table 1, the capacity of the suggested scheme for the ten benchmark files is displayed. We have considered that the true capacity is not the number of marked bits (the second column), since the extended watermark is highly redundant: 70 bits of information plus 364 bits of redundancy. Hence only 70/434 of the marked bits are the true capacity (third column). However, this redundancy is relevant to the robustness of the method, as it allows to correct errors once the extended mark WECC is recovered. Note, also, that 10 seconds of music are enough to allow for, at least, 3 times the mark. If 3-minute files were marked using this method, the capacity of method would be between 3652 bits (or 52 times a 70-bit mark) plus the redundancy for the glockenspiel file and 45763 bits (or 653 times a 70-bit mark) plus the redundancy for the quartet file. It must be taken into account that the glockenspiel file is an especial case, since it only contains 5 seconds of music. 3.3
Robustness Assessment
The robustness of the resulting scheme has been tested using the StirMark benchmark for audio [16], version 0.2. Some of the attacks in this benchmark can not be evaluated
A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression
235
for the watermarking scheme presented in this paper, since the current version of our watermarking scheme does not allow for a large difference between the number of ˆ and the attacked (T ) signals. In addition, only the left channel samples of the marked (S) has been marked in the experiments, thus stereo attacks do not apply here either. The attacks considered for this test are summarised in Table 2. Table 2. Attacks described in the StirMark benchmark for audio
Name
Number
Name
AddBrumm 1—11 AddDynnnoise AddSinus 18 Amplify Echo 21 Exchange FFT Invert 24 FFT RealReverse FFT Test 27 FlippSample LSBZero 30 Normalise RC-LowPass 33 Smooth Stat1 36 Stat2
Number 12 19 22 25 28 31 34 37
Name
Number
Addnoise 13—17 Compressor 20 FFT HLPass 23 FFT Stat1 26 Invert 29 RC-HighPass 32 Smooth2 35 ZeroCross 38
According to this table, thirty-eight different attacks are performed. The attack AddFFTNoise with default parameters destroys the audio file (it produces no sound) and, thus, no results are available for this attack. Future versions of the watermarking scheme should cope with stereo attacks (ExtraStereo and VoiceRemove) and attacks which modify the number of samples in a significant way (CutSamples, ZeroLength, ZeroRemove, CopySample and Resampling), but the current version of the watermarking scheme proposed here cannot cope with them. In order to test the robustness of the suggested watermarking scheme against these 38 attacks, a correlation measure between the embedded mark W and the identified mark W is used. Let Wi and Wi be, respectively, the i-th bit of W and W , hence 1, if Wi = Wi βi = −1, if Wi = Wi is defined. Now, the correlation is computed, taking into account the βi for all the |W | bits (70 in our case) of the mark, as follows: |W |
1 βi . Correlation = |W | i=1 This measure is 1 when all the |W | bits are correctly recovered (W = W ) and it is −1 when all the |W | bits are misidentified. A value of about 0 is expected when 50% of the bits are correctly recovered, as it would occur if the mark bits were reconstructed randomly. In the StirMark benchmark test, we have considered that the watermarking scheme survives an attack only if the correlation is exactly 1, i.e. only if all the 70 bits of the mark are correctly recovered.
236
D. Meg´ıas, J. Herrera-Joancomart´ı, and J. Minguill´on Table 3. Survival of the mark to the StirMark test
Number Survival ratio Number Survival ratio Number Survival ratio 1
10/10
2
10/10
3
10/10
4
10/10
5
10/10
6
10/10
7
10/10
8
10/10
9
10/10
10
10/10
11
10/10
12
6/10
13
10/10
14
10/10
15
10/10
16
10/10
17
8/10
18
6/10
19
10/10
20
10/10
21
0/10
22
10/10
23
3/10
24
10/10
25
10/10
26
0/10
27
0/10
28
0/10
29
10/10
30
10/10
31
10/10
32
1/10
33
10/10
34
9/10
35
9/10
36
10/10
37
10/10
38
1/10
In Table 3 the survival of the mark against the StirMark benchmark attacks is displayed. The relation between the attack number and the name given in the StirMark benchmark is given in Table 2. Each attack has been performed to the ten files of the SQAM corpus reported above. Hence, the results are shown in a x/10 ratio since the total number of files is 10. As remarked above, the mark is considered to be recovered only if all the 70 bits are correctly reconstructed. The results of Table 3 show that only 7 of the 38 attacks of the StirMark benchmark performed in this paper cause serious damage to the embedded mark. The attacks with survival ratios of 6/10 or above produce good correlation values in the non-survived cases, which suggests that better results might arise with an appropriate tuning of the watermarking scheme. The non-survived attacks are the following: 21 (Exchange), 23 (FFT HLPass), 26 (FFT Stat1), 27 (FFT Test), 28 (Flipp Sample), 32 (RC Highpass) and 38 (ZeroCross). It must be remarked that most of these attacks produce significant audible damage to the signal and would not be considered acceptable under the most usual situations, especially for music files. Finally, a set of MPEG 1 Layer 3 compression attacks (using the Blade codec) have been carried out to the marked soprano SQAM file in order to test the robustness of the suggested watermarking scheme against compression. Since the rate used for watermarking is R = 128 Kpbs, it was expected that the scheme is able to overcome compression attacks with bit rates of 128 Kbps and higher. Table 4 displays the correlation values obtained for the MPEG 1 Layer 3 compression attacks for several bit rates, from 320 Kbps to 32 Kbps. This table shows that the watermarking scheme suggested here is not only robust for all bit rates greater than or equal to 128 Kpbs, as expected, but also to rates 112 and 96 Kbps, which are more compressed
A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression
237
Table 4. MPEG 1 Layer 3 compression attacks
Bit rate (Kbps) 320 256 224 192 160 128 112 Compression ratio 4.41:1 5.51:1 6.30:1 7.35:1 8.82:1 11.03:1 12.60:1 Correlation 1 1 1 1 1 1 1 Bit rate (Kbps) 96 80 64 56 48 40 32 Compression ratio 14.70:1 17.64:1 22.05:1 25.20:1 29.30:1 35.28:1 44.10:1 Correlation 1 0.97 0.97 0.94 0.83 0.80 0.49
than the rate used for watermarking (128 Kbps). In addition, the correlation value is very close to 1 even for rates 80, 64 and 56 Kbps. Of course, better robustness against compression attacks might be achieved by choosing a different rate for watermarking, for example R = 64 Kbps.
4
Conclusions and Further Research
This paper presents a watermarking method which uses MPEG 1 Layer 3 compression to determine the position of the embedded mark. The main idea of the method, borrowed from the image watermarking scheme of [12], is to find the frequencies for which the spectrum of the original signal is not modified after compression. These frequencies are used to embed the mark bits by adding or subtracting a given parameter to the magnitude of the spectrum. The method is complemented with an error correcting code and a pseudo-random binary signal to increase robustness and to avoid collusion of two buyers. Thus, this watermarking approach is also suitable for fingerprinting. The performance of the suggested schemes has been evaluated for the SQAM file corpus using three measures: imperceptibility, capacity and robustness. We have shown that (without tuning) the power of the embedded watermark is about 0.01 times that of the original signal. As capacity is concerned, for typical 3-minute music files, the mark can be repeated hundreds of times within the marked signal. Finally, robustness has been tested by performing the applicable attacks in the StirMark benchmark and also MPEG 1 Layer 3 attacks. The suggested scheme has been shown to be robust against most of the StirMark attacks and to compression attacks with compression ratios larger than that used for watermarking. There are several directions to further the research presented in this paper. Part of the future research will be focused on the parameters of the scheme, since some guidelines to tune these parameters must be suggested. In addition, the watermarking scheme must be adapted to stereo files by marking both the left and the right channels appropriately. It is also required that the watermarking scheme is able to cope with attacks which modify the number of samples in the attacked signal in a significant way. The use of filters which model the HAS to measure imperceptibility is another research topic. Finally, the possibility to work with blocks of samples instead of using the whole file should be addressed.
238
D. Meg´ıas, J. Herrera-Joancomart´ı, and J. Minguill´on
Acknowledgements. This work is partially supported by the Spanish MCYT and the FEDER funds under grant no. TIC2001-0633-C03-03 STREAMOBILE.
References 1. Petitcolas, F., Anderson, R., Kuhn, M.: Attacks on copyright marking systems. In: 2nd Workshop on Information Hiding. LNCS 1525, Springer-Verlag (1998) 219–239 2. Petitcolas, F., Anderson, R.: Evaluation of copyright marking systems. In: Proceedings of IEEE Multimedia Systems’99. (1999) 574–579 3. Bell, A.: The dynamic digital disk. IEEE Spectrum 36 (1999) 28–35 4. Swanson, M., Kobayashi, M., Tewfik, A.: Multimedia data-embedding and watermarking technologies. In: Proceedings of the IEEE. Volume 86(6)., IEEE Computer Society (1998) 1064–1087 5. Swanson, M.D.; Bin Zhu; Tewfik, A.: Current state of the art, challenges and future directions for audio watermarking. In: Proceedings of IEEE International Conference on Multimedia Computing and Systems. Volume 1., IEEE Computer Society (1999) 19–24 6. Voyatzis, G.; Pitas, I.: Protecting digital image copyrights: a framework. IEEE Computer Graphics and Applications 19 (1999) 18–24 7. Cox, I.J., Kilian, J., Leighton, T., Shamoon, T.: Secure spread spectrum watermarking for multimedia. IEEE Transactions on Image Processing 6 (1997) 1673–1687 8. M.D. Swanson, B. Zhu, A.T., Boney, L.: Robust audio watermarking using perceptual masking. Elsevier Signal Processing, Special Issue on Copyright Protection And Access Control 66 (1998) 337–335 9. W. Kim, J.L., Lee, W.: An audio watermarking scheme robust to mpeg audio compression. In: Proc. NSIP. Volume 1., Antalya, Turkey (1999) 326–330 10. D. Gruhl, A.L., Bender, W.: Echo hiding. In: Proceedings of the 1st Workshop on Information Hiding. Number 1174 in Lecture Notes in Computer Science, Cambridge, England, Springer Verlag (1996) 295–316 11. Bassia, P., Pitas, I., Nikolaidis, N.: Robust audio watermarking in the time domain. IEEE Transactions on Multimedia 3 (2001) 232–241 12. Domingo-Ferrer, J., Herrera-Joancomart´ı, J.: Simple collusion-secure fingerprinting schemes for images. In: Proceedings of the Information Technology: Coding and Computing ITCC’2000, IEEE Computer Society (2000) 128–132 13. Domingo-Ferrer, J., Herrera-Joancomart´ı, J.: Short collusion-secure fingerprinting based on dual binary hamming codes. Electronics Letters 36 (2000) 1697–1699 14. Boneh, D., Shaw, J.: Collusion-secure fingerprinting for digital data. In: Advances in Cryptology-CRYPTO’95. LNCS 963, Springer-Verlag (1995) 452–465 15. Purnhagen, H.: SQAM - Sound Quality Assessment Material (2001) http://www.tnt.uni-hannover.de/project/mpeg/audio/sqam/. 16. Steinebach, M., Petitcolas, F., Raynal, F., Dittmann, J., Fontaine, C., Seibel, S., Fates, N., Ferri, L.: Stirmark benchmark: audio watermarking attacks. In: Proceedings of the Information Technology: Coding and Computing ITCC’2001, IEEE Computer Society (2001) 49–54
Loss-Tolerant Stream Authentication via Configurable Integration of One-Time Signatures and Hash-Graphs 1
2
Alwyn Goh , G.S. Poh , and David C.L. Ngo
3
1
Corentix Laboratories, B-19-02 Cameron Towers, Jln 5/58B, 46000 Petaling Jaya, Malaysia
[email protected] 2 Mimos, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia 3 Faculty of Information Science & Technology, Multimedia University, 75450 Melaka, Malaysia Abstract. We present a stream authentication framework featuring preemptive one-time signatures and reactive hash-graphs, thereby enabling simultaneous realisation of near-online performance and packet-loss tolerance. Stream authentication is executed on packet aggregations at three levels ie: (1) GM chaining of packets within groups, (2) WL star connectivity of GM authenticator nodes within meta-groups, and (3) signature m-chaining between meta-groups. The proposed framework leverages the most attractive functional attributes of the constituent mechanisms ie: (1) immediate verifiability of onetime signatures and WL star nodes, (2) robust loss-tolerance of WL stars, and (3) efficient loss-tolerance of GM chains; while compensating for various structural characteristics ie: (1) high overhead of one-time signatures and WL stars, and (2) loss-intolerance of the GM chain authenticators. The resultant scheme can be operated in various configurations based on: (1) ratio of GM chain to WL star occurence, (2) frequency of one-time signature affixation, and (3) redundancy and spacing of signature-chain.
1 Introduction Lossy streaming results in the received stream being a subset of the transmitted one; which is problematic from the data authentication viewpoint, especially in comparison to the well-established authentication and verification of block-oriented data. Such signature protocols allow receiver-side establishment of: (1) absolute data integrity during transit, and (2) association with specified sender; thereby regarding data loss (even a single bit) in transit as equivalent to fraudulent manipulation. Block-oriented signature protocols are therefore essentially inapplicable on lossy datastreams. 1.1 Overview of Previous Research Previous research in stream authentication [1-7] has focussed on the integrated use of signatures and hash-chaining, with the latter essentially an amortisation mechanism to compensate for the heavy overheads of the former. This basic concept is established
A
. Lioy and D. Mazzocchi (Eds.): CMS 2003, LNCS 2828, pp. 239–251, 2003. © IFIP International Federation for Information Processing 2003
240
A. Goh, G.S. Poh, and D.C.L. Ngo
in the seminal work of Gennaro-Rohatgi (GR) [1], who also introduced the notion of reactive and preemptive mechanisms. The former is applicable when the entire datastream is available a priori to the sender, thereby enabling attachment to every packet of the hash-value of the preceeding packet. Earlier sections of the datastream are hence reactively verified by subsequently recovered packets. This contrasts with the preemptive one-time signatures applicable when the datastream is not entirely available a priori. In this case previously recovered one-time public-keys are used to verify subsequent one-time signatures. Note that both formulations as originally presented are intolerant of packet-loss. Reactive authentication was extended by the hash-trees of Wong-Lam (WL) [2], in which the hash-values of child-nodes are aggregated and used for parent-node computation. WL hash-trees require: (1) sender-side buffering of leaf and intermediate hashes, and (2) affixation of highly redundant authenticative information to the data packets; both of which can constitute significant overheads. This formulation does nevertheless enable immediate receiver-side verification and is also maximally loss-tolerant. This tolerance against worst-case packet-loss contrasts with the more economical presumption of random-bursty loss adopted by Golle-Modadugu (GM) [3]. GM augmented chains are far more efficient to compute than WL trees, but are not loss-tolerant to the same extreme degree. 1.2 Proposed Solution The major issue in stream authentication is the difficulty of simultaneously enabling: (1) online (ie immediate or with minimal delay) functionality, and (2) packet-loss robustness. The GR formulation satisfies (1) but not (2), with subsequent research [26] has focussed on incorporation of loss-tolerance via hash-graph topology. Overemphasis on either of (1) or (2)—as respectively exemplified by the GR and WL/GM approaches—results in stream authentication with a narrow functional emphasis, and therefore limited usefulness. This paper describes stream authentication in a broad functional context, where online (or near-online) performance and packet-loss tolerance are both important. We outline a composite solution featuring both preemptive and reactive mechanisms, with authentication operations executed at three packet-aggregation layers ie: (1) packet level GM chaining thereby ensuring efficient authentication on the bulk of the datastream, (2) group level WL star connectivity to protect the functionally important chain authenticators, and (3) meta-group level one-time signature [10-12] chaining to establish data-to-sender association.
2 Basic Mechanisms A digital stream D differs significantly from conventional block-oriented data in several important respects ie: (1) a priori undefined (ie infinite) length, (2) online generation in terms of finite L-packet substreams D k = {d1, , d L} ⊂ D , (3) online consumption upon receipt of one or more substreams D′k , and (4) probable loss of
Loss-Tolerant Stream Authentication via Configurable Integration
packets during transit so that D′k ⊆ D k and D′k ⊆ D . ∀k
241
Attributes (2, 3)
necessitate high-throughputs for sender-side authentication and receiver-side verification, thereby motivating the use of collision-resistant hash functions H rather than the (significantly slower) number-theoretic constructions. These hashes are used as the fundamental building blocks for the subsequently described H-graphs and sender-irrefutable signatures, the latter of which are functionally equivalent to blockoriented signatures. Such schemes are denoted σ : G x,y, Sx , V y [13] with key(G)eneration, (S)igning and (V)erification parameterised by key-pair (x, y) of private and public portions; as would be familiar from number-theoretic cryptography. H-graphs and one-time signatures are respectively reactive and preemptive authentication mechanisms, with computations in the former case necessitating forward-buffering of data. Reactive authentication enables receiver-side verification to be loss-tolerant to a certain degree, but is not genuinely online ie only nearly so if the buffer-size is relatively small compared to characteristic stream-lengths. Proactive authentication, in contrast, enables online performance, but requires lossless recovery of the transmitted stream. The inherently dichotomous requirements of online signing/verification (2, 3) and loss-tolerance (4) is an important motivation for the featured research, and will be subsequently discussed in more detail. 2.1 H-Chains H-graphs result from the conceptualisation of authenticated message packets as vertices and H-based computation as directed edges, the latter of which establishes one-way connectivity among the former. Multi-packet authentication via H-graphs can be (depending on the topology) highly efficient due to overhead amortisation over the multiple packets in a particular graph. This is achieved through appending a particular packet hash (immediately or otherwise) ahead or astern of its location in the packet-graph. H-graph authentication can also be robust—to some degree, depending again on the topology—against packet-loss, usually at the expense of signing/verification delays arising from the necessity for packet buffering during graph construction. Various buffered H-graph schemes [2-7] are therefore loss-tolerant, while others [1] are genuinely online-computable but loss-intolerant. The simplest H-graph construction is a linear chain on finite-stream D ie:
π0 = H (d1),S (H (d1)) and πi = d i, H (d i, H (d i +1))
(1)
for i ∈ [1, L–1]. The number-theoretic signature on initial packet π0 is required to establish sender-association, which bootstraps the authentication process. There is then the necessity for d i +1 prior to computation of πi , which is characteristic of reactive schemes.
242
A. Goh, G.S. Poh, and D.C.L. Ngo
2.2 One-Time Signatures H- based signatures [10-12] are significantly faster than the corresponding numbertheoretic operations, but functionally limited in that a key-pair can only be used to sign and verify a single-message. This results in a linear key-to-data overhead, as opposed the constant overhead of number-theoretic formulations with long-term reusable key-pairs. The signatures themselves also tend to be quite large—ie in the kbit-range for Even-Goldreich-Micali (EGM) [10] formulation—thereby rendering impractical signature affixation on every stream packet. One-time signatures do not (in contrast to H-graphs) require packet-buffering and can therefore be operated online. The basic operational concept is for the i-th signing Si and verification V i components (as parameterised by the i-th key-pair) to be applied on packet d i ∈ D . Note that one-time schemes also require bootstrapping with a number-theoretic signature on the initial one-time public-key y 0 . This
certified public-key can subsequently be used for receiver-side verification of a subsequent packet, the logic of which extends down the one-time signature chain as follows:
(
)
π0 = y 0,S ( y 0 ) and πi = d i, yi, Si −1 H (d i, yi )
(2)
This specific formulation is not loss-tolerant, and cannot recover from dropped packets. It can, however, be extended via association of multiple public-keys to a particular packet πi . Such a m-time scheme would be able to tolerate lost packets, but at an increased key-to-data overhead. Note that the signature in the i-th packet (on data and public-keys in πi ) is computed sender-side using the (i–1)-th public-key transmitted in πi −1 , which is characteristic of preemptive authentication formulations. This constrasts with the reactive authenticative logic of Eqn 1, where sender-side computation of H-chain node πi presumes prior availabity of (buffered) πi +1 . 2.3 Wong-Lam H-Star The H-star is the simplest case of the WL hierarchical authentication scheme [2], the basic idea of which is to bind consecutive packets into groups defined by a common signature (number-theoretic or hash-based) on the packet-group. Each packet is subsequently appended with the authenticative information (including the packetgroup signature) necessary to confirm membership in the arbitrary n-sized group. This is explicitly designed to enable verification of single packets independent of any other within the same group. The result is an extremely high degree of loss-tolerance, allowing for group-level authentication even if n–1 (out of n) packets are lost during transmission. Formation of WL H-graphs, on the other hand, requires high buffering and communications overheads. The attributes of packet-loss robustness and high authenticative overhead can be seen from the packet structure ie:
Loss-Tolerant Stream Authentication via Configurable Integration
πi = d i, H (d j),S H H (d j ) ∀j≠ i ∀j
243
(3)
with the root-node S corresponding to the common packet-group signature. Note S in this case denotes the signature-based binding together of all packets within a particular group, rather than a distinct node. Eqn 3 facilitates packet-group verification via any packet πi ∈ Π within an starauthenticated sub-stream, but at the expense of O(n) hashes and one signature per packet-group. This constitutes an extremely high authenticative overhead per packet, and is also manifestly reactive ie necessitating buffering of ∀d i ∈ D prior to computation of any πi . Packet-group size n is therefore indicative of both packetloss robustness and authenticative overheads, with small (large) values appropriate for relatively low (high) loss communications environments. 2.4 Golle-Moldagu Augmented H-Chain GM H-chaining [3] (in common with the WL constructions) is designed to facilitate verification within the context of lossy transmissions, but with a substantively different presumption of packet-loss. Note the packet-level self-similarity in Eqn 3, which renders WL stars robust against worst-case packet-loss, but at the expense of an extremely high authenticative overhead per packet. The GM approach adopts the less strenuous presumption that packets are loss in random-bursts, so that some packets in a n-sized packet-group would be successfully retrieved. There is some evidence [8, 9] that the latter presumption is more realistic, which is fortunate because mitigation of worst-case loss (as addressed by WL stars) should intuitively require heavier overheads than alternative packet-loss models. GM chains in fact enable a significant reduction in the per-packet authenticative overhead, via adoption of: (1) basic chain structure, with far less H-connections compared to the above-discussed WL configuration; and (2) non-uniform distribution of authenticative overheads over the packet-group. Attribute (2) results in a more complex definition ie:-
ψ , H (ψ ) i ∈ [α, n −1] i i with i=n πi = d n ψ ,S H ψ i=β β β d i, H (d i +1) i = α, n −1 i ∈ [1, n − 2] ψ i = d i, H (d i +1), H (d i + 2 ) , H ( ), H ( ) , H ( ) i = β dα d1 d2 dβ
( ( ))
(4)
for i ∈ {α, 1, …, n, β}, with head node α and tail β. This is justified by the resultant constant communications overhead per packet, as opposed O(n) (ie scaling with group-size) for WL stars. The group-level communications overheads are therefore of
244
A. Goh, G.S. Poh, and D.C.L. Ngo
( )
O(n), rather then O n 2
for WL stars. Such a reduction is of major practical
significance, particularly for operational scenarios transmission and bandwidth-constrained environments.
with
high-volume
data
GM augmented chaining allows for any packet πi ∈ Π to the verified so long as there is a path to the signed πβ , which must be retrieved. This position-dependant authenticative prioritisation is diametrically opposed to the node-level uniformity of WL stars, the latter of which results in loss-tolerance irrespective of position. Note also that GM chain formation at both sending and receiving endpoints requires O(n) packet-buffering, with verification predicated on recovery of the β node. This contrasts with the online verifiability of WL star-connected packets, any of which can be verified independently of all others in the packet-group.
3 Proposed Composite Solution The above-outlined mechanisms have various attractive attributes ie: (1) structural simplicity of H-chains with single/multiple connections, (2) online performance of mtime signature chaining, (3) maximal packet-loss robustness of WL stars, and (4) efficient packet-loss robustness of GM augmented chains; as illustrated below: 0
1
i-1
h1
i
hi
h i −1
i+1
0
L-1
1
y0
h i +1
i-1
i
i+1
yi
yi −1
y1
yi +1
(b)
(a) S
h3
h1
h1
hi
h i −1
h i +1
1
α
hn
hα
h i+2
hi i
i+2
h n −1 n-1
hi+3
n
n-2
hn
i+1
hn −2
i-1
h i +1
2
h i −1
h2
n-3
1
i-1
i
(c)
i+1
n
i-2
(d)
Fig 1. (a) Linear H-chain, (b) linear one-time signature chain, (c) WL H-star, and (d) GM augmented H-chain
with the arrows indicative of the authentication direction. This section describes such a framework featuring: (1) GM chain connectivity of individual packets in the datastream, (2) WL star connectivity of the GM β nodes, and (3) m-time signature affixation on the WL groups. The basic idea is to use GM chaining—with its simultaneous realisation of loss tolerance and structural efficiency—for the bulk of the datastream. This still necessitates recovery of the β nodes, which are then protected strongly via WL star connectivity. What remains is therefore to establish an association between any given WL star-group and the sender identity, which is efficiently done via a chained sequence of H-based signatures. Note this results in a tiered authentication framework addressing (1) packet, (2) packet-group and (3) stream-level datastructures.
Loss-Tolerant Stream Authentication via Configurable Integration
245
3.1 Packet-Level GM Chain-Connectivity Packets within groups can be characterised as d ik ∈ D k , with (k, i) the respective group and packet indices. GM chain-authenticated packets πik are straightforwardly obtained via Eqn 4, with the only difference being the handling of the β nodes ie:-
( )
( ) ( ) ( )
βk = ψ βk , H ψβk with ψβk = dβk , H d αk , H d1k , H d k2
(5)
Note these group-wise anchor nodes are not signed as in Eqn 4, but rather used (as subsequently outlined) as leaf nodes within a larger-scale WL star encompassing multiple GM chains. This is denoted by i ∈ {α, 1, …, n, β} and k ∈ {1, …, N} applicable in Eqns 4 and 5. Each data packet in the GM chain then requires a communications overhead of 3 H-words—less for the i ∈ {α, n–1, n} packets— resulting in a total of 3n H-words per packet-group. This is comparable to the overhead of a single WL node, hence obvious attraction of GM chains. Computation
( ) values in Eqn 4 are buffered,
of H-chain is also relatively efficient if the H d ik
resulting in a total requirement of 2n H-computations per group. 3.2 Group-Level WL Star-Connectivity Node β k of Eqn 5 allows verification of the k-th packet-group even if some packets
d ik (for i ≠ β) are dropped, but can itself be loss in transit. Loss of a particular β packet must therefore be mitigated against, so that the consequences do not extend beyond the relatively small n-sized packet-group. This is addressed in our framework via inter-β WL star-connectivity, with one-time signatures on the root-nodes. The resultant structural form is modified from Eqn 3 ie:
( )
µ µ µ Bk = βk , y k , y k +σ, H β k , Σ µ−1 with ∀k ′ ≠ k
µ Σµ−1 = Sµ−1 H H H βk , yµ, yµ+σ ∀k
( )
(6)
for k ∈ {1, …, N}, with µ the meta-group index and σ the inter-group spacing between the affixed one-time public-keys. These star-connected β nodes encompass N packet-groups, and are representative of a meta-group containing Nn data-packets. The WL star configuration ensures maximal robustness against loss of the groupspecific βµk , so that each node can be verified independently of all others. One β node out of the N is therefore sufficient to establish associativity within the largerscale meta-group context, so long as public-key yµ−1 (necessary for verification of signature Σ µ−1 ) is previously recovered and verified. Note the inclusion of two public-keys per WL leaf in Eqn 6, thereby mitigating against discontinuities in the sequence of one-time public-keys and signatures. Note the communications overhead of NH + mY + S per packet-group, with key/signature-lengths Y and S further expressible in terms of H-words. The featured
246
A. Goh, G.S. Poh, and D.C.L. Ngo
EGM protocol—in common with other one-time signature schemes ie DiffieLamport, Merkle and Merkle-Winternitz—has short Y = H key-lengths, but is particularly attractive in that the signature-lengths are both configurable and comparatively short. Typical settings then result in S in the kbit-range ie comparable to commonly encountered number-theoretic implementations. Efficient computation is facilitated—similar to the previously discussed packet-level GM chaining—by
( )
buffering of the H βµk values; resulting in an overhead of NH + S per Nn-sized meta-group, with EGM signature-generation S (or verification V, both of which are equal) also configurable. 3.3 Stream-Level m-Time Signatures The double signature-chaining of the previous section requires equivalently connected initialisation:
(
)
B0 = y 0, y σ,S H (y 0, y σ )
(7)
with Eqns 6 and 7 essentially a straightforward extension of the linear chaining of [1]. Incorporation of these preemptive signatures facilitate immediate verification—upon recovery of B node as specified in Eqn 6—with multiple connectivity allowing resumption of stream verification σ meta-groups astern of any completely dropped meta-group. Characteristic stream-dimension σ is described in [1] as chain-strength, in the sense that such signature-chains would tolerate the loss of σ–1 WL stars between Bµk and Bµ+σ k ′ , as illustrated below: µ
B0
y 0, y σ
S
n-1
n
β Bk
y µ, y µ +σ
Σ µ−1
µ+σ
µ
n-1
n
β B k +1
n-1
n
β Bk
y µ+σ, y µ+ 2 σ
Σ µ+σ−1
Fig 2. Layered framework featuring packet, group and meta-group mechanisms
This facilitates loss-tolerance between meta-groups; and is therefore complementary to the previously discussed WL star-aggregation of β nodes, which addresses packetloss within specified meta-groups. The m = 2 configuration does nevertheless result in a doubled key overhead per meta-group compared to Eqn 2, hence our incorporation of the EGM formalism with short H-sized public-keys. EGM (in common with other one-time protocols) also requires sender-side generation of the one-time key-pairs—with one required for each transmitted meta-groups—which can be pre-computed for enhanced operational efficiency. The framework as outlined features configurable parameters: (1) m public-keys per meta-group, (2) σ signature-chain meta-group spacing, (3) N groups per metagroup, and (4) n data-packets per group. Note the group/packet-level settings (N, n) facilitates more immediate verification compared to a long GM chain of length Nn, in addition to allowing more flexibility in response to different operational conditions.
Loss-Tolerant Stream Authentication via Configurable Integration
247
4 Analysis of Framework The proposed scheme can be implemented with any number-theoretic and one-time protocol. Our choice of the Rabin [14] and EGM formalisms is based primarily on performance considerations. Rabin verification (necessitating only a single modularsquaring) is, for instance, significantly more computation-efficient compared with other number-theoretic and even H-based formulations [15]. EGM, on the other hand, is significantly more communications-efficient compared with other one-time schemes, but in fact necessitates a higher computation-overhead. We therefore presume the relative preeminence of bandwidth and latency constraints over those related to endpoint computations. EGM signature generation and verification is also more computation-efficient (by up to two orders of magnitude) compared with number-theoretic protocols [15]. The constituent H-operations in EGM can be executed using block ciphers—ie the Data Encryption Standard (DES) as originally suggested—or one-way collision-resistent compressions ie Message Digest (MD) 5 or the Secure Hash Algorithm (SHA). Use of MD5 or SHA hashing results in superior computation-efficiency, and is therefore adopted in our implementation. These hashes are also used to construct the above-discussed WL stars and GM augmented chains. Practical stream authentication must be both effective and efficient, with evaluation of both dependant on the manner in which packets are dropped in transit. Mechanisms designed to tolerate packet-loss in random-bursts (ie GM chains) can therefore be expected to be significantly more efficient that those designed for worstcase loss (ie WL stars). Our incorporation of both WL stars and GM chains addresses the fact that the β packets of the latter cannot be lost without major functional consequence. The objective is therefore to demonstrate that the featured specification of loss-tolerance effectiveness does not significantly degrade computation and communications efficiency. Our analysis is presented as follows: 4.1 Correctness The above-outlined layered scheme can be demonstrated to be correct by considering system-wide compromise to be equivalent to compromise of the underlying cryptography protocols ie the signatures and H-graphs. We follow the GR methodology [1], in which presumption of secure signatures—thereby establishing the initial signed packet—is subsequently extended down a linear H-chains. This established security of linear H-chaining is based on random oracles, and is therefore itself extensible to non-linear constructions (ie the proposed layer framework) via demonstration that node-level compromise is as difficult as the equivalent effort on the underlying one-time signature and H-function. The proposed framework is therefore as secure as the underlying mechanisms ie: (1) number-theoretic signature, (2) one-time signature, (3) H-graphs and (4) H-function. 4.2 Signing and Verification Delay Delay Ω is defined as the number of packets which must be buffered prior to signing or verification operations. These operations should ideally be executed on a particular
248
A. Goh, G.S. Poh, and D.C.L. Ngo
packet without delay ie Ω = 0, which denotes genuine online transmission and consumption. Recall that delay-free operations are rendered impossible by our use of H-graphs as an amortisation mechanism against the high overhead of signature operations. Our scheme results in: (1) Ω(send) = Nn from meta-group level buffering prior to signature generation on WL root-node, and (2) Ω(recv) = n from group-level buffering prior to verification with respect signed β node; the latter of which presumes recovery of the required one-time public-keys. 4.3 Communications Overhead The communications overhead per packet:i=0 S′ + mH 2H i = α, n −1 i ∈ [1, n − 2] Ωi = 3H 0 i=n S + (m + N + 3)H i = β
(8)
can be surmised from Eqns 4-7, with: (1) S′ the number-theoretic signature-length, (2) S the one-time signature-length, and (3) H the hash-length. We compare the proposed scheme—in three (N, n) configurations, with m = 2 signature-chaining—against other stream authentication protocols over a 20-packet stream. Table 1 presents the delay and communication overheads associated with the various protocols: Table 1. Buffering delays and communications overheads for 20-packet stream
:i
Scheme
: (send, recv)
GR
0, 0
(0) 128, (i) 10
None
GR (one-time) WL star
0, 0
(0) 128, (i) 146
None
20, 0
(i) 318
Worst-case
GM chain
20, 20
Random
Proposed scheme (4, 5) Proposed scheme (2, 10) Proposed scheme (1, 20)
20, 5
(α, n–1) 20, (i) 30, (n) 0, (E) 158 (0) 148, (α, n–1) 20, (i) 30, (n) 0, (E) 226 (0) 148, (α, n–1) 20, (i) 30, (n) 0, (E) 206 (0) 148, (α, n–1) 20, (i) 30, (n) 0, (E) 196
20, 10 20, 20
(i)
Loss
Random Random Random
given 10-byte H-words, 128-byte number-theoretc signatures and 136-byte one-time signatures. These hashes are relatively short compared to those used on blocked data, but are sufficient in the context of interest ie to ensure target collision-resistance with respect a fixed message, rather than a more generalised anti collision-resistance.
Loss-Tolerant Stream Authentication via Configurable Integration
249
Note the lessened verification delay compared to unassisted GM chaining, while retaining the general efficiency of the augmented chain structure. This configurable reduction of the receiver-side delay is attained while simultaneously incorporating (random) loss-tolerance, the latter of which cannot be addressed by the GR and GR one-time formulations. The trade-off between Ω(recv) and Ωβ is also interesting, as is the significantly lessened communications overhead compared with the unassisted WL star configuration. Our scheme can therefore be said to possess functional advantages compared with previously reported formulations.
4.4 Computation Overhead The signing overhead can also expressed in terms of the constituent operations ie: (1) S′ computations on a stream basis, (2) S computations on a meta-group basis, and (3) H computations on a group/packet basis. We presume the necessity of only a single S′ per streaming session—thereafter represented as [S′] to denote amortisation over multiple meta-groups—and also prior generation of the one-time key-pair sequence. Each meta-group then requires N(n+2) H-computations to account for all the data packets, with N(n+1) required for GM chain construction; as can be seen from Eqns 4 and 5. Meta-group formation also requires Ω(WL) = S + (N+2)H association with WL star computation from Eqn 6. Buffering of packet-level hashes as previously discussed allows for significant efficiency gains, thereby allowing analysis in terms of incremental overheads during GM chaining ie ∆Ω i = H (for i ≠ n) and ∆Ω n = 0 . This results in a total meta-group overhead of:
Ω k = [S′] + S + ( N(n + 2) + 2 ) H
(9)
associated with sender-side signature generation. The verification overhead is likewise expressible in terms of: (1) V′ computations on a stream basis, (2) V computations on a meta-group basis, and (3) H computations on a group/packet basis; with [V′] to denote amortisation over multiple meta-groups. Presumption of packet-level hash-buffering then allows for incremental overheads ∆Ψ i = H (for i ≠ n), ∆Ψ n = 0 and ∆Ψ(WL) = V + 3H. This results in:
Ψ k = [V′] + V + ( N(n +1) + 3) H
(10)
associated with receiver-side signature verification. It should be emphasised that H retrieval rather than recomputation is especially significant for (N, n) configurations with relatively sparse WL star connections of long GM chains. Our framework—once again in three (N, n) configurations—is compared with previously published protocols over multiple µ repetitions of a 20-packet meta-group, resulting in Table 2 ie:
250
A. Goh, G.S. Poh, and D.C.L. Ngo Table 2. Signing and verification overheads for µ repetitions of 20-packet meta-group Scheme GR GR (one-time) WL star GM chain Proposed scheme (4, 5) Proposed scheme (2, 10) Proposed scheme (1, 20)
:(sign) S′ + 20µH